date:20231108

Re: [PATCH 15/22] arch: vdso: consolidate gettime prototypes

2023-11-08 Thread Christophe Leroy

Le 08/11/2023 à 20:37, Arnd Bergmann a écrit :
> On Wed, Nov 8, 2023, at 19:31, Christophe Leroy wrote:
>> Le 08/11/2023 à 13:58, Arnd Bergmann a écrit :
> 
>> powerpc has functions doing more or less the same, they are called
>> __c_kernel_clock_gettime() and alike with their prototypes siting in
>> arch/powerpc/include/asm/vdso/gettimeofday.h
>>
>> Should those prototypes be moved to include/vdso/gettime.h too and
>> eventually renamed, or are they considered too powerpc specific ?
> 
> I don't actually know, my initial interpretation was that
> these function names are part of the user ABI for the vdso,
> but I never looked closely enough at how vdso works to
> be sure what the actual ABI is.
> 
> If __c_kernel_clock_gettime() etc are not part of the user-facing
> ABI, I think renaming them for consistency with the other
> architectures would be best.
> 

User-facing ABI function is __kernel_clock_gettime(), defined in 
arch/powerpc/kernel/vdso/gettimeofday.S , see man vdso.
There is no prototype defined for it anywhere, obviously that's 
undetected because it is assembly. Should a prototype be added somewhere 
anyway ?

__kernel_clock_gettime() sets up a stack frame, retrieves the address of 
the datapage then calls __c_kernel_clock_gettime() which then calls 
__cvdso_clock_gettime_data() which is part of the generic CVDSO.

Maybe that's too different from what other architectures do ?

Christophe

[RFC PATCH 2/2] vc4: introduce DMA-BUF heap

2023-11-08 Thread Simon Ser

User-space sometimes needs to allocate scanout-capable memory for
GPU rendering purposes. On a vc4/v3d split render/display SoC, this
is achieved via DRM dumb buffers: the v3d user-space driver opens
the primary vc4 node, allocates a DRM dumb buffer there, exports it
as a DMA-BUF, imports it into the v3d render node, and renders to it.

However, DRM dumb buffers are only meant for CPU rendering, they are
not intended to be used for GPU rendering. Primary nodes should only
be used for mode-setting purposes, other programs should not attempt
to open it. Moreover, opening the primary node is already broken on
some setups: systemd grants permission to open primary nodes to
physically logged in users, but this breaks when the user is not
physically logged in (e.g. headless setup) and when the distribution
is using a different init (e.g. Alpine Linux uses openrc).

We need an alternate way for v3d to allocate scanout-capable memory.
Leverage DMA heaps for this purpose: expose a CMA heap to user-space.
Preliminary testing has been done with wlroots [1].

This is still an RFC. Open questions:

- Does this approach make sense to y'all in general?
- What would be a good name for the heap? "vc4" is maybe a bit naive and
  not precise enough. Something with "cma"? Do we need to plan a naming
  scheme to accomodate for multiple vc4 devices?
- Right now root privileges are necessary to open the heap. Should we
  allow everybody to open the heap by default (after all, user-space can
  already allocate arbitrary amounts of GPU memory)? Should we leave it
  up to user-space to solve this issue (via logind/seatd or a Wayland
  protocol or something else)?

TODO:

- Need to add !vc5 support.
- Need to the extend DMA heaps API to allow vc4 to unregister the heap
  on unload.

[1]: https://gitlab.freedesktop.org/wlroots/wlroots/-/merge_requests/4414

Signed-off-by: Simon Ser 
Cc: Maxime Ripard 
Cc: Daniel Vetter 
Cc: Daniel Stone 
Cc: Erico Nunes 
Cc: Iago Toral Quiroga 
Cc: Maíra Canal 
Cc: Thomas Zimmermann 
---
 drivers/gpu/drm/vc4/Makefile   |  2 ++
 drivers/gpu/drm/vc4/vc4_dma_heap.c | 51 ++
 drivers/gpu/drm/vc4/vc4_drv.c  |  6 
 drivers/gpu/drm/vc4/vc4_drv.h  |  5 +++
 4 files changed, 64 insertions(+)
 create mode 100644 drivers/gpu/drm/vc4/vc4_dma_heap.c

diff --git a/drivers/gpu/drm/vc4/Makefile b/drivers/gpu/drm/vc4/Makefile
index c41f89a15a55..e4048870cec7 100644
--- a/drivers/gpu/drm/vc4/Makefile
+++ b/drivers/gpu/drm/vc4/Makefile
@@ -34,4 +34,6 @@ vc4-$(CONFIG_DRM_VC4_KUNIT_TEST) += \
 
 vc4-$(CONFIG_DEBUG_FS) += vc4_debugfs.o
 
+vc4-$(CONFIG_DMABUF_HEAPS) += vc4_dma_heap.o
+
 obj-$(CONFIG_DRM_VC4)  += vc4.o
diff --git a/drivers/gpu/drm/vc4/vc4_dma_heap.c 
b/drivers/gpu/drm/vc4/vc4_dma_heap.c
new file mode 100644
index ..010d0a88f3fa
--- /dev/null
+++ b/drivers/gpu/drm/vc4/vc4_dma_heap.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ *  Copyright © 2023 Simon Ser
+ */
+
+#include 
+#include 
+
+#include "vc4_drv.h"
+
+static struct dma_buf *vc4_dma_heap_allocate(struct dma_heap *heap,
+unsigned long size,
+unsigned long fd_flags,
+unsigned long heap_flags)
+{
+   struct vc4_dev *vc4 = dma_heap_get_drvdata(heap);
+   //DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
+   struct drm_gem_dma_object *dma_obj;
+   struct dma_buf *dmabuf;
+
+   if (WARN_ON_ONCE(!vc4->is_vc5))
+   return ERR_PTR(-ENODEV);
+
+   dma_obj = drm_gem_dma_create(>base, size);
+   if (IS_ERR(dma_obj))
+   return ERR_CAST(dma_obj);
+
+   dmabuf = drm_gem_prime_export(_obj->base, fd_flags);
+   drm_gem_object_put(_obj->base);
+   return dmabuf;
+}
+
+static const struct dma_heap_ops vc4_dma_heap_ops = {
+   .allocate = vc4_dma_heap_allocate,
+};
+
+int vc4_dma_heap_create(struct vc4_dev *vc4)
+{
+   struct dma_heap_export_info exp_info;
+   struct dma_heap *heap;
+
+   exp_info.name = "vc4"; /* TODO: allow multiple? */
+   exp_info.ops = _dma_heap_ops;
+   exp_info.priv = vc4; /* TODO: unregister when unloading */
+
+   heap = dma_heap_add(_info);
+   if (IS_ERR(heap))
+   return PTR_ERR(heap);
+
+   return 0;
+}
diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index c133e96b8aca..c7297dd7d9d5 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -391,6 +391,12 @@ static int vc4_drm_bind(struct device *dev)
 
drm_fbdev_dma_setup(drm, 16);
 
+#if IS_ENABLED(CONFIG_DMABUF_HEAPS)
+   ret = vc4_dma_heap_create(vc4);
+   if (ret)
+   goto err;
+#endif
+
return 0;
 
 err:
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index ab61e96e7e14..d5c5dd18815c 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@

[RFC PATCH 1/2] dma-buf/dma-heap: export dma_heap_add and dma_heap_get_drvdata

2023-11-08 Thread Simon Ser

This is necessary to create DMA heaps in other modules
(e.g. graphics drivers).

Signed-off-by: Simon Ser 
Cc: Sumit Semwal 
Cc: Benjamin Gaignard 
Cc: Brian Starkey 
Cc: John Stultz 
Cc: "T.J. Mercier" 
---
 drivers/dma-buf/dma-heap.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
index 84ae708fafe7..51cd58b775d8 100644
--- a/drivers/dma-buf/dma-heap.c
+++ b/drivers/dma-buf/dma-heap.c
@@ -203,6 +203,7 @@ void *dma_heap_get_drvdata(struct dma_heap *heap)
 {
return heap->priv;
 }
+EXPORT_SYMBOL_GPL(dma_heap_get_drvdata);
 
 /**
  * dma_heap_get_name() - get heap name
@@ -300,6 +301,7 @@ struct dma_heap *dma_heap_add(const struct 
dma_heap_export_info *exp_info)
kfree(heap);
return err_ret;
 }
+EXPORT_SYMBOL_GPL(dma_heap_add);
 
 static char *dma_heap_devnode(const struct device *dev, umode_t *mode)
 {
-- 
2.42.0

[PATCH v2 05/11] drm/tidss: Return error value from from softreset

2023-11-08 Thread Tomi Valkeinen

Return an error value from dispc_softreset() so that the caller can
handle the errors.

Signed-off-by: Tomi Valkeinen 
---
 drivers/gpu/drm/tidss/tidss_dispc.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c 
b/drivers/gpu/drm/tidss/tidss_dispc.c
index deb665d0ede8..50b475b285d9 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc.c
+++ b/drivers/gpu/drm/tidss/tidss_dispc.c
@@ -2702,7 +2702,7 @@ static void dispc_init_errata(struct dispc_device *dispc)
}
 }
 
-static void dispc_softreset(struct dispc_device *dispc)
+static int dispc_softreset(struct dispc_device *dispc)
 {
u32 val;
int ret;
@@ -2712,8 +2712,12 @@ static void dispc_softreset(struct dispc_device *dispc)
/* Wait for reset to complete */
ret = readl_poll_timeout(dispc->base_common + DSS_SYSSTATUS,
 val, val & 1, 100, 5000);
-   if (ret)
-   dev_warn(dispc->dev, "failed to reset dispc\n");
+   if (ret) {
+   dev_err(dispc->dev, "failed to reset dispc\n");
+   return ret;
+   }
+
+   return 0;
 }
 
 int dispc_init(struct tidss_device *tidss)
@@ -2826,8 +2830,11 @@ int dispc_init(struct tidss_device *tidss)
 >memory_bandwidth_limit);
 
/* K2G display controller does not support soft reset */
-   if (feat->subrev != DISPC_K2G)
-   dispc_softreset(dispc);
+   if (feat->subrev != DISPC_K2G) {
+   r = dispc_softreset(dispc);
+   if (r)
+   return r;
+   }
 
tidss->dispc = dispc;
 

-- 
2.34.1

[PATCH v2 08/11] drm/tidss: Fix dss reset

2023-11-08 Thread Tomi Valkeinen

The probe function calls dispc_softreset() before runtime PM is enabled
and without enabling any of the DSS clocks. This happens to work by
luck, and we need to make sure the DSS HW is active and the fclk is
enabled.

To fix the above, add a new function, dispc_init_hw(), which does:

- pm_runtime_set_active()
- clk_prepare_enable(fclk)
- dispc_softreset().

This ensures that the reset can be successfully accomplished.

Note that we use pm_runtime_set_active(), not the normal
pm_runtime_get(). The reason for this is that at this point we haven't
enabled the runtime PM yet and also we don't want the normal resume
callback to be called: the dispc resume callback does some initial HW
setup, and it expects that the HW was off (no video ports are
streaming). If the bootloader has enabled the DSS and has set up a
boot time splash-screen, the DSS would be enabled and streaming which
might lead to issues with the normal resume callback.

Fixes: c9b2d923befd ("drm/tidss: Soft Reset DISPC on startup")
Signed-off-by: Tomi Valkeinen 
---
 drivers/gpu/drm/tidss/tidss_dispc.c | 45 -
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c 
b/drivers/gpu/drm/tidss/tidss_dispc.c
index a5c0b72596e8..8d6ac618c539 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc.c
+++ b/drivers/gpu/drm/tidss/tidss_dispc.c
@@ -2738,6 +2738,49 @@ static int dispc_softreset(struct dispc_device *dispc)
return 0;
 }
 
+static int dispc_init_hw(struct dispc_device *dispc)
+{
+   struct device *dev = dispc->dev;
+   int ret;
+
+   ret = pm_runtime_set_active(dev);
+   if (ret) {
+   dev_err(dev, "Failed to set DSS PM to active\n");
+   return ret;
+   }
+
+   ret = clk_prepare_enable(dispc->fclk);
+   if (ret) {
+   dev_err(dev, "Failed to enable DSS fclk\n");
+   goto err_runtime_suspend;
+   }
+
+   ret = dispc_softreset(dispc);
+   if (ret)
+   goto err_clk_disable;
+
+   clk_disable_unprepare(dispc->fclk);
+   ret = pm_runtime_set_suspended(dev);
+   if (ret) {
+   dev_err(dev, "Failed to set DSS PM to suspended\n");
+   return ret;
+   }
+
+   return 0;
+
+err_clk_disable:
+   clk_disable_unprepare(dispc->fclk);
+
+err_runtime_suspend:
+   ret = pm_runtime_set_suspended(dev);
+   if (ret) {
+   dev_err(dev, "Failed to set DSS PM to suspended\n");
+   return ret;
+   }
+
+   return ret;
+}
+
 int dispc_init(struct tidss_device *tidss)
 {
struct device *dev = tidss->dev;
@@ -2847,7 +2890,7 @@ int dispc_init(struct tidss_device *tidss)
of_property_read_u32(dispc->dev->of_node, "max-memory-bandwidth",
 >memory_bandwidth_limit);
 
-   r = dispc_softreset(dispc);
+   r = dispc_init_hw(dispc);
if (r)
return r;
 

-- 
2.34.1

[PATCH v2 10/11] drm/tidss: Fix atomic_flush check

2023-11-08 Thread Tomi Valkeinen

tidss_crtc_atomic_flush() checks if the crtc is enabled, and if not,
returns immediately as there's no reason to do any register changes.

However, the code checks for 'crtc->state->enable', which does not
reflect the actual HW state. We should instead look at the
'crtc->state->active' flag.

This causes the tidss_crtc_atomic_flush() to proceed with the flush even
if the active state is false, which then causes us to hit the
WARN_ON(!crtc->state->event) check.

Fix this by checking the active flag, and while at it, fix the related
debug print which had "active" and "needs modeset" wrong way.

Cc: sta...@vger.kernel.org
Fixes: 32a1795f57ee ("drm/tidss: New driver for TI Keystone platform Display 
SubSystem")
Signed-off-by: Tomi Valkeinen 
---
 drivers/gpu/drm/tidss/tidss_crtc.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tidss/tidss_crtc.c 
b/drivers/gpu/drm/tidss/tidss_crtc.c
index 5e5e466f35d1..7c78c074e3a2 100644
--- a/drivers/gpu/drm/tidss/tidss_crtc.c
+++ b/drivers/gpu/drm/tidss/tidss_crtc.c
@@ -169,13 +169,13 @@ static void tidss_crtc_atomic_flush(struct drm_crtc *crtc,
struct tidss_device *tidss = to_tidss(ddev);
unsigned long flags;
 
-   dev_dbg(ddev->dev,
-   "%s: %s enabled %d, needs modeset %d, event %p\n", __func__,
-   crtc->name, drm_atomic_crtc_needs_modeset(crtc->state),
-   crtc->state->enable, crtc->state->event);
+   dev_dbg(ddev->dev, "%s: %s is %sactive, %s modeset, event %p\n",
+   __func__, crtc->name, crtc->state->active ? "" : "not ",
+   drm_atomic_crtc_needs_modeset(crtc->state) ? "needs" : "doesn't 
need",
+   crtc->state->event);
 
/* There is nothing to do if CRTC is not going to be enabled. */
-   if (!crtc->state->enable)
+   if (!crtc->state->active)
return;
 
/*

-- 
2.34.1

[PATCH v2 11/11] drm/tidss: Use DRM_PLANE_COMMIT_ACTIVE_ONLY

2023-11-08 Thread Tomi Valkeinen

At the moment the driver does not use DRM_PLANE_COMMIT_ACTIVE_ONLY, but
still checks for crtc->state->active in tidss_crtc_atomic_flush(), and
skips the flush if the crtc is not active.

The exact reason why DRM_PLANE_COMMIT_ACTIVE_ONLY is not used has been
lost in history. DRM_PLANE_COMMIT_ACTIVE_ONLY does also affect the plane
updates, and I think the issue was related to multi-display systems and
moving planes between the displays. However, it is possible the issue
was only present on the older DSS hardware, handled by the omapdrm
driver (on which the tidss driver is loosely based).

Reviewing the code related to DRM_PLANE_COMMIT_ACTIVE_ONLY does not show
any issues, and testing on J7 EVM with two displays works fine.

Change the driver to use DRM_PLANE_COMMIT_ACTIVE_ONLY.

Signed-off-by: Tomi Valkeinen 
---
 drivers/gpu/drm/tidss/tidss_crtc.c | 4 
 drivers/gpu/drm/tidss/tidss_kms.c  | 2 +-
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tidss/tidss_crtc.c 
b/drivers/gpu/drm/tidss/tidss_crtc.c
index 7c78c074e3a2..5f838980c7a1 100644
--- a/drivers/gpu/drm/tidss/tidss_crtc.c
+++ b/drivers/gpu/drm/tidss/tidss_crtc.c
@@ -174,10 +174,6 @@ static void tidss_crtc_atomic_flush(struct drm_crtc *crtc,
drm_atomic_crtc_needs_modeset(crtc->state) ? "needs" : "doesn't 
need",
crtc->state->event);
 
-   /* There is nothing to do if CRTC is not going to be enabled. */
-   if (!crtc->state->active)
-   return;
-
/*
 * Flush CRTC changes with go bit only if new modeset is not
 * coming, so CRTC is enabled trough out the commit.
diff --git a/drivers/gpu/drm/tidss/tidss_kms.c 
b/drivers/gpu/drm/tidss/tidss_kms.c
index d096d8d2bc8f..a0e494c806a9 100644
--- a/drivers/gpu/drm/tidss/tidss_kms.c
+++ b/drivers/gpu/drm/tidss/tidss_kms.c
@@ -29,7 +29,7 @@ static void tidss_atomic_commit_tail(struct drm_atomic_state 
*old_state)
tidss_runtime_get(tidss);
 
drm_atomic_helper_commit_modeset_disables(ddev, old_state);
-   drm_atomic_helper_commit_planes(ddev, old_state, 0);
+   drm_atomic_helper_commit_planes(ddev, old_state, 
DRM_PLANE_COMMIT_ACTIVE_ONLY);
drm_atomic_helper_commit_modeset_enables(ddev, old_state);
 
drm_atomic_helper_commit_hw_done(old_state);

-- 
2.34.1

[PATCH v2 00/11] drm/tidss: Probe related fixes and cleanups

2023-11-08 Thread Tomi Valkeinen

While working on the TI BSP kernel, adding bootload splash screen
support, I noticed some issues with the driver and opportunities for
cleanups and improvements.

 Tomi

Signed-off-by: Tomi Valkeinen 
---
Changes in v2:
- Add missing pm_runtime_dont_use_autosuspend() in error path
- Add simple manual "reset" for K2G
- Leave tidss->dispc NULL if dispc_init fails
- Add Fixes tags
- Drop "drm/tidss: Add dispc_is_idle()"
- Add "drm/tidss: Use DRM_PLANE_COMMIT_ACTIVE_ONLY"
- Link to v1: 
https://lore.kernel.org/r/20231101-tidss-probe-v1-0-45149e0f9...@ideasonboard.com

---
Tomi Valkeinen (11):
  drm/tidss: Use pm_runtime_resume_and_get()
  drm/tidss: Use PM autosuspend
  drm/tidss: Drop useless variable init
  drm/tidss: Move reset to the end of dispc_init()
  drm/tidss: Return error value from from softreset
  drm/tidss: Check for K2G in in dispc_softreset()
  drm/tidss: Add simple K2G manual reset
  drm/tidss: Fix dss reset
  drm/tidss: IRQ code cleanup
  drm/tidss: Fix atomic_flush check
  drm/tidss: Use DRM_PLANE_COMMIT_ACTIVE_ONLY

 drivers/gpu/drm/tidss/tidss_crtc.c  | 12 ++
 drivers/gpu/drm/tidss/tidss_dispc.c | 79 +
 drivers/gpu/drm/tidss/tidss_drv.c   | 15 +--
 drivers/gpu/drm/tidss/tidss_irq.c   | 54 -
 drivers/gpu/drm/tidss/tidss_kms.c   |  2 +-
 5 files changed, 97 insertions(+), 65 deletions(-)
---
base-commit: 9d7c8c066916f231ca0ed4e4fce6c4b58ca3e451
change-id: 20231030-tidss-probe-854b1098c3af

Best regards,
-- 
Tomi Valkeinen

[PATCH v2 09/11] drm/tidss: IRQ code cleanup

2023-11-08 Thread Tomi Valkeinen

The IRQ setup code is overly complex. All we really need to do is
initialize the related fields in struct tidss_device, and request the
IRQ.

We can drop all the HW accesses, as they are pointless: the driver will
set the IRQs correctly when it needs any of the IRQs, and at probe time
we have done a reset, so we know that all the IRQs are masked by default
in the hardware.

Thus we can combine the tidss_irq_preinstall() and
tidss_irq_postinstall() into the tidss_irq_install() function, drop the
HW accesses, and drop the use of spinlock, as this is done at init time
and there can be no races.

We can also drop the HW access from the tidss_irq_uninstall(), as the
driver will anyway disable and suspend the hardware at remove time.

Signed-off-by: Tomi Valkeinen 
---
 drivers/gpu/drm/tidss/tidss_drv.c |  2 ++
 drivers/gpu/drm/tidss/tidss_irq.c | 54 ++-
 2 files changed, 10 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/tidss/tidss_drv.c 
b/drivers/gpu/drm/tidss/tidss_drv.c
index f51c87e26e10..490d15ed2216 100644
--- a/drivers/gpu/drm/tidss/tidss_drv.c
+++ b/drivers/gpu/drm/tidss/tidss_drv.c
@@ -138,6 +138,8 @@ static int tidss_probe(struct platform_device *pdev)
 
platform_set_drvdata(pdev, tidss);
 
+   spin_lock_init(>wait_lock);
+
ret = dispc_init(tidss);
if (ret) {
dev_err(dev, "failed to initialize dispc: %d\n", ret);
diff --git a/drivers/gpu/drm/tidss/tidss_irq.c 
b/drivers/gpu/drm/tidss/tidss_irq.c
index 0c681c7600bc..604334ef526a 100644
--- a/drivers/gpu/drm/tidss/tidss_irq.c
+++ b/drivers/gpu/drm/tidss/tidss_irq.c
@@ -93,33 +93,21 @@ void tidss_irq_resume(struct tidss_device *tidss)
spin_unlock_irqrestore(>wait_lock, flags);
 }
 
-static void tidss_irq_preinstall(struct drm_device *ddev)
-{
-   struct tidss_device *tidss = to_tidss(ddev);
-
-   spin_lock_init(>wait_lock);
-
-   tidss_runtime_get(tidss);
-
-   dispc_set_irqenable(tidss->dispc, 0);
-   dispc_read_and_clear_irqstatus(tidss->dispc);
-
-   tidss_runtime_put(tidss);
-}
-
-static void tidss_irq_postinstall(struct drm_device *ddev)
+int tidss_irq_install(struct drm_device *ddev, unsigned int irq)
 {
struct tidss_device *tidss = to_tidss(ddev);
-   unsigned long flags;
-   unsigned int i;
+   int ret;
 
-   tidss_runtime_get(tidss);
+   if (irq == IRQ_NOTCONNECTED)
+   return -ENOTCONN;
 
-   spin_lock_irqsave(>wait_lock, flags);
+   ret = request_irq(irq, tidss_irq_handler, 0, ddev->driver->name, ddev);
+   if (ret)
+   return ret;
 
tidss->irq_mask = DSS_IRQ_DEVICE_OCP_ERR;
 
-   for (i = 0; i < tidss->num_crtcs; ++i) {
+   for (unsigned int i = 0; i < tidss->num_crtcs; ++i) {
struct tidss_crtc *tcrtc = to_tidss_crtc(tidss->crtcs[i]);
 
tidss->irq_mask |= DSS_IRQ_VP_SYNC_LOST(tcrtc->hw_videoport);
@@ -127,28 +115,6 @@ static void tidss_irq_postinstall(struct drm_device *ddev)
tidss->irq_mask |= DSS_IRQ_VP_FRAME_DONE(tcrtc->hw_videoport);
}
 
-   tidss_irq_update(tidss);
-
-   spin_unlock_irqrestore(>wait_lock, flags);
-
-   tidss_runtime_put(tidss);
-}
-
-int tidss_irq_install(struct drm_device *ddev, unsigned int irq)
-{
-   int ret;
-
-   if (irq == IRQ_NOTCONNECTED)
-   return -ENOTCONN;
-
-   tidss_irq_preinstall(ddev);
-
-   ret = request_irq(irq, tidss_irq_handler, 0, ddev->driver->name, ddev);
-   if (ret)
-   return ret;
-
-   tidss_irq_postinstall(ddev);
-
return 0;
 }
 
@@ -156,9 +122,5 @@ void tidss_irq_uninstall(struct drm_device *ddev)
 {
struct tidss_device *tidss = to_tidss(ddev);
 
-   tidss_runtime_get(tidss);
-   dispc_set_irqenable(tidss->dispc, 0);
-   tidss_runtime_put(tidss);
-
free_irq(tidss->irq, ddev);
 }

-- 
2.34.1

[PATCH v2 07/11] drm/tidss: Add simple K2G manual reset

2023-11-08 Thread Tomi Valkeinen

K2G display controller does not support soft reset, but we can do the
most important steps manually: mask the IRQs and disable the VPs.

Signed-off-by: Tomi Valkeinen 
---
 drivers/gpu/drm/tidss/tidss_dispc.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c 
b/drivers/gpu/drm/tidss/tidss_dispc.c
index c82fd5bc9be4..a5c0b72596e8 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc.c
+++ b/drivers/gpu/drm/tidss/tidss_dispc.c
@@ -2702,14 +2702,28 @@ static void dispc_init_errata(struct dispc_device 
*dispc)
}
 }
 
+/*
+ * K2G display controller does not support soft reset, so we do a basic manual
+ * reset here: make sure the IRQs are masked and VPs are disabled.
+ */
+static void dispc_softreset_k2g(struct dispc_device *dispc)
+{
+   dispc_set_irqenable(dispc, 0);
+   dispc_read_and_clear_irqstatus(dispc);
+
+   for (unsigned int vp_idx = 0; vp_idx < dispc->feat->num_vps; ++vp_idx)
+   VP_REG_FLD_MOD(dispc, vp_idx, DISPC_VP_CONTROL, 0, 0, 0);
+}
+
 static int dispc_softreset(struct dispc_device *dispc)
 {
u32 val;
int ret;
 
-   /* K2G display controller does not support soft reset */
-   if (dispc->feat->subrev == DISPC_K2G)
+   if (dispc->feat->subrev == DISPC_K2G) {
+   dispc_softreset_k2g(dispc);
return 0;
+   }
 
/* Soft reset */
REG_FLD_MOD(dispc, DSS_SYSCONFIG, 1, 1, 1);

-- 
2.34.1

[PATCH v2 06/11] drm/tidss: Check for K2G in in dispc_softreset()

2023-11-08 Thread Tomi Valkeinen

K2G doesn't have softreset feature. Instead of having every caller of
dispc_softreset() check for K2G, move the check into dispc_softreset(),
and make dispc_softreset() return 0 in case of K2G.

Reviewed-by: Laurent Pinchart 
Signed-off-by: Tomi Valkeinen 
---
 drivers/gpu/drm/tidss/tidss_dispc.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c 
b/drivers/gpu/drm/tidss/tidss_dispc.c
index 50b475b285d9..c82fd5bc9be4 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc.c
+++ b/drivers/gpu/drm/tidss/tidss_dispc.c
@@ -2707,6 +2707,10 @@ static int dispc_softreset(struct dispc_device *dispc)
u32 val;
int ret;
 
+   /* K2G display controller does not support soft reset */
+   if (dispc->feat->subrev == DISPC_K2G)
+   return 0;
+
/* Soft reset */
REG_FLD_MOD(dispc, DSS_SYSCONFIG, 1, 1, 1);
/* Wait for reset to complete */
@@ -2829,12 +2833,9 @@ int dispc_init(struct tidss_device *tidss)
of_property_read_u32(dispc->dev->of_node, "max-memory-bandwidth",
 >memory_bandwidth_limit);
 
-   /* K2G display controller does not support soft reset */
-   if (feat->subrev != DISPC_K2G) {
-   r = dispc_softreset(dispc);
-   if (r)
-   return r;
-   }
+   r = dispc_softreset(dispc);
+   if (r)
+   return r;
 
tidss->dispc = dispc;
 

-- 
2.34.1

[PATCH v2 02/11] drm/tidss: Use PM autosuspend

2023-11-08 Thread Tomi Valkeinen

Use runtime PM autosuspend feature, with 1s timeout, to avoid
unnecessary suspend-resume cycles when, e.g. the userspace temporarily
turns off the crtcs when configuring the outputs.

Signed-off-by: Tomi Valkeinen 
---
 drivers/gpu/drm/tidss/tidss_drv.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tidss/tidss_drv.c 
b/drivers/gpu/drm/tidss/tidss_drv.c
index f403db11b846..f51c87e26e10 100644
--- a/drivers/gpu/drm/tidss/tidss_drv.c
+++ b/drivers/gpu/drm/tidss/tidss_drv.c
@@ -43,7 +43,9 @@ void tidss_runtime_put(struct tidss_device *tidss)
 
dev_dbg(tidss->dev, "%s\n", __func__);
 
-   r = pm_runtime_put_sync(tidss->dev);
+   pm_runtime_mark_last_busy(tidss->dev);
+
+   r = pm_runtime_put_autosuspend(tidss->dev);
WARN_ON(r < 0);
 }
 
@@ -144,6 +146,9 @@ static int tidss_probe(struct platform_device *pdev)
 
pm_runtime_enable(dev);
 
+   pm_runtime_set_autosuspend_delay(dev, 1000);
+   pm_runtime_use_autosuspend(dev);
+
 #ifndef CONFIG_PM
/* If we don't have PM, we need to call resume manually */
dispc_runtime_resume(tidss->dispc);
@@ -192,6 +197,7 @@ static int tidss_probe(struct platform_device *pdev)
 #ifndef CONFIG_PM
dispc_runtime_suspend(tidss->dispc);
 #endif
+   pm_runtime_dont_use_autosuspend(dev);
pm_runtime_disable(dev);
 
return ret;
@@ -215,6 +221,7 @@ static void tidss_remove(struct platform_device *pdev)
/* If we don't have PM, we need to call suspend manually */
dispc_runtime_suspend(tidss->dispc);
 #endif
+   pm_runtime_dont_use_autosuspend(dev);
pm_runtime_disable(dev);
 
/* devm allocated dispc goes away with the dev so mark it NULL */

-- 
2.34.1

[PATCH v2 04/11] drm/tidss: Move reset to the end of dispc_init()

2023-11-08 Thread Tomi Valkeinen

We do a DSS reset in the middle of the dispc_init(). While that happens
to work now, we should really make sure that e..g the fclk, which is
acquired only later in the function, is enabled when doing a reset. This
will be handled in a later patch, but for now, let's move the
dispc_softreset() call to the end of dispc_init(), which is a sensible
place for it anyway.

Reviewed-by: Laurent Pinchart 
Signed-off-by: Tomi Valkeinen 
---
 drivers/gpu/drm/tidss/tidss_dispc.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c 
b/drivers/gpu/drm/tidss/tidss_dispc.c
index ad7999434299..deb665d0ede8 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc.c
+++ b/drivers/gpu/drm/tidss/tidss_dispc.c
@@ -2777,10 +2777,6 @@ int dispc_init(struct tidss_device *tidss)
return r;
}
 
-   /* K2G display controller does not support soft reset */
-   if (feat->subrev != DISPC_K2G)
-   dispc_softreset(dispc);
-
for (i = 0; i < dispc->feat->num_vps; i++) {
u32 gamma_size = dispc->feat->vp_feat.color.gamma_size;
u32 *gamma_table;
@@ -2829,6 +2825,10 @@ int dispc_init(struct tidss_device *tidss)
of_property_read_u32(dispc->dev->of_node, "max-memory-bandwidth",
 >memory_bandwidth_limit);
 
+   /* K2G display controller does not support soft reset */
+   if (feat->subrev != DISPC_K2G)
+   dispc_softreset(dispc);
+
tidss->dispc = dispc;
 
return 0;

-- 
2.34.1

[PATCH v2 03/11] drm/tidss: Drop useless variable init

2023-11-08 Thread Tomi Valkeinen

No need to initialize the ret to 0 in dispc_softreset().

Reviewed-by: Laurent Pinchart 
Signed-off-by: Tomi Valkeinen 
---
 drivers/gpu/drm/tidss/tidss_dispc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c 
b/drivers/gpu/drm/tidss/tidss_dispc.c
index 9d9dee7abaef..ad7999434299 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc.c
+++ b/drivers/gpu/drm/tidss/tidss_dispc.c
@@ -2705,7 +2705,7 @@ static void dispc_init_errata(struct dispc_device *dispc)
 static void dispc_softreset(struct dispc_device *dispc)
 {
u32 val;
-   int ret = 0;
+   int ret;
 
/* Soft reset */
REG_FLD_MOD(dispc, DSS_SYSCONFIG, 1, 1, 1);

-- 
2.34.1

[PATCH v2 01/11] drm/tidss: Use pm_runtime_resume_and_get()

2023-11-08 Thread Tomi Valkeinen

Use pm_runtime_resume_and_get() instead of pm_runtime_get_sync(), which
will handle error situations better. Also fix the return, as there
should be no reason for the current complex return.

Reviewed-by: Laurent Pinchart 
Signed-off-by: Tomi Valkeinen 
---
 drivers/gpu/drm/tidss/tidss_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/tidss/tidss_drv.c 
b/drivers/gpu/drm/tidss/tidss_drv.c
index 4d063eb9cd0b..f403db11b846 100644
--- a/drivers/gpu/drm/tidss/tidss_drv.c
+++ b/drivers/gpu/drm/tidss/tidss_drv.c
@@ -32,9 +32,9 @@ int tidss_runtime_get(struct tidss_device *tidss)
 
dev_dbg(tidss->dev, "%s\n", __func__);
 
-   r = pm_runtime_get_sync(tidss->dev);
+   r = pm_runtime_resume_and_get(tidss->dev);
WARN_ON(r < 0);
-   return r < 0 ? r : 0;
+   return r;
 }
 
 void tidss_runtime_put(struct tidss_device *tidss)

-- 
2.34.1

Re: [PATCH] drm/sched: Don't disturb the entity when in RR-mode scheduling

2023-11-08 Thread Luben Tuikov

Hi,

On 2023-11-07 19:41, Danilo Krummrich wrote:
> On 11/7/23 05:10, Luben Tuikov wrote:
>> Don't call drm_sched_select_entity() in drm_sched_run_job_queue().  In fact,
>> rename __drm_sched_run_job_queue() to just drm_sched_run_job_queue(), and let
>> it do just that, schedule the work item for execution.
>>
>> The problem is that drm_sched_run_job_queue() calls drm_sched_select_entity()
>> to determine if the scheduler has an entity ready in one of its run-queues,
>> and in the case of the Round-Robin (RR) scheduling, the function
>> drm_sched_rq_select_entity_rr() does just that, selects the _next_ entity
>> which is ready, sets up the run-queue and completion and returns that
>> entity. The FIFO scheduling algorithm is unaffected.
>>
>> Now, since drm_sched_run_job_work() also calls drm_sched_select_entity(), 
>> then
>> in the case of RR scheduling, that would result in drm_sched_select_entity()
>> having been called twice, which may result in skipping a ready entity if more
>> than one entity is ready. This commit fixes this by eliminating the call to
>> drm_sched_select_entity() from drm_sched_run_job_queue(), and leaves it only
>> in drm_sched_run_job_work().
>>
>> v2: Rebased on top of Tvrtko's renames series of patches. (Luben)
>>  Add fixes-tag. (Tvrtko)
>>
>> Signed-off-by: Luben Tuikov 
>> Fixes: f7fe64ad0f22ff ("drm/sched: Split free_job into own work item")
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 16 +++-
>>   1 file changed, 3 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index 27843e37d9b769..cd0dc3f81d05f0 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -256,10 +256,10 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq 
>> *rq)
>>   }
>>   
>>   /**
>> - * __drm_sched_run_job_queue - enqueue run-job work
>> + * drm_sched_run_job_queue - enqueue run-job work
>>* @sched: scheduler instance
>>*/
>> -static void __drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
>> +static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
>>   {
>>  if (!READ_ONCE(sched->pause_submit))
>>  queue_work(sched->submit_wq, >work_run_job);
>> @@ -928,7 +928,7 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler 
>> *sched)
>>   void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
>>   {
>>  if (drm_sched_can_queue(sched))
>> -__drm_sched_run_job_queue(sched);
>> +drm_sched_run_job_queue(sched);
>>   }
>>   
>>   /**
>> @@ -1040,16 +1040,6 @@ drm_sched_pick_best(struct drm_gpu_scheduler 
>> **sched_list,
>>   }
>>   EXPORT_SYMBOL(drm_sched_pick_best);
>>   
>> -/**
>> - * drm_sched_run_job_queue - enqueue run-job work if there are ready 
>> entities
>> - * @sched: scheduler instance
>> - */
>> -static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched)
>> -{
>> -if (drm_sched_select_entity(sched))
> 
> Hm, now that I rebase my patch to implement dynamic job-flow control I 
> recognize that
> we probably need the peek semantics here. If we do not select an entity here, 
> we also
> do not check whether the corresponding job fits on the ring.
> 
> Alternatively, we simply can't do this check in drm_sched_wakeup(). The 
> consequence would
> be that we don't detect that we need to wait for credits to free up before 
> the run work is
> already executing and the run work selects an entity.

So I rebased v5 on top of the latest drm-misc-next, and looked around and found 
out that
drm_sched_wakeup() is missing drm_sched_entity_is_ready(). It should look like 
the following,

void drm_sched_wakeup(struct drm_gpu_scheduler *sched,
  struct drm_sched_entity *entity)
{
if (drm_sched_entity_is_ready(entity))
if (drm_sched_can_queue(sched, entity))
drm_sched_run_job_queue(sched);
}

See the attached patch. (Currently running with base-commit and the attached 
patch.)
-- 
Regards,
Luben
From 65b8b8be52e8c112d7350397cb54b4fb3470b008 Mon Sep 17 00:00:00 2001
From: Danilo Krummrich 
Date: Thu, 2 Nov 2023 01:10:34 +0100
Subject: [PATCH] drm/sched: implement dynamic job-flow control

Currently, job flow control is implemented simply by limiting the number
of jobs in flight. Therefore, a scheduler is initialized with a credit
limit that corresponds to the number of jobs which can be sent to the
hardware.

This implies that for each job, drivers need to account for the maximum
job size possible in order to not overflow the ring buffer.

However, there are drivers, such as Nouveau, where the job size has a
rather large range. For such drivers it can easily happen that job
submissions not even filling the ring by 1% can block subsequent
submissions, which, in the worst case, can lead to the ring run dry.

In order to overcome this issue, allow for tracking the actual job size
instead of the

Re: [PATCH 02/22] fb: atmel_lcdfb: Stop using platform_driver_probe()

2023-11-08 Thread Uwe Kleine-König

Hello,

On Wed, Nov 08, 2023 at 10:57:00PM +0100, Helge Deller wrote:
> On 11/8/23 22:52, Uwe Kleine-König wrote:
> > But info and so info->fix live longer than the probe function, don't
> > they?
> 
> Yes, they do.
> But AFAICS info->fix contains a *copy* of the initial atmel_lcdfb_fix struct
> (and not a pointer to it). So that should be ok.

If you say so that's good. I grepped a bit around and didn't find a
place where a copy is made. But that's probably me and I'll consider the
case closed.

Thanks
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | https://www.pengutronix.de/ |


signature.asc
Description: PGP signature

Re: [PATCH] drm/sched: fix potential page fault in drm_sched_job_init()

2023-11-08 Thread Luben Tuikov

On 2023-11-08 19:09, Danilo Krummrich wrote:
> On 11/8/23 06:46, Luben Tuikov wrote:
>> Hi,
>>
>> Could you please use my gmail address, the one one I'm responding from--I 
>> don't want
>> to miss any DRM scheduler patches. BTW, the luben.tui...@amd.com email 
>> should bounce
>> as undeliverable.
>>
>> On 2023-11-07 21:26, Danilo Krummrich wrote:
>>> Commit 56e449603f0a ("drm/sched: Convert the GPU scheduler to variable
>>> number of run-queues") introduces drm_err() in drm_sched_job_init(), in
>>> order to indicate that the given entity has no runq, however at this
>>> time job->sched is not yet set, likely to be NULL initialized, and hence
>>> shouldn't be used.
>>>
>>> Replace the corresponding drm_err() call with pr_err() to avoid a
>>> potential page fault.
>>>
>>> While at it, extend the documentation of drm_sched_job_init() to
>>> indicate that job->sched is not a valid pointer until
>>> drm_sched_job_arm() has been called.
>>>
>>> Fixes: 56e449603f0a ("drm/sched: Convert the GPU scheduler to variable 
>>> number of run-queues")
>>> Signed-off-by: Danilo Krummrich 
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_main.c | 5 -
>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 27843e37d9b7..dd28389f0ddd 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -680,6 +680,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs);
>>>* This function returns -ENOENT in this case (which probably should be 
>>> -EIO as
>>>* a more meanigful return value).
>>>*
>>> + * Note that job->sched is not a valid pointer until drm_sched_job_arm() 
>>> has
>>> + * been called.
>>> + *
>>
>> Good catch!
>>
>> Did you actually get this to page-fault and have a kernel log?
> 
> No, I just found it because I was about to make the same mistake.
> 
>>
>> I'm asking because we see it correctly set in this kernel log coming from 
>> AMD,
> 
> I think that's because amdgpu just sets job->sched to *some* scheduler 
> instance after
> job allocation [1].
> 
> [1] 
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c#L108
> 
>>
>> [   11.886024] amdgpu :0a:00.0: [drm] *ERROR* drm_sched_job_init: entity 
>> has no rq!
>>
>> in this email,
>> https://lore.kernel.org/r/CADnq5_PS64jYS_Y3kGW27m-kuWP+FQFiaVcOaZiB=jlsgpn...@mail.gmail.com
>>
>>>* Returns 0 for success, negative error code otherwise.
>>>*/
>>>   int drm_sched_job_init(struct drm_sched_job *job,
>>> @@ -691,7 +694,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>  * or worse--a blank screen--leave a trail in the
>>>  * logs, so this can be debugged easier.
>>>  */
>>> -   drm_err(job->sched, "%s: entity has no rq!\n", __func__);
>>> +   pr_err("%s: entity has no rq!\n", __func__);
>>
>> Is it feasible to do something like the following?
>>
>>  dev_err(job->sched ? job->sched->dev : NULL, "%s: entity has no 
>> rq!\n", __func__);
> 
> I don't think that's a good idea. Although I'd assume that every driver 
> zero-initializes its job
> structures, I can't see a rule enforcing that. Hence, job->sched can be a 
> random value until
> drm_sched_job_arm() is called.

Okay. However, when using pr_err() we're losing "[drm] *ERROR* " prefix and we 
scan for that
in the logs to quickly find the cause of the error.

Perhaps we can define pr_fmt() and also include "*ERROR*" so that we can get 
the desired result
as the attached patch shows?
-- 
Regards,
Luben
From 1f3ed97947a406a555a3efea05cab67da94172e7 Mon Sep 17 00:00:00 2001
From: Danilo Krummrich 
Date: Wed, 8 Nov 2023 03:26:07 +0100
Subject: [PATCH] drm/sched: fix potential page fault in drm_sched_job_init()

Commit 56e449603f0a ("drm/sched: Convert the GPU scheduler to variable
number of run-queues") introduces drm_err() in drm_sched_job_init(), in
order to indicate that the given entity has no runq, however at this
time job->sched is not yet set, likely to be NULL initialized, and hence
shouldn't be used.

Replace the corresponding drm_err() call with pr_err() to avoid a
potential page fault.

While at it, extend the documentation of drm_sched_job_init() to
indicate that job->sched is not a valid pointer until
drm_sched_job_arm() has been called.

v2: Add pr_fmt to drm_printk.h. Add "*ERROR*" to this pr_err() message. (Luben)

Fixes: 56e449603f0a ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
Signed-off-by: Danilo Krummrich 
Link: https://patchwork.freedesktop.org/patch/msgid/20231108022716.15250-1-d...@redhat.com
Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/scheduler/sched_main.c | 5 -
 include/drm/drm_print.h| 9 +
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index

Re: [RFC PATCH v3 07/12] page-pool: device memory support

2023-11-08 Thread Mina Almasry

On Wed, Nov 8, 2023 at 2:56 AM Yunsheng Lin  wrote:
>
> On 2023/11/8 5:56, Mina Almasry wrote:
> > On Tue, Nov 7, 2023 at 12:00 AM Yunsheng Lin  wrote:
> >>
> >> On 2023/11/6 10:44, Mina Almasry wrote:
> >>> Overload the LSB of struct page* to indicate that it's a page_pool_iov.
> >>>
> >>> Refactor mm calls on struct page* into helpers, and add page_pool_iov
> >>> handling on those helpers. Modify callers of these mm APIs with calls to
> >>> these helpers instead.
> >>>
> >>> In areas where struct page* is dereferenced, add a check for special
> >>> handling of page_pool_iov.
> >>>
> >>> Signed-off-by: Mina Almasry 
> >>>
> >>> ---
> >>>  include/net/page_pool/helpers.h | 74 -
> >>>  net/core/page_pool.c| 63 
> >>>  2 files changed, 118 insertions(+), 19 deletions(-)
> >>>
> >>> diff --git a/include/net/page_pool/helpers.h 
> >>> b/include/net/page_pool/helpers.h
> >>> index b93243c2a640..08f1a2cc70d2 100644
> >>> --- a/include/net/page_pool/helpers.h
> >>> +++ b/include/net/page_pool/helpers.h
> >>> @@ -151,6 +151,64 @@ static inline struct page_pool_iov 
> >>> *page_to_page_pool_iov(struct page *page)
> >>>   return NULL;
> >>>  }
> >>>
> >>> +static inline int page_pool_page_ref_count(struct page *page)
> >>> +{
> >>> + if (page_is_page_pool_iov(page))
> >>> + return page_pool_iov_refcount(page_to_page_pool_iov(page));
> >>
> >> We have added a lot of 'if' for the devmem case, it would be better to
> >> make it more generic so that we can have more unified metadata handling
> >> for normal page and devmem. If we add another memory type here, do we
> >> need another 'if' here?
> >
> > Maybe, not sure. I'm guessing new memory types will either be pages or
> > iovs, so maybe no new if statements needed.
> >
> >> That is part of the reason I suggested using a more unified metadata for
> >> all the types of memory chunks used by page_pool.
> >
> > I think your suggestion was to use struct pages for devmem. That was
> > thoroughly considered and intensely argued about in the initial
> > conversations regarding devmem and the initial RFC, and from the
> > conclusions there it's extremely clear to me that devmem struct pages
> > are categorically a no-go.
>
> Not exactly, I was wondering if adding a more abstract structure specificly
> for page pool makes any sense, and each mem type can add its own specific
> fields, net stack only see and handle the common fields so that it does not
> care about specific mem type, and each provider only see the and handle the
> specific fields belonging to it most of the time.
>
> Ideally something like beleow:
>
> struct netmem {
> /* common fields */
> refcount_t refcount;
> struct page_pool *pp;
> ..
>
> union {
> struct devmem{
> struct dmabuf_genpool_chunk_owner *owner;
> };
>
> struct other_mem{
> ...
> ...
> };
> };
> };
>
> But untill we completely decouple the 'struct page' from the net stack,
> the above seems undoable in the near term.

Agreed everything above is undoable.

> But we might be able to do something as folio is doing now, mm subsystem
> is still seeing 'struct folio/page', but other subsystem like slab is using
> 'struct slab', and there is still some common fields shared between
> 'struct folio' and 'struct slab'.
>

In my eyes this is almost exactly what I suggested in RFC v1 and got
immediately nacked with no room to negotiate. What we did for v1 is to
allocate struct pages for dma-buf to make dma-bufs look like struct
page to mm subsystem. Almost exactly what you're describing above.
It's a no-go. I don't think renaming struct page to netmem is going to
move the needle (it also re-introduces code-churn). What I feel like I
learnt is that dma-bufs are not struct pages and can't be made to look
like one, I think.

> As the netmem patchset, is devmem able to reuse the below 'struct netmem'
> and rename it to 'struct page_pool_iov'?

I don't think so. For the reasons above, but also practically it
immediately falls apart. Consider this field in netmem:

+ * @flags: The same as the page flags.  Do not use directly.

dma-buf don't have or support page-flags, and making dma-buf looks
like they support page flags or any page-like features (other than
dma_addr) seems extremely unacceptable to mm folks.

> So that 'struct page' for normal
> memory and 'struct page_pool_iov' for devmem share the common fields used
> by page pool and net stack?

Are you suggesting that we'd cast a netmem* to a page* and call core
mm APIs on it? It's basically what was happening with RFC v1, where
things that are not struct pages were made to look like struct pages.

Also, there isn't much upside for what you're suggesting, I think. For
example I can align the refcount variable in struct page_pool_iov

Re: [RFC PATCH v3 10/12] tcp: RX path for devmem TCP

2023-11-08 Thread Mina Almasry

On Tue, Nov 7, 2023 at 4:01 PM David Ahern  wrote:
>
> On 11/7/23 4:55 PM, Mina Almasry wrote:
> > On Mon, Nov 6, 2023 at 4:03 PM Willem de Bruijn
> >  wrote:
> >>
> >> On Mon, Nov 6, 2023 at 3:55 PM David Ahern  wrote:
> >>>
> >>> On 11/6/23 4:32 PM, Stanislav Fomichev wrote:
> > The concise notification API returns tokens as a range for
> > compression, encoding as two 32-bit unsigned integers start + length.
> > It allows for even further batching by returning multiple such ranges
> > in a single call.
> 
>  Tangential: should tokens be u64? Otherwise we can't have more than
>  4gb unacknowledged. Or that's a reasonable constraint?
> 
> >>>
> >>> Was thinking the same and with bits reserved for a dmabuf id to allow
> >>> multiple dmabufs in a single rx queue (future extension, but build the
> >>> capability in now). e.g., something like a 37b offset (128GB dmabuf
> >>> size), 19b length (large GRO), 8b dmabuf id (lots of dmabufs to a queue).
> >>
> >> Agreed. Converting to 64b now sounds like a good forward looking revision.
> >
> > The concept of IDing a dma-buf came up in a couple of different
> > contexts. First, in the context of us giving the dma-buf ID to the
> > user on recvmsg() to tell the user the data is in this specific
> > dma-buf. The second context is here, to bind dma-bufs with multiple
> > user-visible IDs to an rx queue.
> >
> > My issue here is that I don't see anything in the struct dma_buf that
> > can practically serve as an ID:
> >
> > https://elixir.bootlin.com/linux/v6.6-rc7/source/include/linux/dma-buf.h#L302
> >
> > Actually, from the userspace, only the name of the dma-buf seems
> > queryable. That's only unique if the user sets it as such. The dmabuf
> > FD can't serve as an ID. For our use case we need to support 1 process
> > doing the dma-buf bind via netlink, sharing the dma-buf FD to another
> > process, and that process receives the data.  In this case the FDs
> > shown by the 2 processes may be different. Converting to 64b is a
> > trivial change I can make now, but I'm not sure how to ID these
> > dma-bufs. Suggestions welcome. I'm not sure the dma-buf guys will
> > allow adding a new ID + APIs to query said dma-buf ID.
> >
>
> The API can be unique to this usage: e.g., add a dmabuf id to the
> netlink API. Userspace manages the ids (tells the kernel what value to
> use with an instance), the kernel validates no 2 dmabufs have the same
> id and then returns the value here.
>
>

Seems reasonable, will do.

On Wed, Nov 8, 2023 at 7:36 AM Edward Cree  wrote:
>
> On 06/11/2023 21:17, Stanislav Fomichev wrote:
> > I guess I'm just wondering whether other people have any suggestions
> > here. Not sure Jonathan's way was better, but we fundamentally
> > have two queues between the kernel and the userspace:
> > - userspace receiving tokens (recvmsg + magical flag)
> > - userspace refilling tokens (setsockopt + magical flag)
> >
> > So having some kind of shared memory producer-consumer queue feels natural.
> > And using 'classic' socket api here feels like a stretch, idk.
>
> Do 'refilled tokens' (returned memory areas) get used for anything other
>  than subsequent RX?

Hi Ed!

Not really, it's only the subsequent RX.

>  If not then surely the way to return a memory area
>  in an io_uring idiom is just to post a new read sqe ('RX descriptor')
>  pointing into it, rather than explicitly returning it with setsockopt.

We're interested in using this with regular TCP sockets, not
necessarily io_uring. The io_uring interface to devmem TCP may very
well use what you suggest and can drop the setsockopt.


> (Being async means you can post lots of these, unlike recvmsg(), so you
>  don't need any kernel management to keep the RX queue filled; it can
>  just be all handled by the userland thus simplifying APIs overall.)
> Or I'm misunderstanding something?
>
> -e


--
Thanks,
Mina

Re: [RFC PATCH v3 04/12] netdev: support binding dma-buf to netdevice

2023-11-08 Thread Mina Almasry

On Wed, Nov 8, 2023 at 3:47 PM David Wei  wrote:
>
> On 2023-11-05 18:44, Mina Almasry wrote:
> > Add a netdev_dmabuf_binding struct which represents the
> > dma-buf-to-netdevice binding. The netlink API will bind the dma-buf to
> > rx queues on the netdevice. On the binding, the dma_buf_attach
> > & dma_buf_map_attachment will occur. The entries in the sg_table from
> > mapping will be inserted into a genpool to make it ready
> > for allocation.
> >
> > The chunks in the genpool are owned by a dmabuf_chunk_owner struct which
> > holds the dma-buf offset of the base of the chunk and the dma_addr of
> > the chunk. Both are needed to use allocations that come from this chunk.
> >
> > We create a new type that represents an allocation from the genpool:
> > page_pool_iov. We setup the page_pool_iov allocation size in the
> > genpool to PAGE_SIZE for simplicity: to match the PAGE_SIZE normally
> > allocated by the page pool and given to the drivers.
> >
> > The user can unbind the dmabuf from the netdevice by closing the netlink
> > socket that established the binding. We do this so that the binding is
> > automatically unbound even if the userspace process crashes.
> >
> > The binding and unbinding leaves an indicator in struct netdev_rx_queue
> > that the given queue is bound, but the binding doesn't take effect until
> > the driver actually reconfigures its queues, and re-initializes its page
> > pool.
> >
> > The netdev_dmabuf_binding struct is refcounted, and releases its
> > resources only when all the refs are released.
> >
> > Signed-off-by: Willem de Bruijn 
> > Signed-off-by: Kaiyuan Zhang 
> > Signed-off-by: Mina Almasry 
> >
> > ---
> >
> > RFC v3:
> > - Support multi rx-queue binding
> >
> > ---
> >  include/linux/netdevice.h |  80 ++
> >  include/net/netdev_rx_queue.h |   1 +
> >  include/net/page_pool/types.h |  27 +
> >  net/core/dev.c| 203 ++
> >  net/core/netdev-genl.c| 116 ++-
> >  5 files changed, 425 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index b8bf669212cc..eeeda849115c 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -52,6 +52,8 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > +#include 
> >
> >  struct netpoll_info;
> >  struct device;
> > @@ -808,6 +810,84 @@ bool rps_may_expire_flow(struct net_device *dev, u16 
> > rxq_index, u32 flow_id,
> >  #endif
> >  #endif /* CONFIG_RPS */
> >
> > +struct netdev_dmabuf_binding {
> > + struct dma_buf *dmabuf;
> > + struct dma_buf_attachment *attachment;
> > + struct sg_table *sgt;
> > + struct net_device *dev;
> > + struct gen_pool *chunk_pool;
> > +
> > + /* The user holds a ref (via the netlink API) for as long as they want
> > +  * the binding to remain alive. Each page pool using this binding 
> > holds
> > +  * a ref to keep the binding alive. Each allocated page_pool_iov 
> > holds a
> > +  * ref.
> > +  *
> > +  * The binding undos itself and unmaps the underlying dmabuf once all
> > +  * those refs are dropped and the binding is no longer desired or in
> > +  * use.
> > +  */
> > + refcount_t ref;
> > +
> > + /* The portid of the user that owns this binding. Used for netlink to
> > +  * notify us of the user dropping the bind.
> > +  */
> > + u32 owner_nlportid;
> > +
> > + /* The list of bindings currently active. Used for netlink to notify 
> > us
> > +  * of the user dropping the bind.
> > +  */
> > + struct list_head list;
> > +
> > + /* rxq's this binding is active on. */
> > + struct xarray bound_rxq_list;
> > +};
> > +
> > +#ifdef CONFIG_DMA_SHARED_BUFFER
> > +void __netdev_devmem_binding_free(struct netdev_dmabuf_binding *binding);
> > +int netdev_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd,
> > +struct netdev_dmabuf_binding **out);
> > +void netdev_unbind_dmabuf(struct netdev_dmabuf_binding *binding);
> > +int netdev_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
> > + struct netdev_dmabuf_binding *binding);
> > +#else
> > +static inline void
> > +__netdev_devmem_binding_free(struct netdev_dmabuf_binding *binding)
> > +{
> > +}
> > +
> > +static inline int netdev_bind_dmabuf(struct net_device *dev,
> > +  unsigned int dmabuf_fd,
> > +  struct netdev_dmabuf_binding **out)
> > +{
> > + return -EOPNOTSUPP;
> > +}
> > +static inline void netdev_unbind_dmabuf(struct netdev_dmabuf_binding 
> > *binding)
> > +{
> > +}
> > +
> > +static inline int
> > +netdev_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
> > + struct netdev_dmabuf_binding *binding)
> > +{
> > + return -EOPNOTSUPP;
> > +}
> > +#endif
> > +
> > +static inline void
> >

Re: [RFC PATCH v3 04/12] netdev: support binding dma-buf to netdevice

2023-11-08 Thread Mina Almasry

On Tue, Nov 7, 2023 at 7:40 PM Yunsheng Lin  wrote:
>
> On 2023/11/8 5:59, Mina Almasry wrote:
> > On Mon, Nov 6, 2023 at 11:46 PM Yunsheng Lin  wrote:
> >>
> >> On 2023/11/6 10:44, Mina Almasry wrote:
> >>> +
> >>> +void __netdev_devmem_binding_free(struct netdev_dmabuf_binding *binding)
> >>> +{
> >>> + size_t size, avail;
> >>> +
> >>> + gen_pool_for_each_chunk(binding->chunk_pool,
> >>> + netdev_devmem_free_chunk_owner, NULL);
> >>> +
> >>> + size = gen_pool_size(binding->chunk_pool);
> >>> + avail = gen_pool_avail(binding->chunk_pool);
> >>> +
> >>> + if (!WARN(size != avail, "can't destroy genpool. size=%lu, 
> >>> avail=%lu",
> >>> +   size, avail))
> >>> + gen_pool_destroy(binding->chunk_pool);
> >>
> >>
> >> Is there any other place calling the gen_pool_destroy() when the above
> >> warning is triggered? Do we have a leaking for binding->chunk_pool?
> >>
> >
> > gen_pool_destroy BUG_ON() if it's not empty at the time of destroying.
> > Technically that should never happen, because
> > __netdev_devmem_binding_free() should only be called when the refcount
> > hits 0, so all the chunks have been freed back to the gen_pool. But,
> > just in case, I don't want to crash the server just because I'm
> > leaking a chunk... this is a bit of defensive programming that is
> > typically frowned upon, but the behavior of gen_pool is so severe I
> > think the WARN() + check is warranted here.
>
> It seems it is pretty normal for the above to happen nowadays because of
> retransmits timeouts, NAPI defer schemes mentioned below:
>
> https://lkml.kernel.org/netdev/168269854650.2191653.8465259808498269815.stgit@firesoul/
>
> And currently page pool core handles that by using a workqueue.

Forgive me but I'm not understanding the concern here.

__netdev_devmem_binding_free() is called when binding->ref hits 0.

binding->ref is incremented when an iov slice of the dma-buf is
allocated, and decremented when an iov is freed. So,
__netdev_devmem_binding_free() can't really be called unless all the
iovs have been freed, and gen_pool_size() == gen_pool_avail(),
regardless of what's happening on the page_pool side of things, right?

-- 
Thanks,
Mina

Re: [RFC PATCH v3 05/12] netdev: netdevice devmem allocator

2023-11-08 Thread Mina Almasry

> > On Mon, Nov 6, 2023 at 11:45 PM Yunsheng Lin  wrote:
> >>
> >> On 2023/11/6 10:44, Mina Almasry wrote:
> >>> +
> >>> +void netdev_free_devmem(struct page_pool_iov *ppiov)
> >>> +{
> >>> + struct netdev_dmabuf_binding *binding = 
> >>> page_pool_iov_binding(ppiov);
> >>> +
> >>> + refcount_set(>refcount, 1);
> >>> +
> >>> + if (gen_pool_has_addr(binding->chunk_pool,
> >>> +   page_pool_iov_dma_addr(ppiov), PAGE_SIZE))
> >>
> >> When gen_pool_has_addr() returns false, does it mean something has gone
> >> really wrong here?
> >>
> >
> > Yes, good eye. gen_pool_has_addr() should never return false, but then
> > again, gen_pool_free()  BUG_ON()s if it doesn't find the address,
> > which is an extremely severe reaction to what can be a minor bug in
> > the accounting. I prefer to leak rather than crash the machine. It's a
> > bit of defensive programming that is normally frowned upon, but I feel
> > like in this case it's maybe warranted due to the very severe reaction
> > (BUG_ON).
>
> I would argue that why is the above defensive programming not done in the
> gen_pool core:)
>

I think gen_pool is not really not that new, and suggesting removing
the BUG_ONs must have been proposed before and rejected. I'll try to
do some research and maybe suggest downgrading the BUG_ON to WARN_ON,
but my guess is there is some reason the maintainer wants it to be a
BUG_ON.

On Wed, Nov 8, 2023 at 5:00 PM David Wei  wrote:
>
> On 2023-11-07 14:55, David Ahern wrote:
> > On 11/7/23 3:10 PM, Mina Almasry wrote:
> >> On Mon, Nov 6, 2023 at 3:44 PM David Ahern  wrote:
> >>>
> >>> On 11/5/23 7:44 PM, Mina Almasry wrote:
>  diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>  index eeeda849115c..1c351c138a5b 100644
>  --- a/include/linux/netdevice.h
>  +++ b/include/linux/netdevice.h
>  @@ -843,6 +843,9 @@ struct netdev_dmabuf_binding {
>   };
> 
>   #ifdef CONFIG_DMA_SHARED_BUFFER
>  +struct page_pool_iov *
>  +netdev_alloc_devmem(struct netdev_dmabuf_binding *binding);
>  +void netdev_free_devmem(struct page_pool_iov *ppiov);
> >>>
> >>> netdev_{alloc,free}_dmabuf?
> >>>
> >>
> >> Can do.
> >>
> >>> I say that because a dmabuf can be host memory, at least I am not aware
> >>> of a restriction that a dmabuf is device memory.
> >>>
> >>
> >> In my limited experience dma-buf is generally device memory, and
> >> that's really its use case. CONFIG_UDMABUF is a driver that mocks
> >> dma-buf with a memfd which I think is used for testing. But I can do
> >> the rename, it's more clear anyway, I think.
> >
> > config UDMABUF
> > bool "userspace dmabuf misc driver"
> > default n
> > depends on DMA_SHARED_BUFFER
> > depends on MEMFD_CREATE || COMPILE_TEST
> > help
> >   A driver to let userspace turn memfd regions into dma-bufs.
> >   Qemu can use this to create host dmabufs for guest framebuffers.
> >
> >
> > Qemu is just a userspace process; it is no way a special one.
> >
> > Treating host memory as a dmabuf should radically simplify the io_uring
> > extension of this set. That the io_uring set needs to dive into
> > page_pools is just wrong - complicating the design and code and pushing
> > io_uring into a realm it does not need to be involved in.
>
> I think our io_uring proposal will already be vastly simplified once we
> rebase onto Kuba's page pool memory provider API. Using udmabuf means
> depending on a driver designed for testing, vs io_uring's registered
> buffers API that's been tried and tested.
>

FWIW I also get an impression that udmabuf is mostly targeting
testing, but I'm not aware of any deficiency that makes it concretely
unsuitable for you. You be the judge.

The only quirk of udmabuf I'm aware of is that it seems to cap the max
dma-buf size to 16000 pages. Not sure if that's due to a genuine
technical limitation or just convenience.

> I don't have an intuitive understanding of the trade offs yet, and would
> need to try out udmabuf and compare vs say using our own page pool
> memory provider.
>


On Wed, Nov 8, 2023 at 5:15 PM David Wei  wrote:
> How would TCP devmem change if we no longer assume that dmabuf is device
> memory?

It wouldn't. The code already never assumes that dmabuf is device
memory. Any dma-buf should work, as far as I can tell. I'm also quite
confident udmabuf works, I use it for testing.

(Jason Gunthrope is much more of an expert and may chime in to say
'some dma-buf will not work'. My primitive understanding is that we're
using dma-bufs without any quirks and any dma-buf should work. I of
course haven't tested all dma-bufs :D)

> Pavel will know more on the perf side, but I wouldn't want to
> put any if/else on the hot path if we can avoid it. I could be wrong,
> but right now in my mind using different memory providers solves this
> neatly and the driver/networking stack doesn't need to care.
>
> Mina, I believe you said at NetDev conf

Re: [RFC PATCH v3 05/12] netdev: netdevice devmem allocator

2023-11-08 Thread David Wei

On 2023-11-07 15:03, Mina Almasry wrote:
> On Tue, Nov 7, 2023 at 2:55 PM David Ahern  wrote:
>>
>> On 11/7/23 3:10 PM, Mina Almasry wrote:
>>> On Mon, Nov 6, 2023 at 3:44 PM David Ahern  wrote:

 On 11/5/23 7:44 PM, Mina Almasry wrote:
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index eeeda849115c..1c351c138a5b 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -843,6 +843,9 @@ struct netdev_dmabuf_binding {
>  };
>
>  #ifdef CONFIG_DMA_SHARED_BUFFER
> +struct page_pool_iov *
> +netdev_alloc_devmem(struct netdev_dmabuf_binding *binding);
> +void netdev_free_devmem(struct page_pool_iov *ppiov);

 netdev_{alloc,free}_dmabuf?

>>>
>>> Can do.
>>>
 I say that because a dmabuf can be host memory, at least I am not aware
 of a restriction that a dmabuf is device memory.

>>>
>>> In my limited experience dma-buf is generally device memory, and
>>> that's really its use case. CONFIG_UDMABUF is a driver that mocks
>>> dma-buf with a memfd which I think is used for testing. But I can do
>>> the rename, it's more clear anyway, I think.
>>
>> config UDMABUF
>> bool "userspace dmabuf misc driver"
>> default n
>> depends on DMA_SHARED_BUFFER
>> depends on MEMFD_CREATE || COMPILE_TEST
>> help
>>   A driver to let userspace turn memfd regions into dma-bufs.
>>   Qemu can use this to create host dmabufs for guest framebuffers.
>>
>>
>> Qemu is just a userspace process; it is no way a special one.
>>
>> Treating host memory as a dmabuf should radically simplify the io_uring
>> extension of this set.
> 
> I agree actually, and I was about to make that comment to David Wei's
> series once I have the time.
> 
> David, your io_uring RX zerocopy proposal actually works with devmem
> TCP, if you're inclined to do that instead, what you'd do roughly is
> (I think):
> 
> - Allocate a memfd,
> - Use CONFIG_UDMABUF to create a dma-buf out of that memfd.
> - Bind the dma-buf to the NIC using the netlink API in this RFC.
> - Your io_uring extensions and io_uring uapi should work as-is almost
> on top of this series, I think.
> 
> If you do this the incoming packets should land into your memfd, which
> may or may not work for you. In the future if you feel inclined to use
> device memory, this approach that I'm describing here would be more
> extensible to device memory, because you'd already be using dma-bufs
> for your user memory; you'd just replace one kind of dma-buf (UDMABUF)
> with another.
> 

How would TCP devmem change if we no longer assume that dmabuf is device
memory? Pavel will know more on the perf side, but I wouldn't want to
put any if/else on the hot path if we can avoid it. I could be wrong,
but right now in my mind using different memory providers solves this
neatly and the driver/networking stack doesn't need to care.

Mina, I believe you said at NetDev conf that you already had an udmabuf
implementation for testing. I would like to see this (you can send
privately) to see how TCP devmem would handle both user memory and
device memory.

>> That the io_uring set needs to dive into
>> page_pools is just wrong - complicating the design and code and pushing
>> io_uring into a realm it does not need to be involved in.
>>
>> Most (all?) of this patch set can work with any memory; only device
>> memory is unreadable.
>>
>>
> 
>

Re: [PATCH 02/22] [RESEND^2] jffs2: mark __jffs2_dbg_superblock_counts() static

2023-11-08 Thread Zhihao Cheng


在 2023/11/8 20:58, Arnd Bergmann 写道:

From: Arnd Bergmann 

This function is only called locally and does not need to be
global. Since there is no external prototype, gcc warns about
the non-static definition:

fs/jffs2/debug.c:160:6: error: no previous prototype for 
'__jffs2_dbg_superblock_counts' [-Werror=missing-prototypes]

Reviewed-by: Tudor Ambarus 
Signed-off-by: Arnd Bergmann 
---
  fs/jffs2/debug.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Zhihao Cheng

Re: [RFC PATCH v3 05/12] netdev: netdevice devmem allocator

2023-11-08 Thread David Wei

On 2023-11-07 14:55, David Ahern wrote:
> On 11/7/23 3:10 PM, Mina Almasry wrote:
>> On Mon, Nov 6, 2023 at 3:44 PM David Ahern  wrote:
>>>
>>> On 11/5/23 7:44 PM, Mina Almasry wrote:
 diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
 index eeeda849115c..1c351c138a5b 100644
 --- a/include/linux/netdevice.h
 +++ b/include/linux/netdevice.h
 @@ -843,6 +843,9 @@ struct netdev_dmabuf_binding {
  };

  #ifdef CONFIG_DMA_SHARED_BUFFER
 +struct page_pool_iov *
 +netdev_alloc_devmem(struct netdev_dmabuf_binding *binding);
 +void netdev_free_devmem(struct page_pool_iov *ppiov);
>>>
>>> netdev_{alloc,free}_dmabuf?
>>>
>>
>> Can do.
>>
>>> I say that because a dmabuf can be host memory, at least I am not aware
>>> of a restriction that a dmabuf is device memory.
>>>
>>
>> In my limited experience dma-buf is generally device memory, and
>> that's really its use case. CONFIG_UDMABUF is a driver that mocks
>> dma-buf with a memfd which I think is used for testing. But I can do
>> the rename, it's more clear anyway, I think.
> 
> config UDMABUF
> bool "userspace dmabuf misc driver"
> default n
> depends on DMA_SHARED_BUFFER
> depends on MEMFD_CREATE || COMPILE_TEST
> help
>   A driver to let userspace turn memfd regions into dma-bufs.
>   Qemu can use this to create host dmabufs for guest framebuffers.
> 
> 
> Qemu is just a userspace process; it is no way a special one.
> 
> Treating host memory as a dmabuf should radically simplify the io_uring
> extension of this set. That the io_uring set needs to dive into
> page_pools is just wrong - complicating the design and code and pushing
> io_uring into a realm it does not need to be involved in.

I think our io_uring proposal will already be vastly simplified once we
rebase onto Kuba's page pool memory provider API. Using udmabuf means
depending on a driver designed for testing, vs io_uring's registered
buffers API that's been tried and tested.

I don't have an intuitive understanding of the trade offs yet, and would
need to try out udmabuf and compare vs say using our own page pool
memory provider.

> 
> Most (all?) of this patch set can work with any memory; only device
> memory is unreadable.
> 
>

BUG in drm_kms_helper_poll_enable() fixed by reverting "drm/ast: report connection status on Display Port."

2023-11-08 Thread Kim Phillips


Hi, current linux kernel commit 90450a06162e
("Merge tag 'rcu-fixes-v6.7' of 
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks")
and the attached config cause the following BUG when booting on
a reference AMD Zen4 development server:

[   59.995717] input: OpenBMC virtual_input as 
/devices/pci:00/:00:07.1/:02:00.4/usb3/3-2/3-2.6/3-2.6:1.0/0003:1D6B:0104.0002/input/input4
[   60.033135] ast :c2:00.0: vgaarb: deactivate vga console
[   60.066230] ast :c2:00.0: [drm] Using default configuration
[   60.070342] hid-generic 0003:1D6B:0104.0002: input,hidraw0: USB HID v1.01 
Keyboard [OpenBMC virtual_input] on usb-:02:00.4-2.6/input0
[   60.072843] ast :c2:00.0: [drm] AST 2600 detected
[   60.072851] ast :c2:00.0: [drm] Using ASPEED DisplayPort transmitter
[   60.099891] ast :c2:00.0: [drm] dram MCLK=396 Mhz type=1 bus_width=16
[   60.115780] [drm] Initialized ast 0.1.0 20120228 for :c2:00.0 on minor 0
[   60.135643] fbcon: astdrmfb (fb0) is primary device
[   60.135649] fbcon: Deferring console take-over
[   60.146162] ast :c2:00.0: [drm] fb0: astdrmfb frame buffer device
[   60.331802] input: OpenBMC virtual_input as 
/devices/pci:00/:00:07.1/:02:00.4/usb3/3-2/3-2.6/3-2.6:1.0/0003:1D6B:0104.0002/input/input5
[   60.405807] hid-generic 0003:1D6B:0104.0002: input,hidraw0: USB HID v1.01 
Keyboard [OpenBMC virtual_input] on usb-:02:00.4-2.6/input0
[   60.423774] input: OpenBMC virtual_input as 
/devices/pci:00/:00:07.1/:02:00.4/usb3/3-2/3-2.6/3-2.6:1.1/0003:1D6B:0104.0004/input/input6
[   60.443170] hid-generic 0003:1D6B:0104.0004: input,hidraw1: USB HID v1.01 
Mouse [OpenBMC virtual_input] on usb-:02:00.4-2.6/input1
[   60.460675] ast :c2:00.0: vgaarb: deactivate vga console
[   60.479996] ast :c2:00.0: [drm] Using default configuration
[   60.486603] ast :c2:00.0: [drm] AST 2600 detected
[   60.492249] ast :c2:00.0: [drm] Using ASPEED DisplayPort transmitter
[   60.499732] ast :c2:00.0: [drm] dram MCLK=396 Mhz type=1 bus_width=16
[   60.508955] BUG: unable to handle page fault for address: 8881e98109f0
[   60.516623] #PF: supervisor write access in kernel mode
[   60.522449] #PF: error_code(0x0002) - not-present page
[   60.528168] PGD 8dbc01067 P4D 8dbc01067 PUD 104c984067 PMD 104c837067 PTE 
800e167ef060
[   60.537394] Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC NOPTI
[   60.543805] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: GW  
6.6.0+ #3
[   60.552251] Hardware name: AMD Corporation ONYX/ONYX, BIOS ROX100AB 
09/14/2023
[   60.560309] Workqueue: events work_for_cpu_fn
[   60.565173] RIP: 0010:enqueue_timer 
(/home/amd/git/linux/./include/linux/list.h:1034 
/home/amd/git/linux/kernel/time/timer.c:605)
[ 60.570129] Code: 44 00 00 55 48 89 e5 41 55 49 89 cd 41 54 49 89 fc 53 48 89 f3 89 
d6 48 8d 84 f7 b0 00 00 00 48 8b 08 48 89 0b 48 85 c9 74 04 <48> 89 59 08 48 89 
18 48 89 43 08 49 8d 44 24 68 48 0f ab 30 8b 4b
All code

   0:   44 00 00add%r8b,(%rax)
   3:   55  push   %rbp
   4:   48 89 e5mov%rsp,%rbp
   7:   41 55   push   %r13
   9:   49 89 cdmov%rcx,%r13
   c:   41 54   push   %r12
   e:   49 89 fcmov%rdi,%r12
  11:   53  push   %rbx
  12:   48 89 f3mov%rsi,%rbx
  15:   89 d6   mov%edx,%esi
  17:   48 8d 84 f7 b0 00 00lea0xb0(%rdi,%rsi,8),%rax
  1e:   00
  1f:   48 8b 08mov(%rax),%rcx
  22:   48 89 0bmov%rcx,(%rbx)
  25:   48 85 c9test   %rcx,%rcx
  28:   74 04   je 0x2e
  2a:*  48 89 59 08 mov%rbx,0x8(%rcx)   <-- trapping 
instruction
  2e:   48 8
  31:   48 89 43 08 mov%rax,0x8(%rbx)
  35:   49 8d 44 24 68  lea0x68(%r12),%rax
  3a:   48 0f ab 30 bts%rsi,(%rax)
  3e:   8b  .byte 0x8b
  3f:   4b  rex.WXB

Code starting with the faulting instruction
===
   0:   48 89 59 08 mov%rbx,0x8(%rcx)
   4:   48 89 18mov%rbx,(%rax)
   7:   48 89 43 08 mov%rax,0x8(%rbx)
   b:   49 8d 44 24 68  lea0x68(%r12),%rax
  10:   48 0f ab 30 bts%rsi,(%rax)
  14:   8b  .byte 0x8b
  15:   4b  rex.WXB
[   60.591081] RSP: 0018:c90dbbe0 EFLAGS: 00010086
[   60.596908] RAX: 888fd59e31b8 RBX: 8881ec87c9e8 RCX: 8881e98109e8
[   60.604866] RDX: 0099 RSI: 0099 RDI: 888fd59e2c40
[   60.612826] RBP: c90dbbf8 R08: 0001 R09: 888fd59e2c40
[   60.620787] R10: 550d R11:  R12: 888fd59e2c40
[   60.628748] R13: 1640 R14: 163c R15:

Re: [PATCH] drm/sched: fix potential page fault in drm_sched_job_init()

2023-11-08 Thread Danilo Krummrich


On 11/8/23 06:46, Luben Tuikov wrote:

Hi,

Could you please use my gmail address, the one one I'm responding from--I don't 
want
to miss any DRM scheduler patches. BTW, the luben.tui...@amd.com email should 
bounce
as undeliverable.

On 2023-11-07 21:26, Danilo Krummrich wrote:

Commit 56e449603f0a ("drm/sched: Convert the GPU scheduler to variable
number of run-queues") introduces drm_err() in drm_sched_job_init(), in
order to indicate that the given entity has no runq, however at this
time job->sched is not yet set, likely to be NULL initialized, and hence
shouldn't be used.

Replace the corresponding drm_err() call with pr_err() to avoid a
potential page fault.

While at it, extend the documentation of drm_sched_job_init() to
indicate that job->sched is not a valid pointer until
drm_sched_job_arm() has been called.

Fixes: 56e449603f0a ("drm/sched: Convert the GPU scheduler to variable number of 
run-queues")
Signed-off-by: Danilo Krummrich 
---
  drivers/gpu/drm/scheduler/sched_main.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 27843e37d9b7..dd28389f0ddd 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -680,6 +680,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs);
   * This function returns -ENOENT in this case (which probably should be -EIO 
as
   * a more meanigful return value).
   *
+ * Note that job->sched is not a valid pointer until drm_sched_job_arm() has
+ * been called.
+ *


Good catch!

Did you actually get this to page-fault and have a kernel log?


No, I just found it because I was about to make the same mistake.



I'm asking because we see it correctly set in this kernel log coming from AMD,


I think that's because amdgpu just sets job->sched to *some* scheduler instance 
after
job allocation [1].

[1] 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c#L108



[   11.886024] amdgpu :0a:00.0: [drm] *ERROR* drm_sched_job_init: entity 
has no rq!

in this email,
https://lore.kernel.org/r/CADnq5_PS64jYS_Y3kGW27m-kuWP+FQFiaVcOaZiB=jlsgpn...@mail.gmail.com


   * Returns 0 for success, negative error code otherwise.
   */
  int drm_sched_job_init(struct drm_sched_job *job,
@@ -691,7 +694,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
 * or worse--a blank screen--leave a trail in the
 * logs, so this can be debugged easier.
 */
-   drm_err(job->sched, "%s: entity has no rq!\n", __func__);
+   pr_err("%s: entity has no rq!\n", __func__);


Is it feasible to do something like the following?

dev_err(job->sched ? job->sched->dev : NULL, "%s: entity has no 
rq!\n", __func__);


I don't think that's a good idea. Although I'd assume that every driver 
zero-initializes its job
structures, I can't see a rule enforcing that. Hence, job->sched can be a 
random value until
drm_sched_job_arm() is called.

However, I notice there are quite a view more fields of struct drm_sched_job 
that are never
initialized, hence there are either a couple more potential bugs or missing 
documentation that
drivers *must* ensure that a job is zero-initialized.

Not quite sure if we really want to rely on the latter for core 
infrastructure...




return -ENOENT;
}
  


base-commit: c015fb6d01adb616fb54824feb55ce5ab18e8ca1

Re: [PATCH 06/17] dt-bindings: rtc: s3c-rtc: add specific compatibles for existing SoC

2023-11-08 Thread Alexandre Belloni

On 08/11/2023 11:43:32+0100, Krzysztof Kozlowski wrote:
> Samsung Exynos SoC reuses several devices from older designs, thus
> historically we kept the old (block's) compatible only.  This works fine
> and there is no bug here, however guidelines expressed in
> Documentation/devicetree/bindings/writing-bindings.rst state that:
> 1. Compatibles should be specific.
> 2. We should add new compatibles in case of bugs or features.
> 
> Add compatibles specific to each SoC in front of all old-SoC-like
> compatibles.
> 
> Signed-off-by: Krzysztof Kozlowski 
Acked-by: Alexandre Belloni 

> 
> ---
> 
> I propose to take the patch through Samsung SoC (me). See cover letter
> for explanation.
> ---
>  Documentation/devicetree/bindings/rtc/s3c-rtc.yaml | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/rtc/s3c-rtc.yaml 
> b/Documentation/devicetree/bindings/rtc/s3c-rtc.yaml
> index d51b236939bf..bf4e11d6dffb 100644
> --- a/Documentation/devicetree/bindings/rtc/s3c-rtc.yaml
> +++ b/Documentation/devicetree/bindings/rtc/s3c-rtc.yaml
> @@ -17,6 +17,11 @@ properties:
>- samsung,s3c2416-rtc
>- samsung,s3c2443-rtc
>- samsung,s3c6410-rtc
> +  - items:
> +  - enum:
> +  - samsung,exynos7-rtc
> +  - samsung,exynos850-rtc
> +  - const: samsung,s3c6410-rtc
>- const: samsung,exynos3250-rtc
>  deprecated: true
>  
> -- 
> 2.34.1
> 
> 

-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

Re: [RFC PATCH v3 04/12] netdev: support binding dma-buf to netdevice

2023-11-08 Thread David Wei

On 2023-11-05 18:44, Mina Almasry wrote:
> Add a netdev_dmabuf_binding struct which represents the
> dma-buf-to-netdevice binding. The netlink API will bind the dma-buf to
> rx queues on the netdevice. On the binding, the dma_buf_attach
> & dma_buf_map_attachment will occur. The entries in the sg_table from
> mapping will be inserted into a genpool to make it ready
> for allocation.
> 
> The chunks in the genpool are owned by a dmabuf_chunk_owner struct which
> holds the dma-buf offset of the base of the chunk and the dma_addr of
> the chunk. Both are needed to use allocations that come from this chunk.
> 
> We create a new type that represents an allocation from the genpool:
> page_pool_iov. We setup the page_pool_iov allocation size in the
> genpool to PAGE_SIZE for simplicity: to match the PAGE_SIZE normally
> allocated by the page pool and given to the drivers.
> 
> The user can unbind the dmabuf from the netdevice by closing the netlink
> socket that established the binding. We do this so that the binding is
> automatically unbound even if the userspace process crashes.
> 
> The binding and unbinding leaves an indicator in struct netdev_rx_queue
> that the given queue is bound, but the binding doesn't take effect until
> the driver actually reconfigures its queues, and re-initializes its page
> pool.
> 
> The netdev_dmabuf_binding struct is refcounted, and releases its
> resources only when all the refs are released.
> 
> Signed-off-by: Willem de Bruijn 
> Signed-off-by: Kaiyuan Zhang 
> Signed-off-by: Mina Almasry 
> 
> ---
> 
> RFC v3:
> - Support multi rx-queue binding
> 
> ---
>  include/linux/netdevice.h |  80 ++
>  include/net/netdev_rx_queue.h |   1 +
>  include/net/page_pool/types.h |  27 +
>  net/core/dev.c| 203 ++
>  net/core/netdev-genl.c| 116 ++-
>  5 files changed, 425 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index b8bf669212cc..eeeda849115c 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -52,6 +52,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  
>  struct netpoll_info;
>  struct device;
> @@ -808,6 +810,84 @@ bool rps_may_expire_flow(struct net_device *dev, u16 
> rxq_index, u32 flow_id,
>  #endif
>  #endif /* CONFIG_RPS */
>  
> +struct netdev_dmabuf_binding {
> + struct dma_buf *dmabuf;
> + struct dma_buf_attachment *attachment;
> + struct sg_table *sgt;
> + struct net_device *dev;
> + struct gen_pool *chunk_pool;
> +
> + /* The user holds a ref (via the netlink API) for as long as they want
> +  * the binding to remain alive. Each page pool using this binding holds
> +  * a ref to keep the binding alive. Each allocated page_pool_iov holds a
> +  * ref.
> +  *
> +  * The binding undos itself and unmaps the underlying dmabuf once all
> +  * those refs are dropped and the binding is no longer desired or in
> +  * use.
> +  */
> + refcount_t ref;
> +
> + /* The portid of the user that owns this binding. Used for netlink to
> +  * notify us of the user dropping the bind.
> +  */
> + u32 owner_nlportid;
> +
> + /* The list of bindings currently active. Used for netlink to notify us
> +  * of the user dropping the bind.
> +  */
> + struct list_head list;
> +
> + /* rxq's this binding is active on. */
> + struct xarray bound_rxq_list;
> +};
> +
> +#ifdef CONFIG_DMA_SHARED_BUFFER
> +void __netdev_devmem_binding_free(struct netdev_dmabuf_binding *binding);
> +int netdev_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd,
> +struct netdev_dmabuf_binding **out);
> +void netdev_unbind_dmabuf(struct netdev_dmabuf_binding *binding);
> +int netdev_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
> + struct netdev_dmabuf_binding *binding);
> +#else
> +static inline void
> +__netdev_devmem_binding_free(struct netdev_dmabuf_binding *binding)
> +{
> +}
> +
> +static inline int netdev_bind_dmabuf(struct net_device *dev,
> +  unsigned int dmabuf_fd,
> +  struct netdev_dmabuf_binding **out)
> +{
> + return -EOPNOTSUPP;
> +}
> +static inline void netdev_unbind_dmabuf(struct netdev_dmabuf_binding 
> *binding)
> +{
> +}
> +
> +static inline int
> +netdev_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
> + struct netdev_dmabuf_binding *binding)
> +{
> + return -EOPNOTSUPP;
> +}
> +#endif
> +
> +static inline void
> +netdev_devmem_binding_get(struct netdev_dmabuf_binding *binding)
> +{
> + refcount_inc(>ref);
> +}
> +
> +static inline void
> +netdev_devmem_binding_put(struct netdev_dmabuf_binding *binding)
> +{
> + if (!refcount_dec_and_test(>ref))
> + return;
> +
> + __netdev_devmem_binding_free(binding);
> +}
>

Re: [PATCH 03/22] [RESEND] kprobes: unify kprobes_exceptions_nofify() prototypes

2023-11-08 Thread Google

On Wed,  8 Nov 2023 13:58:24 +0100
Arnd Bergmann  wrote:

> From: Arnd Bergmann 
> 
> Most architectures that support kprobes declare this function in their
> own asm/kprobes.h header and provide an override, but some are missing
> the prototype, which causes a warning for the __weak stub implementation:
> 
> kernel/kprobes.c:1865:12: error: no previous prototype for 
> 'kprobe_exceptions_notify' [-Werror=missing-prototypes]
>  1865 | int __weak kprobe_exceptions_notify(struct notifier_block *self,
> 
> Move the prototype into linux/kprobes.h so it is visible to all
> the definitions.

Thanks, let me pick this to linux-trace tree.

> 
> Acked-by: Masami Hiramatsu (Google) 
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/arc/include/asm/kprobes.h | 3 ---
>  arch/arm/include/asm/kprobes.h | 2 --
>  arch/arm64/include/asm/kprobes.h   | 2 --
>  arch/mips/include/asm/kprobes.h| 2 --
>  arch/powerpc/include/asm/kprobes.h | 2 --
>  arch/s390/include/asm/kprobes.h| 2 --
>  arch/sh/include/asm/kprobes.h  | 2 --
>  arch/sparc/include/asm/kprobes.h   | 2 --
>  arch/x86/include/asm/kprobes.h | 2 --
>  include/linux/kprobes.h| 4 
>  10 files changed, 4 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arc/include/asm/kprobes.h b/arch/arc/include/asm/kprobes.h
> index de1566e32cb8..68e8301c0df2 100644
> --- a/arch/arc/include/asm/kprobes.h
> +++ b/arch/arc/include/asm/kprobes.h
> @@ -32,9 +32,6 @@ struct kprobe;
>  
>  void arch_remove_kprobe(struct kprobe *p);
>  
> -int kprobe_exceptions_notify(struct notifier_block *self,
> -  unsigned long val, void *data);
> -
>  struct prev_kprobe {
>   struct kprobe *kp;
>   unsigned long status;
> diff --git a/arch/arm/include/asm/kprobes.h b/arch/arm/include/asm/kprobes.h
> index e26a278d301a..5b8dbf1b0be4 100644
> --- a/arch/arm/include/asm/kprobes.h
> +++ b/arch/arm/include/asm/kprobes.h
> @@ -40,8 +40,6 @@ struct kprobe_ctlblk {
>  
>  void arch_remove_kprobe(struct kprobe *);
>  int kprobe_fault_handler(struct pt_regs *regs, unsigned int fsr);
> -int kprobe_exceptions_notify(struct notifier_block *self,
> -  unsigned long val, void *data);
>  
>  /* optinsn template addresses */
>  extern __visible kprobe_opcode_t optprobe_template_entry[];
> diff --git a/arch/arm64/include/asm/kprobes.h 
> b/arch/arm64/include/asm/kprobes.h
> index 05cd82eeca13..be7a3680dadf 100644
> --- a/arch/arm64/include/asm/kprobes.h
> +++ b/arch/arm64/include/asm/kprobes.h
> @@ -37,8 +37,6 @@ struct kprobe_ctlblk {
>  
>  void arch_remove_kprobe(struct kprobe *);
>  int kprobe_fault_handler(struct pt_regs *regs, unsigned int fsr);
> -int kprobe_exceptions_notify(struct notifier_block *self,
> -  unsigned long val, void *data);
>  void __kretprobe_trampoline(void);
>  void __kprobes *trampoline_probe_handler(struct pt_regs *regs);
>  
> diff --git a/arch/mips/include/asm/kprobes.h b/arch/mips/include/asm/kprobes.h
> index 68b1e5d458cf..bc27d99c9436 100644
> --- a/arch/mips/include/asm/kprobes.h
> +++ b/arch/mips/include/asm/kprobes.h
> @@ -71,8 +71,6 @@ struct kprobe_ctlblk {
>   struct prev_kprobe prev_kprobe;
>  };
>  
> -extern int kprobe_exceptions_notify(struct notifier_block *self,
> - unsigned long val, void *data);
>  
>  #endif /* CONFIG_KPROBES */
>  #endif /* _ASM_KPROBES_H */
> diff --git a/arch/powerpc/include/asm/kprobes.h 
> b/arch/powerpc/include/asm/kprobes.h
> index c8e4b4fd4e33..4525a9c68260 100644
> --- a/arch/powerpc/include/asm/kprobes.h
> +++ b/arch/powerpc/include/asm/kprobes.h
> @@ -84,8 +84,6 @@ struct arch_optimized_insn {
>   kprobe_opcode_t *insn;
>  };
>  
> -extern int kprobe_exceptions_notify(struct notifier_block *self,
> - unsigned long val, void *data);
>  extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
>  extern int kprobe_handler(struct pt_regs *regs);
>  extern int kprobe_post_handler(struct pt_regs *regs);
> diff --git a/arch/s390/include/asm/kprobes.h b/arch/s390/include/asm/kprobes.h
> index 21b9e5290c04..01f1682a73b7 100644
> --- a/arch/s390/include/asm/kprobes.h
> +++ b/arch/s390/include/asm/kprobes.h
> @@ -73,8 +73,6 @@ struct kprobe_ctlblk {
>  void arch_remove_kprobe(struct kprobe *p);
>  
>  int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
> -int kprobe_exceptions_notify(struct notifier_block *self,
> - unsigned long val, void *data);
>  
>  #define flush_insn_slot(p)   do { } while (0)
>  
> diff --git a/arch/sh/include/asm/kprobes.h b/arch/sh/include/asm/kprobes.h
> index eeba83e0a7d2..65d4c3316a5b 100644
> --- a/arch/sh/include/asm/kprobes.h
> +++ b/arch/sh/include/asm/kprobes.h
> @@ -46,8 +46,6 @@ struct kprobe_ctlblk {
>  };
>  
>  extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
> -extern int kprobe_exceptions_notify(struct notifier_block *self,
> - unsigned long val,

Re: [PATCH 02/22] fb: atmel_lcdfb: Stop using platform_driver_probe()

2023-11-08 Thread Helge Deller


On 11/8/23 22:52, Uwe Kleine-König wrote:

On Wed, Nov 08, 2023 at 10:24:09PM +0100, Helge Deller wrote:

On 11/8/23 22:00, Uwe Kleine-König wrote:

On Wed, Nov 08, 2023 at 11:48:05AM -0700, Nathan Chancellor wrote:

diff --git a/drivers/video/fbdev/atmel_lcdfb.c 
b/drivers/video/fbdev/atmel_lcdfb.c
index 88c75ae7d315..9e391e5eaf9d 100644
--- a/drivers/video/fbdev/atmel_lcdfb.c
+++ b/drivers/video/fbdev/atmel_lcdfb.c
@@ -220,7 +220,7 @@ static inline void atmel_lcdfb_power_control(struct 
atmel_lcdfb_info *sinfo, int
}
   }

-static const struct fb_fix_screeninfo atmel_lcdfb_fix __initconst = {
+static const struct fb_fix_screeninfo atmel_lcdfb_fix = {
.type   = FB_TYPE_PACKED_PIXELS,
.visual = FB_VISUAL_TRUECOLOR,
.xpanstep   = 0,


I wonder if this was broken already before my patch. atmel_lcdfb_probe()
does

info->fix = atmel_lcdfb_fix;

and unless I miss something (this is well possible) that is used e.g. in
atmel_lcdfb_set_par() -> atmel_lcdfb_update_dma(). So atmel_lcdfb_fix
should better not live in .init memory?! Someone with more knowledge
about fbdev might want to take a look and decide if this justifies a
separate fix that should then be backported to stable, too?!


I don't think a backport this is necessary.
The "__initconst" atmel_lcdfb_fix struct was only copied in the
"__init" atmel_lcdfb_probe() function.
So, both were dropped at the same time in older kernels.


But info and so info->fix live longer than the probe function, don't
they?


Yes, they do.
But AFAICS info->fix contains a *copy* of the initial atmel_lcdfb_fix struct
(and not a pointer to it). So that should be ok.

Helge

Re: [PATCH 02/22] fb: atmel_lcdfb: Stop using platform_driver_probe()

2023-11-08 Thread Uwe Kleine-König

On Wed, Nov 08, 2023 at 10:24:09PM +0100, Helge Deller wrote:
> On 11/8/23 22:00, Uwe Kleine-König wrote:
> > On Wed, Nov 08, 2023 at 11:48:05AM -0700, Nathan Chancellor wrote:
> > > diff --git a/drivers/video/fbdev/atmel_lcdfb.c 
> > > b/drivers/video/fbdev/atmel_lcdfb.c
> > > index 88c75ae7d315..9e391e5eaf9d 100644
> > > --- a/drivers/video/fbdev/atmel_lcdfb.c
> > > +++ b/drivers/video/fbdev/atmel_lcdfb.c
> > > @@ -220,7 +220,7 @@ static inline void atmel_lcdfb_power_control(struct 
> > > atmel_lcdfb_info *sinfo, int
> > >   }
> > >   }
> > > 
> > > -static const struct fb_fix_screeninfo atmel_lcdfb_fix __initconst = {
> > > +static const struct fb_fix_screeninfo atmel_lcdfb_fix = {
> > >   .type   = FB_TYPE_PACKED_PIXELS,
> > >   .visual = FB_VISUAL_TRUECOLOR,
> > >   .xpanstep   = 0,
> > 
> > I wonder if this was broken already before my patch. atmel_lcdfb_probe()
> > does
> > 
> > info->fix = atmel_lcdfb_fix;
> > 
> > and unless I miss something (this is well possible) that is used e.g. in
> > atmel_lcdfb_set_par() -> atmel_lcdfb_update_dma(). So atmel_lcdfb_fix
> > should better not live in .init memory?! Someone with more knowledge
> > about fbdev might want to take a look and decide if this justifies a
> > separate fix that should then be backported to stable, too?!
> 
> I don't think a backport this is necessary.
> The "__initconst" atmel_lcdfb_fix struct was only copied in the
> "__init" atmel_lcdfb_probe() function.
> So, both were dropped at the same time in older kernels.

But info and so info->fix live longer than the probe function, don't
they? So a call to atmel_lcdfb_update_dma() should better not happen
when .init is already discarded, right?

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | https://www.pengutronix.de/ |


signature.asc
Description: PGP signature

Re: [PATCH 01/22] fb: amifb: Stop using platform_driver_probe()

2023-11-08 Thread Geert Uytterhoeven

Hi Helge,

On Wed, Nov 8, 2023 at 10:32 PM Helge Deller  wrote:
> On 11/8/23 22:06, Geert Uytterhoeven wrote:
> > On Tue, Nov 7, 2023 at 10:20 AM Uwe Kleine-König
> >  wrote:
> >> On today's platforms the benefit of platform_driver_probe() isn't that
> >> relevant any more. It allows to drop some code after booting (or module
> >> loading) for .probe() and discard the .remove() function completely if
> >> the driver is built-in. This typically saves a few 100k.
> >
> > Which is a lot on platforms with only a few MiBs of RAM...
>
> True.
> Given the warnings below, what is your suggestion?
> Better to drop the amifb patch ?

I think so. There is a reason these drivers use platform_driver_probe().

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH 01/22] fb: amifb: Stop using platform_driver_probe()

2023-11-08 Thread Helge Deller


On 11/8/23 22:06, Geert Uytterhoeven wrote:

On Tue, Nov 7, 2023 at 10:20 AM Uwe Kleine-König
 wrote:

On today's platforms the benefit of platform_driver_probe() isn't that
relevant any more. It allows to drop some code after booting (or module
loading) for .probe() and discard the .remove() function completely if
the driver is built-in. This typically saves a few 100k.


Which is a lot on platforms with only a few MiBs of RAM...


True.
Given the warnings below, what is your suggestion?
Better to drop the amifb patch ?

Helge
...

WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x15c (section: .text) -> ami_modedb (section: .init.data)
...

Re: [PATCH 02/22] fb: atmel_lcdfb: Stop using platform_driver_probe()

2023-11-08 Thread Helge Deller


On 11/8/23 22:00, Uwe Kleine-König wrote:

On Wed, Nov 08, 2023 at 11:48:05AM -0700, Nathan Chancellor wrote:

diff --git a/drivers/video/fbdev/atmel_lcdfb.c 
b/drivers/video/fbdev/atmel_lcdfb.c
index 88c75ae7d315..9e391e5eaf9d 100644
--- a/drivers/video/fbdev/atmel_lcdfb.c
+++ b/drivers/video/fbdev/atmel_lcdfb.c
@@ -220,7 +220,7 @@ static inline void atmel_lcdfb_power_control(struct 
atmel_lcdfb_info *sinfo, int
}
  }

-static const struct fb_fix_screeninfo atmel_lcdfb_fix __initconst = {
+static const struct fb_fix_screeninfo atmel_lcdfb_fix = {
.type   = FB_TYPE_PACKED_PIXELS,
.visual = FB_VISUAL_TRUECOLOR,
.xpanstep   = 0,


I wonder if this was broken already before my patch. atmel_lcdfb_probe()
does

info->fix = atmel_lcdfb_fix;

and unless I miss something (this is well possible) that is used e.g. in
atmel_lcdfb_set_par() -> atmel_lcdfb_update_dma(). So atmel_lcdfb_fix
should better not live in .init memory?! Someone with more knowledge
about fbdev might want to take a look and decide if this justifies a
separate fix that should then be backported to stable, too?!


I don't think a backport this is necessary.
The "__initconst" atmel_lcdfb_fix struct was only copied in the
"__init" atmel_lcdfb_probe() function.
So, both were dropped at the same time in older kernels.

Since your patch dropped the "__init" from atmel_lcdfb_probe(),
the __initconst from atmel_lcdfb_fix has to be removed too.

So, I believe folding in Nathan's patch is OK and we don't need
a seperate (or backport) patch.

Helge

Re: [PATCH 10/22] microblaze: include linux/cpu.h for trap_init() prototype

2023-11-08 Thread Geert Uytterhoeven

Hi Arnd,

On Wed, Nov 8, 2023 at 10:07 PM Arnd Bergmann  wrote:
> On Wed, Nov 8, 2023, at 21:42, Geert Uytterhoeven wrote:
> > On Wed, Nov 8, 2023 at 2:01 PM Arnd Bergmann  wrote:
> >> From: Arnd Bergmann 
> >>
> >> Microblaze runs into a single -Wmissing-prototypes warning when that is
> >> enabled:
> >>
> >> arch/microblaze/kernel/traps.c:21:6: warning: no previous prototype for 
> >> 'trap_init' [-Wmissing-prototypes]
> >>
> >> Include the right header to avoid this.
> >>
> >> Signed-off-by: Arnd Bergmann 
> >
> > Thanks for your patch!
> >
> >>  arch/alpha/kernel/traps.c  | 1 +
> >>  arch/csky/include/asm/traps.h  | 2 --
> >>  arch/csky/kernel/traps.c   | 1 +
> >>  arch/m68k/coldfire/vectors.c   | 3 +--
> >>  arch/m68k/coldfire/vectors.h   | 3 ---
> >
> > Ah, so this is where the m68k changes listed in the cover letter are
> > hiding ;-)
> >
> >>  arch/microblaze/kernel/traps.c | 1 +
> >>  arch/sparc/kernel/traps_32.c   | 1 +
> >>  arch/sparc/kernel/traps_64.c   | 1 +
> >>  arch/x86/include/asm/traps.h   | 1 -
> >>  arch/x86/kernel/traps.c| 1 +
> >>  10 files changed, 7 insertions(+), 8 deletions(-)
> >>  delete mode 100644 arch/m68k/coldfire/vectors.h
> >
> > Obviously the non-microblaze changes should be spun off in separate
> > patches.
>
> I messed up one of my rebases here and accidentally sent
> the wrong changelog text. My intention was to have the
> combined patch but with this text:
>
> arch: include linux/cpu.h for trap_init() prototype
>
> some architectures run into a -Wmissing-prototypes warning
> for trap_init()
>
> arch/microblaze/kernel/traps.c:21:6: warning: no previous prototype for 
> 'trap_init' [-Wmissing-prototypes]
>
> Include the right header to avoid this consistently, removing
> the extra declarations on m68k and x86 that were added as local
> workarounds already.
>
> Signed-off-by: Arnd Bergmann 

That makes sense, although it's hard to combine this with "my preference
would be for the patches to make it through the respective subsystem
maintainer trees"...

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH 10/22] microblaze: include linux/cpu.h for trap_init() prototype

2023-11-08 Thread Arnd Bergmann

On Wed, Nov 8, 2023, at 21:42, Geert Uytterhoeven wrote:
>
> On Wed, Nov 8, 2023 at 2:01 PM Arnd Bergmann  wrote:
>> From: Arnd Bergmann 
>>
>> Microblaze runs into a single -Wmissing-prototypes warning when that is
>> enabled:
>>
>> arch/microblaze/kernel/traps.c:21:6: warning: no previous prototype for 
>> 'trap_init' [-Wmissing-prototypes]
>>
>> Include the right header to avoid this.
>>
>> Signed-off-by: Arnd Bergmann 
>
> Thanks for your patch!
>
>>  arch/alpha/kernel/traps.c  | 1 +
>>  arch/csky/include/asm/traps.h  | 2 --
>>  arch/csky/kernel/traps.c   | 1 +
>>  arch/m68k/coldfire/vectors.c   | 3 +--
>>  arch/m68k/coldfire/vectors.h   | 3 ---
>
> Ah, so this is where the m68k changes listed in the cover letter are
> hiding ;-)
>
>>  arch/microblaze/kernel/traps.c | 1 +
>>  arch/sparc/kernel/traps_32.c   | 1 +
>>  arch/sparc/kernel/traps_64.c   | 1 +
>>  arch/x86/include/asm/traps.h   | 1 -
>>  arch/x86/kernel/traps.c| 1 +
>>  10 files changed, 7 insertions(+), 8 deletions(-)
>>  delete mode 100644 arch/m68k/coldfire/vectors.h
>
> Obviously the non-microblaze changes should be spun off in separate
> patches.

I messed up one of my rebases here and accidentally sent
the wrong changelog text. My intention was to have the
combined patch but with this text:

arch: include linux/cpu.h for trap_init() prototype

some architectures run into a -Wmissing-prototypes warning
for trap_init()

arch/microblaze/kernel/traps.c:21:6: warning: no previous prototype for 
'trap_init' [-Wmissing-prototypes]

Include the right header to avoid this consistently, removing
the extra declarations on m68k and x86 that were added as local
workarounds already.

Signed-off-by: Arnd Bergmann 


I made the same mistake with the "arch: add do_page_fault prototypes"
patch that was missing an explanation.

  Arnd

Re: [PATCH 01/22] fb: amifb: Stop using platform_driver_probe()

2023-11-08 Thread Geert Uytterhoeven

Hi Uwe,

On Tue, Nov 7, 2023 at 10:20 AM Uwe Kleine-König
 wrote:
> On today's platforms the benefit of platform_driver_probe() isn't that
> relevant any more. It allows to drop some code after booting (or module
> loading) for .probe() and discard the .remove() function completely if
> the driver is built-in. This typically saves a few 100k.

Which is a lot on platforms with only a few MiBs of RAM...

> The downside of platform_driver_probe() is that the driver cannot be
> bound and unbound at runtime which is ancient and also slightly
> complicates testing. There are also thoughts to deprecate
> platform_driver_probe() because it adds some complexity in the driver
> core for little gain. Also many drivers don't use it correctly. This
> driver for example misses to mark the driver struct with __refdata which
> is needed to suppress a (W=1) modpost warning:
>
> WARNING: modpost: drivers/video/fbdev/amifb: section mismatch in 
> reference: amifb_driver+0x4 (section: .data) -> amifb_remove (section: 
> .exit.text)
>
> Signed-off-by: Uwe Kleine-König 

Thanks for your patch!

> --- a/drivers/video/fbdev/amifb.c
> +++ b/drivers/video/fbdev/amifb.c
> @@ -3530,7 +3530,7 @@ static inline void chipfree(void)
>  * Initialisation
>  */
>
> -static int __init amifb_probe(struct platform_device *pdev)
> +static int amifb_probe(struct platform_device *pdev)

nore...@ellerman.id.au reported the following build failure for
e.g. m68k-defconfig in next-20231108:

WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x15c (section: .text) -> ami_modedb (section: .init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x17a (section: .text) -> amifb_hfmin (section:
.init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x188 (section: .text) -> amifb_hfmax (section:
.init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x190 (section: .text) -> amifb_vfmin (section:
.init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x198 (section: .text) -> amifb_vfmax (section:
.init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x1ba (section: .text) -> ami_modedb (section: .init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x1c4 (section: .text) -> ami_modedb (section: .init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x1ca (section: .text) -> mode_option (section:
.init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x1ee (section: .text) -> ami_modedb (section: .init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x398 (section: .text) -> amifb_hfmin (section:
.init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x39e (section: .text) -> amifb_hfmax (section:
.init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x3a4 (section: .text) -> amifb_vfmin (section:
.init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x3aa (section: .text) -> amifb_vfmax (section:
.init.data)
WARNING: modpost: vmlinux: section mismatch in reference:
amifb_probe+0x3f0 (section: .text) -> mode_option (section:
.init.data)
ERROR: modpost: Section mismatches detected.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH 02/22] fb: atmel_lcdfb: Stop using platform_driver_probe()

2023-11-08 Thread Uwe Kleine-König

Hello,

On Wed, Nov 08, 2023 at 11:48:05AM -0700, Nathan Chancellor wrote:
> diff --git a/drivers/video/fbdev/atmel_lcdfb.c 
> b/drivers/video/fbdev/atmel_lcdfb.c
> index 88c75ae7d315..9e391e5eaf9d 100644
> --- a/drivers/video/fbdev/atmel_lcdfb.c
> +++ b/drivers/video/fbdev/atmel_lcdfb.c
> @@ -220,7 +220,7 @@ static inline void atmel_lcdfb_power_control(struct 
> atmel_lcdfb_info *sinfo, int
>   }
>  }
>  
> -static const struct fb_fix_screeninfo atmel_lcdfb_fix __initconst = {
> +static const struct fb_fix_screeninfo atmel_lcdfb_fix = {
>   .type   = FB_TYPE_PACKED_PIXELS,
>   .visual = FB_VISUAL_TRUECOLOR,
>   .xpanstep   = 0,

I wonder if this was broken already before my patch. atmel_lcdfb_probe()
does

info->fix = atmel_lcdfb_fix;

and unless I miss something (this is well possible) that is used e.g. in
atmel_lcdfb_set_par() -> atmel_lcdfb_update_dma(). So atmel_lcdfb_fix
should better not live in .init memory?! Someone with more knowledge
about fbdev might want to take a look and decide if this justifies a
separate fix that should then be backported to stable, too?!

> @@ -841,7 +841,7 @@ static void atmel_lcdfb_task(struct work_struct *work)
>   atmel_lcdfb_reset(sinfo);
>  }
>  
> -static int __init atmel_lcdfb_init_fbinfo(struct atmel_lcdfb_info *sinfo)
> +static int atmel_lcdfb_init_fbinfo(struct atmel_lcdfb_info *sinfo)
>  {
>   struct fb_info *info = sinfo->info;
>   int ret = 0;

This is only a problem since my patch.

Thanks for your report and patch.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | https://www.pengutronix.de/ |


signature.asc
Description: PGP signature

Re: [PATCH 10/22] microblaze: include linux/cpu.h for trap_init() prototype

2023-11-08 Thread Geert Uytterhoeven

Hi Arnd,

On Wed, Nov 8, 2023 at 2:01 PM Arnd Bergmann  wrote:
> From: Arnd Bergmann 
>
> Microblaze runs into a single -Wmissing-prototypes warning when that is
> enabled:
>
> arch/microblaze/kernel/traps.c:21:6: warning: no previous prototype for 
> 'trap_init' [-Wmissing-prototypes]
>
> Include the right header to avoid this.
>
> Signed-off-by: Arnd Bergmann 

Thanks for your patch!

>  arch/alpha/kernel/traps.c  | 1 +
>  arch/csky/include/asm/traps.h  | 2 --
>  arch/csky/kernel/traps.c   | 1 +
>  arch/m68k/coldfire/vectors.c   | 3 +--
>  arch/m68k/coldfire/vectors.h   | 3 ---

Ah, so this is where the m68k changes listed in the cover letter are
hiding ;-)

>  arch/microblaze/kernel/traps.c | 1 +
>  arch/sparc/kernel/traps_32.c   | 1 +
>  arch/sparc/kernel/traps_64.c   | 1 +
>  arch/x86/include/asm/traps.h   | 1 -
>  arch/x86/kernel/traps.c| 1 +
>  10 files changed, 7 insertions(+), 8 deletions(-)
>  delete mode 100644 arch/m68k/coldfire/vectors.h

Obviously the non-microblaze changes should be spun off in separate
patches.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH 02/22] fb: atmel_lcdfb: Stop using platform_driver_probe()

2023-11-08 Thread Helge Deller


On 11/8/23 19:48, Nathan Chancellor wrote:

On Tue, Nov 07, 2023 at 10:17:43AM +0100, Uwe Kleine-König wrote:

On today's platforms the benefit of platform_driver_probe() isn't that
relevant any more. It allows to drop some code after booting (or module
loading) for .probe() and discard the .remove() function completely if
the driver is built-in. This typically saves a few 100k.

The downside of platform_driver_probe() is that the driver cannot be
bound and unbound at runtime which is ancient and also slightly
complicates testing. There are also thoughts to deprecate
platform_driver_probe() because it adds some complexity in the driver
core for little gain. Also many drivers don't use it correctly. This
driver for example misses to mark the driver struct with __refdata which
is needed to suppress a (W=1) modpost warning:

WARNING: modpost: drivers/video/fbdev/atmel_lcdfb: section mismatch in 
reference: atmel_lcdfb_driver+0x4 (section: .data) -> atmel_lcdfb_remove 
(section: .exit.text)

Signed-off-by: Uwe Kleine-König 
---
  drivers/video/fbdev/atmel_lcdfb.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/video/fbdev/atmel_lcdfb.c 
b/drivers/video/fbdev/atmel_lcdfb.c
index a908db233409..b218731ef732 100644
--- a/drivers/video/fbdev/atmel_lcdfb.c
+++ b/drivers/video/fbdev/atmel_lcdfb.c
@@ -1017,7 +1017,7 @@ static int atmel_lcdfb_of_init(struct atmel_lcdfb_info 
*sinfo)
return ret;
  }

-static int __init atmel_lcdfb_probe(struct platform_device *pdev)
+static int atmel_lcdfb_probe(struct platform_device *pdev)
  {
struct device *dev = >dev;
struct fb_info *info;
@@ -1223,7 +1223,7 @@ static int __init atmel_lcdfb_probe(struct 
platform_device *pdev)
return ret;
  }

-static int __exit atmel_lcdfb_remove(struct platform_device *pdev)
+static int atmel_lcdfb_remove(struct platform_device *pdev)
  {
struct device *dev = >dev;
struct fb_info *info = dev_get_drvdata(dev);
@@ -1301,7 +1301,8 @@ static int atmel_lcdfb_resume(struct platform_device 
*pdev)
  #endif

  static struct platform_driver atmel_lcdfb_driver = {
-   .remove = __exit_p(atmel_lcdfb_remove),
+   .probe  = atmel_lcdfb_probe,
+   .remove = atmel_lcdfb_remove,
.suspend= atmel_lcdfb_suspend,
.resume = atmel_lcdfb_resume,
.driver = {
@@ -1310,7 +1311,7 @@ static struct platform_driver atmel_lcdfb_driver = {
},
  };

-module_platform_driver_probe(atmel_lcdfb_driver, atmel_lcdfb_probe);
+module_platform_driver(atmel_lcdfb_driver, );

  MODULE_DESCRIPTION("AT91 LCD Controller framebuffer driver");
  MODULE_AUTHOR("Nicolas Ferre ");
--
2.42.0



For what it's worth, this introduces a warning when building certain
configurations (such as ARCH=arm multi_v5_defconfig) with clang:

   WARNING: modpost: vmlinux: section mismatch in reference: 
atmel_lcdfb_probe+0x6c4 (section: .text) -> atmel_lcdfb_init_fbinfo (section: 
.init.text)
   WARNING: modpost: vmlinux: section mismatch in reference: 
atmel_lcdfb_probe+0x858 (section: .text) -> atmel_lcdfb_fix (section: 
.init.rodata)

This appears to be legitimate to me? GCC did not warn but I assume that
is due to differences in inlining. The following clears it up for me,
should I send a standalone patch or should this be squashed in?


I've squashed it into the original patch.

Thank you!
Helge



Cheers,
Nathan

diff --git a/drivers/video/fbdev/atmel_lcdfb.c 
b/drivers/video/fbdev/atmel_lcdfb.c
index 88c75ae7d315..9e391e5eaf9d 100644
--- a/drivers/video/fbdev/atmel_lcdfb.c
+++ b/drivers/video/fbdev/atmel_lcdfb.c
@@ -220,7 +220,7 @@ static inline void atmel_lcdfb_power_control(struct 
atmel_lcdfb_info *sinfo, int
}
  }

-static const struct fb_fix_screeninfo atmel_lcdfb_fix __initconst = {
+static const struct fb_fix_screeninfo atmel_lcdfb_fix = {
.type   = FB_TYPE_PACKED_PIXELS,
.visual = FB_VISUAL_TRUECOLOR,
.xpanstep   = 0,
@@ -841,7 +841,7 @@ static void atmel_lcdfb_task(struct work_struct *work)
atmel_lcdfb_reset(sinfo);
  }

-static int __init atmel_lcdfb_init_fbinfo(struct atmel_lcdfb_info *sinfo)
+static int atmel_lcdfb_init_fbinfo(struct atmel_lcdfb_info *sinfo)
  {
struct fb_info *info = sinfo->info;
int ret = 0;

Re: [PATCH 16/22] bcachefs: mark bch2_target_to_text_sb() static

2023-11-08 Thread Kent Overstreet

On Wed, Nov 08, 2023 at 01:58:37PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> bch2_target_to_text_sb() is only called in the file it is defined in,
> and it has no extern prototype:
> 
> fs/bcachefs/disk_groups.c:583:6: error: no previous prototype for 
> 'bch2_target_to_text_sb' [-Werror=missing-prototypes]
> 
> Mark it static to avoid the warning and have the code better optimized.
> 
> Fixes: bf0d9e89de2e ("bcachefs: Split apart bch2_target_to_text(), 
> bch2_target_to_text_sb()")
> Signed-off-by: Arnd Bergmann 

This is already fixed in my tree.

Re: [PATCH 15/22] arch: vdso: consolidate gettime prototypes

2023-11-08 Thread Arnd Bergmann

On Wed, Nov 8, 2023, at 19:31, Christophe Leroy wrote:
> Le 08/11/2023 à 13:58, Arnd Bergmann a écrit :

> powerpc has functions doing more or less the same, they are called 
> __c_kernel_clock_gettime() and alike with their prototypes siting in 
> arch/powerpc/include/asm/vdso/gettimeofday.h
>
> Should those prototypes be moved to include/vdso/gettime.h too and 
> eventually renamed, or are they considered too powerpc specific ?

I don't actually know, my initial interpretation was that
these function names are part of the user ABI for the vdso,
but I never looked closely enough at how vdso works to
be sure what the actual ABI is.

If __c_kernel_clock_gettime() etc are not part of the user-facing
ABI, I think renaming them for consistency with the other
architectures would be best.

 Arnd

Re: [PATCH 09/22] [v2] arch: fix asm-offsets.c building with -Wmissing-prototypes

2023-11-08 Thread Sam Ravnborg

On Wed, Nov 08, 2023 at 01:58:30PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> When -Wmissing-prototypes is enabled, the some asm-offsets.c files fail
> to build, even when this warning is disabled in the Makefile for normal
> files:
> 
> arch/sparc/kernel/asm-offsets.c:22:5: error: no previous prototype for 
> 'sparc32_foo' [-Werror=missing-prototypes]
> arch/sparc/kernel/asm-offsets.c:48:5: error: no previous prototype for 'foo' 
> [-Werror=missing-prototypes]
> 
> Address this by making use of the same trick as x86, marking these
> functions as 'static __used' to avoid the need for a prototype
> by not drop them in dead-code elimination.
> 
> Suggested-by: Masahiro Yamada 
> Link: 
> https://lore.kernel.org/lkml/cak7lnarfemfk0du4hed19ex_g6tuc5wg0zp+l1ayvdpof4y...@mail.gmail.com/
> Signed-off-by: Arnd Bergmann 
Looks good. I sometimes looks at sparc patches so I looked at this one.
Reviewed-by: Sam Ravnborg

Re: [PATCH v3] drm: amd: Resolve Sphinx unexpected indentation warning

2023-11-08 Thread Alex Deucher

Applied.  Thanks!

On Tue, Nov 7, 2023 at 11:17 PM Lazar, Lijo  wrote:
>
>
>
> On 11/7/2023 9:58 PM, Hunter Chasens wrote:
> > Resolves Sphinx unexpected indentation warning when compiling
> > documentation (e.g. `make htmldocs`). Replaces tabs with spaces and adds
> > a literal block to keep vertical formatting of the
> > example power state list.
> >
> > Signed-off-by: Hunter Chasens 
>
> Reviewed-by: Lijo Lazar 
>
> Thanks,
> Lijo
>
> > ---
> >   drivers/gpu/drm/amd/pm/amdgpu_pm.c | 13 +++--
> >   1 file changed, 7 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
> > b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> > index 517b9fb4624c..576202bf64f3 100644
> > --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> > +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> > @@ -989,12 +989,13 @@ static ssize_t amdgpu_get_pp_features(struct device 
> > *dev,
> >* Reading back the files will show you the available power levels within
> >* the power state and the clock information for those levels. If deep 
> > sleep is
> >* applied to a clock, the level will be denoted by a special level 'S:'
> > - * E.g.,
> > - *   S: 19Mhz *
> > - *   0: 615Mhz
> > - *   1: 800Mhz
> > - *   2: 888Mhz
> > - *   3: 1000Mhz
> > + * E.g., ::
> > + *
> > + *  S: 19Mhz *
> > + *  0: 615Mhz
> > + *  1: 800Mhz
> > + *  2: 888Mhz
> > + *  3: 1000Mhz
> >*
> >*
> >* To manually adjust these states, first select manual using

Re: [PATCH] drm/amd/display: remove duplicated argument

2023-11-08 Thread Alex Deucher

Applied.  Thanks!

On Wed, Nov 8, 2023 at 3:52 AM José Pekkarinen
 wrote:
>
> On 2023-10-30 15:54, Aurabindo Pillai wrote:
> > On 10/29/2023 5:39 AM, José Pekkarinen wrote:
> >> Spotted by coccicheck, there is a redundant check for
> >> v->SourcePixelFormat[k] != dm_444_16. This patch will
> >> remove it. The corresponding output follows.
> >>
> >> drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c:5130:86-122:
> >> duplicated argument to && or ||
> >>
> >> Signed-off-by: José Pekkarinen 
> >> ---
> >>   drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c | 2
> >> +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git
> >> a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
> >> b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
> >> index ad741a723c0e..3686f1e7de3a 100644
> >> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
> >> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
> >> @@ -5128,7 +5128,7 @@ void
> >> dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib
> >> *mode_l
> >>  ViewportExceedsSurface = true;
> >>  if (v->SourcePixelFormat[k] != dm_444_64 &&
> >> v->SourcePixelFormat[k] != dm_444_32 && v->SourcePixelFormat[k] !=
> >> dm_444_16
> >> -&& v->SourcePixelFormat[k] != dm_444_16 &&
> >> v->SourcePixelFormat[k] != dm_444_8 && v->SourcePixelFormat[k] !=
> >> dm_rgbe) {
> >> +&& v->SourcePixelFormat[k] != dm_444_8 && 
> >> v->SourcePixelFormat[k]
> >> != dm_rgbe) {
> >>  if (v->ViewportWidthChroma[k] > v->SurfaceWidthC[k] ||
> >> v->ViewportHeightChroma[k] > v->SurfaceHeightC[k]) {
> >>  ViewportExceedsSurface = true;
> >>  }
> >
> > Thanks for catching.
> >
> > Reviewed-by: Aurabindo Pillai 
>
>  Sorry to bring this up, I just wanted to check whether this
> has been applied in the following pulls or not.
>
>  Thanks!
>
>  José.

Re: [PATCH 01/17] dt-bindings: hwinfo: samsung,exynos-chipid: add specific compatibles for existing SoC

2023-11-08 Thread Rob Herring



On Wed, 08 Nov 2023 11:43:27 +0100, Krzysztof Kozlowski wrote:
> Samsung Exynos SoC reuses several devices from older designs, thus
> historically we kept the old (block's) compatible only.  This works fine
> and there is no bug here, however guidelines expressed in
> Documentation/devicetree/bindings/writing-bindings.rst state that:
> 1. Compatibles should be specific.
> 2. We should add new compatibles in case of bugs or features.
> 
> Add compatibles specific to each SoC in front of all old-SoC-like
> compatibles.
> 
> Signed-off-by: Krzysztof Kozlowski 
> 
> ---
> 
> I propose to take the patch through Samsung SoC (me). See cover letter
> for explanation.
> ---
>  .../bindings/hwinfo/samsung,exynos-chipid.yaml  | 17 ++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 

Acked-by: Rob Herring

Re: [PATCH 02/22] fb: atmel_lcdfb: Stop using platform_driver_probe()

2023-11-08 Thread Nathan Chancellor

On Tue, Nov 07, 2023 at 10:17:43AM +0100, Uwe Kleine-König wrote:
> On today's platforms the benefit of platform_driver_probe() isn't that
> relevant any more. It allows to drop some code after booting (or module
> loading) for .probe() and discard the .remove() function completely if
> the driver is built-in. This typically saves a few 100k.
> 
> The downside of platform_driver_probe() is that the driver cannot be
> bound and unbound at runtime which is ancient and also slightly
> complicates testing. There are also thoughts to deprecate
> platform_driver_probe() because it adds some complexity in the driver
> core for little gain. Also many drivers don't use it correctly. This
> driver for example misses to mark the driver struct with __refdata which
> is needed to suppress a (W=1) modpost warning:
> 
>   WARNING: modpost: drivers/video/fbdev/atmel_lcdfb: section mismatch in 
> reference: atmel_lcdfb_driver+0x4 (section: .data) -> atmel_lcdfb_remove 
> (section: .exit.text)
> 
> Signed-off-by: Uwe Kleine-König 
> ---
>  drivers/video/fbdev/atmel_lcdfb.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/video/fbdev/atmel_lcdfb.c 
> b/drivers/video/fbdev/atmel_lcdfb.c
> index a908db233409..b218731ef732 100644
> --- a/drivers/video/fbdev/atmel_lcdfb.c
> +++ b/drivers/video/fbdev/atmel_lcdfb.c
> @@ -1017,7 +1017,7 @@ static int atmel_lcdfb_of_init(struct atmel_lcdfb_info 
> *sinfo)
>   return ret;
>  }
>  
> -static int __init atmel_lcdfb_probe(struct platform_device *pdev)
> +static int atmel_lcdfb_probe(struct platform_device *pdev)
>  {
>   struct device *dev = >dev;
>   struct fb_info *info;
> @@ -1223,7 +1223,7 @@ static int __init atmel_lcdfb_probe(struct 
> platform_device *pdev)
>   return ret;
>  }
>  
> -static int __exit atmel_lcdfb_remove(struct platform_device *pdev)
> +static int atmel_lcdfb_remove(struct platform_device *pdev)
>  {
>   struct device *dev = >dev;
>   struct fb_info *info = dev_get_drvdata(dev);
> @@ -1301,7 +1301,8 @@ static int atmel_lcdfb_resume(struct platform_device 
> *pdev)
>  #endif
>  
>  static struct platform_driver atmel_lcdfb_driver = {
> - .remove = __exit_p(atmel_lcdfb_remove),
> + .probe  = atmel_lcdfb_probe,
> + .remove = atmel_lcdfb_remove,
>   .suspend= atmel_lcdfb_suspend,
>   .resume = atmel_lcdfb_resume,
>   .driver = {
> @@ -1310,7 +1311,7 @@ static struct platform_driver atmel_lcdfb_driver = {
>   },
>  };
>  
> -module_platform_driver_probe(atmel_lcdfb_driver, atmel_lcdfb_probe);
> +module_platform_driver(atmel_lcdfb_driver, );
>  
>  MODULE_DESCRIPTION("AT91 LCD Controller framebuffer driver");
>  MODULE_AUTHOR("Nicolas Ferre ");
> -- 
> 2.42.0
> 

For what it's worth, this introduces a warning when building certain
configurations (such as ARCH=arm multi_v5_defconfig) with clang:

  WARNING: modpost: vmlinux: section mismatch in reference: 
atmel_lcdfb_probe+0x6c4 (section: .text) -> atmel_lcdfb_init_fbinfo (section: 
.init.text)
  WARNING: modpost: vmlinux: section mismatch in reference: 
atmel_lcdfb_probe+0x858 (section: .text) -> atmel_lcdfb_fix (section: 
.init.rodata)

This appears to be legitimate to me? GCC did not warn but I assume that
is due to differences in inlining. The following clears it up for me,
should I send a standalone patch or should this be squashed in?

Cheers,
Nathan

diff --git a/drivers/video/fbdev/atmel_lcdfb.c 
b/drivers/video/fbdev/atmel_lcdfb.c
index 88c75ae7d315..9e391e5eaf9d 100644
--- a/drivers/video/fbdev/atmel_lcdfb.c
+++ b/drivers/video/fbdev/atmel_lcdfb.c
@@ -220,7 +220,7 @@ static inline void atmel_lcdfb_power_control(struct 
atmel_lcdfb_info *sinfo, int
}
 }
 
-static const struct fb_fix_screeninfo atmel_lcdfb_fix __initconst = {
+static const struct fb_fix_screeninfo atmel_lcdfb_fix = {
.type   = FB_TYPE_PACKED_PIXELS,
.visual = FB_VISUAL_TRUECOLOR,
.xpanstep   = 0,
@@ -841,7 +841,7 @@ static void atmel_lcdfb_task(struct work_struct *work)
atmel_lcdfb_reset(sinfo);
 }
 
-static int __init atmel_lcdfb_init_fbinfo(struct atmel_lcdfb_info *sinfo)
+static int atmel_lcdfb_init_fbinfo(struct atmel_lcdfb_info *sinfo)
 {
struct fb_info *info = sinfo->info;
int ret = 0;

Re: [PATCH 15/22] arch: vdso: consolidate gettime prototypes

2023-11-08 Thread Christophe Leroy

Hi Arnd,

Le 08/11/2023 à 13:58, Arnd Bergmann a écrit :
> From: Arnd Bergmann 
> 
> The VDSO functions are defined as globals in the kernel sources but intended
> to be called from userspace, so there is no need to declare them in a kernel
> side header.
> 
> Without a prototype, this now causes warnings such as
> 
> arch/mips/vdso/vgettimeofday.c:14:5: error: no previous prototype for 
> '__vdso_clock_gettime' [-Werror=missing-prototypes]
> arch/mips/vdso/vgettimeofday.c:28:5: error: no previous prototype for 
> '__vdso_gettimeofday' [-Werror=missing-prototypes]
> arch/mips/vdso/vgettimeofday.c:36:5: error: no previous prototype for 
> '__vdso_clock_getres' [-Werror=missing-prototypes]
> arch/mips/vdso/vgettimeofday.c:42:5: error: no previous prototype for 
> '__vdso_clock_gettime64' [-Werror=missing-prototypes]
> arch/sparc/vdso/vclock_gettime.c:254:1: error: no previous prototype for 
> '__vdso_clock_gettime' [-Werror=missing-prototypes]
> arch/sparc/vdso/vclock_gettime.c:282:1: error: no previous prototype for 
> '__vdso_clock_gettime_stick' [-Werror=missing-prototypes]
> arch/sparc/vdso/vclock_gettime.c:307:1: error: no previous prototype for 
> '__vdso_gettimeofday' [-Werror=missing-prototypes]
> arch/sparc/vdso/vclock_gettime.c:343:1: error: no previous prototype for 
> '__vdso_gettimeofday_stick' [-Werror=missing-prototypes]
> 
> Most architectures have already added workarounds for these by adding
> declarations somewhere, but since these are all compatible, we should
> really just have one copy, with an #ifdef check for the 32-bit vs
> 64-bit variant and use that everywhere.
> 
> Unfortunately, the sparc version is currently incompatible since
> that never added support for __vdso_clock_gettime64() in 32-bit
> userland. For the moment, I'm leaving this one out, as I can't
> easily test it and it requires a larger rework.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>   arch/arm/include/asm/vdso.h  |  5 -
>   arch/arm/vdso/vgettimeofday.c|  1 +
>   arch/arm64/kernel/vdso32/vgettimeofday.c |  1 +
>   arch/csky/kernel/vdso/vgettimeofday.c| 11 +--
>   arch/loongarch/vdso/vgettimeofday.c  |  7 +--
>   arch/mips/vdso/vgettimeofday.c   |  1 +
>   arch/riscv/kernel/vdso/vgettimeofday.c   |  7 +--
>   arch/x86/entry/vdso/vclock_gettime.c | 10 +-
>   arch/x86/include/asm/vdso/gettimeofday.h |  2 --
>   arch/x86/um/vdso/um_vdso.c   |  1 +
>   include/vdso/gettime.h   | 23 +++
>   11 files changed, 31 insertions(+), 38 deletions(-)
>   create mode 100644 include/vdso/gettime.h

powerpc has functions doing more or less the same, they are called 
__c_kernel_clock_gettime() and alike with their prototypes siting in 
arch/powerpc/include/asm/vdso/gettimeofday.h

Should those prototypes be moved to include/vdso/gettime.h too and 
eventually renamed, or are they considered too powerpc specific ?

Christophe

Re: [PATCH v7 0/7] incorporate pm runtime framework and eDP clean up

2023-11-08 Thread Abhinav Kumar





On 11/8/2023 10:10 AM, Kuogee Hsieh wrote:


On 11/7/2023 1:23 PM, Dmitry Baryshkov wrote:
On Tue, 7 Nov 2023 at 23:01, Kuogee Hsieh  
wrote:


On 11/6/2023 5:55 PM, Dmitry Baryshkov wrote:
On Sat, 7 Oct 2023 at 01:55, Kuogee Hsieh  
wrote:
The purpose of this patch series is to incorporate pm runtime 
framework
into MSM eDP/DP driver so that eDP panel can be detected by DRM eDP 
panel
driver during system probe time. During incorporating procedure, 
original
customized pm realted fucntions, such as dp_pm_prepare(), 
dp_pm_suspend(),
dp_pm_resume() and dp_pm_prepare(), are removed and replaced with 
functions
provided by pm runtiem framework such as pm_runtime_force_suspend() 
and
pm_runtime_force_resume(). In addition, both eDP aux-bus and irq 
handler

are bound at system probe time too.

With this patchset in place I can crash the board using the following
sequence (SM8350-HDK):

- plug the USBC DP dongle
- run modetest at any mode, don't press Enter yet
- unplug the dongle
- press Enter to stop modetest

=> the board resets to Sahara.

Please ping me if you need any additional information from my side.

questiosn,

1) which dongle are you used?

I have used several Dell and Hama USB-C dongles.


2) what code branch shoud I used to duplicate this problem.

I have pushed my kernel tree to
git.codelinaro.org/dmitry.baryshkov/linux.git, branch test-dp-rpm
I had several UCSI patches on top, but they should not be relevant.
git.codelinaro.org/dmitry.baryshkov/linux.git, branch test-dp-rpm <== I 
synced out his branch and it is still work at my chromebook Kodiak DUT.




Perhaps the gap in test results with the same tree is due to internal 
hpd vs hpd pin. We need to try this on a device which does not use 
internal hpd.



I can not duplicate  system crash problem at my setup kodiak (SM7325)
chrome book with my PM runtime patch series.

my code base is Linux 6.6-rc2 + pm runtime patch series (7 patches)

I did:

1) plugin either apple dongle (DP-to-HDMI) + 1080p display or DP typeC
cable directly to 1080p display

2)  stop ui

3) /usr/bin/modetest -M msm -s 34:1920x1080 (see test pattern show at
display)

4) unplug apple dongle or DP typeC cable

5) hit enter key

6) start ui

7) display back to login page of chrome book

Re: [Patch v2] drm/ttm: Schedule delayed_delete worker closer

2023-11-08 Thread Felix Kuehling


On 2023-11-08 09:49, Christian König wrote:

Am 08.11.23 um 13:58 schrieb Rajneesh Bhardwaj:

Try to allocate system memory on the NUMA node the device is closest to
and try to run delayed_delete workers on a CPU of this node as well.

To optimize the memory clearing operation when a TTM BO gets freed by
the delayed_delete worker, scheduling it closer to a NUMA node where the
memory was initially allocated helps avoid the cases where the worker
gets randomly scheduled on the CPU cores that are across interconnect
boundaries such as xGMI, PCIe etc.

This change helps USWC GTT allocations on NUMA systems (dGPU) and AMD
APU platforms such as GFXIP9.4.3.

Acked-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 


Reviewed-by: Christian König 

Going to push this to drm-misc-next.


Hold on. Rajneesh just pointed out a WARN regression from testing. I 
think the problem is that the bdev->wq is not unbound.


Regards,
  Felix




Thanks,
Christian.


---

Changes in v2:
  - Absorbed the feedback provided by Christian in the commit message 
and

    the comment.

  drivers/gpu/drm/ttm/ttm_bo.c | 8 +++-
  drivers/gpu/drm/ttm/ttm_device.c | 3 ++-
  2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 5757b9415e37..6f28a77a565b 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -370,7 +370,13 @@ static void ttm_bo_release(struct kref *kref)
  spin_unlock(>bdev->lru_lock);
    INIT_WORK(>delayed_delete, ttm_bo_delayed_delete);
-    queue_work(bdev->wq, >delayed_delete);
+
+    /* Schedule the worker on the closest NUMA node. This
+ * improves performance since system memory might be
+ * cleared on free and that is best done on a CPU core
+ * close to it.
+ */
+    queue_work_node(bdev->pool.nid, bdev->wq, 
>delayed_delete);

  return;
  }
  diff --git a/drivers/gpu/drm/ttm/ttm_device.c 
b/drivers/gpu/drm/ttm/ttm_device.c

index 43e27ab77f95..72b81a2ee6c7 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -213,7 +213,8 @@ int ttm_device_init(struct ttm_device *bdev, 
struct ttm_device_funcs *funcs,

  bdev->funcs = funcs;
    ttm_sys_man_init(bdev);
-    ttm_pool_init(>pool, dev, NUMA_NO_NODE, use_dma_alloc, 
use_dma32);

+
+    ttm_pool_init(>pool, dev, dev_to_node(dev), use_dma_alloc, 
use_dma32);

    bdev->vma_manager = vma_manager;
  spin_lock_init(>lru_lock);

Re: [PATCH v7 0/7] incorporate pm runtime framework and eDP clean up

2023-11-08 Thread Kuogee Hsieh




On 11/7/2023 1:23 PM, Dmitry Baryshkov wrote:

On Tue, 7 Nov 2023 at 23:01, Kuogee Hsieh  wrote:


On 11/6/2023 5:55 PM, Dmitry Baryshkov wrote:

On Sat, 7 Oct 2023 at 01:55, Kuogee Hsieh  wrote:

The purpose of this patch series is to incorporate pm runtime framework
into MSM eDP/DP driver so that eDP panel can be detected by DRM eDP panel
driver during system probe time. During incorporating procedure, original
customized pm realted fucntions, such as dp_pm_prepare(), dp_pm_suspend(),
dp_pm_resume() and dp_pm_prepare(), are removed and replaced with functions
provided by pm runtiem framework such as pm_runtime_force_suspend() and
pm_runtime_force_resume(). In addition, both eDP aux-bus and irq handler
are bound at system probe time too.

With this patchset in place I can crash the board using the following
sequence (SM8350-HDK):

- plug the USBC DP dongle
- run modetest at any mode, don't press Enter yet
- unplug the dongle
- press Enter to stop modetest

=> the board resets to Sahara.

Please ping me if you need any additional information from my side.

questiosn,

1) which dongle are you used?

I have used several Dell and Hama USB-C dongles.


2) what code branch shoud I used to duplicate this problem.

I have pushed my kernel tree to
git.codelinaro.org/dmitry.baryshkov/linux.git, branch test-dp-rpm
I had several UCSI patches on top, but they should not be relevant.
git.codelinaro.org/dmitry.baryshkov/linux.git, branch test-dp-rpm <== I 
synced out his branch and it is still work at my chromebook Kodiak DUT.



I can not duplicate  system crash problem at my setup kodiak (SM7325)
chrome book with my PM runtime patch series.

my code base is Linux 6.6-rc2 + pm runtime patch series (7 patches)

I did:

1) plugin either apple dongle (DP-to-HDMI) + 1080p display or DP typeC
cable directly to 1080p display

2)  stop ui

3) /usr/bin/modetest -M msm -s 34:1920x1080 (see test pattern show at
display)

4) unplug apple dongle or DP typeC cable

5) hit enter key

6) start ui

7) display back to login page of chrome book

Re: [Patch v2] drm/ttm: Schedule delayed_delete worker closer

2023-11-08 Thread Bhardwaj, Rajneesh




On 11/8/2023 9:49 AM, Christian König wrote:

Am 08.11.23 um 13:58 schrieb Rajneesh Bhardwaj:

Try to allocate system memory on the NUMA node the device is closest to
and try to run delayed_delete workers on a CPU of this node as well.

To optimize the memory clearing operation when a TTM BO gets freed by
the delayed_delete worker, scheduling it closer to a NUMA node where the
memory was initially allocated helps avoid the cases where the worker
gets randomly scheduled on the CPU cores that are across interconnect
boundaries such as xGMI, PCIe etc.

This change helps USWC GTT allocations on NUMA systems (dGPU) and AMD
APU platforms such as GFXIP9.4.3.

Acked-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 


Reviewed-by: Christian König 

Going to push this to drm-misc-next.


Thanks Christian, there is a new regression reported and I am checking 
on that. Please don't submit it yet.





Thanks,
Christian.


---

Changes in v2:
  - Absorbed the feedback provided by Christian in the commit message 
and

    the comment.

  drivers/gpu/drm/ttm/ttm_bo.c | 8 +++-
  drivers/gpu/drm/ttm/ttm_device.c | 3 ++-
  2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 5757b9415e37..6f28a77a565b 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -370,7 +370,13 @@ static void ttm_bo_release(struct kref *kref)
  spin_unlock(>bdev->lru_lock);
    INIT_WORK(>delayed_delete, ttm_bo_delayed_delete);
-    queue_work(bdev->wq, >delayed_delete);
+
+    /* Schedule the worker on the closest NUMA node. This
+ * improves performance since system memory might be
+ * cleared on free and that is best done on a CPU core
+ * close to it.
+ */
+    queue_work_node(bdev->pool.nid, bdev->wq, 
>delayed_delete);

  return;
  }
  diff --git a/drivers/gpu/drm/ttm/ttm_device.c 
b/drivers/gpu/drm/ttm/ttm_device.c

index 43e27ab77f95..72b81a2ee6c7 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -213,7 +213,8 @@ int ttm_device_init(struct ttm_device *bdev, 
struct ttm_device_funcs *funcs,

  bdev->funcs = funcs;
    ttm_sys_man_init(bdev);
-    ttm_pool_init(>pool, dev, NUMA_NO_NODE, use_dma_alloc, 
use_dma32);

+
+    ttm_pool_init(>pool, dev, dev_to_node(dev), use_dma_alloc, 
use_dma32);

    bdev->vma_manager = vma_manager;
  spin_lock_init(>lru_lock);

RE: [PATCH 00/17] dt-bindings: samsung: add specific compatibles for existing SoC

2023-11-08 Thread Alim Akhtar

Hi Krzysztof

> -Original Message-
> From: Krzysztof Kozlowski 
> Sent: Wednesday, November 8, 2023 4:13 PM
> To: David Airlie ; Daniel Vetter ;
> Maarten Lankhorst ; Maxime Ripard
> ; Thomas Zimmermann ;
> Rob Herring ; Krzysztof Kozlowski
> ; Conor Dooley
> ; Alim Akhtar ; Andi
> Shyti ; Jonathan Cameron ; Lars-
> Peter Clausen ; Lee Jones ; Ulf
> Hansson ; Tomasz Figa ;
> Sylwester Nawrocki ; Linus Walleij
> ; Thierry Reding ; Uwe
> Kleine-König ; Alessandro Zummo
> ; Alexandre Belloni
> ; Greg Kroah-Hartman
> ; Jiri Slaby ; Liam
> Girdwood ; Mark Brown ;
> Jaehoon Chung ; Sam Protsenko
> ; dri-devel@lists.freedesktop.org;
> devicet...@vger.kernel.org; linux-ker...@vger.kernel.org; linux-arm-
> ker...@lists.infradead.org; linux-samsung-...@vger.kernel.org; linux-
> i...@vger.kernel.org; linux-...@vger.kernel.org; linux-...@vger.kernel.org;
> linux-g...@vger.kernel.org; linux-...@vger.kernel.org; linux-
> r...@vger.kernel.org; linux-ser...@vger.kernel.org; alsa-devel@alsa-
> project.org; linux-so...@vger.kernel.org
> Cc: Krzysztof Kozlowski 
> Subject: [PATCH 00/17] dt-bindings: samsung: add specific compatibles for
> existing SoC
> 
> Hi,
> 
> Merging
> ===
> I propose to take entire patchset through my tree (Samsung SoC), because:
> 1. Next cycle two new SoCs will be coming (Google GS101 and
> ExynosAutov920), so
>they will touch the same lines in some of the DT bindings (not all, 
> though).
>It is reasonable for me to take the bindings for the new SoCs, to have 
> clean
>`make dtbs_check` on the new DTS.
> 2. Having it together helps me to have clean `make dtbs_check` within my
> tree
>on the existing DTS.
> 3. No drivers are affected by this change.
> 4. I plan to do the same for Tesla FSD and Exynos ARM32 SoCs, thus expect
>follow up patchsets.
> 
Sure, make sense. 

> If folks agree, please kindly Ack the patches.
> 
> Description
> ===
> Samsung Exynos SoCs reuse several devices from older designs, thus
> historically we kept the old (block's) compatible only.  This works fine and
> there is no bug here, however guidelines expressed in
> Documentation/devicetree/bindings/writing-bindings.rst state that:
> 1. Compatibles should be specific.
> 2. We should add new compatibles in case of bugs or features.
> 
> Add compatibles specific to each SoC in front of all old-SoC-like compatibles.
> This will also help reviews of new code using existing DTS as template.  No
> functional impact on Linux drivers behavior.
> 
> Future
> ==
> If reasonable, I will do similar work for Tesla FSD and ARMv7/ARM32 Exynos
> bindings and DTS.
> 
> Best regards,
> Krzysztof
> 
> Krzysztof Kozlowski (17):
>   dt-bindings: hwinfo: samsung,exynos-chipid: add specific compatibles
> for existing SoC
>   dt-bindings: i2c: exynos5: add specific compatibles for existing SoC
>   dt-bindings: i2c: samsung,s3c2410-i2c: add specific compatibles for
> existing SoC
>   dt-bindings: mmc: samsung,exynos-dw-mshc: add specific compatibles for
> existing SoC
>   dt-bindings: pinctrl: samsung: add specific compatibles for existing
> SoC
>   dt-bindings: rtc: s3c-rtc: add specific compatibles for existing SoC
>   dt-bindings: serial: samsung: add specific compatibles for existing
> SoC
>   dt-bindings: samsung: exynos-pmu: add specific compatibles for
> existing SoC
>   dt-bindings: gpu: arm,mali-midgard: add specific compatibles for
> existing Exynos SoC
>   dt-bindings: iio: samsung,exynos-adc: add specific compatibles for
> existing SoC
>   ASoC: dt-bindings: samsung-i2s: add specific compatibles for existing
> SoC
>   dt-bindings: pwm: samsung: add specific compatibles for existing SoC
>   arm64: dts: exynos5433: add specific compatibles to several blocks
>   arm64: dts: exynos7: add specific compatibles to several blocks
>   arm64: dts: exynos7885: add specific compatibles to several blocks
>   arm64: dts: exynos850: add specific compatibles to several blocks
>   arm64: dts: exynosautov9: add specific compatibles to several blocks
> 
>  .../bindings/gpu/arm,mali-midgard.yaml|  5 ++
>  .../hwinfo/samsung,exynos-chipid.yaml | 17 +-
>  .../devicetree/bindings/i2c/i2c-exynos5.yaml  | 10 +++-
>  .../bindings/i2c/samsung,s3c2410-i2c.yaml | 22 ---
>  .../bindings/iio/adc/samsung,exynos-adc.yaml  | 29 +
>  .../mfd/samsung,exynos5433-lpass.yaml |  2 +-
>  .../bindings/mmc/samsung,exynos-dw-mshc.yaml  | 25 +---
>  .../samsung,pinctrl-wakeup-interrupt.yaml | 24 +---
>  .../bindings/pinctrl/samsung,pinctrl.yaml |  3 +-
>  .../devicetree/bindings/pwm/pwm-samsung.yaml  |  2 +
>  .../devicetree/bindings/rtc/s3c-rtc.yaml  |  5 ++
>  .../bindings/serial/samsung_uart.yaml | 14 -
>  .../bindings/soc/samsung/exynos-pmu.yaml  |  6 ++
>  .../bindings/soc/samsung/exynos-usi.yaml  |  2 +-
>  .../bindings/sound/samsung-i2s.yaml   | 19 +++---
>  arch/arm64/boot/dts/exynos/exynos5433.dtsi|

[PATCH v3 2/2] drivers/tidss: Add support for AM62A7 DSS

2023-11-08 Thread Aradhya Bhatia

Add support for the DSS controller on TI's AM62A7 SoC in the tidss
driver.

This controller has 2 video pipelines that can render 2 video planes on
over a screen, using the overlay managers. The output of the DSS comes
from video port 2 (VP2) in the form of RGB88 DPI signals, while the VP1
is tied off inside the SoC.

Also add and use a new type of VP, DISPC_VP_TIED_OFF, for the tied-off
VP1 of AM62A DSS.

Signed-off-by: Aradhya Bhatia 
---
Notes:
   Changes from V2:
   * Add new DISPC_VP_TIED_OFF for tied-off video-ports.

   Changes from V1:
   * Correctly sort DISPC_AM62A7 macro after DISPC_AM625.

 drivers/gpu/drm/tidss/tidss_dispc.c | 59 +
 drivers/gpu/drm/tidss/tidss_dispc.h |  3 ++
 drivers/gpu/drm/tidss/tidss_drv.c   |  1 +
 3 files changed, 63 insertions(+)

diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c 
b/drivers/gpu/drm/tidss/tidss_dispc.c
index 9d9dee7abaef..7af416457c57 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc.c
+++ b/drivers/gpu/drm/tidss/tidss_dispc.c
@@ -322,6 +322,60 @@ const struct dispc_features dispc_am625_feats = {
.vid_order = { 1, 0 },
 };
 
+const struct dispc_features dispc_am62a7_feats = {
+   /*
+* if the code reaches dispc_mode_valid with VP1,
+* it should return MODE_BAD.
+*/
+   .max_pclk_khz = {
+   [DISPC_VP_TIED_OFF] = 0,
+   [DISPC_VP_DPI] = 165000,
+   },
+
+   .scaling = {
+   .in_width_max_5tap_rgb = 1280,
+   .in_width_max_3tap_rgb = 2560,
+   .in_width_max_5tap_yuv = 2560,
+   .in_width_max_3tap_yuv = 4096,
+   .upscale_limit = 16,
+   .downscale_limit_5tap = 4,
+   .downscale_limit_3tap = 2,
+   /*
+* The max supported pixel inc value is 255. The value
+* of pixel inc is calculated like this: 1+(xinc-1)*bpp.
+* The maximum bpp of all formats supported by the HW
+* is 8. So the maximum supported xinc value is 32,
+* because 1+(32-1)*8 < 255 < 1+(33-1)*4.
+*/
+   .xinc_max = 32,
+   },
+
+   .subrev = DISPC_AM62A7,
+
+   .common = "common",
+   .common_regs = tidss_am65x_common_regs,
+
+   .num_vps = 2,
+   .vp_name = { "vp1", "vp2" },
+   .ovr_name = { "ovr1", "ovr2" },
+   .vpclk_name =  { "vp1", "vp2" },
+   /* VP1 of the DSS in AM62A7 SoC is tied off internally */
+   .vp_bus_type = { DISPC_VP_TIED_OFF, DISPC_VP_DPI },
+
+   .vp_feat = { .color = {
+   .has_ctm = true,
+   .gamma_size = 256,
+   .gamma_type = TIDSS_GAMMA_8BIT,
+   },
+   },
+
+   .num_planes = 2,
+   /* note: vid is plane_id 0 and vidl1 is plane_id 1 */
+   .vid_name = { "vid", "vidl1" },
+   .vid_lite = { false, true, },
+   .vid_order = { 1, 0 },
+};
+
 static const u16 *dispc_common_regmap;
 
 struct dss_vp_data {
@@ -824,6 +878,7 @@ dispc_irq_t dispc_read_and_clear_irqstatus(struct 
dispc_device *dispc)
case DISPC_K2G:
return dispc_k2g_read_and_clear_irqstatus(dispc);
case DISPC_AM625:
+   case DISPC_AM62A7:
case DISPC_AM65X:
case DISPC_J721E:
return dispc_k3_read_and_clear_irqstatus(dispc);
@@ -840,6 +895,7 @@ void dispc_set_irqenable(struct dispc_device *dispc, 
dispc_irq_t mask)
dispc_k2g_set_irqenable(dispc, mask);
break;
case DISPC_AM625:
+   case DISPC_AM62A7:
case DISPC_AM65X:
case DISPC_J721E:
dispc_k3_set_irqenable(dispc, mask);
@@ -1331,6 +1387,7 @@ void dispc_ovr_set_plane(struct dispc_device *dispc, u32 
hw_plane,
x, y, layer);
break;
case DISPC_AM625:
+   case DISPC_AM62A7:
case DISPC_AM65X:
dispc_am65x_ovr_set_plane(dispc, hw_plane, hw_videoport,
  x, y, layer);
@@ -2250,6 +2307,7 @@ static void dispc_plane_init(struct dispc_device *dispc)
dispc_k2g_plane_init(dispc);
break;
case DISPC_AM625:
+   case DISPC_AM62A7:
case DISPC_AM65X:
case DISPC_J721E:
dispc_k3_plane_init(dispc);
@@ -2357,6 +2415,7 @@ static void dispc_vp_write_gamma_table(struct 
dispc_device *dispc,
dispc_k2g_vp_write_gamma_table(dispc, hw_videoport);
break;
case DISPC_AM625:
+   case DISPC_AM62A7:
case DISPC_AM65X:
dispc_am65x_vp_write_gamma_table(dispc, hw_videoport);
break;
diff --git a/drivers/gpu/drm/tidss/tidss_dispc.h 
b/drivers/gpu/drm/tidss/tidss_dispc.h
index 33ac5ad7a423..086327d51a90 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc.h
+++ b/drivers/gpu/drm/tidss/tidss_dispc.h
@@ -54,12 +54,14 @@ enum

[PATCH v3 1/2] dt-bindings: display: ti: Add support for am62a7 dss

2023-11-08 Thread Aradhya Bhatia

The DSS controller on TI's AM62A7 SoC is an update from that on TI's
AM625 SoC. Like the DSS in AM625, the DSS in this SoC has 2 video
pipelines, but unlike the former, the latter only has one output port on
VP2 to service DPI display sinks.

Add the new controller's compatible.

Signed-off-by: Aradhya Bhatia 
Reviewed-by: Krzysztof Kozlowski 
---
 .../bindings/display/ti/ti,am65x-dss.yaml  | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml 
b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
index ae09cd3cbce1..b6767ef0d24d 100644
--- a/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
+++ b/Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml
@@ -23,6 +23,7 @@ properties:
   compatible:
 enum:
   - ti,am625-dss
+  - ti,am62a7,dss
   - ti,am65x-dss
 
   reg:
@@ -87,6 +88,7 @@ properties:
   For AM65x DSS, the OLDI output port node from video port 1.
   For AM625 DSS, the internal DPI output port node from video
   port 1.
+  For AM62A7 DSS, the port is tied off inside the SoC.
 
   port@1:
 $ref: /schemas/graph.yaml#/properties/port
@@ -108,6 +110,18 @@ properties:
   Input memory (from main memory to dispc) bandwidth limit in
   bytes per second
 
+allOf:
+  - if:
+  properties:
+compatible:
+  contains:
+const: ti,am62a7-dss
+then:
+  properties:
+ports:
+  properties:
+port@0: false
+
 required:
   - compatible
   - reg
-- 
2.42.0

[PATCH v3 0/2] Add DSS support for TI AM62A7 SoC

2023-11-08 Thread Aradhya Bhatia

This patch series adds a new compatible for the Display SubSystem (DSS)
controller on TI's AM62A7 SoC. It further adds the required support, for
the same, in the tidss driver.

The DSS controller is similar to the recently added AM625 DSS, with the
key difference being the absence of VP1 output on the SoC. The VP1 in
AM62A7 DSS is tied off and cannot be used, unlike in AM625, where the
VP1 was connected to 2 OLDI TXes. The video pipeline that corresponds to
VP1 still exists and can be used to overlay planes on the VP2's primary
plane. This can be done using the overlay managers inside the SoC.
Moreover, DSS VP2 can output Full-HD RGB888 DPI video signals.

I have tested these patches on AM62A7 SK-EVM, which converts DPI signals
to HDMI on the platform using the Sil9022A HDMI transmitter. All the
patches, required to enable display on AM62A7-SK, can be found on my
github fork[0] in the branch "next_am62a-v3".

Regards
Aradhya

[0]: https://github.com/aradhya07/linux-ab/tree/next_am62a-v3

Change Log:
V2 -> V3:
  - Add Krzysztof Kozlowski's R-b in patch 1/2.
  - Add new DISPC_VP_TIED_OFF for tied-off video-ports in patch 2/2.

V1 -> V2:
  - Correctly sort DISPC_AM62A7 macro after DISPC_AM625 in patch 2/2.

Previous Versions:
V1: https://lore.kernel.org/all/20230818131750.4779-1-a-bhat...@ti.com/
V2: https://lore.kernel.org/all/20230818142124.8561-1-a-bhat...@ti.com/

Aradhya Bhatia (2):
  dt-bindings: display: ti: Add support for am62a7 dss
  drivers/tidss: Add support for AM62A7 DSS

 .../bindings/display/ti/ti,am65x-dss.yaml | 14 +
 drivers/gpu/drm/tidss/tidss_dispc.c   | 59 +++
 drivers/gpu/drm/tidss/tidss_dispc.h   |  3 +
 drivers/gpu/drm/tidss/tidss_drv.c |  1 +
 4 files changed, 77 insertions(+)


base-commit: 2220f68f4504aa1ccce0fac721ccdb301e9da32f
-- 
2.42.0

Re: [PATCH v6 4/5] drm/panel-edp: Add override_edid_mode quirk for generic edp

2023-11-08 Thread Doug Anderson

Hi,

On Wed, Nov 8, 2023 at 7:45 AM Maxime Ripard  wrote:
>
> > @@ -575,9 +599,18 @@ static int panel_edp_get_modes(struct drm_panel *panel,
> >
> >   if (!p->edid)
> >   p->edid = drm_get_edid(connector, p->ddc);
> > -
> > - if (p->edid)
> > - num += drm_add_edid_modes(connector, p->edid);
> > + if (p->edid) {
> > + if (has_override_edid_mode) {
>
> It's not clear to me why the override mechanism is only there when
> there's a ddc bus?

I think you're confusing the two different (but related) issues
addressed by this series. One is when you're using the generic
"edp-panel" compatible string. In that case the mode comes from the
EDID and only the EDID since there's no hardcoded mode. We need a mode
override there since some EDIDs shipped with a bad mode. That's the
subject of ${SUBJECT} patch.

The second issue is what to do with a hardcoded mode. That's the
subject of the next patch in the series (patch #5). Previously we
merged the hardcoded and EDID modes. Now (in the next patch) we use
only the hardcoded mode. There's no need for a fixup because the mode
is hardcoded in the kernel.

> You mentioned before that you were following panel-simple,

As of the newest version of the patch, it's no longer following
panel-simple in response to your feedback on previous versions.

> but that's a
> clear deviation from what I can see. If there's a reason for that
> deviation, that's fine by me, but it should at least be documented in
> the commit log.

I think the commit log is OK. I suspect the confusion is only because
you've reviewed previous versions of the series. Please shout if
things still look confusing.

> > @@ -950,6 +983,19 @@ static const struct panel_desc auo_b101ean01 = {
> >   },
> >  };
> >
> > +static const struct drm_display_mode auo_b116xa3_mode = {
> > + .clock = 70589,
> > + .hdisplay = 1366,
> > + .hsync_start = 1366 + 40,
> > + .hsync_end = 1366 + 40 + 40,
> > + .htotal = 1366 + 40 + 40 + 32,
> > + .vdisplay = 768,
> > + .vsync_start = 768 + 10,
> > + .vsync_end = 768 + 10 + 12,
> > + .vtotal = 768 + 10 + 12 + 6,
> > + .flags = DRM_MODE_FLAG_NVSYNC | DRM_MODE_FLAG_NHSYNC,
> > +};
>
> That should be a separate patch

That's fair. I didn't think it was a huge deal, but I agree that it's
slightly cleaner.

-Doug

Re: [PATCH 11/17] ASoC: dt-bindings: samsung-i2s: add specific compatibles for existing SoC

2023-11-08 Thread Lee Jones

On Wed, 08 Nov 2023, Krzysztof Kozlowski wrote:

> Samsung Exynos SoC reuses several devices from older designs, thus
> historically we kept the old (block's) compatible only.  This works fine
> and there is no bug here, however guidelines expressed in
> Documentation/devicetree/bindings/writing-bindings.rst state that:
> 1. Compatibles should be specific.
> 2. We should add new compatibles in case of bugs or features.
> 
> Add compatibles specific to each SoC in front of all old-SoC-like
> compatibles.
> 
> Signed-off-by: Krzysztof Kozlowski 
> 
> ---
> 
> I propose to take the patch through Samsung SoC (me). See cover letter
> for explanation.
> ---
>  .../mfd/samsung,exynos5433-lpass.yaml |  2 +-
>  .../bindings/sound/samsung-i2s.yaml   | 19 ---
>  2 files changed, 13 insertions(+), 8 deletions(-)

Acked-by: Lee Jones 

-- 
Lee Jones [李琼斯]

[RFC PATCH v3 17/23] drm/colorop: Introduce DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE

2023-11-08 Thread Harry Wentland

With the introduction of the pre-blending color pipeline we
can no longer have color operations that don't have a clear
position in the color pipeline. We deprecate all existing
plane properties. For upstream drivers those are:
 - COLOR_ENCODING
 - COLOR_RANGE

Userspace that registers with the
DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE will be permitted to set
the COLOR_PIPELINE plane property and drm_colorop properties.
But it won't be allowed to set COLOR_ENCODING and
COLOR_RANGE. Userspace that does not set this client cap
will not be permitted to touch the color pipeline.

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/drm_atomic_uapi.c | 23 ++-
 drivers/gpu/drm/drm_ioctl.c   |  7 +++
 include/drm/drm_file.h|  7 +++
 include/uapi/drm/drm.h| 15 +++
 4 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index c6629fdaa114..69c56982e2d0 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -610,10 +610,26 @@ static int drm_atomic_plane_set_property(struct drm_plane 
*plane,
} else if (property == plane->zpos_property) {
state->zpos = val;
} else if (property == plane->color_encoding_property) {
+   if (file_priv->plane_color_pipeline) {
+   drm_dbg_atomic(dev,
+  "Setting COLOR_PIPELINE plane property 
not permitted when DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE is set\n");
+   return -EINVAL;
+   }
state->color_encoding = val;
} else if (property == plane->color_range_property) {
+   if (file_priv->plane_color_pipeline) {
+   drm_dbg_atomic(dev,
+  "Setting COLOR_PIPELINE plane property 
not permitted when DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE is set\n");
+   return -EINVAL;
+   }
state->color_range = val;
} else if (property == plane->color_pipeline_property) {
+   if (!file_priv->plane_color_pipeline) {
+   drm_dbg_atomic(dev,
+  "Setting COLOR_PIPELINE plane property 
not permitted unless DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE is set\n");
+   return -EINVAL;
+   }
+
/* find DRM colorop object */
struct drm_colorop *colorop = NULL;
colorop = drm_colorop_find(dev, file_priv, val);
@@ -1158,6 +1174,12 @@ int drm_atomic_set_property(struct drm_atomic_state 
*state,
break;
}
case DRM_MODE_OBJECT_COLOROP: {
+   if (!file_priv->plane_color_pipeline) {
+   drm_dbg_atomic(prop->dev,
+  "[OBJECT:%d] is a colorop but 
DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE not set\n",
+  obj->id);
+   ret = -EINVAL;
+   }
struct drm_colorop *colorop = obj_to_colorop(obj);
struct drm_colorop_state *colorop_state;
 
@@ -1170,7 +1192,6 @@ int drm_atomic_set_property(struct drm_atomic_state 
*state,
ret = drm_atomic_colorop_set_property(colorop,
colorop_state, file_priv,
prop, prop_value);
-
break;
}
default:
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index 44fda68c28ae..0d869658e13e 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -361,6 +361,13 @@ drm_setclientcap(struct drm_device *dev, void *data, 
struct drm_file *file_priv)
return -EINVAL;
file_priv->writeback_connectors = req->value;
break;
+   case DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE:
+   if (!file_priv->atomic)
+   return -EINVAL;
+   if (req->value > 1)
+   return -EINVAL;
+   file_priv->plane_color_pipeline = req->value;
+   break;
default:
return -EINVAL;
}
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index e1b5b4282f75..bf11b646c898 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -204,6 +204,13 @@ struct drm_file {
 */
bool writeback_connectors;
 
+   /**
+* @plane_color_pipeline:
+*
+* True if client understands plane color pipelines
+*/
+   bool plane_color_pipeline;
+
/**
 * @was_master:
 *
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 841d393fb84e..2576b170e8d0 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -842,6 +842,21 @@ struct drm_get_cap {
  */
 #define

[RFC PATCH v3 19/23] drm/vkms: Pull apply_colorop out of pre_blend_color_transform

2023-11-08 Thread Harry Wentland

The if/switch statement is bound to grow with more types and
subtypes. Pull this out into its own funcion to make things more
manageable and readable.

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/vkms/vkms_composer.c | 48 
 1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
b/drivers/gpu/drm/vkms/vkms_composer.c
index be42756e300a..9010415e4bd6 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -164,6 +164,31 @@ static void apply_lut(const struct vkms_crtc_state 
*crtc_state, struct line_buff
}
 }
 
+static void apply_colorop(struct pixel_argb_u16 *pixel, struct drm_colorop 
*colorop)
+{
+   /* TODO is this right? */
+   struct drm_colorop_state *colorop_state = colorop->state;
+
+   if (colorop->type == DRM_COLOROP_1D_CURVE) {
+   switch (colorop_state->curve_1d_type) {
+   case DRM_COLOROP_1D_CURVE_SRGB_INV_EOTF:
+   pixel->r = 
apply_lut_to_channel_value(_inv_eotf, pixel->r, LUT_RED);
+   pixel->g = 
apply_lut_to_channel_value(_inv_eotf, pixel->g, LUT_GREEN);
+   pixel->b = 
apply_lut_to_channel_value(_inv_eotf, pixel->b, LUT_BLUE);
+   break;
+   case DRM_COLOROP_1D_CURVE_SRGB_EOTF:
+   pixel->r = 
apply_lut_to_channel_value(_eotf, pixel->r, LUT_RED);
+   pixel->g = 
apply_lut_to_channel_value(_eotf, pixel->g, LUT_GREEN);
+   pixel->b = 
apply_lut_to_channel_value(_eotf, pixel->b, LUT_BLUE);
+   break;
+   default:
+   DRM_DEBUG_DRIVER("unkown colorop 1D curve type 
%d\n", colorop_state->curve_1d_type);
+   break;
+   }
+   }
+
+}
+
 static void pre_blend_color_transform(const struct vkms_plane_state 
*plane_state, struct line_buffer *output_buffer)
 {
struct drm_colorop *colorop = plane_state->base.base.color_pipeline;
@@ -180,26 +205,9 @@ static void pre_blend_color_transform(const struct 
vkms_plane_state *plane_state
if (!colorop_state)
return;
 
-   for (size_t x = 0; x < output_buffer->n_pixels; x++) {
-   struct pixel_argb_u16 *pixel = 
_buffer->pixels[x];
-
-   if (colorop->type == DRM_COLOROP_1D_CURVE &&
-   colorop_state->bypass == false) {
-   switch (colorop_state->curve_1d_type) {
-   case DRM_COLOROP_1D_CURVE_SRGB_INV_EOTF:
-   pixel->r = 
apply_lut_to_channel_value(_inv_eotf, pixel->r, LUT_RED);
-   pixel->g = 
apply_lut_to_channel_value(_inv_eotf, pixel->g, LUT_GREEN);
-   pixel->b = 
apply_lut_to_channel_value(_inv_eotf, pixel->b, LUT_BLUE);
-   break;
-   case DRM_COLOROP_1D_CURVE_SRGB_EOTF:
-   default:
-   pixel->r = 
apply_lut_to_channel_value(_eotf, pixel->r, LUT_RED);
-   pixel->g = 
apply_lut_to_channel_value(_eotf, pixel->g, LUT_GREEN);
-   pixel->b = 
apply_lut_to_channel_value(_eotf, pixel->b, LUT_BLUE);
-   break;
-   }
-   }
-   }
+   for (size_t x = 0; x < output_buffer->n_pixels; x++)
+   if (!colorop_state->bypass)
+   apply_colorop(_buffer->pixels[x], 
colorop);
 
colorop = colorop->next;
}
-- 
2.42.1

[RFC PATCH v3 08/23] drm/colorop: Add TYPE property

2023-11-08 Thread Harry Wentland

Add a read-only TYPE property. The TYPE specifies the colorop
type, such as enumerated curve, 1D LUT, CTM, 3D LUT, PWL LUT,
etc.

v3:
 - Make TYPE a range property
 - Move enum drm_colorop_type to uapi header
 - Fix drm_get_colorop_type_name description

For now we're only introducing an enumerated 1D LUT type to
illustrate the concept.

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/drm_atomic.c  |  4 +--
 drivers/gpu/drm/drm_atomic_uapi.c |  8 +-
 drivers/gpu/drm/drm_colorop.c | 43 ++-
 include/drm/drm_colorop.h | 17 +++-
 include/uapi/drm/drm_mode.h   |  4 +++
 5 files changed, 71 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 6390443f1819..59e1797d1ca8 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -634,8 +634,8 @@ drm_atomic_get_colorop_state(struct drm_atomic_state *state,
state->colorops[index].new_state = colorop_state;
colorop_state->state = state;
 
-   drm_dbg_atomic(colorop->dev, "Added [COLOROP:%d] %p state to %p\n",
-  colorop->base.id, colorop_state, state);
+   drm_dbg_atomic(colorop->dev, "Added [COLOROP:%d:%d] %p state to %p\n",
+  colorop->base.id, colorop->type, colorop_state, state);
 
return colorop_state;
 }
diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index 21da1b327ee9..f22bd8671236 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -682,7 +682,13 @@ drm_atomic_colorop_get_property(struct drm_colorop 
*colorop,
const struct drm_colorop_state *state,
struct drm_property *property, uint64_t *val)
 {
-   return -EINVAL;
+   if (property == colorop->type_property) {
+   *val = colorop->type;
+   } else {
+   return -EINVAL;
+   }
+
+   return 0;
 }
 
 static int drm_atomic_set_writeback_fb_for_connector(
diff --git a/drivers/gpu/drm/drm_colorop.c b/drivers/gpu/drm/drm_colorop.c
index a295ab96aee1..902b13e300fd 100644
--- a/drivers/gpu/drm/drm_colorop.c
+++ b/drivers/gpu/drm/drm_colorop.c
@@ -32,12 +32,17 @@
 
 /* TODO big colorop doc, including properties, etc. */
 
+static const struct drm_prop_enum_list drm_colorop_type_enum_list[] = {
+   { DRM_COLOROP_1D_CURVE, "1D Curve" },
+};
+
 /* Init Helpers */
 
 int drm_colorop_init(struct drm_device *dev, struct drm_colorop *colorop,
-struct drm_plane *plane)
+struct drm_plane *plane, enum drm_colorop_type type)
 {
struct drm_mode_config *config = >mode_config;
+   struct drm_property *prop;
int ret = 0;
 
ret = drm_mode_object_add(dev, >base, DRM_MODE_OBJECT_COLOROP);
@@ -46,12 +51,28 @@ int drm_colorop_init(struct drm_device *dev, struct 
drm_colorop *colorop,
 
colorop->base.properties = >properties;
colorop->dev = dev;
+   colorop->type = type;
colorop->plane = plane;
 
list_add_tail(>head, >colorop_list);
colorop->index = config->num_colorop++;
 
/* add properties */
+
+   /* type */
+   prop = drm_property_create_range(dev,
+   DRM_MODE_PROP_IMMUTABLE,
+   "TYPE", type, type);
+
+   if (!prop)
+   return -ENOMEM;
+
+   colorop->type_property = prop;
+
+   drm_object_attach_property(>base,
+  colorop->type_property,
+  colorop->type);
+
return ret;
 }
 EXPORT_SYMBOL(drm_colorop_init);
@@ -150,3 +171,23 @@ void drm_colorop_reset(struct drm_colorop *colorop)
__drm_colorop_reset(colorop, colorop->state);
 }
 EXPORT_SYMBOL(drm_colorop_reset);
+
+
+static const char * const colorop_type_name[] = {
+   [DRM_COLOROP_1D_CURVE] = "1D Curve",
+};
+
+/**
+ * drm_get_colorop_type_name - return a string for colorop type
+ * @type: colorop type to compute name of
+ *
+ * In contrast to the other drm_get_*_name functions this one here returns a
+ * const pointer and hence is threadsafe.
+ */
+const char *drm_get_colorop_type_name(enum drm_colorop_type type)
+{
+   if (WARN_ON(type >= ARRAY_SIZE(colorop_type_name)))
+   return "unknown";
+
+   return colorop_type_name[type];
+}
diff --git a/include/drm/drm_colorop.h b/include/drm/drm_colorop.h
index 3dd169b0317d..0386440afe40 100644
--- a/include/drm/drm_colorop.h
+++ b/include/drm/drm_colorop.h
@@ -103,6 +103,21 @@ struct drm_colorop {
/** @properties: property tracking for this plane */
struct drm_object_properties properties;
 
+   /**
+* @type:
+*
+* Read-only
+* Type of color operation
+*/
+   enum drm_colorop_type type;
+
+   /**
+* @type_property:
+*
+* Read-only "TYPE" property for

[RFC PATCH v3 20/23] drm/vkms: Use s32 for internal color pipeline precision

2023-11-08 Thread Harry Wentland

Certain operations require us to preserve values below 0.0 and
above 1.0 (0x0 and 0x respectively in 16 bpc unorm). One
such operation is a BT709 encoding operation followed by its
decoding operation, or the reverse.

We'll use s32 values as intermediate in and outputs of our
color operations, for the operations where it matters.

For now this won't apply to LUT operations. We might want to
update those to work on s32 as well, but it's unclear how
that should work for unorm LUT definitions. We'll revisit
that once we add LUT + CTM tests.

In order to allow for this we'll also invert the nesting of our
colorop processing loops. We now use the pixel iteration loop
on the outside and the colorop iteration on the inside.

v3:
 - Use new colorop->next pointer

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/vkms/vkms_composer.c | 55 +---
 drivers/gpu/drm/vkms/vkms_drv.h  |  4 ++
 2 files changed, 46 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
b/drivers/gpu/drm/vkms/vkms_composer.c
index 9010415e4bd6..d04a235b9fcd 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -164,7 +164,7 @@ static void apply_lut(const struct vkms_crtc_state 
*crtc_state, struct line_buff
}
 }
 
-static void apply_colorop(struct pixel_argb_u16 *pixel, struct drm_colorop 
*colorop)
+static void apply_colorop(struct pixel_argb_s32 *pixel, struct drm_colorop 
*colorop)
 {
/* TODO is this right? */
struct drm_colorop_state *colorop_state = colorop->state;
@@ -191,25 +191,54 @@ static void apply_colorop(struct pixel_argb_u16 *pixel, 
struct drm_colorop *colo
 
 static void pre_blend_color_transform(const struct vkms_plane_state 
*plane_state, struct line_buffer *output_buffer)
 {
-   struct drm_colorop *colorop = plane_state->base.base.color_pipeline;
+   struct drm_colorop *colorop;
+   struct pixel_argb_s32 pixel;
 
-   while (colorop) {
-   struct drm_colorop_state *colorop_state;
+   for (size_t x = 0; x < output_buffer->n_pixels; x++) {
+
+   /*
+* Some operations, such as applying a BT709 encoding matrix,
+* followed by a decoding matrix, require that we preserve
+* values above 1.0 and below 0.0 until the end of the pipeline.
+*
+* Convert values to s32 for our internal pipeline and go back
+* to u16 values at the end.
+*/
+   pixel.a = output_buffer->pixels[x].a;
+   pixel.r = output_buffer->pixels[x].r;
+   pixel.g = output_buffer->pixels[x].g;
+   pixel.b = output_buffer->pixels[x].b;
+
+   colorop = plane_state->base.base.color_pipeline;
+   while (colorop) {
+   struct drm_colorop_state *colorop_state;
 
-   if (!colorop)
-   return;
+   if (!colorop)
+   return;
 
-   /* TODO this is probably wrong */
-   colorop_state = colorop->state;
+   /* TODO this is probably wrong */
+   colorop_state = colorop->state;
 
-   if (!colorop_state)
-   return;
+   if (!colorop_state)
+   return;
 
-   for (size_t x = 0; x < output_buffer->n_pixels; x++)
if (!colorop_state->bypass)
-   apply_colorop(_buffer->pixels[x], 
colorop);
+   apply_colorop(, colorop);
+
+   colorop = colorop->next;
+   }
 
-   colorop = colorop->next;
+   /* clamp pixel */
+   pixel.a = max(min(pixel.a, 0x), 0x0);
+   pixel.r = max(min(pixel.r, 0x), 0x0);
+   pixel.g = max(min(pixel.g, 0x), 0x0);
+   pixel.b = max(min(pixel.b, 0x), 0x0);
+
+   /* put back to output_buffer */
+   output_buffer->pixels[x].a = pixel.a;
+   output_buffer->pixels[x].r = pixel.r;
+   output_buffer->pixels[x].g = pixel.g;
+   output_buffer->pixels[x].b = pixel.b;
}
 }
 
diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index 2bcc24c196a2..fadb7685a360 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -36,6 +36,10 @@ struct vkms_frame_info {
unsigned int cpp;
 };
 
+struct pixel_argb_s32 {
+   s32 a, r, g, b;
+};
+
 struct pixel_argb_u16 {
u16 a, r, g, b;
 };
-- 
2.42.1

[RFC PATCH v3 22/23] drm/tests: Add a few tests around drm_fixed.h

2023-11-08 Thread Harry Wentland

While working on the CTM implementation of VKMS I had to ascertain
myself of a few assumptions. One of those is whether drm_fixed.h
treats its numbers using signed-magnitude or twos-complement. It is
twos-complement.

In order to make someone else's day easier I am adding the
drm_test_int2fixp test that validates the above assumption.

I am also adding a test for the new sm2fixp function that converts
from a signed-magnitude fixed point to the twos-complement fixed
point.

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/tests/Makefile|  3 +-
 drivers/gpu/drm/tests/drm_fixp_test.c | 69 +++
 2 files changed, 71 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/tests/drm_fixp_test.c

diff --git a/drivers/gpu/drm/tests/Makefile b/drivers/gpu/drm/tests/Makefile
index ba7baa622675..61f44ad0e862 100644
--- a/drivers/gpu/drm/tests/Makefile
+++ b/drivers/gpu/drm/tests/Makefile
@@ -18,6 +18,7 @@ obj-$(CONFIG_DRM_KUNIT_TEST) += \
drm_plane_helper_test.o \
drm_probe_helper_test.o \
drm_rect_test.o \
-   drm_exec_test.o
+   drm_exec_test.o \
+   drm_fixp_test.o
 
 CFLAGS_drm_mm_test.o := $(DISABLE_STRUCTLEAK_PLUGIN)
diff --git a/drivers/gpu/drm/tests/drm_fixp_test.c 
b/drivers/gpu/drm/tests/drm_fixp_test.c
new file mode 100644
index ..f420f173ff66
--- /dev/null
+++ b/drivers/gpu/drm/tests/drm_fixp_test.c
@@ -0,0 +1,69 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright 2022 Advanced Micro Devices, Inc.
+ */
+
+#include 
+#include 
+
+static void drm_test_sm2fixp(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, 0x7fffll, ((1LL << 63) - 1));
+
+   /* 1 */
+   KUNIT_EXPECT_EQ(test, drm_int2fixp(1), drm_sm2fixp(1ull << 
DRM_FIXED_POINT));
+
+   /* -1 */
+   KUNIT_EXPECT_EQ(test, drm_int2fixp(-1), drm_sm2fixp((1ull << 63) | 
(1ull << DRM_FIXED_POINT)));
+
+   /* 0.5 */
+   KUNIT_EXPECT_EQ(test, drm_fixp_from_fraction(1, 2), drm_sm2fixp(1ull << 
(DRM_FIXED_POINT - 1)));
+
+   /* -0.5 */
+   KUNIT_EXPECT_EQ(test, drm_fixp_from_fraction(-1, 2), drm_sm2fixp((1ull 
<< 63) | (1ull << (DRM_FIXED_POINT - 1;
+
+}
+
+static void drm_test_int2fixp(struct kunit *test)
+{
+   /* 1 */
+   KUNIT_EXPECT_EQ(test, 1ll << 32, drm_int2fixp(1));
+
+   /* -1 */
+   KUNIT_EXPECT_EQ(test, -(1ll << 32), drm_int2fixp(-1));
+
+   /* 1 + (-1) = 0 */
+   KUNIT_EXPECT_EQ(test, 0, drm_int2fixp(1) + drm_int2fixp(-1));
+
+   /* 1 / 2 */
+   KUNIT_EXPECT_EQ(test, 1ll << 31, drm_fixp_from_fraction(1, 2));
+
+   /* -0.5 */
+   KUNIT_EXPECT_EQ(test, -(1ll << 31), drm_fixp_from_fraction(-1, 2));
+
+   /* (1 / 2) + (-1) = 0.5 */
+   KUNIT_EXPECT_EQ(test, 1ll << 31, drm_fixp_from_fraction(-1, 2) + 
drm_int2fixp(1));
+
+   /* (1 / 2) - 1) = 0.5 */
+   KUNIT_EXPECT_EQ(test, -(1ll << 31), drm_fixp_from_fraction(1, 2) + 
drm_int2fixp(-1));
+
+   /* (1 / 2) - 1) = 0.5 */
+   KUNIT_EXPECT_EQ(test, -(1ll << 31), drm_fixp_from_fraction(1, 2) - 
drm_int2fixp(1));
+
+}
+
+static struct kunit_case drm_fixp_tests[] = {
+   KUNIT_CASE(drm_test_int2fixp),
+   KUNIT_CASE(drm_test_sm2fixp),
+   { }
+};
+
+static struct kunit_suite drm_rect_test_suite = {
+   .name = "drm_fixp",
+   .test_cases = drm_fixp_tests,
+};
+
+kunit_test_suite(drm_rect_test_suite);
+
+MODULE_AUTHOR("AMD");
+MODULE_LICENSE("GPL and additional rights");
\ No newline at end of file
-- 
2.42.1

[RFC PATCH v3 07/23] drm/colorop: Introduce new drm_colorop mode object

2023-11-08 Thread Harry Wentland

This patches introduces a new drm_colorop mode object. This
object represents color transformations and can be used to
define color pipelines.

We also introduce the drm_colorop_state here, as well as
various helpers and state tracking bits.

v3:
 - Drop TODO for lock (it's handled in drm_modeset_drop_locks)
   (Melissa)
 - Don't get plane state when getting colorop state
 - Make some functions static (kernel test robot)

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/Makefile|   1 +
 drivers/gpu/drm/drm_atomic.c|  70 +
 drivers/gpu/drm/drm_atomic_helper.c |  12 +++
 drivers/gpu/drm/drm_atomic_uapi.c   |  48 +
 drivers/gpu/drm/drm_colorop.c   | 152 +++
 drivers/gpu/drm/drm_mode_config.c   |   7 ++
 drivers/gpu/drm/drm_plane_helper.c  |   2 +-
 include/drm/drm_atomic.h|  82 +++
 include/drm/drm_atomic_uapi.h   |   1 +
 include/drm/drm_colorop.h   | 157 
 include/drm/drm_mode_config.h   |  18 
 include/drm/drm_plane.h |   2 +
 include/uapi/drm/drm.h  |   3 +
 include/uapi/drm/drm_mode.h |   1 +
 14 files changed, 555 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/drm_colorop.c
 create mode 100644 include/drm/drm_colorop.h

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 8e1bde059170..7ba67f9775e7 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -16,6 +16,7 @@ drm-y := \
drm_client.o \
drm_client_modeset.o \
drm_color_mgmt.o \
+   drm_colorop.o \
drm_connector.o \
drm_crtc.o \
drm_displayid.o \
diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index f1a503aafe5a..6390443f1819 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "drm_crtc_internal.h"
 #include "drm_internal.h"
@@ -108,6 +109,7 @@ void drm_atomic_state_default_release(struct 
drm_atomic_state *state)
kfree(state->connectors);
kfree(state->crtcs);
kfree(state->planes);
+   kfree(state->colorops);
kfree(state->private_objs);
 }
 EXPORT_SYMBOL(drm_atomic_state_default_release);
@@ -139,6 +141,10 @@ drm_atomic_state_init(struct drm_device *dev, struct 
drm_atomic_state *state)
sizeof(*state->planes), GFP_KERNEL);
if (!state->planes)
goto fail;
+   state->colorops = kcalloc(dev->mode_config.num_colorop,
+ sizeof(*state->colorops), GFP_KERNEL);
+   if (!state->colorops)
+   goto fail;
 
/*
 * Because drm_atomic_state can be committed asynchronously we need our
@@ -250,6 +256,20 @@ void drm_atomic_state_default_clear(struct 
drm_atomic_state *state)
state->planes[i].new_state = NULL;
}
 
+   for (i = 0; i < config->num_colorop; i++) {
+   struct drm_colorop *colorop = state->colorops[i].ptr;
+
+   if (!colorop)
+   continue;
+
+   drm_colorop_atomic_destroy_state(colorop,
+state->colorops[i].state);
+   state->colorops[i].ptr = NULL;
+   state->colorops[i].state = NULL;
+   state->colorops[i].old_state = NULL;
+   state->colorops[i].new_state = NULL;
+   }
+
for (i = 0; i < state->num_private_objs; i++) {
struct drm_private_obj *obj = state->private_objs[i].ptr;
 
@@ -571,6 +591,56 @@ drm_atomic_get_plane_state(struct drm_atomic_state *state,
 }
 EXPORT_SYMBOL(drm_atomic_get_plane_state);
 
+
+/**
+ * drm_atomic_get_colorop_state - get colorop state
+ * @state: global atomic state object
+ * @colorop: colorop to get state object for
+ *
+ * This function returns the colorop state for the given colorop, allocating it
+ * if needed. It will also grab the relevant plane lock to make sure that the
+ * state is consistent.
+ *
+ * Returns:
+ *
+ * Either the allocated state or the error code encoded into the pointer. When
+ * the error is EDEADLK then the w/w mutex code has detected a deadlock and the
+ * entire atomic sequence must be restarted. All other errors are fatal.
+ */
+struct drm_colorop_state *
+drm_atomic_get_colorop_state(struct drm_atomic_state *state,
+struct drm_colorop *colorop)
+{
+   int ret, index = drm_colorop_index(colorop);
+   struct drm_colorop_state *colorop_state;
+
+   WARN_ON(!state->acquire_ctx);
+
+   colorop_state = drm_atomic_get_existing_colorop_state(state, colorop);
+   if (colorop_state)
+   return colorop_state;
+
+   ret = drm_modeset_lock(>plane->mutex, state->acquire_ctx);
+   if (ret)
+   return ERR_PTR(ret);
+
+   colorop_state =

[RFC PATCH v3 13/23] drm/plane: Add COLOR PIPELINE property

2023-11-08 Thread Harry Wentland

We're adding a new enum COLOR PIPELINE property. This
property will have entries for each COLOR PIPELINE by
referencing the DRM object ID of the first drm_colorop
of the pipeline. 0 disables the entire COLOR PIPELINE.

Userspace can use this to discover the available color
pipelines, as well as set the desired one. The color
pipelines are programmed via properties on the actual
drm_colorop objects.

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/drm_atomic.c  | 46 +++
 drivers/gpu/drm/drm_atomic_state_helper.c |  5 +++
 drivers/gpu/drm/drm_atomic_uapi.c | 44 ++
 include/drm/drm_atomic_uapi.h |  2 +
 include/drm/drm_plane.h   |  8 
 5 files changed, 105 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index ccf26b034433..cf3cb6d1239f 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -1463,6 +1463,52 @@ drm_atomic_add_affected_planes(struct drm_atomic_state 
*state,
 }
 EXPORT_SYMBOL(drm_atomic_add_affected_planes);
 
+/**
+ * drm_atomic_add_affected_colorops - add colorops for plane
+ * @state: atomic state
+ * @plane: DRM plane
+ *
+ * This function walks the current configuration and adds all colorops
+ * currently used by @plane to the atomic configuration @state. This is useful
+ * when an atomic commit also needs to check all currently enabled colorop on
+ * @plane, e.g. when changing the mode. It's also useful when re-enabling a 
plane
+ * to avoid special code to force-enable all colorops.
+ *
+ * Since acquiring a colorop state will always also acquire the w/w mutex of 
the
+ * current plane for that colorop (if there is any) adding all the colorop 
states for
+ * a plane will not reduce parallelism of atomic updates.
+ *
+ * Returns:
+ * 0 on success or can fail with -EDEADLK or -ENOMEM. When the error is EDEADLK
+ * then the w/w mutex code has detected a deadlock and the entire atomic
+ * sequence must be restarted. All other errors are fatal.
+ */
+int
+drm_atomic_add_affected_colorops(struct drm_atomic_state *state,
+struct drm_plane *plane)
+{
+   struct drm_colorop *colorop;
+   struct drm_colorop_state *colorop_state;
+
+   WARN_ON(!drm_atomic_get_new_plane_state(state, plane));
+
+   drm_dbg_atomic(plane->dev,
+  "Adding all current colorops for [plane:%d:%s] to %p\n",
+  plane->base.id, plane->name, state);
+
+   drm_for_each_colorop(colorop, plane->dev) {
+   if (colorop->plane != plane)
+   continue;
+
+   colorop_state = drm_atomic_get_colorop_state(state, colorop);
+   if (IS_ERR(colorop_state))
+   return PTR_ERR(colorop_state);
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL(drm_atomic_add_affected_colorops);
+
 /**
  * drm_atomic_check_only - check whether a given config would work
  * @state: atomic configuration to check
diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c 
b/drivers/gpu/drm/drm_atomic_state_helper.c
index 784e63d70a42..3c5f2c8e33d0 100644
--- a/drivers/gpu/drm/drm_atomic_state_helper.c
+++ b/drivers/gpu/drm/drm_atomic_state_helper.c
@@ -267,6 +267,11 @@ void __drm_atomic_helper_plane_state_reset(struct 
drm_plane_state *plane_state,
plane_state->color_range = val;
}
 
+   if (plane->color_pipeline_property) {
+   /* default is always NULL, i.e., bypass */
+   plane_state->color_pipeline = NULL;
+   }
+
if (plane->zpos_property) {
if (!drm_object_property_get_default_value(>base,
   plane->zpos_property,
diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index a8f7a8a6639a..c6629fdaa114 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -256,6 +256,38 @@ drm_atomic_set_fb_for_plane(struct drm_plane_state 
*plane_state,
 }
 EXPORT_SYMBOL(drm_atomic_set_fb_for_plane);
 
+
+/**
+ * drm_atomic_set_colorop_for_plane - set colorop for plane
+ * @plane_state: atomic state object for the plane
+ * @colorop: colorop to use for the plane
+ *
+ * Changing the assigned framebuffer for a plane requires us to grab a 
reference
+ * to the new fb and drop the reference to the old fb, if there is one. This
+ * function takes care of all these details besides updating the pointer in the
+ * state object itself.
+ */
+void
+drm_atomic_set_colorop_for_plane(struct drm_plane_state *plane_state,
+struct drm_colorop *colorop)
+{
+   struct drm_plane *plane = plane_state->plane;
+
+   if (colorop)
+   drm_dbg_atomic(plane->dev,
+  "Set [COLOROP:%d] for [PLANE:%d:%s] state %p\n",
+  colorop->base.id, plane->base.id, plane->name,
+

[RFC PATCH v3 18/23] drm/colorop: Add 3x4 CTM type

2023-11-08 Thread Harry Wentland

This type is used to support a 3x4 matrix in colorops. A 3x4
matrix uses the last column as a "bias" column. Some HW exposes
support for 3x4. The calculation looks like:

 out   matrixin
 |R|   |0  1  2  3 |   | R |
 |G| = |4  5  6  7 | x | G |
 |B|   |8  9  10 12|   | B |
   |1.0|

This is also the first colorop where we need a blob property to
program the property. For that we'll introduce a new DATA
property that can be used by all colorop TYPEs requiring a
blob. The way a DATA blob is read depends on the TYPE of
the colorop.

We only create the DATA property for property types that
need it.

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/drm_atomic_uapi.c | 30 ++
 drivers/gpu/drm/drm_colorop.c | 16 
 include/drm/drm_colorop.h | 19 +++
 include/uapi/drm/drm_mode.h   | 19 ++-
 4 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index 69c56982e2d0..564bca68f652 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -725,6 +725,31 @@ drm_atomic_plane_get_property(struct drm_plane *plane,
return 0;
 }
 
+static int drm_atomic_color_set_data_property(struct drm_colorop *colorop,
+   struct drm_colorop_state *state,
+   struct drm_property *property, uint64_t val)
+{
+   ssize_t elem_size = -1;
+   ssize_t size = -1;
+   bool replaced;
+
+
+   switch (colorop->type) {
+   case DRM_COLOROP_CTM_3X4:
+   size = sizeof(struct drm_color_ctm_3x4);
+   break;
+   default:
+   /* should never get here */
+   return -EINVAL;
+   }
+
+   return drm_atomic_replace_property_blob_from_id(colorop->dev,
+   >data,
+   val,
+   size,
+   elem_size,
+   );
+}
 
 static int drm_atomic_colorop_set_property(struct drm_colorop *colorop,
struct drm_colorop_state *state, struct drm_file *file_priv,
@@ -734,6 +759,9 @@ static int drm_atomic_colorop_set_property(struct 
drm_colorop *colorop,
state->bypass = val;
} else if (property == colorop->curve_1d_type_property) {
state->curve_1d_type = val;
+   } else if (property == colorop->data_property) {
+   return drm_atomic_color_set_data_property(colorop,
+   state, property, val);
} else {
drm_dbg_atomic(colorop->dev,
   "[COLOROP:%d:%d] unknown property 
[PROP:%d:%s]]\n",
@@ -756,6 +784,8 @@ drm_atomic_colorop_get_property(struct drm_colorop *colorop,
*val = state->bypass;
} else if (property == colorop->curve_1d_type_property) {
*val = state->curve_1d_type;
+   } else if (property == colorop->data_property) {
+   *val = (state->data) ? state->data->base.id : 0;
} else {
return -EINVAL;
}
diff --git a/drivers/gpu/drm/drm_colorop.c b/drivers/gpu/drm/drm_colorop.c
index e62acf68bf9e..67e6efc90803 100644
--- a/drivers/gpu/drm/drm_colorop.c
+++ b/drivers/gpu/drm/drm_colorop.c
@@ -34,6 +34,7 @@
 
 static const struct drm_prop_enum_list drm_colorop_type_enum_list[] = {
{ DRM_COLOROP_1D_CURVE, "1D Curve" },
+   { DRM_COLOROP_CTM_3X4, "3x4 Matrix"}
 };
 
 static const struct drm_prop_enum_list drm_colorop_curve_1d_type_enum_list[] = 
{
@@ -105,6 +106,20 @@ int drm_colorop_init(struct drm_device *dev, struct 
drm_colorop *colorop,
   colorop->curve_1d_type_property,
   0);
 
+   /* data */
+   if (type == DRM_COLOROP_CTM_3X4) {
+   prop = drm_property_create(dev, DRM_MODE_PROP_ATOMIC | 
DRM_MODE_PROP_BLOB,
+  "DATA", 0);
+   if (!prop)
+   return -ENOMEM;
+
+   colorop->data_property = prop;
+   drm_object_attach_property(>base,
+  colorop->data_property,
+  0);
+   }
+
+   /* next */
prop = drm_property_create_object(dev, DRM_MODE_PROP_IMMUTABLE | 
DRM_MODE_PROP_ATOMIC,
"NEXT", DRM_MODE_OBJECT_COLOROP);
if (!prop)
@@ -219,6 +234,7 @@ EXPORT_SYMBOL(drm_colorop_reset);
 
 static const char * const colorop_type_name[] = {
[DRM_COLOROP_1D_CURVE] = "1D Curve",
+   [DRM_COLOROP_CTM_3X4] = "3x4 Matrix"
 };
 
 static const char * const colorop_curve_1d_type_name[] = {
diff --git a/include/drm/drm_colorop.h b/include/drm/drm_colorop.h
index 13acc9a6ac38..faca6eba10e1 100644
--- a/include/drm/drm_colorop.h
+++

[RFC PATCH v3 23/23] drm/vkms: Add tests for CTM handling

2023-11-08 Thread Harry Wentland

A whole slew of tests for CTM handling that greatly helped in
debugging the CTM code. The extent of tests might seem a bit
silly but they're fast and might someday help save someone
else's day when debugging this.

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/vkms/tests/vkms_color_tests.c | 258 ++
 drivers/gpu/drm/vkms/vkms_composer.c  |   2 +-
 2 files changed, 259 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vkms/tests/vkms_color_tests.c 
b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c
index ad4c2f72fd1e..3eaa2233afbb 100644
--- a/drivers/gpu/drm/vkms/tests/vkms_color_tests.c
+++ b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c
@@ -3,6 +3,7 @@
 #include 
 
 #include 
+#include 
 
 #define TEST_LUT_SIZE 16
 
@@ -80,11 +81,268 @@ static void vkms_color_srgb_inv_srgb(struct kunit *test)
}
 }
 
+#define FIXPT_HALF(DRM_FIXED_ONE >> 1)
+#define FIXPT_QUARTER (DRM_FIXED_ONE >> 2)
+
+const struct drm_color_ctm_3x4 test_matrix_3x4_50_desat = { {
+   FIXPT_HALF, FIXPT_QUARTER, FIXPT_QUARTER, 0,
+   FIXPT_QUARTER, FIXPT_HALF, FIXPT_QUARTER, 0,
+   FIXPT_QUARTER, FIXPT_QUARTER, FIXPT_HALF, 0
+} };
+
+static void vkms_color_ctm_3x4_50_desat(struct kunit *test)
+{
+   struct pixel_argb_s32 ref, out;
+
+   /* full white */
+   ref.a = 0x0;
+   ref.r = 0x;
+   ref.g = 0x;
+   ref.b = 0x;
+
+   memcpy(, , sizeof(out));
+   apply_3x4_matrix(, _matrix_3x4_50_desat);
+
+   KUNIT_EXPECT_MEMEQ(test, , , sizeof(out));
+
+   /* full black */
+   ref.a = 0x0;
+   ref.r = 0x0;
+   ref.g = 0x0;
+   ref.b = 0x0;
+
+   memcpy(, , sizeof(out));
+   apply_3x4_matrix(, _matrix_3x4_50_desat);
+
+   KUNIT_EXPECT_MEMEQ(test, , , sizeof(out));
+
+   /* 50% grey */
+   ref.a = 0x0;
+   ref.r = 0x8000;
+   ref.g = 0x8000;
+   ref.b = 0x8000;
+
+   memcpy(, , sizeof(out));
+   apply_3x4_matrix(, _matrix_3x4_50_desat);
+
+   KUNIT_EXPECT_MEMEQ(test, , , sizeof(out));
+
+   /* full red to 50% desat */
+   ref.a = 0x0;
+   ref.r = 0x7fff;
+   ref.g = 0x3fff;
+   ref.b = 0x3fff;
+
+   out.a = 0x0;
+   out.r = 0x;
+   out.g = 0x0;
+   out.b = 0x0;
+
+   apply_3x4_matrix(, _matrix_3x4_50_desat);
+
+   KUNIT_EXPECT_MEMEQ(test, , , sizeof(out));
+}
+
+const struct drm_color_ctm_3x4 test_matrix_3x4_bt709_enc = { {
+   0x366cf400ull, 0xb7175900ull, 0x000127bb300ull, 0,
+   0x80001993b3a0ull, 0x80005609fe80ull, 0x6f9db200ull, 0,
+   0x9d70a400ull, 0x80008f011100ull, 0x8e6f9330ull, 0
+} };
+
+static void vkms_color_ctm_3x4_bt709(struct kunit *test)
+{
+   struct pixel_argb_s32 ref, out;
+
+   /* full white to bt709 */
+   ref.a = 0x0;
+   ref.r = 0xfffe; /* off by one in 16bpc not a big deal */
+   ref.g = 0x0;
+   ref.b = 0x0;
+
+   out.a = 0x0;
+   out.r = 0x;
+   out.g = 0x;
+   out.b = 0x;
+
+   apply_3x4_matrix(, _matrix_3x4_bt709_enc);
+
+   /* red 255 */
+   KUNIT_EXPECT_GT(test, out.r, 0xfe00);
+
+   /* green 0 */
+   KUNIT_EXPECT_LT(test, out.g, 0x0100);
+
+   /* blue 0 */
+   KUNIT_EXPECT_LT(test, out.b, 0x0100);
+
+   /* full black to bt709 */
+   ref.a = 0x0;
+   ref.r = 0x0; /* off by one in 16bpc not a big deal */
+   ref.g = 0x0;
+   ref.b = 0x0;
+
+   out.a = 0x0;
+   out.r = 0x0;
+   out.g = 0x0;
+   out.b = 0x0;
+
+   apply_3x4_matrix(, _matrix_3x4_bt709_enc);
+
+   /* red 0 */
+   KUNIT_EXPECT_LT(test, out.r, 0x100);
+
+   /* green 0 */
+   KUNIT_EXPECT_LT(test, out.g, 0x0100);
+
+   /* blue 0 */
+   KUNIT_EXPECT_LT(test, out.b, 0x0100);
+
+   /* gray to bt709 */
+   ref.a = 0x0;
+   ref.r = 0x7fff; /* off by one in 16bpc not a big deal */
+   ref.g = 0x0;
+   ref.b = 0x0;
+
+   out.a = 0x0;
+   out.r = 0x7fff;
+   out.g = 0x7fff;
+   out.b = 0x7fff;
+
+   apply_3x4_matrix(, _matrix_3x4_bt709_enc);
+
+   /* red 127 */
+   KUNIT_EXPECT_GT(test, out.r, 0x7e00);
+   KUNIT_EXPECT_LT(test, out.r, 0x8000);
+
+   /* green 0 */
+   KUNIT_EXPECT_LT(test, out.g, 0x0100);
+
+   /* blue 0 */
+   KUNIT_EXPECT_LT(test, out.b, 0x0100);
+
+   /* == red 255 - bt709 enc == */
+   out.a = 0x0;
+   out.r = 0x;
+   out.g = 0x0;
+   out.b = 0x0;
+
+   apply_3x4_matrix(, _matrix_3x4_bt709_enc);
+
+   /* red 54 */
+   KUNIT_EXPECT_GT(test, out.r, 0x3500);
+   KUNIT_EXPECT_LT(test, out.r, 0x3700);
+
+   /* green 0 */
+   KUNIT_EXPECT_LT(test, out.g, 0x0100);
+
+   /* blue 157 */
+   KUNIT_EXPECT_GT(test, out.b, 0x9C00);
+   KUNIT_EXPECT_LT(test, out.b, 0x9E00);
+
+
+   /* == green 255 - bt709 enc == */
+   out.a = 0x0;
+   out.r = 0x0;
+   out.g = 0x;
+   out.b = 0x0;
+

[RFC PATCH v3 15/23] drm/vkms: Add enumerated 1D curve colorop

2023-11-08 Thread Harry Wentland

This patch introduces a VKMS color pipeline that includes two
drm_colorops for named transfer functions. For now the only ones
supported are sRGB EOTF, sRGB Inverse EOTF, and a Linear TF.
We will expand this in the future but I don't want to do so
without accompanying IGT tests.

We introduce a new vkms_luts.c file that hard-codes sRGB EOTF,
sRGB Inverse EOTF, and a linear EOTF LUT. These have been
generated with 256 entries each as IGT is currently testing
only 8 bpc surfaces. We will likely need higher precision
but I'm reluctant to make that change without clear indication
that we need it. We'll revisit and, if necessary, regenerate
the LUTs when we have IGT tests for higher precision buffers.

v2:
 - Add commit description
 - Fix sRGB EOTF LUT definition
 - Add linear and sRGB inverse EOTF LUTs

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/vkms/Makefile|   4 +-
 drivers/gpu/drm/vkms/vkms_colorop.c  |  85 +++
 drivers/gpu/drm/vkms/vkms_composer.c |  45 ++
 drivers/gpu/drm/vkms/vkms_drv.h  |   4 +
 drivers/gpu/drm/vkms/vkms_luts.c | 802 +++
 drivers/gpu/drm/vkms/vkms_luts.h |  12 +
 drivers/gpu/drm/vkms/vkms_plane.c|   2 +
 7 files changed, 953 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/vkms/vkms_colorop.c
 create mode 100644 drivers/gpu/drm/vkms/vkms_luts.c
 create mode 100644 drivers/gpu/drm/vkms/vkms_luts.h

diff --git a/drivers/gpu/drm/vkms/Makefile b/drivers/gpu/drm/vkms/Makefile
index 1b28a6a32948..c38455c46be4 100644
--- a/drivers/gpu/drm/vkms/Makefile
+++ b/drivers/gpu/drm/vkms/Makefile
@@ -6,6 +6,8 @@ vkms-y := \
vkms_formats.o \
vkms_crtc.o \
vkms_composer.o \
-   vkms_writeback.o
+   vkms_writeback.o \
+   vkms_colorop.o \
+   vkms_luts.o
 
 obj-$(CONFIG_DRM_VKMS) += vkms.o
diff --git a/drivers/gpu/drm/vkms/vkms_colorop.c 
b/drivers/gpu/drm/vkms/vkms_colorop.c
new file mode 100644
index ..9a26b9fdc4a2
--- /dev/null
+++ b/drivers/gpu/drm/vkms/vkms_colorop.c
@@ -0,0 +1,85 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MAX_COLOR_PIPELINES 5
+
+const int vkms_initialize_tf_pipeline(struct drm_plane *plane, struct 
drm_prop_enum_list *list)
+{
+
+   struct drm_colorop *op, *prev_op;
+   struct drm_device *dev = plane->dev;
+   int ret;
+
+   /* 1st op: 1d curve */
+   op = kzalloc(sizeof(struct drm_colorop), GFP_KERNEL);
+   if (!op) {
+   DRM_ERROR("KMS: Failed to allocate colorop\n");
+   return -ENOMEM;
+   }
+
+   ret = drm_colorop_init(dev, op, plane, DRM_COLOROP_1D_CURVE);
+   if (ret)
+   return ret;
+
+   list->type = op->base.id;
+   list->name = kasprintf(GFP_KERNEL, "Color Pipeline %d", op->base.id);
+
+   prev_op = op;
+
+   /* 2nd op: 1d curve */
+   op = kzalloc(sizeof(struct drm_colorop), GFP_KERNEL);
+   if (!op) {
+   DRM_ERROR("KMS: Failed to allocate colorop\n");
+   return -ENOMEM;
+   }
+
+   ret = drm_colorop_init(dev, op, plane, DRM_COLOROP_1D_CURVE);
+   if (ret)
+   return ret;
+
+   drm_colorop_set_next_property(prev_op, op);
+
+   return 0;
+}
+
+int vkms_initialize_colorops(struct drm_plane *plane)
+{
+   struct drm_device *dev = plane->dev;
+   struct drm_property *prop;
+   struct drm_prop_enum_list pipelines[MAX_COLOR_PIPELINES];
+   int len = 0;
+   int ret;
+
+   /* Add "Bypass" (i.e. NULL) pipeline */
+   pipelines[len].type = 0;
+   pipelines[len].name = "Bypass";
+   len++;
+
+   /* Add pipeline consisting of transfer functions */
+   ret = vkms_initialize_tf_pipeline(plane, &(pipelines[len]));
+   if (ret)
+   return ret;
+   len++;
+
+   /* Create COLOR_PIPELINE property and attach */
+   prop = drm_property_create_enum(dev, DRM_MODE_PROP_ATOMIC,
+   "COLOR_PIPELINE",
+   pipelines, len);
+   if (!prop)
+   return -ENOMEM;
+
+   plane->color_pipeline_property = prop;
+
+   drm_object_attach_property(>base, prop, 0);
+
+   /* TODO do we even need this? */
+   if (plane->state)
+   plane->state->color_pipeline = NULL;
+
+   return 0;
+}
diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
b/drivers/gpu/drm/vkms/vkms_composer.c
index 25b6b73bece8..be42756e300a 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -12,6 +12,7 @@
 #include 
 
 #include "vkms_drv.h"
+#include "vkms_luts.h"
 
 static u16 pre_mul_blend_channel(u16 src, u16 dst, u16 alpha)
 {
@@ -163,6 +164,47 @@ static void apply_lut(const struct vkms_crtc_state 
*crtc_state, struct line_buff
}
 }
 
+static void pre_blend_color_transform(const struct vkms_plane_state 
*plane_state, struct line_buffer *output_buffer)
+{
+

[RFC PATCH v3 09/23] drm/color: Add 1D Curve subtype

2023-11-08 Thread Harry Wentland

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/drm_atomic_uapi.c | 18 ++
 drivers/gpu/drm/drm_colorop.c | 39 +++
 include/drm/drm_colorop.h | 20 
 3 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index f22bd8671236..52b9b48e5757 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -670,11 +670,17 @@ static int drm_atomic_colorop_set_property(struct 
drm_colorop *colorop,
struct drm_colorop_state *state, struct drm_file *file_priv,
struct drm_property *property, uint64_t val)
 {
-   drm_dbg_atomic(colorop->dev,
-   "[COLOROP:%d] unknown property [PROP:%d:%s]]\n",
-   colorop->base.id,
-   property->base.id, property->name);
-   return -EINVAL;
+   if (property == colorop->curve_1d_type_property) {
+   state->curve_1d_type = val;
+   } else {
+   drm_dbg_atomic(colorop->dev,
+  "[COLOROP:%d:%d] unknown property 
[PROP:%d:%s]]\n",
+  colorop->base.id, colorop->type,
+  property->base.id, property->name);
+   return -EINVAL;
+   }
+
+   return 0;
 }
 
 static int
@@ -684,6 +690,8 @@ drm_atomic_colorop_get_property(struct drm_colorop *colorop,
 {
if (property == colorop->type_property) {
*val = colorop->type;
+   } else if (property == colorop->curve_1d_type_property) {
+   *val = state->curve_1d_type;
} else {
return -EINVAL;
}
diff --git a/drivers/gpu/drm/drm_colorop.c b/drivers/gpu/drm/drm_colorop.c
index 902b13e300fd..b1c271f90a16 100644
--- a/drivers/gpu/drm/drm_colorop.c
+++ b/drivers/gpu/drm/drm_colorop.c
@@ -36,6 +36,11 @@ static const struct drm_prop_enum_list 
drm_colorop_type_enum_list[] = {
{ DRM_COLOROP_1D_CURVE, "1D Curve" },
 };
 
+static const struct drm_prop_enum_list drm_colorop_curve_1d_type_enum_list[] = 
{
+   { DRM_COLOROP_1D_CURVE_SRGB_EOTF, "sRGB EOTF" },
+   { DRM_COLOROP_1D_CURVE_SRGB_INV_EOTF, "sRGB Inverse EOTF" },
+};
+
 /* Init Helpers */
 
 int drm_colorop_init(struct drm_device *dev, struct drm_colorop *colorop,
@@ -73,6 +78,20 @@ int drm_colorop_init(struct drm_device *dev, struct 
drm_colorop *colorop,
   colorop->type_property,
   colorop->type);
 
+   /* curve_1d_type */
+   /* TODO move to mode_config? */
+   prop = drm_property_create_enum(dev, DRM_MODE_PROP_ATOMIC,
+   "CURVE_1D_TYPE",
+   drm_colorop_curve_1d_type_enum_list,
+   
ARRAY_SIZE(drm_colorop_curve_1d_type_enum_list));
+   if (!prop)
+   return -ENOMEM;
+
+   colorop->curve_1d_type_property = prop;
+   drm_object_attach_property(>base,
+  colorop->curve_1d_type_property,
+  0);
+
return ret;
 }
 EXPORT_SYMBOL(drm_colorop_init);
@@ -177,6 +196,11 @@ static const char * const colorop_type_name[] = {
[DRM_COLOROP_1D_CURVE] = "1D Curve",
 };
 
+static const char * const colorop_curve_1d_type_name[] = {
+   [DRM_COLOROP_1D_CURVE_SRGB_EOTF] = "sRGB EOTF",
+   [DRM_COLOROP_1D_CURVE_SRGB_INV_EOTF] = "sRGB Inverse EOTF",
+};
+
 /**
  * drm_get_colorop_type_name - return a string for colorop type
  * @type: colorop type to compute name of
@@ -191,3 +215,18 @@ const char *drm_get_colorop_type_name(enum 
drm_colorop_type type)
 
return colorop_type_name[type];
 }
+
+/**
+ * drm_get_colorop_curve_1d_type_name - return a string for 1D curve type
+ * @range: 1d curve type to compute name of
+ *
+ * In contrast to the other drm_get_*_name functions this one here returns a
+ * const pointer and hence is threadsafe.
+ */
+const char *drm_get_colorop_curve_1d_type_name(enum drm_colorop_curve_1d_type 
type)
+{
+   if (WARN_ON(type >= ARRAY_SIZE(colorop_curve_1d_type_name)))
+   return "unknown";
+
+   return colorop_curve_1d_type_name[type];
+}
diff --git a/include/drm/drm_colorop.h b/include/drm/drm_colorop.h
index 0386440afe40..965db7ec488f 100644
--- a/include/drm/drm_colorop.h
+++ b/include/drm/drm_colorop.h
@@ -30,6 +30,11 @@
 #include 
 #include 
 
+enum drm_colorop_curve_1d_type {
+   DRM_COLOROP_1D_CURVE_SRGB_EOTF,
+   DRM_COLOROP_1D_CURVE_SRGB_INV_EOTF
+};
+
 /**
  * struct drm_colorop_state - mutable colorop state
  */
@@ -39,6 +44,13 @@ struct drm_colorop_state {
 
/* colorop properties */
 
+   /**
+* @curve_1d_type:
+*
+* Type of 1D curve.
+*/
+   enum drm_colorop_curve_1d_type curve_1d_type;
+
/** @state: backpointer to global

[RFC PATCH v3 12/23] drm/colorop: Add atomic state print for drm_colorop

2023-11-08 Thread Harry Wentland

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/drm_atomic.c | 29 +
 include/drm/drm_colorop.h|  5 +
 2 files changed, 34 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 59e1797d1ca8..ccf26b034433 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -783,6 +783,19 @@ static int drm_atomic_plane_check(const struct 
drm_plane_state *old_plane_state,
return 0;
 }
 
+
+
+static void drm_atomic_colorop_print_state(struct drm_printer *p,
+   const struct drm_colorop_state *state)
+{
+   struct drm_colorop *colorop = state->colorop;
+
+   drm_printf(p, "colorop[%u]:\n", colorop->base.id);
+   drm_printf(p, "\ttype=%s\n", drm_get_colorop_type_name(colorop->type));
+   drm_printf(p, "\tbypass=%u\n", state->bypass);
+   drm_printf(p, "\tcurve_1d_type=%s\n", 
drm_get_colorop_curve_1d_type_name(state->curve_1d_type));
+}
+
 static void drm_atomic_plane_print_state(struct drm_printer *p,
const struct drm_plane_state *state)
 {
@@ -803,6 +816,13 @@ static void drm_atomic_plane_print_state(struct 
drm_printer *p,
   drm_get_color_encoding_name(state->color_encoding));
drm_printf(p, "\tcolor-range=%s\n",
   drm_get_color_range_name(state->color_range));
+#if 0
+   drm_printf(p, "\tcolor-pipeline=%s\n",
+  drm_get_color_pipeline_name(state->color_pipeline));
+#else
+   drm_printf(p, "\tcolor-pipeline=%d\n",
+  state->color_pipeline ? state->color_pipeline->base.id : 0);
+#endif
 
if (plane->funcs->atomic_print_state)
plane->funcs->atomic_print_state(p, state);
@@ -1839,6 +1859,7 @@ static void __drm_state_dump(struct drm_device *dev, 
struct drm_printer *p,
 bool take_locks)
 {
struct drm_mode_config *config = >mode_config;
+   struct drm_colorop *colorop;
struct drm_plane *plane;
struct drm_crtc *crtc;
struct drm_connector *connector;
@@ -1847,6 +1868,14 @@ static void __drm_state_dump(struct drm_device *dev, 
struct drm_printer *p,
if (!drm_drv_uses_atomic_modeset(dev))
return;
 
+   list_for_each_entry(colorop, >colorop_list, head) {
+   if (take_locks)
+   drm_modeset_lock(>plane->mutex, NULL);
+   drm_atomic_colorop_print_state(p, colorop->state);
+   if (take_locks)
+   drm_modeset_unlock(>plane->mutex);
+   }
+
list_for_each_entry(plane, >plane_list, head) {
if (take_locks)
drm_modeset_lock(>mutex, NULL);
diff --git a/include/drm/drm_colorop.h b/include/drm/drm_colorop.h
index c44f076a0606..fbf7c0489fc8 100644
--- a/include/drm/drm_colorop.h
+++ b/include/drm/drm_colorop.h
@@ -226,6 +226,11 @@ static inline unsigned int drm_colorop_index(const struct 
drm_colorop *colorop)
 #define drm_for_each_colorop(colorop, dev) \
list_for_each_entry(colorop, &(dev)->mode_config.colorop_list, head)
 
+const char *drm_get_color_pipeline_name(struct drm_colorop *colorop);
+
+const char *drm_get_colorop_type_name(enum drm_colorop_type type);
+const char *drm_get_colorop_curve_1d_type_name(enum drm_colorop_curve_1d_type 
type);
+
 void drm_colorop_set_next_property(struct drm_colorop *colorop, struct 
drm_colorop *next);
 
 
-- 
2.42.1

[RFC PATCH v3 16/23] drm/vkms: Add kunit tests for linear and sRGB LUTs

2023-11-08 Thread Harry Wentland

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/vkms/tests/vkms_color_tests.c | 37 ++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vkms/tests/vkms_color_tests.c 
b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c
index b995114cf6b8..ad4c2f72fd1e 100644
--- a/drivers/gpu/drm/vkms/tests/vkms_color_tests.c
+++ b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c
@@ -31,7 +31,6 @@ const struct vkms_color_lut test_linear_lut = {
.channel_value2index_ratio = 0xf000fll
 };
 
-
 static void vkms_color_test_get_lut_index(struct kunit *test)
 {
int i;
@@ -40,6 +39,19 @@ static void vkms_color_test_get_lut_index(struct kunit *test)
 
for (i = 0; i < TEST_LUT_SIZE; i++)
KUNIT_EXPECT_EQ(test, 
drm_fixp2int_ceil(get_lut_index(_linear_lut, test_linear_array[i].red)), 
i);
+
+   KUNIT_EXPECT_EQ(test, drm_fixp2int(get_lut_index(_eotf, 0x0)), 
0x0);
+   KUNIT_EXPECT_EQ(test, drm_fixp2int_ceil(get_lut_index(_eotf, 
0x0)), 0x0);
+   KUNIT_EXPECT_EQ(test, drm_fixp2int_ceil(get_lut_index(_eotf, 
0x101)), 0x1);
+   KUNIT_EXPECT_EQ(test, drm_fixp2int_ceil(get_lut_index(_eotf, 
0x202)), 0x2);
+
+   KUNIT_EXPECT_EQ(test, drm_fixp2int(get_lut_index(_inv_eotf, 0x0)), 
0x0);
+   KUNIT_EXPECT_EQ(test, drm_fixp2int_ceil(get_lut_index(_inv_eotf, 
0x0)), 0x0);
+   KUNIT_EXPECT_EQ(test, drm_fixp2int_ceil(get_lut_index(_inv_eotf, 
0x101)), 0x1);
+   KUNIT_EXPECT_EQ(test, drm_fixp2int_ceil(get_lut_index(_inv_eotf, 
0x202)), 0x2);
+
+   KUNIT_EXPECT_EQ(test, drm_fixp2int_ceil(get_lut_index(_eotf, 
0xfefe)), 0xfe);
+   KUNIT_EXPECT_EQ(test, drm_fixp2int_ceil(get_lut_index(_eotf, 
0x)), 0xff);
 }
 
 static void vkms_color_test_lerp(struct kunit *test)
@@ -47,9 +59,32 @@ static void vkms_color_test_lerp(struct kunit *test)
KUNIT_EXPECT_EQ(test, lerp_u16(0x0, 0x10, 0x8000), 0x8);
 }
 
+static void vkms_color_test_linear(struct kunit *test)
+{
+   for (int i = 0; i < LUT_SIZE; i++) {
+   int linear = apply_lut_to_channel_value(_eotf, i * 
0x101, LUT_RED);
+   KUNIT_EXPECT_EQ(test, DIV_ROUND_CLOSEST(linear, 0x101), i);
+   }
+}
+
+static void vkms_color_srgb_inv_srgb(struct kunit *test)
+{
+   u16 srgb, final;
+
+   for (int i = 0; i < LUT_SIZE; i++) {
+   srgb = apply_lut_to_channel_value(_eotf, i * 0x101, 
LUT_RED);
+   final = apply_lut_to_channel_value(_inv_eotf, srgb, 
LUT_RED);
+
+   KUNIT_EXPECT_GE(test, final / 0x101, i-1);
+   KUNIT_EXPECT_LE(test, final / 0x101, i+1);
+   }
+}
+
 static struct kunit_case vkms_color_test_cases[] = {
KUNIT_CASE(vkms_color_test_get_lut_index),
KUNIT_CASE(vkms_color_test_lerp),
+   KUNIT_CASE(vkms_color_test_linear),
+   KUNIT_CASE(vkms_color_srgb_inv_srgb),
{}
 };
 
-- 
2.42.1

[RFC PATCH v3 21/23] drm/vkms: add 3x4 matrix in color pipeline

2023-11-08 Thread Harry Wentland

We add two 3x4 matrices into the VKMS color pipeline. The reason
we're adding matrices is so that we can test that application
of a matrix and its inverse yields an output equal to the input
image.

One complication with the matrix implementation has to do with
the fact that the matrix entries are in signed-magnitude fixed
point, whereas the drm_fixed.h implementation uses 2s-complement.
The latter one is the one that we want for easy addition and
subtraction, so we convert all entries to 2s-complement.

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/vkms/vkms_colorop.c  | 32 +++-
 drivers/gpu/drm/vkms/vkms_composer.c | 27 +++
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vkms/vkms_colorop.c 
b/drivers/gpu/drm/vkms/vkms_colorop.c
index 9a26b9fdc4a2..4e37e805c443 100644
--- a/drivers/gpu/drm/vkms/vkms_colorop.c
+++ b/drivers/gpu/drm/vkms/vkms_colorop.c
@@ -31,7 +31,37 @@ const int vkms_initialize_tf_pipeline(struct drm_plane 
*plane, struct drm_prop_e
 
prev_op = op;
 
-   /* 2nd op: 1d curve */
+   /* 2nd op: 3x4 matrix */
+   op = kzalloc(sizeof(struct drm_colorop), GFP_KERNEL);
+   if (!op) {
+   DRM_ERROR("KMS: Failed to allocate colorop\n");
+   return -ENOMEM;
+   }
+
+   ret = drm_colorop_init(dev, op, plane, DRM_COLOROP_CTM_3X4);
+   if (ret)
+   return ret;
+
+   drm_colorop_set_next_property(prev_op, op);
+
+   prev_op = op;
+
+   /* 3rd op: 3x4 matrix */
+   op = kzalloc(sizeof(struct drm_colorop), GFP_KERNEL);
+   if (!op) {
+   DRM_ERROR("KMS: Failed to allocate colorop\n");
+   return -ENOMEM;
+   }
+
+   ret = drm_colorop_init(dev, op, plane, DRM_COLOROP_CTM_3X4);
+   if (ret)
+   return ret;
+
+   drm_colorop_set_next_property(prev_op, op);
+
+   prev_op = op;
+
+   /* 4th op: 1d curve */
op = kzalloc(sizeof(struct drm_colorop), GFP_KERNEL);
if (!op) {
DRM_ERROR("KMS: Failed to allocate colorop\n");
diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
b/drivers/gpu/drm/vkms/vkms_composer.c
index d04a235b9fcd..c278fb223188 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -164,6 +164,30 @@ static void apply_lut(const struct vkms_crtc_state 
*crtc_state, struct line_buff
}
 }
 
+static void apply_3x4_matrix(struct pixel_argb_s32 *pixel, const struct 
drm_color_ctm_3x4 *matrix)
+{
+   s64 rf, gf, bf;
+
+   rf = drm_fixp_mul(drm_sm2fixp(matrix->matrix[0]), 
drm_int2fixp(pixel->r)) +
+drm_fixp_mul(drm_sm2fixp(matrix->matrix[1]), 
drm_int2fixp(pixel->g)) +
+drm_fixp_mul(drm_sm2fixp(matrix->matrix[2]), 
drm_int2fixp(pixel->b)) +
+drm_sm2fixp(matrix->matrix[3]);
+
+   gf = drm_fixp_mul(drm_sm2fixp(matrix->matrix[4]), 
drm_int2fixp(pixel->r)) +
+drm_fixp_mul(drm_sm2fixp(matrix->matrix[5]), 
drm_int2fixp(pixel->g)) +
+drm_fixp_mul(drm_sm2fixp(matrix->matrix[6]), 
drm_int2fixp(pixel->b)) +
+drm_sm2fixp(matrix->matrix[7]);
+
+   bf = drm_fixp_mul(drm_sm2fixp(matrix->matrix[8]), 
drm_int2fixp(pixel->r)) +
+drm_fixp_mul(drm_sm2fixp(matrix->matrix[9]), 
drm_int2fixp(pixel->g)) +
+drm_fixp_mul(drm_sm2fixp(matrix->matrix[10]), 
drm_int2fixp(pixel->b)) +
+drm_sm2fixp(matrix->matrix[11]);
+
+   pixel->r = drm_fixp2int(rf);
+   pixel->g = drm_fixp2int(gf);
+   pixel->b = drm_fixp2int(bf);
+}
+
 static void apply_colorop(struct pixel_argb_s32 *pixel, struct drm_colorop 
*colorop)
 {
/* TODO is this right? */
@@ -185,6 +209,9 @@ static void apply_colorop(struct pixel_argb_s32 *pixel, 
struct drm_colorop *colo
DRM_DEBUG_DRIVER("unkown colorop 1D curve type 
%d\n", colorop_state->curve_1d_type);
break;
}
+   } else if (colorop->type == DRM_COLOROP_CTM_3X4) {
+   if (colorop_state->data)
+   apply_3x4_matrix(pixel, (struct drm_color_ctm_3x4 *) 
colorop_state->data->data);
}
 
 }
-- 
2.42.1

[RFC PATCH v3 14/23] drm/colorop: Add NEXT to colorop state print

2023-11-08 Thread Harry Wentland

v3:
 - Read NEXT ID from drm_colorop's next pointer

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/drm_atomic.c | 1 +
 include/drm/drm_colorop.h| 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index cf3cb6d1239f..02bb071f735c 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -794,6 +794,7 @@ static void drm_atomic_colorop_print_state(struct 
drm_printer *p,
drm_printf(p, "\ttype=%s\n", drm_get_colorop_type_name(colorop->type));
drm_printf(p, "\tbypass=%u\n", state->bypass);
drm_printf(p, "\tcurve_1d_type=%s\n", 
drm_get_colorop_curve_1d_type_name(state->curve_1d_type));
+   drm_printf(p, "\tnext=%d\n", colorop->next ? colorop->next->base.id : 
0);
 }
 
 static void drm_atomic_plane_print_state(struct drm_printer *p,
diff --git a/include/drm/drm_colorop.h b/include/drm/drm_colorop.h
index fbf7c0489fc8..13acc9a6ac38 100644
--- a/include/drm/drm_colorop.h
+++ b/include/drm/drm_colorop.h
@@ -232,6 +232,8 @@ const char *drm_get_colorop_type_name(enum drm_colorop_type 
type);
 const char *drm_get_colorop_curve_1d_type_name(enum drm_colorop_curve_1d_type 
type);
 
 void drm_colorop_set_next_property(struct drm_colorop *colorop, struct 
drm_colorop *next);
+uint32_t drm_colorop_get_next_property(struct drm_colorop *colorop);
+struct drm_colorop *drm_colorop_get_next(struct drm_colorop *colorop);
 
 
 #endif /* __DRM_COLOROP_H__ */
-- 
2.42.1

[RFC PATCH v3 04/23] drm/vkms: Add kunit tests for VKMS LUT handling

2023-11-08 Thread Harry Wentland

Debugging LUT math is much easier when we can unit test
it. Add kunit functionality to VKMS and add tests for
 - get_lut_index
 - lerp_u16

v3:
 - Use include way of testing static functions (Arthur)

Signed-off-by: Harry Wentland 
Cc: Arthur Grillo 
---
 drivers/gpu/drm/vkms/Kconfig  |  5 ++
 drivers/gpu/drm/vkms/tests/.kunitconfig   |  4 ++
 drivers/gpu/drm/vkms/tests/vkms_color_tests.c | 62 +++
 drivers/gpu/drm/vkms/vkms_composer.c  |  8 ++-
 4 files changed, 77 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/vkms/tests/.kunitconfig
 create mode 100644 drivers/gpu/drm/vkms/tests/vkms_color_tests.c

diff --git a/drivers/gpu/drm/vkms/Kconfig b/drivers/gpu/drm/vkms/Kconfig
index b9ecdebecb0b..c1f8b343ff0e 100644
--- a/drivers/gpu/drm/vkms/Kconfig
+++ b/drivers/gpu/drm/vkms/Kconfig
@@ -13,3 +13,8 @@ config DRM_VKMS
  a VKMS.
 
  If M is selected the module will be called vkms.
+
+config DRM_VKMS_KUNIT_TESTS
+   tristate "Tests for VKMS" if !KUNIT_ALL_TESTS
+   depends on DRM_VKMS && KUNIT
+   default KUNIT_ALL_TESTS
diff --git a/drivers/gpu/drm/vkms/tests/.kunitconfig 
b/drivers/gpu/drm/vkms/tests/.kunitconfig
new file mode 100644
index ..70e378228cbd
--- /dev/null
+++ b/drivers/gpu/drm/vkms/tests/.kunitconfig
@@ -0,0 +1,4 @@
+CONFIG_KUNIT=y
+CONFIG_DRM=y
+CONFIG_DRM_VKMS=y
+CONFIG_DRM_VKMS_KUNIT_TESTS=y
diff --git a/drivers/gpu/drm/vkms/tests/vkms_color_tests.c 
b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c
new file mode 100644
index ..b995114cf6b8
--- /dev/null
+++ b/drivers/gpu/drm/vkms/tests/vkms_color_tests.c
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+
+#include 
+
+#include 
+
+#define TEST_LUT_SIZE 16
+
+static struct drm_color_lut test_linear_array[TEST_LUT_SIZE] = {
+   { 0x0, 0x0, 0x0, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+   { 0x, 0x, 0x, 0 },
+};
+
+const struct vkms_color_lut test_linear_lut = {
+   .base = test_linear_array,
+   .lut_length = TEST_LUT_SIZE,
+   .channel_value2index_ratio = 0xf000fll
+};
+
+
+static void vkms_color_test_get_lut_index(struct kunit *test)
+{
+   int i;
+
+   KUNIT_EXPECT_EQ(test, drm_fixp2int(get_lut_index(_linear_lut, 
test_linear_array[0].red)), 0);
+
+   for (i = 0; i < TEST_LUT_SIZE; i++)
+   KUNIT_EXPECT_EQ(test, 
drm_fixp2int_ceil(get_lut_index(_linear_lut, test_linear_array[i].red)), 
i);
+}
+
+static void vkms_color_test_lerp(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, lerp_u16(0x0, 0x10, 0x8000), 0x8);
+}
+
+static struct kunit_case vkms_color_test_cases[] = {
+   KUNIT_CASE(vkms_color_test_get_lut_index),
+   KUNIT_CASE(vkms_color_test_lerp),
+   {}
+};
+
+static struct kunit_suite vkms_color_test_suite = {
+   .name = "vkms-color",
+   .test_cases = vkms_color_test_cases,
+};
+kunit_test_suite(vkms_color_test_suite);
+
+MODULE_LICENSE("GPL");
\ No newline at end of file
diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
b/drivers/gpu/drm/vkms/vkms_composer.c
index 3c99fb8b54e2..6f942896036e 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -91,7 +91,7 @@ static void fill_background(const struct pixel_argb_u16 
*background_color,
 }
 
 // lerp(a, b, t) = a + (b - a) * t
-static u16 lerp_u16(u16 a, u16 b, s64 t)
+u16 lerp_u16(u16 a, u16 b, s64 t)
 {
s64 a_fp = drm_int2fixp(a);
s64 b_fp = drm_int2fixp(b);
@@ -101,7 +101,7 @@ static u16 lerp_u16(u16 a, u16 b, s64 t)
return drm_fixp2int(a_fp + delta);
 }
 
-static s64 get_lut_index(const struct vkms_color_lut *lut, u16 channel_value)
+s64 get_lut_index(const struct vkms_color_lut *lut, u16 channel_value)
 {
s64 color_channel_fp = drm_int2fixp(channel_value);
 
@@ -429,3 +429,7 @@ int vkms_set_crc_source(struct drm_crtc *crtc, const char 
*src_name)
 
return ret;
 }
+
+#ifdef CONFIG_DRM_VKMS_KUNIT_TESTS
+#include "tests/vkms_color_tests.c"
+#endif
-- 
2.42.1

[RFC PATCH v3 03/23] drm/vkms: Create separate Kconfig file for VKMS

2023-11-08 Thread Harry Wentland

This aligns with most other DRM drivers and will allow
us to add new VKMS config options without polluting
the DRM Kconfig.

v3:
 - Change SPDX to GPL-2.0-only to match DRM KConfig
   SPDX (Simon)

Signed-off-by: Harry Wentland 
Reviewed-by: Simon Ser 
---
 drivers/gpu/drm/Kconfig  | 14 +-
 drivers/gpu/drm/vkms/Kconfig | 15 +++
 2 files changed, 16 insertions(+), 13 deletions(-)
 create mode 100644 drivers/gpu/drm/vkms/Kconfig

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 48ca28a2e4ff..61ebd682c9b0 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -286,19 +286,7 @@ config DRM_VGEM
  as used by Mesa's software renderer for enhanced performance.
  If M is selected the module will be called vgem.
 
-config DRM_VKMS
-   tristate "Virtual KMS (EXPERIMENTAL)"
-   depends on DRM && MMU
-   select DRM_KMS_HELPER
-   select DRM_GEM_SHMEM_HELPER
-   select CRC32
-   default n
-   help
- Virtual Kernel Mode-Setting (VKMS) is used for testing or for
- running GPU in a headless machines. Choose this option to get
- a VKMS.
-
- If M is selected the module will be called vkms.
+source "drivers/gpu/drm/vkms/Kconfig"
 
 source "drivers/gpu/drm/exynos/Kconfig"
 
diff --git a/drivers/gpu/drm/vkms/Kconfig b/drivers/gpu/drm/vkms/Kconfig
new file mode 100644
index ..b9ecdebecb0b
--- /dev/null
+++ b/drivers/gpu/drm/vkms/Kconfig
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config DRM_VKMS
+   tristate "Virtual KMS (EXPERIMENTAL)"
+   depends on DRM && MMU
+   select DRM_KMS_HELPER
+   select DRM_GEM_SHMEM_HELPER
+   select CRC32
+   default n
+   help
+ Virtual Kernel Mode-Setting (VKMS) is used for testing or for
+ running GPU in a headless machines. Choose this option to get
+ a VKMS.
+
+ If M is selected the module will be called vkms.
-- 
2.42.1

[RFC PATCH v3 11/23] drm/colorop: Add NEXT property

2023-11-08 Thread Harry Wentland

We'll construct color pipelines out of drm_colorop by
chaining them via the NEXT pointer. NEXT will point to
the next drm_colorop in the pipeline, or by 0 if we're
at the end of the pipeline.

v3:
 - Add next pointer to colorop to be used by drivers
   and in DRM core

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/drm_colorop.c | 29 +
 include/drm/drm_colorop.h | 20 
 2 files changed, 49 insertions(+)

diff --git a/drivers/gpu/drm/drm_colorop.c b/drivers/gpu/drm/drm_colorop.c
index 17ba11ae205b..e62acf68bf9e 100644
--- a/drivers/gpu/drm/drm_colorop.c
+++ b/drivers/gpu/drm/drm_colorop.c
@@ -58,6 +58,7 @@ int drm_colorop_init(struct drm_device *dev, struct 
drm_colorop *colorop,
colorop->dev = dev;
colorop->type = type;
colorop->plane = plane;
+   colorop->next = NULL;
 
list_add_tail(>head, >colorop_list);
colorop->index = config->num_colorop++;
@@ -104,6 +105,15 @@ int drm_colorop_init(struct drm_device *dev, struct 
drm_colorop *colorop,
   colorop->curve_1d_type_property,
   0);
 
+   prop = drm_property_create_object(dev, DRM_MODE_PROP_IMMUTABLE | 
DRM_MODE_PROP_ATOMIC,
+   "NEXT", DRM_MODE_OBJECT_COLOROP);
+   if (!prop)
+   return -ENOMEM;
+   colorop->next_property = prop;
+   drm_object_attach_property(>base,
+  colorop->next_property,
+  0);
+
return ret;
 }
 EXPORT_SYMBOL(drm_colorop_init);
@@ -245,3 +255,22 @@ const char *drm_get_colorop_curve_1d_type_name(enum 
drm_colorop_curve_1d_type ty
 
return colorop_curve_1d_type_name[type];
 }
+
+/**
+ * drm_colorop_set_next_property - sets the next pointer
+ * @colorop: drm colorop
+ * @next: next colorop
+ *
+ * Should be used when constructing the color pipeline
+ */
+void drm_colorop_set_next_property(struct drm_colorop *colorop, struct 
drm_colorop *next)
+{
+   if (!colorop->next_property)
+   return;
+
+   drm_object_property_set_value(>base,
+ colorop->next_property,
+ next->base.id);
+   colorop->next = next;
+}
+EXPORT_SYMBOL(drm_colorop_set_next_property);
diff --git a/include/drm/drm_colorop.h b/include/drm/drm_colorop.h
index 053d1aa72e1b..c44f076a0606 100644
--- a/include/drm/drm_colorop.h
+++ b/include/drm/drm_colorop.h
@@ -131,6 +131,14 @@ struct drm_colorop {
 */
enum drm_colorop_type type;
 
+   /**
+* @next:
+*
+* Read-only
+* Pointer to next drm_colorop in pipeline
+*/
+   struct drm_colorop *next;
+
/**
 * @type_property:
 *
@@ -158,10 +166,20 @@ struct drm_colorop {
 */
struct drm_property *curve_1d_type_property;
 
+   /**
+* @next_property
+*
+* Read-only property to next colorop in the pipeline
+*/
+   struct drm_property *next_property;
+
 };
 
 #define obj_to_colorop(x) container_of(x, struct drm_colorop, base)
 
+
+
+
 /**
  * drm_crtc_find - look up a Colorop object from its ID
  * @dev: DRM device
@@ -208,5 +226,7 @@ static inline unsigned int drm_colorop_index(const struct 
drm_colorop *colorop)
 #define drm_for_each_colorop(colorop, dev) \
list_for_each_entry(colorop, &(dev)->mode_config.colorop_list, head)
 
+void drm_colorop_set_next_property(struct drm_colorop *colorop, struct 
drm_colorop *next);
+
 
 #endif /* __DRM_COLOROP_H__ */
-- 
2.42.1

[RFC PATCH v3 02/23] drm: Add helper for conversion from signed-magnitude

2023-11-08 Thread Harry Wentland

CTM values are defined as signed-magnitude values. Add
a helper that converts from CTM signed-magnitude fixed
point value to the twos-complement value used by
drm_fixed.

Signed-off-by: Harry Wentland 
---
 include/drm/drm_fixed.h | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/include/drm/drm_fixed.h b/include/drm/drm_fixed.h
index 0c9f917a4d4b..cb842ba80ddd 100644
--- a/include/drm/drm_fixed.h
+++ b/include/drm/drm_fixed.h
@@ -78,6 +78,24 @@ static inline u32 dfixed_div(fixed20_12 A, fixed20_12 B)
 #define DRM_FIXED_EPSILON  1LL
 #define DRM_FIXED_ALMOST_ONE   (DRM_FIXED_ONE - DRM_FIXED_EPSILON)
 
+/**
+ * @drm_sm2fixp
+ *
+ * Convert a 1.31.32 signed-magnitude fixed point to 32.32
+ * 2s-complement fixed point
+ *
+ * @return s64 2s-complement fixed point
+ */
+static inline s64 drm_sm2fixp(__u64 a)
+{
+   if ((a & (1LL << 63))) {
+   return -(a & 0x7fffll);
+   } else {
+   return a;
+   }
+
+}
+
 static inline s64 drm_int2fixp(int a)
 {
return ((s64)a) << DRM_FIXED_POINT;
-- 
2.42.1

[RFC PATCH v3 06/23] drm/doc/rfc: Describe why prescriptive color pipeline is needed

2023-11-08 Thread Harry Wentland

v3:
 - Describe DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE (Sebastian)
 - Ask for clear documentation of colorop behavior (Sebastian)

v2:
 - Update colorop visualizations to match reality (Sebastian, Alex Hung)
 - Updated wording (Pekka)
 - Change BYPASS wording to make it non-mandatory (Sebastian)
 - Drop cover-letter-like paragraph from COLOR_PIPELINE Plane Property
   section (Pekka)
 - Use PQ EOTF instead of its inverse in Pipeline Programming example (Melissa)
 - Add "Driver Implementer's Guide" section (Pekka)
 - Add "Driver Forward/Backward Compatibility" section (Sebastian, Pekka)

Signed-off-by: Harry Wentland 
---
 Documentation/gpu/rfc/color_pipeline.rst | 352 +++
 1 file changed, 352 insertions(+)
 create mode 100644 Documentation/gpu/rfc/color_pipeline.rst

diff --git a/Documentation/gpu/rfc/color_pipeline.rst 
b/Documentation/gpu/rfc/color_pipeline.rst
new file mode 100644
index ..efc70570a592
--- /dev/null
+++ b/Documentation/gpu/rfc/color_pipeline.rst
@@ -0,0 +1,352 @@
+
+Linux Color Pipeline API
+
+
+What problem are we solving?
+
+
+We would like to support pre-, and post-blending complex color
+transformations in display controller hardware in order to allow for
+HW-supported HDR use-cases, as well as to provide support to
+color-managed applications, such as video or image editors.
+
+It is possible to support an HDR output on HW supporting the Colorspace
+and HDR Metadata drm_connector properties, but that requires the
+compositor or application to render and compose the content into one
+final buffer intended for display. Doing so is costly.
+
+Most modern display HW offers various 1D LUTs, 3D LUTs, matrices, and other
+operations to support color transformations. These operations are often
+implemented in fixed-function HW and therefore much more power efficient than
+performing similar operations via shaders or CPU.
+
+We would like to make use of this HW functionality to support complex color
+transformations with no, or minimal CPU or shader load.
+
+
+How are other OSes solving this problem?
+
+
+The most widely supported use-cases regard HDR content, whether video or
+gaming.
+
+Most OSes will specify the source content format (color gamut, encoding 
transfer
+function, and other metadata, such as max and average light levels) to a 
driver.
+Drivers will then program their fixed-function HW accordingly to map from a
+source content buffer's space to a display's space.
+
+When fixed-function HW is not available the compositor will assemble a shader 
to
+ask the GPU to perform the transformation from the source content format to the
+display's format.
+
+A compositor's mapping function and a driver's mapping function are usually
+entirely separate concepts. On OSes where a HW vendor has no insight into
+closed-source compositor code such a vendor will tune their color management
+code to visually match the compositor's. On other OSes, where both mapping
+functions are open to an implementer they will ensure both mappings match.
+
+This results in mapping algorithm lock-in, meaning that no-one alone can
+experiment with or introduce new mapping algorithms and achieve
+consistent results regardless of which implementation path is taken.
+
+Why is Linux different?
+===
+
+Unlike other OSes, where there is one compositor for one or more drivers, on
+Linux we have a many-to-many relationship. Many compositors; many drivers.
+In addition each compositor vendor or community has their own view of how
+color management should be done. This is what makes Linux so beautiful.
+
+This means that a HW vendor can now no longer tune their driver to one
+compositor, as tuning it to one could make it look fairly different from
+another compositor's color mapping.
+
+We need a better solution.
+
+
+Descriptive API
+===
+
+An API that describes the source and destination colorspaces is a descriptive
+API. It describes the input and output color spaces but does not describe
+how precisely they should be mapped. Such a mapping includes many minute
+design decision that can greatly affect the look of the final result.
+
+It is not feasible to describe such mapping with enough detail to ensure the
+same result from each implementation. In fact, these mappings are a very active
+research area.
+
+
+Prescriptive API
+
+
+A prescriptive API describes not the source and destination colorspaces. It
+instead prescribes a recipe for how to manipulate pixel values to arrive at the
+desired outcome.
+
+This recipe is generally an ordered list of straight-forward operations,
+with clear mathematical definitions, such as 1D LUTs, 3D LUTs, matrices,
+or other operations that can be described in a precise manner.
+
+
+The Color Pipeline API
+==
+
+HW color management pipelines can significantly differ between

[RFC PATCH v3 10/23] drm/colorop: Add BYPASS property

2023-11-08 Thread Harry Wentland

We want to be able to bypass each colorop at all times.
Introduce a new BYPASS boolean property for this.

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/drm_atomic_uapi.c |  6 +-
 drivers/gpu/drm/drm_colorop.c | 15 +++
 include/drm/drm_colorop.h | 20 
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index 52b9b48e5757..a8f7a8a6639a 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -670,7 +670,9 @@ static int drm_atomic_colorop_set_property(struct 
drm_colorop *colorop,
struct drm_colorop_state *state, struct drm_file *file_priv,
struct drm_property *property, uint64_t val)
 {
-   if (property == colorop->curve_1d_type_property) {
+   if (property == colorop->bypass_property) {
+   state->bypass = val;
+   } else if (property == colorop->curve_1d_type_property) {
state->curve_1d_type = val;
} else {
drm_dbg_atomic(colorop->dev,
@@ -690,6 +692,8 @@ drm_atomic_colorop_get_property(struct drm_colorop *colorop,
 {
if (property == colorop->type_property) {
*val = colorop->type;
+   } else if (property == colorop->bypass_property) {
+   *val = state->bypass;
} else if (property == colorop->curve_1d_type_property) {
*val = state->curve_1d_type;
} else {
diff --git a/drivers/gpu/drm/drm_colorop.c b/drivers/gpu/drm/drm_colorop.c
index b1c271f90a16..17ba11ae205b 100644
--- a/drivers/gpu/drm/drm_colorop.c
+++ b/drivers/gpu/drm/drm_colorop.c
@@ -78,6 +78,18 @@ int drm_colorop_init(struct drm_device *dev, struct 
drm_colorop *colorop,
   colorop->type_property,
   colorop->type);
 
+   /* bypass */
+   /* TODO can we reuse the mode_config->active_prop? */
+   prop = drm_property_create_bool(dev, DRM_MODE_PROP_ATOMIC,
+   "BYPASS");
+   if (!prop)
+   return -ENOMEM;
+
+   colorop->bypass_property = prop;
+   drm_object_attach_property(>base,
+  colorop->bypass_property,
+  1);
+
/* curve_1d_type */
/* TODO move to mode_config? */
prop = drm_property_create_enum(dev, DRM_MODE_PROP_ATOMIC,
@@ -100,6 +112,8 @@ static void 
__drm_atomic_helper_colorop_duplicate_state(struct drm_colorop *colo
struct 
drm_colorop_state *state)
 {
memcpy(state, colorop->state, sizeof(*state));
+
+   state->bypass = true;
 }
 
 struct drm_colorop_state *
@@ -151,6 +165,7 @@ static void __drm_colorop_state_reset(struct 
drm_colorop_state *colorop_state,
  struct drm_colorop *colorop)
 {
colorop_state->colorop = colorop;
+   colorop_state->bypass = true;
 }
 
 /**
diff --git a/include/drm/drm_colorop.h b/include/drm/drm_colorop.h
index 965db7ec488f..053d1aa72e1b 100644
--- a/include/drm/drm_colorop.h
+++ b/include/drm/drm_colorop.h
@@ -44,6 +44,14 @@ struct drm_colorop_state {
 
/* colorop properties */
 
+   /**
+* @bypass:
+*
+* True if colorop shall be bypassed. False if colorop is
+* enabled.
+*/
+   bool bypass;
+
/**
 * @curve_1d_type:
 *
@@ -131,6 +139,18 @@ struct drm_colorop {
 */
struct drm_property *type_property;
 
+   /**
+* @bypass_property:
+*
+* Boolean property to control enablement of the color
+* operation. Setting bypass to "true" shall always be supported
+* in order to allow compositors to quickly fall back to
+* alternate methods of color processing. This is important
+* since setting color operations can fail due to unique
+* HW constraints.
+*/
+   struct drm_property *bypass_property;
+
/**
 * @curve_1d_type:
 *
-- 
2.42.1

[RFC PATCH v3 05/23] drm/vkms: Avoid reading beyond LUT array

2023-11-08 Thread Harry Wentland

When the floor LUT index (drm_fixp2int(lut_index) is the last
index of the array the ceil LUT index will point to an entry
beyond the array. Make sure we guard against it and use the
value of the floor LUT index.

v3:
 - Drop bits from commit description that didn't contribute
   anything of value

Signed-off-by: Harry Wentland 
Cc: Arthur Grillo 
---
 drivers/gpu/drm/vkms/vkms_composer.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
b/drivers/gpu/drm/vkms/vkms_composer.c
index 6f942896036e..25b6b73bece8 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -123,6 +123,8 @@ static u16 apply_lut_to_channel_value(const struct 
vkms_color_lut *lut, u16 chan
  enum lut_channel channel)
 {
s64 lut_index = get_lut_index(lut, channel_value);
+   u16 *floor_lut_value, *ceil_lut_value;
+   u16 floor_channel_value, ceil_channel_value;
 
/*
 * This checks if `struct drm_color_lut` has any gap added by the 
compiler
@@ -130,11 +132,15 @@ static u16 apply_lut_to_channel_value(const struct 
vkms_color_lut *lut, u16 chan
 */
static_assert(sizeof(struct drm_color_lut) == sizeof(__u16) * 4);
 
-   u16 *floor_lut_value = (__u16 *)>base[drm_fixp2int(lut_index)];
-   u16 *ceil_lut_value = (__u16 *)>base[drm_fixp2int_ceil(lut_index)];
+   floor_lut_value = (__u16 *)>base[drm_fixp2int(lut_index)];
+   if (drm_fixp2int(lut_index) == (lut->lut_length - 1))
+   /* We're at the end of the LUT array, use same value for ceil 
and floor */
+   ceil_lut_value = floor_lut_value;
+   else
+   ceil_lut_value = (__u16 
*)>base[drm_fixp2int_ceil(lut_index)];
 
-   u16 floor_channel_value = floor_lut_value[channel];
-   u16 ceil_channel_value = ceil_lut_value[channel];
+   floor_channel_value = floor_lut_value[channel];
+   ceil_channel_value = ceil_lut_value[channel];
 
return lerp_u16(floor_channel_value, ceil_channel_value,
lut_index & DRM_FIXED_DECIMAL_MASK);
-- 
2.42.1

[RFC PATCH v3 01/23] drm: Don't treat 0 as -1 in drm_fixp2int_ceil

2023-11-08 Thread Harry Wentland

Unit testing this in VKMS shows that passing 0 into
this function returns -1, which is highly counter-
intuitive. Fix it by checking whether the input is
>= 0 instead of > 0.

Signed-off-by: Harry Wentland 
Reviewed-by: Simon Ser 
---
 include/drm/drm_fixed.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/drm/drm_fixed.h b/include/drm/drm_fixed.h
index 6ea339d5de08..0c9f917a4d4b 100644
--- a/include/drm/drm_fixed.h
+++ b/include/drm/drm_fixed.h
@@ -95,7 +95,7 @@ static inline int drm_fixp2int_round(s64 a)
 
 static inline int drm_fixp2int_ceil(s64 a)
 {
-   if (a > 0)
+   if (a >= 0)
return drm_fixp2int(a + DRM_FIXED_ALMOST_ONE);
else
return drm_fixp2int(a - DRM_FIXED_ALMOST_ONE);
-- 
2.42.1

[RFC PATCH v3 00/23] Color Pipeline API w/ VKMS

2023-11-08 Thread Harry Wentland

This is an RFC set for a color pipeline API, along with a sample
implementation in VKMS. All the key API bits are here. VKMS now
supports two named transfer function colorops and two matrix
colorops. We have IGT tests that check all four of these colorops
with a pixel-by-pixel comparison that checks that these colorops
do what we expect them to do with a +/- 1 8 bpc code point margin.



This patchset is grouped as follows:
 - Patches 1-2: couple general patches/fixes
 - Patches 3-5: introduce kunit to VKMS
 - Patch 6: description of motivation and details behind the
Color Pipeline API. If you're reading nothing else
but are interested in the topic I highly recommend
you take a look at this.
 - Patches 7-23: DRM core and VKMS changes for color pipeline API

There are plenty of things that I would like to see here but
haven't had a chance to look at. These will (hopefully) be
addressed in future iterations, either in VKMS or amdgpu:
 - PQ transfer function
 - Add custom LUT colorops to VKMS
 - Add pre-blending 3DLUT
 - How to support HW which can't bypass entire pipeline?
 - Add ability to create colorops that don't have BYPASS
 - Can we do a LOAD / COMMIT model for LUTs (and other properties)?
 - read-only scaling colorop which defines scaling taps and position
 - read-only color format colorop to define supported color formats
   for a pipeline
 - named matrices, for things like converting YUV to RGB

IGT tests can be found at
https://gitlab.freedesktop.org/hwentland/igt-gpu-tools/-/merge_requests/1

IGT patches are also being sent to the igt-dev mailing list.

If you prefer a gitlab MR for review you can find it at
https://gitlab.freedesktop.org/hwentland/linux/-/merge_requests/5

v3:
 - Abandon IOCTLs and discover colorops as clients iterate the pipeline
 - Remove need for libdrm
 - Add color_pipeline client cap and make mutually exclusive with
   COLOR_RANGE and COLOR_ENCODING properties
 - add CTM colorop to VKMS
 - Use include way for kunit testing static functions (Arthur)
 - Make TYPE a range property
 - Move enum drm_colorop_type to uapi header
 - and a bunch of smaller bits that are highlighted in the relevant commit
   description

v2:
 - Rebased on drm-misc-next
 - Introduce a VKMS Kunit so we can test LUT functionality in vkms_composer
 - Incorporate feedback in color_pipeline.rst doc
 - Add support for sRGB inverse EOTF
 - Add 2nd enumerated TF colorop to VKMS
 - Fix LUTs and some issues with applying LUTs in VKMS

Cc: Ville Syrjala 
Cc: Pekka Paalanen 
Cc: Simon Ser 
Cc: Harry Wentland 
Cc: Melissa Wen 
Cc: Jonas Ådahl 
Cc: Sebastian Wick 
Cc: Shashank Sharma 
Cc: Alexander Goins 
Cc: Joshua Ashton 
Cc: Michel Dänzer 
Cc: Aleix Pol 
Cc: Xaver Hugl 
Cc: Victoria Brekenfeld 
Cc: Sima 
Cc: Uma Shankar 
Cc: Naseer Ahmed 
Cc: Christopher Braga 
Cc: Abhinav Kumar 
Cc: Arthur Grillo 
Cc: Hector Martin 
Cc: Liviu Dudau 
Cc: Sasha McIntosh 

Harry Wentland (23):
  drm: Don't treat 0 as -1 in drm_fixp2int_ceil
  drm: Add helper for conversion from signed-magnitude
  drm/vkms: Create separate Kconfig file for VKMS
  drm/vkms: Add kunit tests for VKMS LUT handling
  drm/vkms: Avoid reading beyond LUT array
  drm/doc/rfc: Describe why prescriptive color pipeline is needed
  drm/colorop: Introduce new drm_colorop mode object
  drm/colorop: Add TYPE property
  drm/color: Add 1D Curve subtype
  drm/colorop: Add BYPASS property
  drm/colorop: Add NEXT property
  drm/colorop: Add atomic state print for drm_colorop
  drm/plane: Add COLOR PIPELINE property
  drm/colorop: Add NEXT to colorop state print
  drm/vkms: Add enumerated 1D curve colorop
  drm/vkms: Add kunit tests for linear and sRGB LUTs
  drm/colorop: Introduce DRM_CLIENT_CAP_PLANE_COLOR_PIPELINE
  drm/colorop: Add 3x4 CTM type
  drm/vkms: Pull apply_colorop out of pre_blend_color_transform
  drm/vkms: Use s32 for internal color pipeline precision
  drm/vkms: add 3x4 matrix in color pipeline
  drm/tests: Add a few tests around drm_fixed.h
  drm/vkms: Add tests for CTM handling

 Documentation/gpu/rfc/color_pipeline.rst  | 352 
 drivers/gpu/drm/Kconfig   |  14 +-
 drivers/gpu/drm/Makefile  |   1 +
 drivers/gpu/drm/drm_atomic.c  | 146 
 drivers/gpu/drm/drm_atomic_helper.c   |  12 +
 drivers/gpu/drm/drm_atomic_state_helper.c |   5 +
 drivers/gpu/drm/drm_atomic_uapi.c | 161 
 drivers/gpu/drm/drm_colorop.c | 292 +++
 drivers/gpu/drm/drm_ioctl.c   |   7 +
 drivers/gpu/drm/drm_mode_config.c |   7 +
 drivers/gpu/drm/drm_plane_helper.c|   2 +-
 drivers/gpu/drm/tests/Makefile|   3 +-
 drivers/gpu/drm/tests/drm_fixp_test.c |  69 ++
 drivers/gpu/drm/vkms/Kconfig  |  20 +
 drivers/gpu/drm/vkms/Makefile |   4 +-
 drivers/gpu/drm/vkms/tests/.kunitconfig   |   4 +
 drivers/gpu/drm/vkms/tests/vkms_color_tests.c |

Re: [PATCH] drm/amd/display: remove duplicated argument

2023-11-08 Thread Aurabindo Pillai





On 2023-10-29 05:39, José Pekkarinen wrote:

Spotted by coccicheck, there is a redundant check for
v->SourcePixelFormat[k] != dm_444_16. This patch will
remove it. The corresponding output follows.

drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c:5130:86-122: duplicated 
argument to && or ||

Signed-off-by: José Pekkarinen 
---
  drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
index ad741a723c0e..3686f1e7de3a 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
@@ -5128,7 +5128,7 @@ void dml30_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l
ViewportExceedsSurface = true;
  
  		if (v->SourcePixelFormat[k] != dm_444_64 && v->SourcePixelFormat[k] != dm_444_32 && v->SourcePixelFormat[k] != dm_444_16

-   && v->SourcePixelFormat[k] != dm_444_16 && 
v->SourcePixelFormat[k] != dm_444_8 && v->SourcePixelFormat[k] != dm_rgbe) {
+   && v->SourcePixelFormat[k] != dm_444_8 && 
v->SourcePixelFormat[k] != dm_rgbe) {
if (v->ViewportWidthChroma[k] > v->SurfaceWidthC[k] || 
v->ViewportHeightChroma[k] > v->SurfaceHeightC[k]) {
ViewportExceedsSurface = true;
}


Hi José,

Sorry, I've just queued it. Should be merged to amd-staging-drm-next soon.

[PATCH v2] drm/i915/vma: Fix potential UAF on multi-tile platforms

2023-11-08 Thread Janusz Krzysztofik

Object debugging tools were sporadically reporting illegal attempts to
free a still active i915 VMA object from when parking a GPU tile believed
to be idle.

[161.359441] ODEBUG: free active (active state 0) object: 88811643b958 
object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915]
[161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 
debug_print_object+0x80/0xb0
...
[161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 
6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1
[161.360314] Hardware name: Intel Corporation Rocket Lake Client 
Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 
04/21/2022
[161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915]
[161.360592] RIP: 0010:debug_print_object+0x80/0xb0
...
[161.361347] debug_object_free+0xeb/0x110
[161.361362] i915_active_fini+0x14/0x130 [i915]
[161.361866] release_references+0xfe/0x1f0 [i915]
[161.362543] i915_vma_parked+0x1db/0x380 [i915]
[161.363129] __gt_park+0x121/0x230 [i915]
[161.363515] intel_wakeref_put_last+0x1f/0x70 [i915]

That has been tracked down to be happening when another thread was
deactivating the VMA inside __active_retire() helper, after the VMA's
active counter was already decremented to 0, but before deactivation of
the VMA's object was reported to the object debugging tools.  Root cause
has been identified as premature release of last wakeref for the GPU tile
to which the active VMA belonged.

In case of single-tile platforms, an engine associated with a request that
uses the VMA is usually keeping the tile's wakeref long enough for that
VMA to be deactivated on time, before it is going to be freed on last put
of that wakeref.  However, on multi-tile platforms, a request may use a
VMA from a tile other than the one that hosts the request's engine, then,
not protected with the engine's wakeref.

Get an extra wakeref for the VMA's tile when activating it, and put that
wakeref only after the VMA is deactivated.  However, exclude GGTT from
that processing path, otherwise the GPU never goes idle.  Since
__i915_vma_retire() may be called from atomic contexts, use async variant
of wakeref put.

CI reports indicate that single-tile platforms also suffer sporadically
from the same race, however, unlike in case of multi-tile, exact scenario
when that happens hasn't been discovered yet.  Then, while I submit this
patch as fix for multi-tile cases, and in hope it also addresses single-
tile, I'm not able to blame any particular commit for that issue.
However, I'm going to ask i915 maintainers to include this fix, if
accepted, in the current rc cycle (6.7-rc) as important for the first
supported multi-tile platform -- Meteor Lake.

v2: Get the wakeref before vm mutex to avoid circular locking dependency,
  - drop questionable Fixes: tag.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/8875
Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_vma.c | 26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index d09aad34ba37f..f56907f9bc756 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -34,6 +34,7 @@
 #include "gt/intel_engine.h"
 #include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_gt.h"
+#include "gt/intel_gt_pm.h"
 #include "gt/intel_gt_requests.h"
 #include "gt/intel_tlb.h"
 
@@ -103,12 +104,25 @@ static inline struct i915_vma *active_to_vma(struct 
i915_active *ref)
 
 static int __i915_vma_active(struct i915_active *ref)
 {
-   return i915_vma_tryget(active_to_vma(ref)) ? 0 : -ENOENT;
+   struct i915_vma *vma = active_to_vma(ref);
+
+   if (!i915_vma_tryget(vma))
+   return -ENOENT;
+
+   if (!i915_vma_is_ggtt(vma))
+   intel_gt_pm_get(vma->vm->gt);
+
+   return 0;
 }
 
 static void __i915_vma_retire(struct i915_active *ref)
 {
-   i915_vma_put(active_to_vma(ref));
+   struct i915_vma *vma = active_to_vma(ref);
+
+   if (!i915_vma_is_ggtt(vma))
+   intel_gt_pm_put_async(vma->vm->gt);
+
+   i915_vma_put(vma);
 }
 
 static struct i915_vma *
@@ -1404,7 +1418,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct 
i915_gem_ww_ctx *ww,
struct i915_vma_work *work = NULL;
struct dma_fence *moving = NULL;
struct i915_vma_resource *vma_res = NULL;
-   intel_wakeref_t wakeref = 0;
+   intel_wakeref_t wakeref;
unsigned int bound;
int err;
 
@@ -1424,8 +1438,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct 
i915_gem_ww_ctx *ww,
if (err)
return err;
 
-   if (flags & PIN_GLOBAL)
-   wakeref = intel_runtime_pm_get(>vm->i915->runtime_pm);
+   wakeref = intel_runtime_pm_get(>vm->i915->runtime_pm);
 
if (flags & vma->vm->bind_async_flags) {
/* lock VM */
@@ -1561,8 +1574,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct

Re: [RFC PATCH 01/10] drm/doc/rfc: Describe why prescriptive color pipeline is needed

2023-11-08 Thread Harry Wentland

On 2023-11-08 11:19, Pekka Paalanen wrote:
> On Wed, 8 Nov 2023 09:31:17 -0500
> Harry Wentland  wrote:
> 
>> On 2023-11-08 06:40, Sebastian Wick wrote:
>>> On Wed, Nov 8, 2023 at 11:16 AM Pekka Paalanen  wrote: 
>>>  

 On Tue, 7 Nov 2023 11:58:26 -0500
 Harry Wentland  wrote:

> On 2023-11-07 04:55, Pekka Paalanen wrote:  
>> On Mon, 6 Nov 2023 11:19:27 -0500
>> Harry Wentland  wrote:
>>  
>>> On 2023-10-20 06:36, Pekka Paalanen wrote:  
 On Thu, 19 Oct 2023 10:56:40 -0400
 Harry Wentland  wrote:

> On 2023-10-10 12:13, Melissa Wen wrote:  
>> O 09/08, Harry Wentland wrote:  
>>> Signed-off-by: Harry Wentland   

 ...

>> Also, with this new plane API in place, I understand that we will
>> already need think on how to deal with the mixing between old drm 
>> color
>> properties (color encoding and color range) and these new way of 
>> setting
>> plane color properties. IIUC, Pekka asked a related question about it
>> when talking about CRTC automatic RGB->YUV (?)
>>  
>
> We'll still need to confirm whether we'll want to deprecate these
> existing properties. If we do that we'd want a client prop. Things
> should still work without deprecating them, if drivers just pick up
> after the initial encoding and range CSC.
>
> But realistically it might be better to deprecate them and turn them
> into explicit colorops.  

 The existing properties would need to be explicitly reflected in the
 new pipelines anyway, otherwise there would always be doubt at which
 point of a pipeline the old properties apply, and they might even
 need to change positions between pipelines.

 I think it is simply easier to just hide all old color related
 properties when userspace sets the client-cap to enable pipelines. The
 problem is to make sure to hide all old properties on all drivers that
 support the client-cap.

 As a pipeline must be complete (describe everything that happens to
 pixel values), it's going to be a flag day per driver.

 Btw. the plane FB YUV->RGB conversion needs a colorop in every pipeline
 as well. Maybe it's purely informative and non-configurable, keyed by
 FB pixel format, but still.

 We also need a colorop to represent sample filtering, e.g. bilinear,
 like I think Sebastian may have mentioned in the past. Everything
 before the sample filter happens "per tap" as Joshua Ashton put it, and
 everything after it happens on the sample that was computed as a
 weighted average of the filter tap inputs (texels).

 There could be colorops other than sample filtering that operate on
 more than one sample at a time, like blur or sharpness. There could
 even be colorops that change the image size like adding padding that
 the following colorop hardware requires, and then yet another colorop
 that clips that padding away. For an example, see
 https://lists.freedesktop.org/archives/dri-devel/2023-October/427015.html

 If that padding and its color can affect the pipeline results of the
 pixels near the padding (e.g. some convolution is applied with them,
 which may be the reason why padding is necessary to begin with), then
 it would be best to model it.

>>>
>>> I hear you but I'm also somewhat shying away from defining this at this 
>>> point.  
>>
>> Would you define them before the new UAPI is released though?
>>
>> I agree there is no need to have them in this patch series, but I think
>> we'd hit the below problems if the UAPI is released without them.
>>  
>>> There are already too many things that need to happen and I will focus 
>>> on the
>>> actual color blocks (LUTs, matrices) first. We'll always be able to add 
>>> a new
>>> (read-only) colorop type to define scaling and tap behavior at any 
>>> point and
>>> a client is free to ignore a color pipeline if it doesn't find any 
>>> tap/scale
>>> info in it.  
>>
>> How would userspace know to look for tap/scale info, if there is no
>> upstream definition even on paper?
>>  
>
> So far OSes did not care about this. Whether that's good or bad is
> something everyone can answer for themselves.
>
> If you write a compositor and really need this you can just ignore
> color pipelines that don't have this, i.e., you'll probably want
> to wait with implementing color pipeline support until you have what
> you need from DRM/KMS.
>
> This is not to say I don't want to have support for Weston. But I'm
>

Re: [RFC PATCH 01/10] drm/doc/rfc: Describe why prescriptive color pipeline is needed

2023-11-08 Thread Pekka Paalanen

On Wed, 8 Nov 2023 09:31:17 -0500
Harry Wentland  wrote:

> On 2023-11-08 06:40, Sebastian Wick wrote:
> > On Wed, Nov 8, 2023 at 11:16 AM Pekka Paalanen  wrote: 
> >  
> >>
> >> On Tue, 7 Nov 2023 11:58:26 -0500
> >> Harry Wentland  wrote:
> >>  
> >>> On 2023-11-07 04:55, Pekka Paalanen wrote:  
>  On Mon, 6 Nov 2023 11:19:27 -0500
>  Harry Wentland  wrote:
>   
> > On 2023-10-20 06:36, Pekka Paalanen wrote:  
> >> On Thu, 19 Oct 2023 10:56:40 -0400
> >> Harry Wentland  wrote:
> >>  
> >>> On 2023-10-10 12:13, Melissa Wen wrote:  
>  O 09/08, Harry Wentland wrote:  
> > Signed-off-by: Harry Wentland   
> >>
> >> ...
> >>  
>  Also, with this new plane API in place, I understand that we will
>  already need think on how to deal with the mixing between old drm 
>  color
>  properties (color encoding and color range) and these new way of 
>  setting
>  plane color properties. IIUC, Pekka asked a related question about it
>  when talking about CRTC automatic RGB->YUV (?)
>   
> >>>
> >>> We'll still need to confirm whether we'll want to deprecate these
> >>> existing properties. If we do that we'd want a client prop. Things
> >>> should still work without deprecating them, if drivers just pick up
> >>> after the initial encoding and range CSC.
> >>>
> >>> But realistically it might be better to deprecate them and turn them
> >>> into explicit colorops.  
> >>
> >> The existing properties would need to be explicitly reflected in the
> >> new pipelines anyway, otherwise there would always be doubt at which
> >> point of a pipeline the old properties apply, and they might even
> >> need to change positions between pipelines.
> >>
> >> I think it is simply easier to just hide all old color related
> >> properties when userspace sets the client-cap to enable pipelines. The
> >> problem is to make sure to hide all old properties on all drivers that
> >> support the client-cap.
> >>
> >> As a pipeline must be complete (describe everything that happens to
> >> pixel values), it's going to be a flag day per driver.
> >>
> >> Btw. the plane FB YUV->RGB conversion needs a colorop in every pipeline
> >> as well. Maybe it's purely informative and non-configurable, keyed by
> >> FB pixel format, but still.
> >>
> >> We also need a colorop to represent sample filtering, e.g. bilinear,
> >> like I think Sebastian may have mentioned in the past. Everything
> >> before the sample filter happens "per tap" as Joshua Ashton put it, and
> >> everything after it happens on the sample that was computed as a
> >> weighted average of the filter tap inputs (texels).
> >>
> >> There could be colorops other than sample filtering that operate on
> >> more than one sample at a time, like blur or sharpness. There could
> >> even be colorops that change the image size like adding padding that
> >> the following colorop hardware requires, and then yet another colorop
> >> that clips that padding away. For an example, see
> >> https://lists.freedesktop.org/archives/dri-devel/2023-October/427015.html
> >>
> >> If that padding and its color can affect the pipeline results of the
> >> pixels near the padding (e.g. some convolution is applied with them,
> >> which may be the reason why padding is necessary to begin with), then
> >> it would be best to model it.
> >>  
> >
> > I hear you but I'm also somewhat shying away from defining this at this 
> > point.  
> 
>  Would you define them before the new UAPI is released though?
> 
>  I agree there is no need to have them in this patch series, but I think
>  we'd hit the below problems if the UAPI is released without them.
>   
> > There are already too many things that need to happen and I will focus 
> > on the
> > actual color blocks (LUTs, matrices) first. We'll always be able to add 
> > a new
> > (read-only) colorop type to define scaling and tap behavior at any 
> > point and
> > a client is free to ignore a color pipeline if it doesn't find any 
> > tap/scale
> > info in it.  
> 
>  How would userspace know to look for tap/scale info, if there is no
>  upstream definition even on paper?
>   
> >>>
> >>> So far OSes did not care about this. Whether that's good or bad is
> >>> something everyone can answer for themselves.
> >>>
> >>> If you write a compositor and really need this you can just ignore
> >>> color pipelines that don't have this, i.e., you'll probably want
> >>> to wait with implementing color pipeline support until you have what
> >>> you need from DRM/KMS.
> >>>
> >>> This is not to say I don't want to have support for Weston. But I'm
> >>> wondering if we place too much importance on getting

Re: [RFC PATCH 03/10] drm/mipi-dsi: add API for manual control over the DSI link power state

2023-11-08 Thread Laurent Pinchart

On Wed, Nov 08, 2023 at 04:34:39PM +0100, Maxime Ripard wrote:
> On Tue, Nov 07, 2023 at 04:26:34PM +0100, Greg Kroah-Hartman wrote:
> > On Tue, Nov 07, 2023 at 01:18:14PM +0100, Maxime Ripard wrote:
> > > On Tue, Nov 07, 2023 at 12:22:21PM +0100, Greg Kroah-Hartman wrote:
> > > > On Tue, Nov 07, 2023 at 11:57:49AM +0100, Maxime Ripard wrote:
> > > > > +GKH
> > > > 
> > > > Why?  I don't see a question for me here, sorry.
> > > 
> > > I guess the question is: we have a bus with various power states
> > > (powered off, low power, high speed)
> > 
> > Great, have fun!  And is this per-device or per-bus-instance?
> 
> Per bus instance

To be precise, those power states are link states. They don't
necessarily translate directly to device power states, and they're not
so much about power management than speed (and bus turn-around for
reads) management.

Also, while DSI allows for multiple peripherals on a bus, the link is
point-to-point, with the peripherals being all behind a single DSI RX.

> > > low power is typically used to send commands to a device, high speed to
> > > transmit pixels, but still allows to send commands.

Low power (LP) is a link state where commands can be transmitted at a
low speed, as opposed to the high speed (HS) link state that is used to
transmit both video data and commands at high speed. Any device-to-host
data transfer (in response to read commands) occurs exclusively in LP
mode (at least with DSI v1.3, I don't have acces to newer
specifications).

> > > Depending on the devices, there's different requirements about the state
> > > devices expect the bus to be in to send commands. Some will need to send
> > > all the commands in the low power state, some don't care, etc. See
> > > the mail I was replying too for more details.
> > > 
> > > We've tried so far to model that in KMS itself, so the framework the
> > > drivers would register too, but we're kind of reaching the limits of
> > > what we can do there. It also feels to me that "the driver can't access
> > > its device" is more of a problem for the bus to solve rather than the
> > > framework.
> > 
> > This is up to the specific bus to resolve, there's nothing special
> > needed in the driver core for it, right?
> 
> Yeah, we weren't really looking to handle this into the driver core, but
> rather if there was a set of guidelines or feedback on implementing
> those kind of features for a bus.
> 
> > > Do you agree? Are you aware of any other bus in Linux with similar
> > > requirements we could look at? Or any suggestion on how to solve it?
> > 
> > There might be others, yes, look at how the dynamic power management
> > works for different devices on most busses, that might help you out
> > here.
> 
> Thanks for the pointers, we'll have a look

-- 
Regards,

Laurent Pinchart

Re: [PATCH v6 4/5] drm/panel-edp: Add override_edid_mode quirk for generic edp

2023-11-08 Thread Maxime Ripard

On Tue, Nov 07, 2023 at 12:41:54PM -0800, Hsin-Yi Wang wrote:
> Generic edp gets mode from edid. However, some panels report incorrect
> mode in this way, resulting in glitches on panel. Introduce a new quirk
> additional_mode to the generic edid to pick a correct hardcoded mode.
> 
> Signed-off-by: Hsin-Yi Wang 
> Reviewed-by: Douglas Anderson 
> ---
> no change.
> ---
>  drivers/gpu/drm/panel/panel-edp.c | 67 ---
>  1 file changed, 62 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panel/panel-edp.c 
> b/drivers/gpu/drm/panel/panel-edp.c
> index 599a949d74d1..c0c24d94c3a0 100644
> --- a/drivers/gpu/drm/panel/panel-edp.c
> +++ b/drivers/gpu/drm/panel/panel-edp.c
> @@ -203,6 +203,9 @@ struct edp_panel_entry {
>  
>   /** @name: Name of this panel (for printing to logs). */
>   const char *name;
> +
> + /** @override_edid_mode: Override the mode obtained by edid. */
> + const struct drm_display_mode *override_edid_mode;
>  };
>  
>  struct panel_edp {
> @@ -301,6 +304,24 @@ static unsigned int panel_edp_get_display_modes(struct 
> panel_edp *panel,
>   return num;
>  }
>  
> +static int panel_edp_override_edid_mode(struct panel_edp *panel,
> + struct drm_connector *connector,
> + const struct drm_display_mode 
> *override_mode)
> +{
> + struct drm_display_mode *mode;
> +
> + mode = drm_mode_duplicate(connector->dev, override_mode);
> + if (!mode) {
> + dev_err(panel->base.dev, "failed to add additional mode\n");
> + return 0;
> + }
> +
> + mode->type |= DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED;
> + drm_mode_set_name(mode);
> + drm_mode_probed_add(connector, mode);
> + return 1;
> +}
> +
>  static int panel_edp_get_non_edid_modes(struct panel_edp *panel,
>   struct drm_connector *connector)
>  {
> @@ -568,6 +589,9 @@ static int panel_edp_get_modes(struct drm_panel *panel,
>  {
>   struct panel_edp *p = to_panel_edp(panel);
>   int num = 0;
> + bool has_override_edid_mode = p->detected_panel &&
> +   p->detected_panel != ERR_PTR(-EINVAL) &&
> +   p->detected_panel->override_edid_mode;
>  
>   /* probe EDID if a DDC bus is available */
>   if (p->ddc) {
> @@ -575,9 +599,18 @@ static int panel_edp_get_modes(struct drm_panel *panel,
>  
>   if (!p->edid)
>   p->edid = drm_get_edid(connector, p->ddc);
> -
> - if (p->edid)
> - num += drm_add_edid_modes(connector, p->edid);
> + if (p->edid) {
> + if (has_override_edid_mode) {

It's not clear to me why the override mechanism is only there when
there's a ddc bus?

You mentioned before that you were following panel-simple, but that's a
clear deviation from what I can see. If there's a reason for that
deviation, that's fine by me, but it should at least be documented in
the commit log.

> + /*
> +  * override_edid_mode is specified. Use
> +  * override_edid_mode instead of from edid.
> +  */
> + num += panel_edp_override_edid_mode(p, 
> connector,
> + 
> p->detected_panel->override_edid_mode);
> + } else {
> + num += drm_add_edid_modes(connector, p->edid);
> + }
> + }
>  
>   pm_runtime_mark_last_busy(panel->dev);
>   pm_runtime_put_autosuspend(panel->dev);
> @@ -950,6 +983,19 @@ static const struct panel_desc auo_b101ean01 = {
>   },
>  };
>  
> +static const struct drm_display_mode auo_b116xa3_mode = {
> + .clock = 70589,
> + .hdisplay = 1366,
> + .hsync_start = 1366 + 40,
> + .hsync_end = 1366 + 40 + 40,
> + .htotal = 1366 + 40 + 40 + 32,
> + .vdisplay = 768,
> + .vsync_start = 768 + 10,
> + .vsync_end = 768 + 10 + 12,
> + .vtotal = 768 + 10 + 12 + 6,
> + .flags = DRM_MODE_FLAG_NVSYNC | DRM_MODE_FLAG_NHSYNC,
> +};

That should be a separate patch

Maxime


signature.asc
Description: PGP signature

Re: [PATCH v2 5/6] drm/panfrost: Implement ability to turn on/off regulators in suspend

2023-11-08 Thread Steven Price

On 02/11/2023 14:26, AngeloGioacchino Del Regno wrote:
> Some platforms/SoCs can power off the GPU entirely by completely cutting
> off power, greatly enhancing battery time during system suspend: add a
> new pm_feature GPU_PM_VREG_OFF to allow turning off the GPU regulators
> during full suspend only on selected platforms.
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  drivers/gpu/drm/panfrost/panfrost_device.c | 19 ++-
>  drivers/gpu/drm/panfrost/panfrost_device.h |  2 ++
>  2 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c 
> b/drivers/gpu/drm/panfrost/panfrost_device.c
> index 2022ed76a620..51b22eb0971d 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
> @@ -431,10 +431,21 @@ static int panfrost_device_resume(struct device *dev)
>   struct panfrost_device *pfdev = dev_get_drvdata(dev);
>   int ret;
>  
> + if (pfdev->comp->pm_features & BIT(GPU_PM_VREG_OFF)) {
> + unsigned long freq = pfdev->pfdevfreq.fast_rate;
> + struct dev_pm_opp *opp;
> +
> + opp = dev_pm_opp_find_freq_ceil(dev, );
> + if (IS_ERR(opp))
> + return PTR_ERR(opp);
> + dev_pm_opp_put(opp);
> + dev_pm_opp_set_opp(dev, opp);

These lines are still in the wrong order - the put should be after the set.

Steve

> + }
> +
>   if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) {
>   ret = clk_enable(pfdev->clock);
>   if (ret)
> - return ret;
> + goto err_clk;
>  
>   if (pfdev->bus_clock) {
>   ret = clk_enable(pfdev->bus_clock);
> @@ -455,6 +466,9 @@ static int panfrost_device_resume(struct device *dev)
>  err_bus_clk:
>   if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS))
>   clk_disable(pfdev->clock);
> +err_clk:
> + if (pfdev->comp->pm_features & BIT(GPU_PM_VREG_OFF))
> + dev_pm_opp_set_opp(dev, NULL);
>   return ret;
>  }
>  
> @@ -474,6 +488,9 @@ static int panfrost_device_suspend(struct device *dev)
>   clk_disable(pfdev->bus_clock);
>   }
>  
> + if (pfdev->comp->pm_features & BIT(GPU_PM_VREG_OFF))
> + dev_pm_opp_set_opp(dev, NULL);
> +
>   return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
> b/drivers/gpu/drm/panfrost/panfrost_device.h
> index d7f179eb8ea3..0fc558db6bfd 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -28,9 +28,11 @@ struct panfrost_perfcnt;
>  /**
>   * enum panfrost_gpu_pm - Supported kernel power management features
>   * @GPU_PM_CLK_DIS:  Allow disabling clocks during system suspend
> + * @GPU_PM_VREG_OFF: Allow turning off regulators during system suspend
>   */
>  enum panfrost_gpu_pm {
>   GPU_PM_CLK_DIS,
> + GPU_PM_VREG_OFF,
>  };
>  
>  struct panfrost_features {

Re: [PATCH v2 6/6] drm/panfrost: Set regulators on/off during system sleep on MediaTek SoCs

2023-11-08 Thread Steven Price

On 02/11/2023 14:26, AngeloGioacchino Del Regno wrote:
> All of the MediaTek SoCs supported by Panfrost can completely cut power
> to the GPU during full system sleep without any user-noticeable delay
> in the resume operation, as shown by measurements taken on multiple
> MediaTek SoCs.
> 
> As an example, for MT8195 - a "before" with only runtime PM operations
> (so, without turning on/off regulators), and an "after" executing both
> the system sleep .resume() handler and .runtime_resume() (so the time
> refers to T_Resume + T_Runtime_Resume):
> 
> Average Panfrost-only system sleep resume time, before: ~33500ns
> Average Panfrost-only system sleep resume time, after:  ~336200ns
> 
> Keep in mind that this additional ~308200 nanoseconds delay happens only
> in resume from a full system suspend, and not in runtime PM operations,
> hence it is acceptable.
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 

As with patch 4, please can you confirm which SoCs you've tested and
what level of testing. I'm more interested in the correctness (i.e. not
hanging) rather than performance because as you point out it's only the
full system suspend path that takes the performance hit.

Steve

> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 82f3c5fe9c58..f63382d9ab04 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -734,7 +734,7 @@ static const struct panfrost_compatible 
> mediatek_mt8183_b_data = {
>   .supply_names = mediatek_mt8183_b_supplies,
>   .num_pm_domains = ARRAY_SIZE(mediatek_mt8183_pm_domains),
>   .pm_domain_names = mediatek_mt8183_pm_domains,
> - .pm_features = BIT(GPU_PM_CLK_DIS),
> + .pm_features = BIT(GPU_PM_CLK_DIS) | BIT(GPU_PM_VREG_OFF),
>  };
>  
>  static const char * const mediatek_mt8186_pm_domains[] = { "core0", "core1" 
> };
> @@ -743,7 +743,7 @@ static const struct panfrost_compatible 
> mediatek_mt8186_data = {
>   .supply_names = mediatek_mt8183_b_supplies,
>   .num_pm_domains = ARRAY_SIZE(mediatek_mt8186_pm_domains),
>   .pm_domain_names = mediatek_mt8186_pm_domains,
> - .pm_features = BIT(GPU_PM_CLK_DIS),
> + .pm_features = BIT(GPU_PM_CLK_DIS) | BIT(GPU_PM_VREG_OFF),
>  };
>  
>  static const char * const mediatek_mt8192_supplies[] = { "mali", NULL };
> @@ -754,7 +754,7 @@ static const struct panfrost_compatible 
> mediatek_mt8192_data = {
>   .supply_names = mediatek_mt8192_supplies,
>   .num_pm_domains = ARRAY_SIZE(mediatek_mt8192_pm_domains),
>   .pm_domain_names = mediatek_mt8192_pm_domains,
> - .pm_features = BIT(GPU_PM_CLK_DIS),
> + .pm_features = BIT(GPU_PM_CLK_DIS) | BIT(GPU_PM_VREG_OFF),
>  };
>  
>  static const struct of_device_id dt_match[] = {

Re: [PATCH v2 4/6] drm/panfrost: Set clocks on/off during system sleep on MediaTek SoCs

2023-11-08 Thread Steven Price

On 02/11/2023 14:26, AngeloGioacchino Del Regno wrote:
> All of the MediaTek SoCs supported by Panfrost can switch the clocks
> off and on during system sleep to save some power without any user
> experience penalty.
> 
> Measurements taken on multiple MediaTek SoCs show that adding this
> will not prolong the time that is required to resume the system in
> any meaningful way.
> 
> As an example, for MT8195 - a "before" with only runtime PM operations
> (so, without turning on/off GPU clocks), and an "after" executing both
> the system sleep .resume() handler and .runtime_resume() (so the time
> refers to T_Resume + T_Runtime_Resume):
> 
> Average Panfrost-only system sleep resume time, before: ~28000ns
> Average Panfrost-only system sleep resume time, after:  ~33500ns
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 

The change looks good:

Reviewed-by: Steven Price 

However it would be good to explicitly state (in the commit message)
which SoCs you personally have tested (for correctness), just in case we
find there are problems in the future with this on a particular SoC.

Steve

> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 7cabf4e3d1f2..82f3c5fe9c58 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -734,6 +734,7 @@ static const struct panfrost_compatible 
> mediatek_mt8183_b_data = {
>   .supply_names = mediatek_mt8183_b_supplies,
>   .num_pm_domains = ARRAY_SIZE(mediatek_mt8183_pm_domains),
>   .pm_domain_names = mediatek_mt8183_pm_domains,
> + .pm_features = BIT(GPU_PM_CLK_DIS),
>  };
>  
>  static const char * const mediatek_mt8186_pm_domains[] = { "core0", "core1" 
> };
> @@ -742,6 +743,7 @@ static const struct panfrost_compatible 
> mediatek_mt8186_data = {
>   .supply_names = mediatek_mt8183_b_supplies,
>   .num_pm_domains = ARRAY_SIZE(mediatek_mt8186_pm_domains),
>   .pm_domain_names = mediatek_mt8186_pm_domains,
> + .pm_features = BIT(GPU_PM_CLK_DIS),
>  };
>  
>  static const char * const mediatek_mt8192_supplies[] = { "mali", NULL };
> @@ -752,6 +754,7 @@ static const struct panfrost_compatible 
> mediatek_mt8192_data = {
>   .supply_names = mediatek_mt8192_supplies,
>   .num_pm_domains = ARRAY_SIZE(mediatek_mt8192_pm_domains),
>   .pm_domain_names = mediatek_mt8192_pm_domains,
> + .pm_features = BIT(GPU_PM_CLK_DIS),
>  };
>  
>  static const struct of_device_id dt_match[] = {

Re: [PATCH v2 3/6] drm/panfrost: Implement ability to turn on/off GPU clocks in suspend

2023-11-08 Thread Steven Price

On 02/11/2023 14:26, AngeloGioacchino Del Regno wrote:
> Currently, the GPU is being internally powered off for runtime suspend
> and turned back on for runtime resume through commands sent to it, but
> note that the GPU doesn't need to be clocked during the poweroff state,
> hence it is possible to save some power on selected platforms.
> 
> Add suspend and resume handlers for full system sleep and then add
> a new panfrost_gpu_pm enumeration and a pm_features variable in the
> panfrost_compatible structure: BIT(GPU_PM_CLK_DIS) will be used to
> enable this power saving technique only on SoCs that are able to
> safely use it.
> 
> Note that this was implemented only for the system sleep case and not
> for runtime PM because testing on one of my MediaTek platforms showed
> issues when turning on and off clocks aggressively (in PM runtime)
> resulting in a full system lockup.
> 
> Doing this only for full system sleep never showed issues during my
> testing by suspending and resuming the system continuously for more
> than 100 cycles.
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
> 
> Note: Even after fixing the panfrost_power_off() function, I'm still
> getting issues with turning off the clocks at .runtime_suspend() but
> this time, instead of getting a GPU lockup, the entire SoC will deadlock
> bringing down the entire system with it (so it's even worst!) :-)

Ouch! Hopefully that's a SoC issue as I can't see anything that should
cause problems. But note that if the GPU is powered down during a bus
transaction that can lock up the entire bus.
> 
> 
>  drivers/gpu/drm/panfrost/panfrost_device.c | 61 --
>  drivers/gpu/drm/panfrost/panfrost_device.h | 11 
>  2 files changed, 68 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c 
> b/drivers/gpu/drm/panfrost/panfrost_device.c
> index 28f7046e1b1a..2022ed76a620 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
> @@ -403,7 +403,7 @@ void panfrost_device_reset(struct panfrost_device *pfdev)
>   panfrost_job_enable_interrupts(pfdev);
>  }
>  
> -static int panfrost_device_resume(struct device *dev)
> +static int panfrost_device_runtime_resume(struct device *dev)
>  {
>   struct panfrost_device *pfdev = dev_get_drvdata(dev);
>  
> @@ -413,7 +413,7 @@ static int panfrost_device_resume(struct device *dev)
>   return 0;
>  }
>  
> -static int panfrost_device_suspend(struct device *dev)
> +static int panfrost_device_runtime_suspend(struct device *dev)
>  {
>   struct panfrost_device *pfdev = dev_get_drvdata(dev);
>  
> @@ -426,5 +426,58 @@ static int panfrost_device_suspend(struct device *dev)
>   return 0;
>  }
>  
> -EXPORT_GPL_RUNTIME_DEV_PM_OPS(panfrost_pm_ops, panfrost_device_suspend,
> -   panfrost_device_resume, NULL);
> +static int panfrost_device_resume(struct device *dev)
> +{
> + struct panfrost_device *pfdev = dev_get_drvdata(dev);
> + int ret;
> +
> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) {
> + ret = clk_enable(pfdev->clock);
> + if (ret)
> + return ret;
> +
> + if (pfdev->bus_clock) {
> + ret = clk_enable(pfdev->bus_clock);
> + if (ret)
> + goto err_bus_clk;
> + }
> + }
> +
> + ret = pm_runtime_force_resume(dev);
> + if (ret)
> + goto err_resume;
> +
> + return 0;
> +
> +err_resume:
> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS) && pfdev->bus_clock)
> + clk_disable(pfdev->bus_clock);
> +err_bus_clk:
> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS))
> + clk_disable(pfdev->clock);
> + return ret;
> +}
> +
> +static int panfrost_device_suspend(struct device *dev)
> +{
> + struct panfrost_device *pfdev = dev_get_drvdata(dev);
> + int ret;
> +
> + ret = pm_runtime_force_suspend(dev);
> + if (ret)
> + return ret;
> +
> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) {
> + clk_disable(pfdev->clock);
> +
> + if (pfdev->bus_clock)
> + clk_disable(pfdev->bus_clock);

NIT: I would normally expect panfrost_device_resume() to have the
opposite order. I'm not sure if there's an expected order here but I
feel like the bus should be enabled before core - so _resume() would
need to be swapped round.

Other than that:

Reviewed-by: Steven Price 

Thanks,

Steve

> + }
> +
> + return 0;
> +}
> +
> +EXPORT_GPL_DEV_PM_OPS(panfrost_pm_ops) = {
> + RUNTIME_PM_OPS(panfrost_device_runtime_suspend, 
> panfrost_device_runtime_resume, NULL)
> + SYSTEM_SLEEP_PM_OPS(panfrost_device_suspend, panfrost_device_resume)
> +};
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
> b/drivers/gpu/drm/panfrost/panfrost_device.h
> index 1ef38f60d5dc..d7f179eb8ea3 100644
> ---

Re: [PATCH v2 2/6] drm/panfrost: Tighten polling for soft reset and power on

2023-11-08 Thread Steven Price

On 02/11/2023 14:26, AngeloGioacchino Del Regno wrote:
> In many cases, soft reset takes more than 1 microsecond, but definitely
> less than 10; moreover in the poweron flow, tilers, shaders and l2 will
> become ready (each) in less than 10 microseconds as well.
> 
> Even in the cases (at least on my platforms, rarely) in which those take
> more than 10 microseconds, it's very unlikely to see both soft reset and
> poweron to take more than 70 microseconds.
> 
> Shorten the polling delay to 10 microseconds to consistently reduce the
> runtime resume time of the GPU.
> 
> As an indicative example, measurements taken on a MediaTek MT8195 SoC
> 
> Average runtime resume time in nanoseconds before this commit:
> GDM, user selection up/down:88435ns
> GDM, Text Entry (typing user/password): 91489ns
> GNOME Desktop, idling, GKRELLM running: 73200ns
> 
> After this commit:
> 
> GDM: user selection up/down:26690ns
> GDM: Text Entry (typing user/password): 27917ns
> GNOME Desktop, idling, GKRELLM running:   25304ns
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 

Reviewed-by: Steven Price 

> ---
>  drivers/gpu/drm/panfrost/panfrost_gpu.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c 
> b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> index 7e9e2cf26e4d..e264e8c2252d 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> @@ -63,7 +63,7 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev)
>  
>   gpu_write(pfdev, GPU_CMD, GPU_CMD_SOFT_RESET);
>   ret = readl_relaxed_poll_timeout(pfdev->iomem + GPU_INT_RAWSTAT,
> - val, val & GPU_IRQ_RESET_COMPLETED, 100, 1);
> + val, val & GPU_IRQ_RESET_COMPLETED, 10, 1);
>   if (ret) {
>   dev_err(pfdev->dev, "gpu soft reset timed out, attempting hard 
> reset\n");
>  
> @@ -403,7 +403,7 @@ void panfrost_gpu_power_on(struct panfrost_device *pfdev)
>   gpu_write(pfdev, L2_PWRON_LO, pfdev->features.l2_present & core_mask);
>   ret = readl_relaxed_poll_timeout(pfdev->iomem + L2_READY_LO,
>   val, val == (pfdev->features.l2_present & core_mask),
> - 100, 2);
> + 10, 2);
>   if (ret)
>   dev_err(pfdev->dev, "error powering up gpu L2");
>  
> @@ -411,13 +411,13 @@ void panfrost_gpu_power_on(struct panfrost_device 
> *pfdev)
> pfdev->features.shader_present & core_mask);
>   ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_READY_LO,
>   val, val == (pfdev->features.shader_present & core_mask),
> - 100, 2);
> + 10, 2);
>   if (ret)
>   dev_err(pfdev->dev, "error powering up gpu shader");
>  
>   gpu_write(pfdev, TILER_PWRON_LO, pfdev->features.tiler_present);
>   ret = readl_relaxed_poll_timeout(pfdev->iomem + TILER_READY_LO,
> - val, val == pfdev->features.tiler_present, 100, 1000);
> + val, val == pfdev->features.tiler_present, 10, 1000);
>   if (ret)
>   dev_err(pfdev->dev, "error powering up gpu tiler");
>  }

Re: [PATCH v2 1/6] drm/panfrost: Perform hard reset to recover GPU if soft reset fails

2023-11-08 Thread Steven Price

On 02/11/2023 14:26, AngeloGioacchino Del Regno wrote:
> Even though soft reset should ideally never fail, during development of
> some power management features I managed to get some bits wrong: this
> resulted in GPU soft reset failures, where the GPU was never able to
> recover, not even after suspend/resume cycles, meaning that the only
> way to get functionality back was to reboot the machine.
> 
> Perform a hard reset after a soft reset failure to be able to recover
> the GPU during runtime (so, without any machine reboot).
> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  drivers/gpu/drm/panfrost/panfrost_gpu.c  | 14 ++
>  drivers/gpu/drm/panfrost/panfrost_regs.h |  1 +
>  2 files changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c 
> b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> index fad75b6e543e..7e9e2cf26e4d 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> @@ -60,14 +60,20 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev)
>  
>   gpu_write(pfdev, GPU_INT_MASK, 0);
>   gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_RESET_COMPLETED);
> - gpu_write(pfdev, GPU_CMD, GPU_CMD_SOFT_RESET);
>  
> + gpu_write(pfdev, GPU_CMD, GPU_CMD_SOFT_RESET);
>   ret = readl_relaxed_poll_timeout(pfdev->iomem + GPU_INT_RAWSTAT,
>   val, val & GPU_IRQ_RESET_COMPLETED, 100, 1);
> -

I'm not sure what's going on with blank lines above - AFAICT there's no
actual change just a blank line being moved. It's best to avoid blank
line changes to keep the diff readable.

>   if (ret) {
> - dev_err(pfdev->dev, "gpu soft reset timed out\n");
> - return ret;
> + dev_err(pfdev->dev, "gpu soft reset timed out, attempting hard 
> reset\n");
> +
> + gpu_write(pfdev, GPU_CMD, GPU_CMD_HARD_RESET);
> + ret = readl_relaxed_poll_timeout(pfdev->iomem + GPU_INT_RAWSTAT,
> + val, val & GPU_IRQ_RESET_COMPLETED, 100, 1);

NIT: checkpatch complains about the alignment here.

Other than the minor comments this looks fine. Hard reset isn't
something we want to use (there's a possibility of locking up the system
if it occurs during a bus transaction) but it can sometimes recover an
otherwise completely locked up GPU.

Steve

> + if (ret) {
> + dev_err(pfdev->dev, "gpu hard reset timed out\n");
> + return ret;
> + }
>   }
>  
>   gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
> diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h 
> b/drivers/gpu/drm/panfrost/panfrost_regs.h
> index 55ec807550b3..c25743b05c55 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_regs.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
> @@ -44,6 +44,7 @@
>GPU_IRQ_MULTIPLE_FAULT)
>  #define GPU_CMD  0x30
>  #define   GPU_CMD_SOFT_RESET 0x01
> +#define   GPU_CMD_HARD_RESET 0x02
>  #define   GPU_CMD_PERFCNT_CLEAR  0x03
>  #define   GPU_CMD_PERFCNT_SAMPLE 0x04
>  #define   GPU_CMD_CYCLE_COUNT_START  0x05

Re: [PATCH] fbdev: hyperv_fb: fix uninitialized local variable use

2023-11-08 Thread Helge Deller


On 11/8/23 15:58, Arnd Bergmann wrote:

From: Arnd Bergmann 

When CONFIG_SYSFB is disabled, the hyperv_fb driver can now run into
undefined behavior on a gen2 VM, as indicated by this smatch warning:

drivers/video/fbdev/hyperv_fb.c:1077 hvfb_getmem() error: uninitialized symbol 
'base'.
drivers/video/fbdev/hyperv_fb.c:1077 hvfb_getmem() error: uninitialized symbol 
'size'.

Since there is no way to know the actual framebuffer in this configuration,
just return an allocation failure here, which should avoid the build
warning and the undefined behavior.

Reported-by: kernel test robot 
Reported-by: Dan Carpenter 
Closes: https://lore.kernel.org/r/202311070802.ycpvehaz-...@intel.com/
Fixes: a07b50d80ab6 ("hyperv: avoid dependency on screen_info")
Signed-off-by: Arnd Bergmann 


applied.

Thanks!
Helge



---
  drivers/video/fbdev/hyperv_fb.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
index bf59daf862fc..a80939fe2ee6 100644
--- a/drivers/video/fbdev/hyperv_fb.c
+++ b/drivers/video/fbdev/hyperv_fb.c
@@ -1013,6 +1013,8 @@ static int hvfb_getmem(struct hv_device *hdev, struct 
fb_info *info)
} else if (IS_ENABLED(CONFIG_SYSFB)) {
base = screen_info.lfb_base;
size = screen_info.lfb_size;
+   } else {
+   goto err1;
}

/*

Re: [PATCH v6 3/5] drm/panel-edp: drm/panel-edp: Add several generic edp panels

2023-11-08 Thread Maxime Ripard

On Tue, 7 Nov 2023 12:41:53 -0800, Hsin-Yi Wang wrote:
> Add a few generic edp panels used by mt8186 chromebooks.
> 
> Signed-off-by: Hsin-Yi Wang 
> Reviewed-by: Douglas Anderson 

Acked-by: Maxime Ripard 

Thanks!
Maxime

Re: [RFC PATCH v3 10/12] tcp: RX path for devmem TCP

2023-11-08 Thread Edward Cree

On 06/11/2023 21:17, Stanislav Fomichev wrote:
> I guess I'm just wondering whether other people have any suggestions
> here. Not sure Jonathan's way was better, but we fundamentally
> have two queues between the kernel and the userspace:
> - userspace receiving tokens (recvmsg + magical flag)
> - userspace refilling tokens (setsockopt + magical flag)
> 
> So having some kind of shared memory producer-consumer queue feels natural.
> And using 'classic' socket api here feels like a stretch, idk.

Do 'refilled tokens' (returned memory areas) get used for anything other
 than subsequent RX?  If not then surely the way to return a memory area
 in an io_uring idiom is just to post a new read sqe ('RX descriptor')
 pointing into it, rather than explicitly returning it with setsockopt.
(Being async means you can post lots of these, unlike recvmsg(), so you
 don't need any kernel management to keep the RX queue filled; it can
 just be all handled by the userland thus simplifying APIs overall.)
Or I'm misunderstanding something?

-e

Re: [PATCH v6 2/5] drm/panel-edp: drm/panel-edp: Fix AUO B116XTN02 name

2023-11-08 Thread Maxime Ripard

On Tue, 7 Nov 2023 12:41:52 -0800, Hsin-Yi Wang wrote:
> Rename AUO 0x235c B116XTN02 to B116XTN02.3 according to decoding edid.
> 
> Fixes: 3db2420422a5 ("drm/panel-edp: Add AUO B116XTN02, BOE 
> NT116WHM-N21,836X2, NV116WHM-N49 V8.0")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Hsin-Yi Wang 
> 
> [ ... ]

Acked-by: Maxime Ripard 

Thanks!
Maxime

1 2 >

1 - 100 of 196 matches

Mail list logo