Re: [PATCH 5/5] drm/i915: Implement fdinfo memory stats printing

2023-08-08 Thread Iddamsetty, Aravind



On 03-08-2023 14:19, Tvrtko Ursulin wrote:
> 
> On 03/08/2023 06:15, Iddamsetty, Aravind wrote:
>> On 27-07-2023 15:43, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin 
>>>
>>> Use the newly added drm_print_memory_stats helper to show memory
>>> utilisation of our objects in drm/driver specific fdinfo output.
>>>
>>> To collect the stats we walk the per memory regions object lists
>>> and accumulate object size into the respective drm_memory_stats
>>> categories.
>>>
>>> Objects with multiple possible placements are reported in multiple
>>> regions for total and shared sizes, while other categories are
>>> counted only for the currently active region.
>>>
>>> Signed-off-by: Tvrtko Ursulin 
>>> Cc: Aravind Iddamsetty 
>>> Cc: Rob Clark > ---
>>>   drivers/gpu/drm/i915/i915_drm_client.c | 85 ++
>>>   1 file changed, 85 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drm_client.c
>>> b/drivers/gpu/drm/i915/i915_drm_client.c
>>> index a61356012df8..9e7a6075ee25 100644
>>> --- a/drivers/gpu/drm/i915/i915_drm_client.c
>>> +++ b/drivers/gpu/drm/i915/i915_drm_client.c
>>> @@ -45,6 +45,89 @@ void __i915_drm_client_free(struct kref *kref)
>>>   }
>>>     #ifdef CONFIG_PROC_FS
>>> +static void
>>> +obj_meminfo(struct drm_i915_gem_object *obj,
>>> +    struct drm_memory_stats stats[INTEL_REGION_UNKNOWN])
>>> +{
>>> +    struct intel_memory_region *mr;
>>> +    u64 sz = obj->base.size;
>>> +    enum intel_region_id id;
>>> +    unsigned int i;
>>> +
>>> +    /* Attribute size and shared to all possible memory regions. */
>>> +    for (i = 0; i < obj->mm.n_placements; i++) {
>>> +    mr = obj->mm.placements[i];
>>> +    id = mr->id;
>>> +
>>> +    if (obj->base.handle_count > 1)
>>> +    stats[id].shared += sz;
>>> +    else
>>> +    stats[id].private += sz;
>>> +    }
>>> +
>>> +    /* Attribute other categories to only the current region. */
>>> +    mr = obj->mm.region;
>>> +    if (mr)
>>> +    id = mr->id;
>>> +    else
>>> +    id = INTEL_REGION_SMEM;
>>> +
>>> +    if (!obj->mm.n_placements) {
>>
>> I guess we do not expect to have n_placements set to public objects, is
>> that right?
> 
> I think they are the only ones which can have placements. It is via
> I915_GEM_CREATE_EXT_MEMORY_REGIONS userspace is able to create them.
> 
> My main conundrum in this patch is a few lines above, the loop which
> adds shared and private.
> 
> Question is, if an object can be either smem or lmem, how do we want to
> report it? This patch adds the size for all possible regions and
> resident and active only to the currently active. But perhaps that is
> wrong. Maybe I should change it is only against the active region and
> multiple regions are just ignored. Then if object is migrated do access
> patterns or memory pressure, the total size would migrate too.
> 
> I think I was trying to achieve something here (have more visibility on
> what kind of backing store clients are allocating) which maybe does not
> work to well with the current categories.
> 
> Namely if userspace allocates say one 1MiB object with placement in
> either smem or lmem, and it is currently resident in lmem, I wanted it
> to show as:
> 
>  total-smem: 1 MiB
>  resident-smem: 0
>  total-lmem: 1 MiB
>  resident-lmem: 1 MiB
> 
> To constantly show how in theory client could be using memory from
> either region. Maybe that is misleading and should instead be:
> 
>  total-smem: 0
>  resident-smem: 0
>  total-lmem: 1 MiB
>  resident-lmem: 1 MiB
> 
> ?

I think the current implementation will not match with the memregion
info in query ioctl as well. While what you say is true I'm not sure if
there can be a client who is tracking the allocation say for an obj who
has 2 placements LMEM and SMEM, and might assume since I had made a
reservation in SMEM it shall not fail when i try to migrate there later.

Thanks,
Aravind.

> 
> And then if/when the same object gets migrated to smem it changes to
> (lets assume it is also not resident any more but got swapped out):
> 
>  total-smem: 1 MiB
>  resident-smem: 0
>  total-lmem: 0
>  resident-lmem: 0
> 
> Regards,
> 
> Tvrtko
> 
>>> +    if (obj->base.handle_count > 1)
>>> +    stats[id].shared += sz;
>>> +    else
>>> +    stats[id].private += sz;
>>> +    }
>>> +
>>> +    if (i915_gem_object_has_pages(obj)) {
>>> +    stats[id].resident += sz;
>>> +
>>> +    if (!dma_resv_test_signaled(obj->base.resv,
>>> +    dma_resv_usage_rw(true)))
>>> +    stats[id].active += sz;
>>> +    else if (i915_gem_object_is_shrinkable(obj) &&
>>> + obj->mm.madv == I915_MADV_DONTNEED)
>>> +    stats[id].purgeable += sz;
>>> +    }
>>> +}
>>> +
>>> +static void show_meminfo(struct drm_printer *p, struct drm_file *file)
>>> +{
>>> +    struct drm_memory_stats stats[INTEL_REGION_UNKNOWN] = {};
>>> +    struct drm_i915_file_private *fpriv = file->driver_priv;
>>> +    struct 

[PATCH -next 7/7] drm: Remove unnecessary NULL values

2023-08-08 Thread Ruan Jinjie
The NULL initialization of the pointers assigned by kzalloc() first is
not necessary, because if the kzalloc() failed, the pointers will be
assigned NULL, otherwise it works as usual. so remove it.

Signed-off-by: Ruan Jinjie 
---
 drivers/gpu/drm/drm_agpsupport.c  | 2 +-
 drivers/gpu/drm/drm_atomic_uapi.c | 2 +-
 drivers/gpu/drm/exynos/exynos_drm_ipp.c   | 2 +-
 drivers/gpu/drm/nouveau/dispnv04/tvnv17.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_agpsupport.c b/drivers/gpu/drm/drm_agpsupport.c
index a4ad6fd13abc..158709849481 100644
--- a/drivers/gpu/drm/drm_agpsupport.c
+++ b/drivers/gpu/drm/drm_agpsupport.c
@@ -384,7 +384,7 @@ int drm_legacy_agp_free_ioctl(struct drm_device *dev, void 
*data,
 struct drm_agp_head *drm_legacy_agp_init(struct drm_device *dev)
 {
struct pci_dev *pdev = to_pci_dev(dev->dev);
-   struct drm_agp_head *head = NULL;
+   struct drm_agp_head *head;
 
head = kzalloc(sizeof(*head), GFP_KERNEL);
if (!head)
diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index 98d3b10c08ae..5a433af75132 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -942,7 +942,7 @@ int drm_atomic_get_property(struct drm_mode_object *obj,
 static struct drm_pending_vblank_event *create_vblank_event(
struct drm_crtc *crtc, uint64_t user_data)
 {
-   struct drm_pending_vblank_event *e = NULL;
+   struct drm_pending_vblank_event *e;
 
e = kzalloc(sizeof *e, GFP_KERNEL);
if (!e)
diff --git a/drivers/gpu/drm/exynos/exynos_drm_ipp.c 
b/drivers/gpu/drm/exynos/exynos_drm_ipp.c
index ea9f66037600..419d0afccdb9 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_ipp.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_ipp.c
@@ -695,7 +695,7 @@ static int exynos_drm_ipp_task_setup_buffers(struct 
exynos_drm_ipp_task *task,
 static int exynos_drm_ipp_event_create(struct exynos_drm_ipp_task *task,
 struct drm_file *file_priv, uint64_t user_data)
 {
-   struct drm_pending_exynos_ipp_event *e = NULL;
+   struct drm_pending_exynos_ipp_event *e;
int ret;
 
e = kzalloc(sizeof(*e), GFP_KERNEL);
diff --git a/drivers/gpu/drm/nouveau/dispnv04/tvnv17.c 
b/drivers/gpu/drm/nouveau/dispnv04/tvnv17.c
index 670c9739e5e1..9accb2a12719 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/tvnv17.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/tvnv17.c
@@ -789,7 +789,7 @@ nv17_tv_create(struct drm_connector *connector, struct 
dcb_output *entry)
 {
struct drm_device *dev = connector->dev;
struct drm_encoder *encoder;
-   struct nv17_tv_encoder *tv_enc = NULL;
+   struct nv17_tv_encoder *tv_enc;
 
tv_enc = kzalloc(sizeof(*tv_enc), GFP_KERNEL);
if (!tv_enc)
-- 
2.34.1



[PATCH -next 5/7] drm/virtio: Remove an unnecessary NULL value

2023-08-08 Thread Ruan Jinjie
The NULL initialization of the pointer assigned by kzalloc() first is
not necessary, because if the kzalloc() failed, the pointer will be
assigned NULL, otherwise it works as usual. so remove it.

Signed-off-by: Ruan Jinjie 
---
 drivers/gpu/drm/virtio/virtgpu_submit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_submit.c 
b/drivers/gpu/drm/virtio/virtgpu_submit.c
index 3c00135ead45..82563dbec2ab 100644
--- a/drivers/gpu/drm/virtio/virtgpu_submit.c
+++ b/drivers/gpu/drm/virtio/virtgpu_submit.c
@@ -274,7 +274,7 @@ static int virtio_gpu_fence_event_create(struct drm_device 
*dev,
 struct virtio_gpu_fence *fence,
 u32 ring_idx)
 {
-   struct virtio_gpu_fence_event *e = NULL;
+   struct virtio_gpu_fence_event *e;
int ret;
 
e = kzalloc(sizeof(*e), GFP_KERNEL);
-- 
2.34.1



[PATCH -next 6/7] drm/format-helper: Remove unnecessary NULL values

2023-08-08 Thread Ruan Jinjie
The NULL initialization of the pointers assigned by
kunit_kzalloc() first is not necessary, because if kunit_kzalloc()
failed, the pointers will be assigned NULL, otherwise it works
as usual. so remove it.

Signed-off-by: Ruan Jinjie 
---
 .../gpu/drm/tests/drm_format_helper_test.c| 28 +--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/tests/drm_format_helper_test.c 
b/drivers/gpu/drm/tests/drm_format_helper_test.c
index 474bb7a1c4ee..1db12d8ed23c 100644
--- a/drivers/gpu/drm/tests/drm_format_helper_test.c
+++ b/drivers/gpu/drm/tests/drm_format_helper_test.c
@@ -452,7 +452,7 @@ static size_t conversion_buf_size(u32 dst_format, unsigned 
int dst_pitch,
 
 static u16 *le16buf_to_cpu(struct kunit *test, const __le16 *buf, size_t 
buf_size)
 {
-   u16 *dst = NULL;
+   u16 *dst;
int n;
 
dst = kunit_kzalloc(test, sizeof(*dst) * buf_size, GFP_KERNEL);
@@ -467,7 +467,7 @@ static u16 *le16buf_to_cpu(struct kunit *test, const __le16 
*buf, size_t buf_siz
 
 static u32 *le32buf_to_cpu(struct kunit *test, const __le32 *buf, size_t 
buf_size)
 {
-   u32 *dst = NULL;
+   u32 *dst;
int n;
 
dst = kunit_kzalloc(test, sizeof(*dst) * buf_size, GFP_KERNEL);
@@ -482,7 +482,7 @@ static u32 *le32buf_to_cpu(struct kunit *test, const __le32 
*buf, size_t buf_siz
 
 static __le32 *cpubuf_to_le32(struct kunit *test, const u32 *buf, size_t 
buf_size)
 {
-   __le32 *dst = NULL;
+   __le32 *dst;
int n;
 
dst = kunit_kzalloc(test, sizeof(*dst) * buf_size, GFP_KERNEL);
@@ -509,7 +509,7 @@ static void drm_test_fb_xrgb_to_gray8(struct kunit 
*test)
const struct convert_xrgb_case *params = test->param_value;
const struct convert_to_gray8_result *result = >gray8_result;
size_t dst_size;
-   u8 *buf = NULL;
+   u8 *buf;
__le32 *xrgb = NULL;
struct iosys_map dst, src;
 
@@ -539,7 +539,7 @@ static void drm_test_fb_xrgb_to_rgb332(struct kunit 
*test)
const struct convert_xrgb_case *params = test->param_value;
const struct convert_to_rgb332_result *result = >rgb332_result;
size_t dst_size;
-   u8 *buf = NULL;
+   u8 *buf;
__le32 *xrgb = NULL;
struct iosys_map dst, src;
 
@@ -569,7 +569,7 @@ static void drm_test_fb_xrgb_to_rgb565(struct kunit 
*test)
const struct convert_xrgb_case *params = test->param_value;
const struct convert_to_rgb565_result *result = >rgb565_result;
size_t dst_size;
-   u16 *buf = NULL;
+   u16 *buf;
__le32 *xrgb = NULL;
struct iosys_map dst, src;
 
@@ -605,7 +605,7 @@ static void drm_test_fb_xrgb_to_xrgb1555(struct kunit 
*test)
const struct convert_xrgb_case *params = test->param_value;
const struct convert_to_xrgb1555_result *result = 
>xrgb1555_result;
size_t dst_size;
-   u16 *buf = NULL;
+   u16 *buf;
__le32 *xrgb = NULL;
struct iosys_map dst, src;
 
@@ -636,7 +636,7 @@ static void drm_test_fb_xrgb_to_argb1555(struct kunit 
*test)
const struct convert_xrgb_case *params = test->param_value;
const struct convert_to_argb1555_result *result = 
>argb1555_result;
size_t dst_size;
-   u16 *buf = NULL;
+   u16 *buf;
__le32 *xrgb = NULL;
struct iosys_map dst, src;
 
@@ -667,7 +667,7 @@ static void drm_test_fb_xrgb_to_rgba5551(struct kunit 
*test)
const struct convert_xrgb_case *params = test->param_value;
const struct convert_to_rgba5551_result *result = 
>rgba5551_result;
size_t dst_size;
-   u16 *buf = NULL;
+   u16 *buf;
__le32 *xrgb = NULL;
struct iosys_map dst, src;
 
@@ -698,7 +698,7 @@ static void drm_test_fb_xrgb_to_rgb888(struct kunit 
*test)
const struct convert_xrgb_case *params = test->param_value;
const struct convert_to_rgb888_result *result = >rgb888_result;
size_t dst_size;
-   u8 *buf = NULL;
+   u8 *buf;
__le32 *xrgb = NULL;
struct iosys_map dst, src;
 
@@ -732,7 +732,7 @@ static void drm_test_fb_xrgb_to_argb(struct kunit 
*test)
const struct convert_xrgb_case *params = test->param_value;
const struct convert_to_argb_result *result = 
>argb_result;
size_t dst_size;
-   u32 *buf = NULL;
+   u32 *buf;
__le32 *xrgb = NULL;
struct iosys_map dst, src;
 
@@ -763,7 +763,7 @@ static void drm_test_fb_xrgb_to_xrgb2101010(struct 
kunit *test)
const struct convert_xrgb_case *params = test->param_value;
const struct convert_to_xrgb2101010_result *result = 
>xrgb2101010_result;
size_t dst_size;
-   u32 *buf = NULL;
+   u32 *buf;
__le32 *xrgb = NULL;
struct iosys_map dst, src;
 
@@ -794,7 +794,7 @@ static void drm_test_fb_xrgb_to_argb2101010(struct 

[PATCH -next 3/7] drm/msm: Remove unnecessary NULL values

2023-08-08 Thread Ruan Jinjie
The NULL initialization of the pointers assigned by kzalloc() first is
not necessary, because if the kzalloc() failed, the pointers will be
assigned NULL, otherwise it works as usual. so remove it.

Signed-off-by: Ruan Jinjie 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 2 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
index 8ce7586e2ddf..3c475f8042b0 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
@@ -1466,7 +1466,7 @@ struct drm_crtc *dpu_crtc_init(struct drm_device *dev, 
struct drm_plane *plane,
struct msm_drm_private *priv = dev->dev_private;
struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
struct drm_crtc *crtc = NULL;
-   struct dpu_crtc *dpu_crtc = NULL;
+   struct dpu_crtc *dpu_crtc;
int i, ret;
 
dpu_crtc = kzalloc(sizeof(*dpu_crtc), GFP_KERNEL);
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c 
b/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
index 56a3063545ec..b68682c1b5bc 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
@@ -379,7 +379,7 @@ struct mdp5_smp *mdp5_smp_init(struct mdp5_kms *mdp5_kms, 
const struct mdp5_smp_
 {
struct mdp5_smp_state *state;
struct mdp5_global_state *global_state;
-   struct mdp5_smp *smp = NULL;
+   struct mdp5_smp *smp;
int ret;
 
smp = kzalloc(sizeof(*smp), GFP_KERNEL);
-- 
2.34.1



[PATCH -next 4/7] drm/radeon: Remove unnecessary NULL values

2023-08-08 Thread Ruan Jinjie
The NULL initialization of the pointers assigned by kzalloc() first is
not necessary, because if the kzalloc() failed, the pointers will be
assigned NULL, otherwise it works as usual. so remove it.

Signed-off-by: Ruan Jinjie 
---
 drivers/gpu/drm/radeon/radeon_agp.c | 2 +-
 drivers/gpu/drm/radeon/radeon_combios.c | 6 +++---
 drivers/gpu/drm/radeon/radeon_legacy_encoders.c | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_agp.c 
b/drivers/gpu/drm/radeon/radeon_agp.c
index d124600b5f58..a3d749e350f9 100644
--- a/drivers/gpu/drm/radeon/radeon_agp.c
+++ b/drivers/gpu/drm/radeon/radeon_agp.c
@@ -130,7 +130,7 @@ static struct radeon_agpmode_quirk 
radeon_agpmode_quirk_list[] = {
 struct radeon_agp_head *radeon_agp_head_init(struct drm_device *dev)
 {
struct pci_dev *pdev = to_pci_dev(dev->dev);
-   struct radeon_agp_head *head = NULL;
+   struct radeon_agp_head *head;
 
head = kzalloc(sizeof(*head), GFP_KERNEL);
if (!head)
diff --git a/drivers/gpu/drm/radeon/radeon_combios.c 
b/drivers/gpu/drm/radeon/radeon_combios.c
index 795c3667f6d6..2620efc7c675 100644
--- a/drivers/gpu/drm/radeon/radeon_combios.c
+++ b/drivers/gpu/drm/radeon/radeon_combios.c
@@ -863,7 +863,7 @@ struct radeon_encoder_primary_dac 
*radeon_combios_get_primary_dac_info(struct
struct radeon_device *rdev = dev->dev_private;
uint16_t dac_info;
uint8_t rev, bg, dac;
-   struct radeon_encoder_primary_dac *p_dac = NULL;
+   struct radeon_encoder_primary_dac *p_dac;
int found = 0;
 
p_dac = kzalloc(sizeof(struct radeon_encoder_primary_dac),
@@ -1014,7 +1014,7 @@ struct radeon_encoder_tv_dac 
*radeon_combios_get_tv_dac_info(struct
struct radeon_device *rdev = dev->dev_private;
uint16_t dac_info;
uint8_t rev, bg, dac;
-   struct radeon_encoder_tv_dac *tv_dac = NULL;
+   struct radeon_encoder_tv_dac *tv_dac;
int found = 0;
 
tv_dac = kzalloc(sizeof(struct radeon_encoder_tv_dac), GFP_KERNEL);
@@ -1100,7 +1100,7 @@ static struct radeon_encoder_lvds 
*radeon_legacy_get_lvds_info_from_regs(struct
 
radeon_device
 *rdev)
 {
-   struct radeon_encoder_lvds *lvds = NULL;
+   struct radeon_encoder_lvds *lvds;
uint32_t fp_vert_stretch, fp_horz_stretch;
uint32_t ppll_div_sel, ppll_val;
uint32_t lvds_ss_gen_cntl = RREG32(RADEON_LVDS_SS_GEN_CNTL);
diff --git a/drivers/gpu/drm/radeon/radeon_legacy_encoders.c 
b/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
index 601d35d34eab..c4350ac2b3d2 100644
--- a/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
+++ b/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
@@ -1692,7 +1692,7 @@ static struct radeon_encoder_int_tmds 
*radeon_legacy_get_tmds_info(struct radeon
 {
struct drm_device *dev = encoder->base.dev;
struct radeon_device *rdev = dev->dev_private;
-   struct radeon_encoder_int_tmds *tmds = NULL;
+   struct radeon_encoder_int_tmds *tmds;
bool ret;
 
tmds = kzalloc(sizeof(struct radeon_encoder_int_tmds), GFP_KERNEL);
@@ -1715,7 +1715,7 @@ static struct radeon_encoder_ext_tmds 
*radeon_legacy_get_ext_tmds_info(struct ra
 {
struct drm_device *dev = encoder->base.dev;
struct radeon_device *rdev = dev->dev_private;
-   struct radeon_encoder_ext_tmds *tmds = NULL;
+   struct radeon_encoder_ext_tmds *tmds;
bool ret;
 
if (rdev->is_atom_bios)
-- 
2.34.1



[PATCH -next 2/7] drm/amd/display: Remove unnecessary NULL values

2023-08-08 Thread Ruan Jinjie
The NULL initialization of the pointers assigned by kzalloc() first is
not necessary, because if the kzalloc() failed, the pointers will be
assigned NULL, otherwise it works as usual. so remove it.

Signed-off-by: Ruan Jinjie 
---
 drivers/gpu/drm/amd/display/dc/bios/bios_parser.c  | 4 ++--
 drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c 
b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
index 4f005ae1516c..6b3190447581 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
@@ -96,7 +96,7 @@ struct dc_bios *bios_parser_create(
struct bp_init_data *init,
enum dce_version dce_version)
 {
-   struct bios_parser *bp = NULL;
+   struct bios_parser *bp;
 
bp = kzalloc(sizeof(struct bios_parser), GFP_KERNEL);
if (!bp)
@@ -2576,7 +2576,7 @@ static struct integrated_info 
*bios_parser_create_integrated_info(
struct dc_bios *dcb)
 {
struct bios_parser *bp = BP_FROM_DCB(dcb);
-   struct integrated_info *info = NULL;
+   struct integrated_info *info;
 
info = kzalloc(sizeof(struct integrated_info), GFP_KERNEL);
 
diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c 
b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
index 540d19efad8f..c7b3359f1e1d 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
@@ -3086,7 +3086,7 @@ static struct integrated_info 
*bios_parser_create_integrated_info(
struct dc_bios *dcb)
 {
struct bios_parser *bp = BP_FROM_DCB(dcb);
-   struct integrated_info *info = NULL;
+   struct integrated_info *info;
 
info = kzalloc(sizeof(struct integrated_info), GFP_KERNEL);
 
@@ -3675,7 +3675,7 @@ struct dc_bios *firmware_parser_create(
struct bp_init_data *init,
enum dce_version dce_version)
 {
-   struct bios_parser *bp = NULL;
+   struct bios_parser *bp;
 
bp = kzalloc(sizeof(struct bios_parser), GFP_KERNEL);
if (!bp)
-- 
2.34.1



[PATCH -next 1/7] drm/amdkfd: Remove unnecessary NULL values

2023-08-08 Thread Ruan Jinjie
The NULL initialization of the pointers assigned by kzalloc() first is
not necessary, because if the kzalloc() failed, the pointers will be
assigned NULL, otherwise it works as usual. so remove it.

Signed-off-by: Ruan Jinjie 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
index 863cf060af48..d01bb57733b3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
@@ -48,7 +48,7 @@ int pipe_priority_map[] = {
 
 struct kfd_mem_obj *allocate_hiq_mqd(struct kfd_node *dev, struct 
queue_properties *q)
 {
-   struct kfd_mem_obj *mqd_mem_obj = NULL;
+   struct kfd_mem_obj *mqd_mem_obj;
 
mqd_mem_obj = kzalloc(sizeof(struct kfd_mem_obj), GFP_KERNEL);
if (!mqd_mem_obj)
@@ -64,7 +64,7 @@ struct kfd_mem_obj *allocate_hiq_mqd(struct kfd_node *dev, 
struct queue_properti
 struct kfd_mem_obj *allocate_sdma_mqd(struct kfd_node *dev,
struct queue_properties *q)
 {
-   struct kfd_mem_obj *mqd_mem_obj = NULL;
+   struct kfd_mem_obj *mqd_mem_obj;
uint64_t offset;
 
mqd_mem_obj = kzalloc(sizeof(struct kfd_mem_obj), GFP_KERNEL);
-- 
2.34.1



[PATCH -next 0/7] drm: Remove many unnecessary NULL values

2023-08-08 Thread Ruan Jinjie
The NULL initialization of the pointers assigned by kzalloc() or
kunit_kzalloc() first is not necessary, because if the kzalloc() or
kunit_kzalloc() failed, the pointers will be assigned NULL, otherwise
it works as usual. so remove it.

Ruan Jinjie (7):
  drm/amdkfd: Remove unnecessary NULL values
  drm/amd/display: Remove unnecessary NULL values
  drm/msm: Remove unnecessary NULL values
  drm/radeon: Remove unnecessary NULL values
  drm/virtio: Remove an unnecessary NULL value
  drm/format-helper: Remove unnecessary NULL values
  drm: Remove unnecessary NULL values

 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c  |  4 +--
 .../gpu/drm/amd/display/dc/bios/bios_parser.c |  4 +--
 .../drm/amd/display/dc/bios/bios_parser2.c|  4 +--
 drivers/gpu/drm/drm_agpsupport.c  |  2 +-
 drivers/gpu/drm/drm_atomic_uapi.c |  2 +-
 drivers/gpu/drm/exynos/exynos_drm_ipp.c   |  2 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c  |  2 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c  |  2 +-
 drivers/gpu/drm/nouveau/dispnv04/tvnv17.c |  2 +-
 drivers/gpu/drm/radeon/radeon_agp.c   |  2 +-
 drivers/gpu/drm/radeon/radeon_combios.c   |  6 ++--
 .../gpu/drm/radeon/radeon_legacy_encoders.c   |  4 +--
 .../gpu/drm/tests/drm_format_helper_test.c| 28 +--
 drivers/gpu/drm/virtio/virtgpu_submit.c   |  2 +-
 14 files changed, 33 insertions(+), 33 deletions(-)

-- 
2.34.1



[PATCH -next] drm/tegra: Remove two unused function declarations

2023-08-08 Thread Yue Haibing
Commit 776dc3840367 ("drm/tegra: Move subdevice infrastructure to host1x")
removed the implementation but not the declaration.

Signed-off-by: Yue Haibing 
---
 drivers/gpu/drm/tegra/drm.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index f9d18e8cf6ab..ccb5d74fa227 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -120,9 +120,6 @@ int tegra_drm_unregister_client(struct tegra_drm *tegra,
 int host1x_client_iommu_attach(struct host1x_client *client);
 void host1x_client_iommu_detach(struct host1x_client *client);
 
-int tegra_drm_init(struct tegra_drm *tegra, struct drm_device *drm);
-int tegra_drm_exit(struct tegra_drm *tegra);
-
 void *tegra_drm_alloc(struct tegra_drm *tegra, size_t size, dma_addr_t *iova);
 void tegra_drm_free(struct tegra_drm *tegra, size_t size, void *virt,
dma_addr_t iova);
-- 
2.34.1



Re: [PATCH 4/6] accel/ivpu: Add param ioctl to identify capabilities

2023-08-08 Thread Jeffrey Hugo

On 8/8/2023 2:52 AM, Stanislaw Gruszka wrote:

On Thu, Aug 03, 2023 at 10:37:37AM +0200, Stanislaw Gruszka wrote:

Seems like we might want to decide this now, because if we define a iVPU
specific ioctl as proposed here, but then switch to an Accel-wide mechanism
later, iVPU is going to be stuck supporting both.


For the record, we do not add new ioctl in this patch, we just extend
existing DRM_IOCTL_IVPU_GET_PARAM one.


To avoid confusion, I'll change the topic and commit massage
before applying:

accel/ivpu: Extend get_param ioctl to identify capabilities

Add DRM_IVPU_PARAM_CAPABILITIES parameters to get_param ioctl to query
driver capabilities. For now use it for identify metric streamer and
new dma memory range features. Currently upstream version of intel_vpu
does not have those, they will be added it the future.


This is perhaps slightly better.  I didn't find the original one confusing.

Seems like no opinions on pushing this up to the framework.  You did 
point out DRM drivers have driver level ones, so carry-on I guess.


Seems ok to me.  I'd prefer to see some comments in the uapi header 
describing what the DRM_IVPU_CAP_* values mean.  A bit more than "device 
has metric streamer support" - what is metric streamer, and why might 
userspace care?


However, as a uAPI change, is Oded's Ack not required?  I thought that 
was the rule.


-Jeff


linux-next: manual merge of the fbdev tree with the drm-misc tree

2023-08-08 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the fbdev tree got a conflict in:

  drivers/video/fbdev/Kconfig

between commit:

  8c47895b70a2 ("fbdev/mx3fb: Use fbdev I/O helpers")

from the drm-misc tree and commit:

  87ac8777d424 ("fbdev: mx3fb: Remove the driver")

from the fbdev tree.

I fixed it up (the latter removed the lines modified by the former,
so I just used the latter) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgpv2pPvVg96j.pgp
Description: OpenPGP digital signature


[PATCH] drm/amd/display: dmub_replay: don't use kernel-doc markers

2023-08-08 Thread Randy Dunlap
These functions don't use kernel-doc notation for comments so
don't begin each comment block with the "/**" kernel-doc marker.

This prevents a bunch of kernel-doc warnings:

dmub_replay.c:37: warning: This comment starts with '/**', but isn't a 
kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
dmub_replay.c:37: warning: missing initial short description on line:
 * Get Replay state from firmware.
dmub_replay.c:66: warning: This comment starts with '/**', but isn't a 
kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
dmub_replay.c:66: warning: missing initial short description on line:
 * Enable/Disable Replay.
dmub_replay.c:116: warning: This comment starts with '/**', but isn't a 
kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
dmub_replay.c:116: warning: missing initial short description on line:
 * Set REPLAY power optimization flags.
dmub_replay.c:134: warning: This comment starts with '/**', but isn't a 
kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
dmub_replay.c:134: warning: missing initial short description on line:
 * Setup Replay by programming phy registers and sending replay hw context 
values to firmware.
and 10 more similar warnings.

Fixes: c7ddc0a800bc ("drm/amd/display: Add Functions to enable Freesync Panel 
Replay")
Signed-off-by: Randy Dunlap 
Reported-by: kernel test robot 
Link: lore.kernel.org/r/202308081459.us5rlyay-...@intel.com
Cc: Bhawanpreet Lakha 
Cc: Harry Wentland 
Cc: Alex Deucher 
Cc: Leo Li 
Cc: Rodrigo Siqueira 
Cc: amd-...@lists.freedesktop.org
Cc: Christian König 
Cc: "Pan, Xinhui" 
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/amd/display/dc/dce/dmub_replay.c |   18 ++---
 1 file changed, 9 insertions(+), 9 deletions(-)

diff -- a/drivers/gpu/drm/amd/display/dc/dce/dmub_replay.c 
b/drivers/gpu/drm/amd/display/dc/dce/dmub_replay.c
--- a/drivers/gpu/drm/amd/display/dc/dce/dmub_replay.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_replay.c
@@ -33,7 +33,7 @@
 
 #define MAX_PIPES 6
 
-/**
+/*
  * Get Replay state from firmware.
  */
 static void dmub_replay_get_state(struct dmub_replay *dmub, enum replay_state 
*state, uint8_t panel_inst)
@@ -62,7 +62,7 @@ static void dmub_replay_get_state(struct
}
 }
 
-/**
+/*
  * Enable/Disable Replay.
  */
 static void dmub_replay_enable(struct dmub_replay *dmub, bool enable, bool 
wait, uint8_t panel_inst)
@@ -112,7 +112,7 @@ static void dmub_replay_enable(struct dm
 
 }
 
-/**
+/*
  * Set REPLAY power optimization flags.
  */
 static void dmub_replay_set_power_opt(struct dmub_replay *dmub, unsigned int 
power_opt, uint8_t panel_inst)
@@ -130,7 +130,7 @@ static void dmub_replay_set_power_opt(st
dm_execute_dmub_cmd(dc, , DM_DMUB_WAIT_TYPE_WAIT);
 }
 
-/**
+/*
  * Setup Replay by programming phy registers and sending replay hw context 
values to firmware.
  */
 static bool dmub_replay_copy_settings(struct dmub_replay *dmub,
@@ -215,7 +215,7 @@ static bool dmub_replay_copy_settings(st
return true;
 }
 
-/**
+/*
  * Set coasting vtotal.
  */
 static void dmub_replay_set_coasting_vtotal(struct dmub_replay *dmub,
@@ -234,7 +234,7 @@ static void dmub_replay_set_coasting_vto
dm_execute_dmub_cmd(dc, , DM_DMUB_WAIT_TYPE_WAIT);
 }
 
-/**
+/*
  * Get Replay residency from firmware.
  */
 static void dmub_replay_residency(struct dmub_replay *dmub, uint8_t panel_inst,
@@ -267,7 +267,7 @@ static const struct dmub_replay_funcs re
.replay_residency   = dmub_replay_residency,
 };
 
-/**
+/*
  * Construct Replay object.
  */
 static void dmub_replay_construct(struct dmub_replay *replay, struct 
dc_context *ctx)
@@ -276,7 +276,7 @@ static void dmub_replay_construct(struct
replay->funcs = _funcs;
 }
 
-/**
+/*
  * Allocate and initialize Replay object.
  */
 struct dmub_replay *dmub_replay_create(struct dc_context *ctx)
@@ -293,7 +293,7 @@ struct dmub_replay *dmub_replay_create(s
return replay;
 }
 
-/**
+/*
  * Deallocate Replay object.
  */
 void dmub_replay_destroy(struct dmub_replay **dmub)


Re: [PATCH RFC v5 02/10] drm: Introduce solid fill DRM plane property

2023-08-08 Thread Jessica Zhang




On 8/7/2023 6:07 PM, Dmitry Baryshkov wrote:



On 8 August 2023 00:41:07 GMT+03:00, Jessica Zhang  
wrote:



On 8/4/2023 6:27 AM, Dmitry Baryshkov wrote:

On Fri, 28 Jul 2023 at 20:03, Jessica Zhang  wrote:


Document and add support for solid_fill property to drm_plane. In
addition, add support for setting and getting the values for solid_fill.

To enable solid fill planes, userspace must assign a property blob to
the "solid_fill" plane property containing the following information:

struct drm_mode_solid_fill {
  u32 version;
  u32 r, g, b;
};

Signed-off-by: Jessica Zhang 
---
   drivers/gpu/drm/drm_atomic_state_helper.c |  9 +
   drivers/gpu/drm/drm_atomic_uapi.c | 55 
+++
   drivers/gpu/drm/drm_blend.c   | 30 +
   include/drm/drm_blend.h   |  1 +
   include/drm/drm_plane.h   | 35 
   include/uapi/drm/drm_mode.h   | 24 ++
   6 files changed, 154 insertions(+)



[skipped most of the patch]


diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h
index 43691058d28f..53c8efa5ad7f 100644
--- a/include/uapi/drm/drm_mode.h
+++ b/include/uapi/drm/drm_mode.h
@@ -259,6 +259,30 @@ struct drm_mode_modeinfo {
  char name[DRM_DISPLAY_MODE_LEN];
   };

+/**
+ * struct drm_mode_solid_fill - User info for solid fill planes
+ *
+ * This is the userspace API solid fill information structure.
+ *
+ * Userspace can enable solid fill planes by assigning the plane "solid_fill"
+ * property to a blob containing a single drm_mode_solid_fill struct populated 
with an RGB323232
+ * color and setting the pixel source to "SOLID_FILL".
+ *
+ * For information on the plane property, see 
drm_plane_create_solid_fill_property()
+ *
+ * @version: Version of the blob. Currently, there is only support for version 
== 1
+ * @r: Red color value of single pixel
+ * @g: Green color value of single pixel
+ * @b: Blue color value of single pixel
+ */
+struct drm_mode_solid_fill {
+   __u32 version;
+   __u32 r;
+   __u32 g;
+   __u32 b;


Another thought about the drm_mode_solid_fill uABI. I still think we
should add alpha here. The reason is the following:

It is true that we have  drm_plane_state::alpha and the plane's
"alpha" property. However it is documented as "the plane-wide opacity
[...] It can be combined with pixel alpha. The pixel values in the
framebuffers are expected to not be pre-multiplied by the global alpha
associated to the plane.".

I can imagine a use case, when a user might want to enable plane-wide
opacity, set "pixel blend mode" to "Coverage" and then switch between
partially opaque framebuffer and partially opaque solid-fill without
touching the plane's alpha value.


Hi Dmitry,

I don't really agree that adding a solid fill alpha would be a good idea. Since 
the intent behind solid fill is to have a single color for the entire plane, I 
think it makes more sense to have solid fill rely on the global plane alpha.

As stated in earlier discussions, I think having both a solid_fill.alpha and a 
plane_state.alpha would be redundant and serve to confuse the user as to which 
one to set.


That depends on the blending mode: in Coverage mode one has independent plane 
and contents alpha values. And I consider alpha value to be a part of the 
colour in the rgba/bgra modes.


Acked -- taking Sebastian's concern into consideration, I think I'll 
have "PIXEL_SOURCE_SOLID_FILL_RGB" and add a separate 
"PIXEL_SOURCE_SOLID_FILL_RGBA".


Thanks,

Jessica Zhang






Thanks,

Jessica Zhang



--
With best wishes
Dmitry


--
With best wishes
Dmitry


[PATCH v2 09/11] PCI/VGA: Fix a typo to the comments in vga_str_to_iostate() function

2023-08-08 Thread Sui Jingfeng
From: Sui Jingfeng 

s/chekcing/checking

While at it, convert the comments to the conventional multi-line style,
and rewrap to fill 78 columns.

Fixes: deb2d2ecd43d ("PCI/GPU: implement VGA arbitration on Linux")
Signed-off-by: Sui Jingfeng 
---
 drivers/pci/vgaarb.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index d80d92e8012b..9f5cf6a6e3a2 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -79,14 +79,16 @@ static const char *vga_iostate_to_str(unsigned int iostate)
 
 static int vga_str_to_iostate(char *buf, int str_size, unsigned int *io_state)
 {
-   /* we could in theory hand out locks on IO and mem
-* separately to userspace but it can cause deadlocks */
+   /*
+* In theory, we could hand out locks on IO and MEM separately to
+* userspace, but this can cause deadlocks.
+*/
if (strncmp(buf, "none", 4) == 0) {
*io_state = VGA_RSRC_NONE;
return 1;
}
 
-   /* XXX We're not chekcing the str_size! */
+   /* XXX We're not checking the str_size! */
if (strncmp(buf, "io+mem", 6) == 0)
goto both;
else if (strncmp(buf, "io", 2) == 0)
-- 
2.34.1



[PATCH v2 11/11] PCI/VGA: Replace full MIT license text with SPDX identifier

2023-08-08 Thread Sui Jingfeng
From: Sui Jingfeng 

Per Documentation/process/license-rules.rst, the SPDX MIT identifier is
equivalent to including the entire MIT license text from
LICENSES/preferred/MIT.

Replace the MIT license text with the equivalent SPDX identifier.

Signed-off-by: Sui Jingfeng 
Reviewed-by: Andi Shyti 
---
 include/linux/vgaarb.h | 23 ++-
 1 file changed, 2 insertions(+), 21 deletions(-)

diff --git a/include/linux/vgaarb.h b/include/linux/vgaarb.h
index 6d5465f8c3f2..97129a1bbb7d 100644
--- a/include/linux/vgaarb.h
+++ b/include/linux/vgaarb.h
@@ -1,3 +1,5 @@
+/* SPDX-License-Identifier: MIT */
+
 /*
  * The VGA aribiter manages VGA space routing and VGA resource decode to
  * allow multiple VGA devices to be used in a system in a safe way.
@@ -5,27 +7,6 @@
  * (C) Copyright 2005 Benjamin Herrenschmidt 
  * (C) Copyright 2007 Paulo R. Zanoni 
  * (C) Copyright 2007, 2009 Tiago Vignatti 
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS
- * IN THE SOFTWARE.
- *
  */
 
 #ifndef LINUX_VGA_H
-- 
2.34.1



[PATCH v2 10/11] PCI/VGA: Tidy up the code and comment format

2023-08-08 Thread Sui Jingfeng
From: Sui Jingfeng 

This patch replaces the leading space with a tab and removes the double
blank line and fix various typos, no functional change.

Reviewed-by: Andi Shyti 
Signed-off-by: Sui Jingfeng 
---
 drivers/pci/vgaarb.c   | 90 --
 include/linux/vgaarb.h |  4 +-
 2 files changed, 53 insertions(+), 41 deletions(-)

diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index 9f5cf6a6e3a2..a2f6e0e6b634 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -30,14 +30,13 @@
 #include 
 #include 
 #include 
-
 #include 
-
 #include 
 
 static void vga_arbiter_notify_clients(void);
+
 /*
- * We keep a list of all vga devices in the system to speed
+ * We keep a list of all VGA devices in the system to speed
  * up the various operations of the arbiter
  */
 struct vga_device {
@@ -61,7 +60,6 @@ static bool vga_arbiter_used;
 static DEFINE_SPINLOCK(vga_lock);
 static DECLARE_WAIT_QUEUE_HEAD(vga_wait_queue);
 
-
 static const char *vga_iostate_to_str(unsigned int iostate)
 {
/* Ignore VGA_RSRC_IO and VGA_RSRC_MEM */
@@ -195,14 +193,16 @@ int vga_remove_vgacon(struct pci_dev *pdev)
 #endif
 EXPORT_SYMBOL(vga_remove_vgacon);
 
-/* If we don't ever use VGA arb we should avoid
-   turning off anything anywhere due to old X servers getting
-   confused about the boot device not being VGA */
+/*
+ * If we don't ever use vgaarb, we should avoid turning off anything anywhere.
+ * Due to old X servers getting confused about the boot device not being VGA.
+ */
 static void vga_check_first_use(void)
 {
-   /* we should inform all GPUs in the system that
-* VGA arb has occurred and to try and disable resources
-* if they can */
+   /*
+* We should inform all GPUs in the system that
+* vgaarb has occurred and to try and disable resources if they can
+*/
if (!vga_arbiter_used) {
vga_arbiter_used = true;
vga_arbiter_notify_clients();
@@ -218,7 +218,8 @@ static struct vga_device *__vga_tryget(struct vga_device 
*vgadev,
unsigned int pci_bits;
u32 flags = 0;
 
-   /* Account for "normal" resources to lock. If we decode the legacy,
+   /*
+* Account for "normal" resources to lock. If we decode the legacy,
 * counterpart, we need to request it as well
 */
if ((rsrc & VGA_RSRC_NORMAL_IO) &&
@@ -238,7 +239,8 @@ static struct vga_device *__vga_tryget(struct vga_device 
*vgadev,
if (wants == 0)
goto lock_them;
 
-   /* We don't need to request a legacy resource, we just enable
+   /*
+* We don't need to request a legacy resource, we just enable
 * appropriate decoding and go
 */
legacy_wants = wants & VGA_RSRC_LEGACY_MASK;
@@ -254,7 +256,8 @@ static struct vga_device *__vga_tryget(struct vga_device 
*vgadev,
if (vgadev == conflict)
continue;
 
-   /* We have a possible conflict. before we go further, we must
+   /*
+* We have a possible conflict. before we go further, we must
 * check if we sit on the same bus as the conflicting device.
 * if we don't, then we must tie both IO and MEM resources
 * together since there is only a single bit controlling
@@ -265,13 +268,15 @@ static struct vga_device *__vga_tryget(struct vga_device 
*vgadev,
lwants = VGA_RSRC_LEGACY_IO | VGA_RSRC_LEGACY_MEM;
}
 
-   /* Check if the guy has a lock on the resource. If he does,
+   /*
+* Check if the guy has a lock on the resource. If he does,
 * return the conflicting entry
 */
if (conflict->locks & lwants)
return conflict;
 
-   /* Ok, now check if it owns the resource we want.  We can
+   /*
+* Ok, now check if it owns the resource we want.  We can
 * lock resources that are not decoded, therefore a device
 * can own resources it doesn't decode.
 */
@@ -279,14 +284,16 @@ static struct vga_device *__vga_tryget(struct vga_device 
*vgadev,
if (!match)
continue;
 
-   /* looks like he doesn't have a lock, we can steal
+   /*
+* Looks like he doesn't have a lock, we can steal
 * them from him
 */
 
flags = 0;
pci_bits = 0;
 
-   /* If we can't control legacy resources via the bridge, we
+   /*
+* If we can't control legacy resources via the bridge, we
 * also need to disable normal decoding.
 */
if (!conflict->bridge_has_one_vga) {
@@ -313,7 +320,8 @@ static struct vga_device *__vga_tryget(struct 

[PATCH v2 05/11] PCI/VGA: Move the new_state assignment out of the loop

2023-08-08 Thread Sui Jingfeng
From: Sui Jingfeng 

In the vga_arbiter_notify_clients() function, the value of the 'new_state'
variable will be 'false' on systems that have more than one VGA device.
The value will be 'true' if there is only one VGA device or no VGA device
at all. Hence, its value is not relevant to the iteration of the loop.

So move the assignment clause out of the loop. For a system with multiple
video cards, this patch saves unnecessary assignment.

Signed-off-by: Sui Jingfeng 
---
 drivers/pci/vgaarb.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index dc10a262fb5e..6883067a802a 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -1468,22 +1468,20 @@ static void vga_arbiter_notify_clients(void)
 {
struct vga_device *vgadev;
unsigned long flags;
-   uint32_t new_decodes;
-   bool new_state;
+   bool state;
 
if (!vga_arbiter_used)
return;
 
+   state = (vga_count > 1) ? false : true;
+
spin_lock_irqsave(_lock, flags);
list_for_each_entry(vgadev, _list, list) {
-   if (vga_count > 1)
-   new_state = false;
-   else
-   new_state = true;
if (vgadev->set_decode) {
-   new_decodes = vgadev->set_decode(vgadev->pdev,
-new_state);
-   vga_update_device_decodes(vgadev, new_decodes);
+   unsigned int decodes;
+
+   decodes = vgadev->set_decode(vgadev->pdev, state);
+   vga_update_device_decodes(vgadev, decodes);
}
}
spin_unlock_irqrestore(_lock, flags);
-- 
2.34.1



[PATCH v2 08/11] PCI/VGA: Fix a typo to the comment of vga_default

2023-08-08 Thread Sui Jingfeng
From: Sui Jingfeng 

Fixes: deb2d2ecd43d ("PCI/GPU: implement VGA arbitration on Linux")
Signed-off-by: Sui Jingfeng 
---
 drivers/pci/vgaarb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index a6b8c0def35d..d80d92e8012b 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -99,7 +99,7 @@ static int vga_str_to_iostate(char *buf, int str_size, 
unsigned int *io_state)
return 1;
 }
 
-/* this is only used a cookie - it should not be dereferenced */
+/* This is only used as a cookie, it should not be dereferenced */
 static struct pci_dev *vga_default;
 
 /* Find somebody in our list */
-- 
2.34.1



[PATCH v2 06/11] PCI/VGA: Fix two typos in the comments of pci_notify()

2023-08-08 Thread Sui Jingfeng
From: Sui Jingfeng 

1) s/intereted/interested
2) s/hotplugable/hot-pluggable

While at it, convert the comments to the conventional multi-line style,
and rewrap to fill 78 columns.

Fixes: deb2d2ecd43d ("PCI/GPU: implement VGA arbitration on Linux")
Signed-off-by: Sui Jingfeng 
---
 drivers/pci/vgaarb.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index 6883067a802a..811510253553 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -1535,9 +1535,11 @@ static int pci_notify(struct notifier_block *nb, 
unsigned long action,
if (!pci_dev_is_vga(pdev))
return 0;
 
-   /* For now we're only intereted in devices added and removed. I didn't
-* test this thing here, so someone needs to double check for the
-* cases of hotplugable vga cards. */
+   /*
+* For now, we're only interested in devices added and removed.
+* I didn't test this thing here, so someone needs to double check
+* for the cases of hot-pluggable VGA cards.
+*/
if (action == BUS_NOTIFY_ADD_DEVICE)
notify = vga_arbiter_add_pci_device(pdev);
else if (action == BUS_NOTIFY_DEL_DEVICE)
-- 
2.34.1



[PATCH v2 07/11] PCI/VGA: vga_client_register() return -ENODEV on failure, not -1

2023-08-08 Thread Sui Jingfeng
From: Sui Jingfeng 

Fixes: 934f992c763a ("drm/i915: Recognise non-VGA display devices")
Signed-off-by: Sui Jingfeng 
---
 drivers/pci/vgaarb.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index 811510253553..a6b8c0def35d 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -964,7 +964,7 @@ EXPORT_SYMBOL(vga_set_legacy_decoding);
  *
  * To unregister just call vga_client_unregister().
  *
- * Returns: 0 on success, -1 on failure
+ * Returns: 0 on success, -ENODEV on failure
  */
 int vga_client_register(struct pci_dev *pdev,
unsigned int (*set_decode)(struct pci_dev *pdev, bool decode))
@@ -975,16 +975,13 @@ int vga_client_register(struct pci_dev *pdev,
 
spin_lock_irqsave(_lock, flags);
vgadev = vgadev_find(pdev);
-   if (!vgadev)
-   goto bail;
-
-   vgadev->set_decode = set_decode;
-   ret = 0;
-
-bail:
+   if (vgadev) {
+   vgadev->set_decode = set_decode;
+   ret = 0;
+   }
spin_unlock_irqrestore(_lock, flags);
-   return ret;
 
+   return ret;
 }
 EXPORT_SYMBOL(vga_client_register);
 
-- 
2.34.1



[PATCH v2 04/11] PCI/VGA: Drop the inline in the vga_update_device_decodes() function.

2023-08-08 Thread Sui Jingfeng
From: Sui Jingfeng 

The vga_update_device_decodes() function is not performance-critical.
So drop the inline. This patch also makes the parameter consistent with
the argument, using the 'unsigned int' type instead of the 'signed' type
to store the decode.

Change the second argument of the vga_update_device_decodes() function
to 'unsigned int' type.

Signed-off-by: Sui Jingfeng 
---
 drivers/pci/vgaarb.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index 8742a51d450f..dc10a262fb5e 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -860,24 +860,24 @@ static bool vga_arbiter_del_pci_device(struct pci_dev 
*pdev)
return ret;
 }
 
-/* this is called with the lock */
-static inline void vga_update_device_decodes(struct vga_device *vgadev,
-int new_decodes)
+/* This is called with the lock */
+static void vga_update_device_decodes(struct vga_device *vgadev,
+ unsigned int new_decodes)
 {
struct device *dev = >pdev->dev;
-   int old_decodes, decodes_removed, decodes_unlocked;
+   unsigned int old_decodes = vgadev->decodes;
+   unsigned int decodes_removed = ~new_decodes & old_decodes;
+   unsigned int decodes_unlocked = vgadev->locks & decodes_removed;
 
-   old_decodes = vgadev->decodes;
-   decodes_removed = ~new_decodes & old_decodes;
-   decodes_unlocked = vgadev->locks & decodes_removed;
vgadev->decodes = new_decodes;
 
-   vgaarb_info(dev, "changed VGA decodes: 
olddecodes=%s,decodes=%s:owns=%s\n",
-   vga_iostate_to_str(old_decodes),
-   vga_iostate_to_str(vgadev->decodes),
-   vga_iostate_to_str(vgadev->owns));
+   vgaarb_info(dev,
+   "VGA decodes changed: olddecodes=%s,decodes=%s:owns=%s\n",
+   vga_iostate_to_str(old_decodes),
+   vga_iostate_to_str(vgadev->decodes),
+   vga_iostate_to_str(vgadev->owns));
 
-   /* if we removed locked decodes, lock count goes to zero, and release */
+   /* If we removed locked decodes, lock count goes to zero, and release */
if (decodes_unlocked) {
if (decodes_unlocked & VGA_RSRC_LEGACY_IO)
vgadev->io_lock_cnt = 0;
-- 
2.34.1



[PATCH v2 03/11] PCI/VGA: Deal with VGA class devices

2023-08-08 Thread Sui Jingfeng
From: Sui Jingfeng 

vgaarb only cares about PCI(e) VGA devices (pdev->class == 0x0300XX)
Currently, hence we only need to add VGA devices has its class code equals
to 0x0300 to the arbiter. To keep align with the previous behavior. we
ignore the programming interface byte (the least significant 8 bits)
intentionally.

After apply this patch, We will filter the unqualified devices out in the
vga_arb_device_init() function. While the current implementation is to
search all PCI devices in a system, this is not efficient. This also means
that deleting a PCI device no longer needs to walk the list.

Note that the major contribution of this patch is optimization.

Reviewed-by: Mario Limonciello 
Signed-off-by: Sui Jingfeng 
---
 drivers/pci/vgaarb.c | 68 
 1 file changed, 56 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index c1bc6c983932..8742a51d450f 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -754,10 +754,6 @@ static bool vga_arbiter_add_pci_device(struct pci_dev 
*pdev)
struct pci_dev *bridge;
u16 cmd;
 
-   /* Only deal with VGA class devices */
-   if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
-   return false;
-
/* Allocate structure */
vgadev = kzalloc(sizeof(struct vga_device), GFP_KERNEL);
if (vgadev == NULL) {
@@ -1493,6 +1489,42 @@ static void vga_arbiter_notify_clients(void)
spin_unlock_irqrestore(_lock, flags);
 }
 
+/*
+ * The PCI Class Code spec implies that only VGA devices with programming
+ * interface 0x00 can depend on the legacy VGA address range. VGA devices
+ * with programming interface 0x01 are 8514-compatible controllers. Since
+ * VGA devices with programming interface 0x00 is VGA compatible, the 'vga'
+ * suffix here should refer to the VGA-compatible devices after a strict
+ * reading of that specification. But considering the fact that there
+ * probably don't has a 8514-compatible controller that could be used with
+ * upstream kernel anymore, we would like to just ignore the programming
+ * interface byte.
+ *
+ * Besides, there do exist non VGA-compatible display controllers in the
+ * world and hardware vendors may abandon the old VGA standard someday.
+ * The meaning of 'vga' suffix here may change to evolve with time.
+ *
+ * A strict understanding of 'vga' certainly should be VGA-compatible, While
+ * a relaxed understanding of 'vga' would be PCI devices that are able to
+ * display. Currently, we just keep aligned to the previous behavior.
+ * Deal with VGA class devices.
+ */
+static bool pci_dev_is_vga(struct pci_dev *pdev)
+{
+   if ((pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
+   return true;
+
+   /*
+* The PCI_CLASS_NOT_DEFINED_VGA is defined to provide backward
+* compatibility for devices that were built before the class code
+* field was defined.
+*/
+   if ((pdev->class >> 8) == PCI_CLASS_NOT_DEFINED_VGA)
+   return true;
+
+   return false;
+}
+
 static int pci_notify(struct notifier_block *nb, unsigned long action,
  void *data)
 {
@@ -1502,6 +1534,9 @@ static int pci_notify(struct notifier_block *nb, unsigned 
long action,
 
vgaarb_dbg(dev, "%s\n", __func__);
 
+   if (!pci_dev_is_vga(pdev))
+   return 0;
+
/* For now we're only intereted in devices added and removed. I didn't
 * test this thing here, so someone needs to double check for the
 * cases of hotplugable vga cards. */
@@ -1534,8 +1569,8 @@ static struct miscdevice vga_arb_device = {
 
 static int __init vga_arb_device_init(void)
 {
+   struct pci_dev *pdev = NULL;
int rc;
-   struct pci_dev *pdev;
 
rc = misc_register(_arb_device);
if (rc < 0)
@@ -1543,13 +1578,22 @@ static int __init vga_arb_device_init(void)
 
bus_register_notifier(_bus_type, _notifier);
 
-   /* We add all PCI devices satisfying VGA class in the arbiter by
-* default */
-   pdev = NULL;
-   while ((pdev =
-   pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
-  PCI_ANY_ID, pdev)) != NULL)
-   vga_arbiter_add_pci_device(pdev);
+   /*
+* We add all PCI devices satisfying VGA class in the arbiter
+* by default, but we ignore the programming interface byte
+* intentionally.
+*/
+   do {
+   pdev = pci_get_class_masked(PCI_CLASS_DISPLAY_VGA << 8, 
0x00, pdev);
+   if (pdev && pci_dev_is_vga(pdev))
+   vga_arbiter_add_pci_device(pdev);
+   } while (pdev);
+
+   do {
+   pdev = pci_get_class_masked(PCI_CLASS_NOT_DEFINED_VGA << 8, 
0x00, pdev);
+   if (pdev && pci_dev_is_vga(pdev))
+   vga_arbiter_add_pci_device(pdev);
+   } while (pdev);
 

[PATCH v2 02/11] PCI: Add the pci_get_class_masked() helper

2023-08-08 Thread Sui Jingfeng
From: Sui Jingfeng 

Because there is no good way to get the mask member used to searching for
devices that conform to a specific PCI class code, an application needs to
process all PCI display devices can achieve its goal as follows:

pdev = NULL;
do {
pdev = pci_get_class_masked(PCI_BASE_CLASS_DISPLAY << 16, 0xFF, 
pdev);
if (pdev)
do_something_for_pci_display_device(pdev);
} while (pdev);

While previously, we just can not ignore Sub-Class code and the Programming
Interface byte when do the searching.

Signed-off-by: Sui Jingfeng 
---
 drivers/pci/search.c | 30 ++
 include/linux/pci.h  |  7 +++
 2 files changed, 37 insertions(+)

diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index b4c138a6ec02..f1c15aea868b 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -334,6 +334,36 @@ struct pci_dev *pci_get_device(unsigned int vendor, 
unsigned int device,
 }
 EXPORT_SYMBOL(pci_get_device);
 
+/**
+ * pci_get_class_masked - begin or continue searching for a PCI device by 
class and mask
+ * @class: search for a PCI device with this class designation
+ * @from: Previous PCI device found in search, or %NULL for new search.
+ *
+ * Iterates through the list of known PCI devices.  If a PCI device is
+ * found with a matching @class, the reference count to the device is
+ * incremented and a pointer to its device structure is returned.
+ * Otherwise, %NULL is returned.
+ * A new search is initiated by passing %NULL as the @from argument.
+ * Otherwise if @from is not %NULL, searches continue from next device
+ * on the global list.  The reference count for @from is always decremented
+ * if it is not %NULL.
+ */
+struct pci_dev *pci_get_class_masked(unsigned int class, unsigned int mask,
+struct pci_dev *from)
+{
+   struct pci_device_id id = {
+   .vendor = PCI_ANY_ID,
+   .device = PCI_ANY_ID,
+   .subvendor = PCI_ANY_ID,
+   .subdevice = PCI_ANY_ID,
+   .class_mask = mask,
+   .class = class,
+   };
+
+   return pci_get_dev_by_id(, from);
+}
+EXPORT_SYMBOL(pci_get_class_masked);
+
 /**
  * pci_get_class - begin or continue searching for a PCI device by class
  * @class: search for a PCI device with this class designation
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 0ff7500772e6..b20e7ba844bf 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1180,6 +1180,9 @@ struct pci_dev *pci_get_slot(struct pci_bus *bus, 
unsigned int devfn);
 struct pci_dev *pci_get_domain_bus_and_slot(int domain, unsigned int bus,
unsigned int devfn);
 struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from);
+struct pci_dev *pci_get_class_masked(unsigned int class, unsigned int mask,
+struct pci_dev *from);
+
 int pci_dev_present(const struct pci_device_id *ids);
 
 int pci_bus_read_config_byte(struct pci_bus *bus, unsigned int devfn,
@@ -1895,6 +1898,10 @@ static inline struct pci_dev *pci_get_class(unsigned int 
class,
struct pci_dev *from)
 { return NULL; }
 
+static inline struct pci_dev *pci_get_class_masked(unsigned int class,
+  unsigned int mask,
+  struct pci_dev *from)
+{ return NULL; }
 
 static inline int pci_dev_present(const struct pci_device_id *ids)
 { return 0; }
-- 
2.34.1



[PATCH v2 01/11] PCI/VGA: Use unsigned type for the io_state variable

2023-08-08 Thread Sui Jingfeng
From: Sui Jingfeng 

The io_state variable in the vga_arb_write() function is declared with
unsigned int type, while the vga_str_to_iostate() function takes 'int *'
type. To keep them consistent, this patch replaceis the third argument of
vga_str_to_iostate() function with 'unsigned int *' type.

Reviewed-by: Andi Shyti 
Signed-off-by: Sui Jingfeng 
---
 drivers/pci/vgaarb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index 5a696078b382..c1bc6c983932 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -77,7 +77,7 @@ static const char *vga_iostate_to_str(unsigned int iostate)
return "none";
 }
 
-static int vga_str_to_iostate(char *buf, int str_size, int *io_state)
+static int vga_str_to_iostate(char *buf, int str_size, unsigned int *io_state)
 {
/* we could in theory hand out locks on IO and mem
 * separately to userspace but it can cause deadlocks */
-- 
2.34.1



[PATCH v2 00/11] Fix typos, comments and copyright

2023-08-08 Thread Sui Jingfeng
From: Sui Jingfeng 

v1:
* Various improve.
v2:
* More fixes, optimizations and improvements.

Sui Jingfeng (11):
  PCI/VGA: Use unsigned type for the io_state variable
  PCI: Add the pci_get_class_masked() helper
  PCI/VGA: Deal with VGA class devices
  PCI/VGA: Drop the inline in the vga_update_device_decodes() function.
  PCI/VGA: Move the new_state assignment out of the loop
  PCI/VGA: Fix two typos in the comments of pci_notify()
  PCI/VGA: vga_client_register() return -ENODEV on failure, not -1
  PCI/VGA: Fix a typo to the comment of vga_default
  PCI/VGA: Fix a typo to the comments in vga_str_to_iostate() function
  PCI/VGA: Tidy up the code and comment format
  PCI/VGA: Replace full MIT license text with SPDX identifier

 drivers/pci/search.c   |  30 ++
 drivers/pci/vgaarb.c   | 233 +
 include/linux/pci.h|   7 ++
 include/linux/vgaarb.h |  27 +
 4 files changed, 185 insertions(+), 112 deletions(-)


base-commit: 69286072664490a366f3331f9496fe78efaca603
-- 
2.34.1



[PATCH 1/2] drm/panfrost: Add fdinfo support to Panfrost

2023-08-08 Thread Adrián Larumbe
We calculate the amount of time the GPU spends on a job with ktime samples,
and then add it to the cumulative total for the open DRM file, which is
what will be eventually exposed through the 'fdinfo' DRM file descriptor.

Signed-off-by: Adrián Larumbe 
---
 drivers/gpu/drm/panfrost/panfrost_device.c | 12 
 drivers/gpu/drm/panfrost/panfrost_device.h | 10 +++
 drivers/gpu/drm/panfrost/panfrost_drv.c| 32 +-
 drivers/gpu/drm/panfrost/panfrost_job.c|  6 
 drivers/gpu/drm/panfrost/panfrost_job.h|  3 ++
 5 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c 
b/drivers/gpu/drm/panfrost/panfrost_device.c
index fa1a086a862b..67a5e894d037 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.c
+++ b/drivers/gpu/drm/panfrost/panfrost_device.c
@@ -401,6 +401,18 @@ void panfrost_device_reset(struct panfrost_device *pfdev)
panfrost_job_enable_interrupts(pfdev);
 }
 
+struct drm_info_gpu panfrost_device_get_counters(struct panfrost_device *pfdev,
+struct panfrost_file_priv 
*panfrost_priv)
+{
+   struct drm_info_gpu gpu_info;
+
+   gpu_info.engine =  panfrost_priv->elapsed_ns;
+   gpu_info.cycles =  panfrost_priv->elapsed_ns * 
clk_get_rate(pfdev->clock);
+   gpu_info.maxfreq =  clk_get_rate(pfdev->clock);
+
+   return gpu_info;
+}
+
 static int panfrost_device_resume(struct device *dev)
 {
struct panfrost_device *pfdev = dev_get_drvdata(dev);
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
b/drivers/gpu/drm/panfrost/panfrost_device.h
index b0126b9fbadc..4621a2ece1bb 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -141,6 +141,14 @@ struct panfrost_file_priv {
struct drm_sched_entity sched_entity[NUM_JOB_SLOTS];
 
struct panfrost_mmu *mmu;
+
+   uint64_t elapsed_ns;
+};
+
+struct drm_info_gpu {
+   unsigned long long engine;
+   unsigned long long cycles;
+   unsigned int maxfreq;
 };
 
 static inline struct panfrost_device *to_panfrost_device(struct drm_device 
*ddev)
@@ -172,6 +180,8 @@ int panfrost_unstable_ioctl_check(void);
 int panfrost_device_init(struct panfrost_device *pfdev);
 void panfrost_device_fini(struct panfrost_device *pfdev);
 void panfrost_device_reset(struct panfrost_device *pfdev);
+struct drm_info_gpu panfrost_device_get_counters(struct panfrost_device *pfdev,
+struct panfrost_file_priv 
*panfrost_priv);
 
 extern const struct dev_pm_ops panfrost_pm_ops;
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index a2ab99698ca8..65fdc0e4c7cb 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -3,6 +3,7 @@
 /* Copyright 2019 Linaro, Ltd., Rob Herring  */
 /* Copyright 2019 Collabora ltd. */
 
+#include "drm/drm_file.h"
 #include 
 #include 
 #include 
@@ -267,6 +268,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
job->requirements = args->requirements;
job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
job->mmu = file_priv->mmu;
+   job->priv = file_priv;
 
slot = panfrost_job_get_slot(job);
 
@@ -523,7 +525,34 @@ static const struct drm_ioctl_desc 
panfrost_drm_driver_ioctls[] = {
PANFROST_IOCTL(MADVISE, madvise,DRM_RENDER_ALLOW),
 };
 
-DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops);
+
+static void panfrost_gpu_show_fdinfo(struct panfrost_device *pfdev,
+struct panfrost_file_priv *panfrost_priv,
+struct drm_printer *p)
+{
+   struct drm_info_gpu gpu_info;
+
+   gpu_info = panfrost_device_get_counters(pfdev, panfrost_priv);
+
+   drm_printf(p, "drm-engine-gpu:\t%llu ns\n", gpu_info.engine);
+   drm_printf(p, "drm-cycles-gpu:\t%llu\n", gpu_info.cycles);
+   drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu_info.maxfreq);
+}
+
+static void panfrost_show_fdinfo(struct drm_printer *p, struct drm_file *file)
+{
+   struct drm_device *dev = file->minor->dev;
+   struct panfrost_device *pfdev = dev->dev_private;
+
+   panfrost_gpu_show_fdinfo(pfdev, file->driver_priv, p);
+
+}
+
+static const struct file_operations panfrost_drm_driver_fops = {
+   .owner = THIS_MODULE,
+   DRM_GEM_FOPS,
+   .show_fdinfo = drm_show_fdinfo,
+};
 
 /*
  * Panfrost driver version:
@@ -535,6 +564,7 @@ static const struct drm_driver panfrost_drm_driver = {
.driver_features= DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ,
.open   = panfrost_open,
.postclose  = panfrost_postclose,
+   .show_fdinfo= panfrost_show_fdinfo,
.ioctls = panfrost_drm_driver_ioctls,
.num_ioctls = 

[PATCH 2/2] drm/panfrost: Add drm memory stats display through fdinfo

2023-08-08 Thread Adrián Larumbe
For drm_show_memory_stats to produce a more accurate report, provide a new
Panfrost DRM object callback that decides whether an object is resident in
memory or eligible for purging.

Signed-off-by: Adrián Larumbe 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c |  8 ++--
 drivers/gpu/drm/panfrost/panfrost_gem.c | 16 
 drivers/gpu/drm/panfrost/panfrost_gem.h |  1 +
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 65fdc0e4c7cb..46e8e69479c0 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -441,11 +441,14 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, 
void *data,
args->retained = drm_gem_shmem_madvise(>base, args->madv);
 
if (args->retained) {
-   if (args->madv == PANFROST_MADV_DONTNEED)
+   if (args->madv == PANFROST_MADV_DONTNEED) {
list_move_tail(>base.madv_list,
   >shrinker_list);
-   else if (args->madv == PANFROST_MADV_WILLNEED)
+   bo->is_purgable = true;
+   } else if (args->madv == PANFROST_MADV_WILLNEED) {
list_del_init(>base.madv_list);
+   bo->is_purgable = false;
+   }
}
 
 out_unlock_mappings:
@@ -546,6 +549,7 @@ static void panfrost_show_fdinfo(struct drm_printer *p, 
struct drm_file *file)
 
panfrost_gpu_show_fdinfo(pfdev, file->driver_priv, p);
 
+   drm_show_memory_stats(p, file);
 }
 
 static const struct file_operations panfrost_drm_driver_fops = {
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c 
b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 3c812fbd126f..80ab1521a14e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -195,6 +195,21 @@ static int panfrost_gem_pin(struct drm_gem_object *obj)
return drm_gem_shmem_pin(>base);
 }
 
+static enum drm_gem_object_status panfrost_gem_status(struct drm_gem_object 
*obj)
+{
+   struct panfrost_gem_object *bo = to_panfrost_bo(obj);
+   struct panfrost_device *pfdev = obj->dev->dev_private;
+   unsigned int res = 0;
+
+   mutex_lock(>shrinker_lock);
+   res |= (bo->is_purgable) ? DRM_GEM_OBJECT_PURGEABLE : 0;
+   mutex_unlock(>shrinker_lock);
+
+   res |= (bo->base.pages) ? DRM_GEM_OBJECT_RESIDENT : 0;
+
+   return res;
+}
+
 static const struct drm_gem_object_funcs panfrost_gem_funcs = {
.free = panfrost_gem_free_object,
.open = panfrost_gem_open,
@@ -206,6 +221,7 @@ static const struct drm_gem_object_funcs panfrost_gem_funcs 
= {
.vmap = drm_gem_shmem_object_vmap,
.vunmap = drm_gem_shmem_object_vunmap,
.mmap = drm_gem_shmem_object_mmap,
+   .status = panfrost_gem_status,
.vm_ops = _gem_shmem_vm_ops,
 };
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h 
b/drivers/gpu/drm/panfrost/panfrost_gem.h
index ad2877eeeccd..e06f7ceb8f73 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.h
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.h
@@ -38,6 +38,7 @@ struct panfrost_gem_object {
 
bool noexec :1;
bool is_heap:1;
+   bool is_purgable:1;
 };
 
 struct panfrost_gem_mapping {
-- 
2.41.0



[PATCH 0/2] Add fdinfo support to Panfrost

2023-08-08 Thread Adrián Larumbe
This patch series adds basic fdinfo support to the Panfrost DRM driver.
It will display a series of key:value pairs under /proc/pid/fdinfo/fd
for render processes that open the Panfrost DRM file.

The pairs contain basic drm gpu engine and memory region information that
can either be cat by a privileged user or accessed with IGT's gputop
utility.

Adrián Larumbe (2):
  drm/panfrost: Add fdinfo support to Panfrost
  drm/panfrost: Add drm memory stats display through fdinfo

 drivers/gpu/drm/panfrost/panfrost_device.c | 12 +++
 drivers/gpu/drm/panfrost/panfrost_device.h | 10 ++
 drivers/gpu/drm/panfrost/panfrost_drv.c| 40 --
 drivers/gpu/drm/panfrost/panfrost_gem.c| 16 +
 drivers/gpu/drm/panfrost/panfrost_gem.h|  1 +
 drivers/gpu/drm/panfrost/panfrost_job.c|  6 
 drivers/gpu/drm/panfrost/panfrost_job.h|  3 ++
 7 files changed, 85 insertions(+), 3 deletions(-)

-- 
2.41.0



[PATCH v1] drm/msm/dp: do not reinitialize phy unless retry during link training

2023-08-08 Thread Kuogee Hsieh
DP PHY re-initialization done using dp_ctrl_reinitialize_mainlink() will
cause PLL unlocked initially and then PLL gets locked at the end of
initialization. PLL_UNLOCKED interrupt will fire during this time if the
interrupt mask is enabled.
However currently DP driver link training implementation incorrectly
re-initializes PHY unconditionally during link training as the PHY was
already configured in dp_ctrl_enable_mainlink_clocks().

Fix this by re-initializing the PHY only if the previous link training
failed.

[drm:dp_aux_isr] *ERROR* Unexpected DP AUX IRQ 0x0100 when not busy

Fixes: c943b4948b58 ("drm/msm/dp: add displayPort driver support")
Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/30
Signed-off-by: Kuogee Hsieh 
---
 drivers/gpu/drm/msm/dp/dp_ctrl.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c
index a7a5c7e..77a8d93 100644
--- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
+++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
@@ -1774,13 +1774,6 @@ int dp_ctrl_on_link(struct dp_ctrl *dp_ctrl)
return rc;
 
while (--link_train_max_retries) {
-   rc = dp_ctrl_reinitialize_mainlink(ctrl);
-   if (rc) {
-   DRM_ERROR("Failed to reinitialize mainlink. rc=%d\n",
-   rc);
-   break;
-   }
-
training_step = DP_TRAINING_NONE;
rc = dp_ctrl_setup_main_link(ctrl, _step);
if (rc == 0) {
@@ -1832,6 +1825,12 @@ int dp_ctrl_on_link(struct dp_ctrl *dp_ctrl)
/* stop link training before start re training  */
dp_ctrl_clear_training_pattern(ctrl);
}
+
+   rc = dp_ctrl_reinitialize_mainlink(ctrl);
+   if (rc) {
+   DRM_ERROR("Failed to reinitialize mainlink. rc=%d\n", 
rc);
+   break;
+   }
}
 
if (ctrl->link->sink_request & DP_TEST_LINK_PHY_TEST_PATTERN)
-- 
2.7.4



Re: [PATCH v4] drm/i915: Avoid circular locking dependency when flush delayed work on gt reset

2023-08-08 Thread Dong, Zhanjun

Hi Daniel,


On 2023-08-03 9:03 a.m., Daniel Vetter wrote:

On Thu, 27 Jul 2023 at 22:13, Zhanjun Dong  wrote:


This attempts to avoid circular locking dependency between flush delayed work 
and intel_gt_reset.
Switched from cancel_delayed_work_sync to cancel_delayed_work, the non-sync 
version for reset path, it is safe as the worker has the trylock code to handle 
the lock; Meanwhile keep the sync version for park/fini to ensure the worker is 
not still running during suspend or shutdown.

WARNING: possible circular locking dependency detected
6.4.0-rc1-drmtip_1340-g31e3463b0edb+ #1 Not tainted
--
kms_pipe_crc_ba/6415 is trying to acquire lock:
88813e6cc640 
((work_completion)(&(>timestamp.work)->work)){+.+.}-{0:0}, at: 
__flush_work+0x42/0x530

but task is already holding lock:
88813e6cce90 (>reset.mutex){+.+.}-{3:3}, at: intel_gt_reset+0x19e/0x470 
[i915]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (>reset.mutex){+.+.}-{3:3}:
 lock_acquire+0xd8/0x2d0
 i915_gem_shrinker_taints_mutex+0x31/0x50 [i915]
 intel_gt_init_reset+0x65/0x80 [i915]
 intel_gt_common_init_early+0xe1/0x170 [i915]
 intel_root_gt_init_early+0x48/0x60 [i915]
 i915_driver_probe+0x671/0xcb0 [i915]
 i915_pci_probe+0xdc/0x210 [i915]
 pci_device_probe+0x95/0x120
 really_probe+0x164/0x3c0
 __driver_probe_device+0x73/0x160
 driver_probe_device+0x19/0xa0
 __driver_attach+0xb6/0x180
 bus_for_each_dev+0x77/0xd0
 bus_add_driver+0x114/0x210
 driver_register+0x5b/0x110
 __pfx_vgem_open+0x3/0x10 [vgem]
 do_one_initcall+0x57/0x270
 do_init_module+0x5f/0x220
 load_module+0x1ca4/0x1f00
 __do_sys_finit_module+0xb4/0x130
 do_syscall_64+0x3c/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

-> #2 (fs_reclaim){+.+.}-{0:0}:
 lock_acquire+0xd8/0x2d0
 fs_reclaim_acquire+0xac/0xe0
 kmem_cache_alloc+0x32/0x260
 i915_vma_instance+0xb2/0xc60 [i915]
 i915_gem_object_ggtt_pin_ww+0x175/0x370 [i915]
 vm_fault_gtt+0x22d/0xf60 [i915]
 __do_fault+0x2f/0x1d0
 do_pte_missing+0x4a/0xd20
 __handle_mm_fault+0x5b0/0x790
 handle_mm_fault+0xa2/0x230
 do_user_addr_fault+0x3ea/0xa10
 exc_page_fault+0x68/0x1a0
 asm_exc_page_fault+0x26/0x30

-> #1 (>reset.backoff_srcu){}-{0:0}:
 lock_acquire+0xd8/0x2d0
 _intel_gt_reset_lock+0x57/0x330 [i915]
 guc_timestamp_ping+0x35/0x130 [i915]
 process_one_work+0x250/0x510
 worker_thread+0x4f/0x3a0
 kthread+0xff/0x130
 ret_from_fork+0x29/0x50

-> #0 ((work_completion)(&(>timestamp.work)->work)){+.+.}-{0:0}:
 check_prev_add+0x90/0xc60
 __lock_acquire+0x1998/0x2590
 lock_acquire+0xd8/0x2d0
 __flush_work+0x74/0x530
 __cancel_work_timer+0x14f/0x1f0
 intel_guc_submission_reset_prepare+0x81/0x4b0 [i915]
 intel_uc_reset_prepare+0x9c/0x120 [i915]
 reset_prepare+0x21/0x60 [i915]
 intel_gt_reset+0x1dd/0x470 [i915]
 intel_gt_reset_global+0xfb/0x170 [i915]
 intel_gt_handle_error+0x368/0x420 [i915]
 intel_gt_debugfs_reset_store+0x5c/0xc0 [i915]
 i915_wedged_set+0x29/0x40 [i915]
 simple_attr_write_xsigned.constprop.0+0xb4/0x110
 full_proxy_write+0x52/0x80
 vfs_write+0xc5/0x4f0
 ksys_write+0x64/0xe0
 do_syscall_64+0x3c/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

other info that might help us debug this:
  Chain exists of:
   (work_completion)(&(>timestamp.work)->work) --> fs_reclaim --> 
>reset.mutex
   Possible unsafe locking scenario:
 CPU0CPU1
 
lock(>reset.mutex);
 lock(fs_reclaim);
 lock(>reset.mutex);
lock((work_completion)(&(>timestamp.work)->work));

  *** DEADLOCK ***
  3 locks held by kms_pipe_crc_ba/6415:
   #0: 888101541430 (sb_writers#15){.+.+}-{0:0}, at: ksys_write+0x64/0xe0
   #1: 888136c7eab8 (>mutex){+.+.}-{3:3}, at: 
simple_attr_write_xsigned.constprop.0+0x47/0x110
   #2: 88813e6cce90 (>reset.mutex){+.+.}-{3:3}, at: 
intel_gt_reset+0x19e/0x470 [i915]

v2: Add sync flag to guc_cancel_busyness_worker to ensure reset path calls 
asynchronous cancel.
v3: Add sync flag to intel_guc_submission_disable to ensure reset path calls 
asynchronous cancel.
v4: Set to always sync from __uc_fini_hw path.

Signed-off-by: Zhanjun Dong 
Cc: John Harrison 
Cc: Andi Shyti 
---
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c   | 17 ++---
  .../gpu/drm/i915/gt/uc/intel_guc_submission.h   |  2 +-
  drivers/gpu/drm/i915/gt/uc/intel_uc.c   |  4 ++--
  3 files changed, 13 insertions(+), 10 

Re: [PATCH v4] drm/i915: Avoid circular locking dependency when flush delayed work on gt reset

2023-08-08 Thread Dong, Zhanjun

Hi Andi,


On 2023-08-03 8:36 a.m., Andi Shyti wrote:

Hi Zhanjun,

On Thu, Jul 27, 2023 at 01:13:23PM -0700, Zhanjun Dong wrote:

This attempts to avoid circular locking dependency between flush delayed work 
and intel_gt_reset.
Switched from cancel_delayed_work_sync to cancel_delayed_work, the non-sync 
version for reset path, it is safe as the worker has the trylock code to handle 
the lock; Meanwhile keep the sync version for park/fini to ensure the worker is 
not still running during suspend or shutdown.


Next time, please wrap the sentences to 65 characters (standing
to the e-mail netiquette, RFC1855[1]) or 70-75 characters
(standing to the kernel guidelines[2]).

[1] https://www.ietf.org/rfc/rfc1855.txt
 chapter "2.1.1 For mail", page 3
[2] https://docs.kernel.org/process/submitting-patches.html
 chapter "The canonical patch format"



Thanks, will be fixed in next revision.


WARNING: possible circular locking dependency detected
6.4.0-rc1-drmtip_1340-g31e3463b0edb+ #1 Not tainted
--
kms_pipe_crc_ba/6415 is trying to acquire lock:
88813e6cc640 
((work_completion)(&(>timestamp.work)->work)){+.+.}-{0:0}, at: 
__flush_work+0x42/0x530

but task is already holding lock:
88813e6cce90 (>reset.mutex){+.+.}-{3:3}, at: intel_gt_reset+0x19e/0x470 
[i915]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (>reset.mutex){+.+.}-{3:3}:
 lock_acquire+0xd8/0x2d0
 i915_gem_shrinker_taints_mutex+0x31/0x50 [i915]
 intel_gt_init_reset+0x65/0x80 [i915]
 intel_gt_common_init_early+0xe1/0x170 [i915]
 intel_root_gt_init_early+0x48/0x60 [i915]
 i915_driver_probe+0x671/0xcb0 [i915]
 i915_pci_probe+0xdc/0x210 [i915]
 pci_device_probe+0x95/0x120
 really_probe+0x164/0x3c0
 __driver_probe_device+0x73/0x160
 driver_probe_device+0x19/0xa0
 __driver_attach+0xb6/0x180
 bus_for_each_dev+0x77/0xd0
 bus_add_driver+0x114/0x210
 driver_register+0x5b/0x110
 __pfx_vgem_open+0x3/0x10 [vgem]
 do_one_initcall+0x57/0x270
 do_init_module+0x5f/0x220
 load_module+0x1ca4/0x1f00
 __do_sys_finit_module+0xb4/0x130
 do_syscall_64+0x3c/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

-> #2 (fs_reclaim){+.+.}-{0:0}:
 lock_acquire+0xd8/0x2d0
 fs_reclaim_acquire+0xac/0xe0
 kmem_cache_alloc+0x32/0x260
 i915_vma_instance+0xb2/0xc60 [i915]
 i915_gem_object_ggtt_pin_ww+0x175/0x370 [i915]
 vm_fault_gtt+0x22d/0xf60 [i915]
 __do_fault+0x2f/0x1d0
 do_pte_missing+0x4a/0xd20
 __handle_mm_fault+0x5b0/0x790
 handle_mm_fault+0xa2/0x230
 do_user_addr_fault+0x3ea/0xa10
 exc_page_fault+0x68/0x1a0
 asm_exc_page_fault+0x26/0x30

-> #1 (>reset.backoff_srcu){}-{0:0}:
 lock_acquire+0xd8/0x2d0
 _intel_gt_reset_lock+0x57/0x330 [i915]
 guc_timestamp_ping+0x35/0x130 [i915]
 process_one_work+0x250/0x510
 worker_thread+0x4f/0x3a0
 kthread+0xff/0x130
 ret_from_fork+0x29/0x50

-> #0 ((work_completion)(&(>timestamp.work)->work)){+.+.}-{0:0}:
 check_prev_add+0x90/0xc60
 __lock_acquire+0x1998/0x2590
 lock_acquire+0xd8/0x2d0
 __flush_work+0x74/0x530
 __cancel_work_timer+0x14f/0x1f0
 intel_guc_submission_reset_prepare+0x81/0x4b0 [i915]
 intel_uc_reset_prepare+0x9c/0x120 [i915]
 reset_prepare+0x21/0x60 [i915]
 intel_gt_reset+0x1dd/0x470 [i915]
 intel_gt_reset_global+0xfb/0x170 [i915]
 intel_gt_handle_error+0x368/0x420 [i915]
 intel_gt_debugfs_reset_store+0x5c/0xc0 [i915]
 i915_wedged_set+0x29/0x40 [i915]
 simple_attr_write_xsigned.constprop.0+0xb4/0x110
 full_proxy_write+0x52/0x80
 vfs_write+0xc5/0x4f0
 ksys_write+0x64/0xe0
 do_syscall_64+0x3c/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

other info that might help us debug this:
  Chain exists of:
   (work_completion)(&(>timestamp.work)->work) --> fs_reclaim --> 
>reset.mutex
   Possible unsafe locking scenario:
 CPU0CPU1
 
lock(>reset.mutex);
 lock(fs_reclaim);
 lock(>reset.mutex);
lock((work_completion)(&(>timestamp.work)->work));

  *** DEADLOCK ***
  3 locks held by kms_pipe_crc_ba/6415:
   #0: 888101541430 (sb_writers#15){.+.+}-{0:0}, at: ksys_write+0x64/0xe0
   #1: 888136c7eab8 (>mutex){+.+.}-{3:3}, at: 
simple_attr_write_xsigned.constprop.0+0x47/0x110
   #2: 88813e6cce90 (>reset.mutex){+.+.}-{3:3}, at: 
intel_gt_reset+0x19e/0x470 [i915]

v2: Add sync flag to guc_cancel_busyness_worker to ensure reset path calls 
asynchronous cancel.
v3: Add sync flag to 

[PATCH v2 14/14] drm/msm/a6xx: Poll for GBIF unhalt status in hw_init

2023-08-08 Thread Konrad Dybcio
Some GPUs - particularly A7xx ones - are really really stubborn and
sometimes take a longer-than-expected time to finish unhalting GBIF.

Note that this is not caused by the request a few lines above.

Poll for the unhalt ack to make sure we're not trying to write bits to
an essentially dead GPU that can't receive data on its end of the bus.
Failing to do this will result in inexplicable GMU timeouts or worse.

This is a rather ugly hack which introduces a whole lot of latency.

Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 2313620084b6..11cb410e0ac7 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1629,6 +1629,10 @@ static int hw_init(struct msm_gpu *gpu)
mb();
}
 
+   /* Some GPUs are stubborn and take their sweet time to unhalt GBIF! */
+   if (adreno_is_a7xx(adreno_gpu) && a6xx_has_gbif(adreno_gpu))
+   spin_until(!gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK));
+
gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_CNTL, 0);
 
if (adreno_is_a619_holi(adreno_gpu))

-- 
2.41.0



[PATCH v2 13/14] drm/msm/a6xx: Vastly increase HFI timeout

2023-08-08 Thread Konrad Dybcio
A7xx GMUs can be slow as molasses at times.
Increase the timeout to 1 second to match the vendor driver.

Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c 
b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
index cdb3f6e74d3e..e25ddb82a087 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
@@ -108,7 +108,7 @@ static int a6xx_hfi_wait_for_ack(struct a6xx_gmu *gmu, u32 
id, u32 seqnum,
 
/* Wait for a response */
ret = gmu_poll_timeout(gmu, REG_A6XX_GMU_GMU2HOST_INTR_INFO, val,
-   val & A6XX_GMU_GMU2HOST_INTR_INFO_MSGQ, 100, 5000);
+   val & A6XX_GMU_GMU2HOST_INTR_INFO_MSGQ, 100, 100);
 
if (ret) {
DRM_DEV_ERROR(gmu->dev,

-- 
2.41.0



[PATCH v2 12/14] drm/msm/a6xx: Add A740 support

2023-08-08 Thread Konrad Dybcio
A740 builds upon the A730 IP, shuffling some values and registers
around. More differences will appear when things like BCL are
implemented.

adreno_is_a740_family is added in preparation for more A7xx GPUs,
the logic checks will be valid resulting in smaller diffs.

Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c  | 88 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c  | 82 +---
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c  | 27 +
 drivers/gpu/drm/msm/adreno/adreno_device.c | 17 ++
 drivers/gpu/drm/msm/adreno/adreno_gpu.c|  6 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h| 18 +-
 6 files changed, 200 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 17e1e72f5d7d..14ba407e7fe0 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -516,6 +516,7 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
struct adreno_gpu *adreno_gpu = _gpu->base;
struct platform_device *pdev = to_platform_device(gmu->dev);
void __iomem *pdcptr = a6xx_gmu_get_mmio(pdev, "gmu_pdc");
+   u32 seqmem0_drv0_reg = REG_A6XX_RSCC_SEQ_MEM_0_DRV0;
void __iomem *seqptr = NULL;
uint32_t pdc_address_offset;
bool pdc_in_aop = false;
@@ -549,21 +550,26 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
gmu_write_rscc(gmu, REG_A6XX_RSCC_HIDDEN_TCS_CMD0_ADDR, 0);
gmu_write_rscc(gmu, REG_A6XX_RSCC_HIDDEN_TCS_CMD0_DATA + 2, 0);
gmu_write_rscc(gmu, REG_A6XX_RSCC_HIDDEN_TCS_CMD0_ADDR + 2, 0);
-   gmu_write_rscc(gmu, REG_A6XX_RSCC_HIDDEN_TCS_CMD0_DATA + 4, 0x8000);
+   gmu_write_rscc(gmu, REG_A6XX_RSCC_HIDDEN_TCS_CMD0_DATA + 4,
+  adreno_is_a740_family(adreno_gpu) ? 0x8021 : 
0x8000);
gmu_write_rscc(gmu, REG_A6XX_RSCC_HIDDEN_TCS_CMD0_ADDR + 4, 0);
gmu_write_rscc(gmu, REG_A6XX_RSCC_OVERRIDE_START_ADDR, 0);
gmu_write_rscc(gmu, REG_A6XX_RSCC_PDC_SEQ_START_ADDR, 0x4520);
gmu_write_rscc(gmu, REG_A6XX_RSCC_PDC_MATCH_VALUE_LO, 0x4510);
gmu_write_rscc(gmu, REG_A6XX_RSCC_PDC_MATCH_VALUE_HI, 0x4514);
 
+   /* The second spin of A7xx GPUs messed with some register offsets.. */
+   if (adreno_is_a740_family(adreno_gpu))
+   seqmem0_drv0_reg = REG_A7XX_RSCC_SEQ_MEM_0_DRV0_A740;
+
/* Load RSC sequencer uCode for sleep and wakeup */
if (adreno_is_a650_family(adreno_gpu) ||
adreno_is_a7xx(adreno_gpu)) {
-   gmu_write_rscc(gmu, REG_A6XX_RSCC_SEQ_MEM_0_DRV0, 0xeaaae5a0);
-   gmu_write_rscc(gmu, REG_A6XX_RSCC_SEQ_MEM_0_DRV0 + 1, 
0xe1a1ebab);
-   gmu_write_rscc(gmu, REG_A6XX_RSCC_SEQ_MEM_0_DRV0 + 2, 
0xa2e0a581);
-   gmu_write_rscc(gmu, REG_A6XX_RSCC_SEQ_MEM_0_DRV0 + 3, 
0xecac82e2);
-   gmu_write_rscc(gmu, REG_A6XX_RSCC_SEQ_MEM_0_DRV0 + 4, 
0x0020edad);
+   gmu_write_rscc(gmu, seqmem0_drv0_reg, 0xeaaae5a0);
+   gmu_write_rscc(gmu, seqmem0_drv0_reg + 1, 0xe1a1ebab);
+   gmu_write_rscc(gmu, seqmem0_drv0_reg + 2, 0xa2e0a581);
+   gmu_write_rscc(gmu, seqmem0_drv0_reg + 3, 0xecac82e2);
+   gmu_write_rscc(gmu, seqmem0_drv0_reg + 4, 0x0020edad);
} else {
gmu_write_rscc(gmu, REG_A6XX_RSCC_SEQ_MEM_0_DRV0, 0xa7a506a0);
gmu_write_rscc(gmu, REG_A6XX_RSCC_SEQ_MEM_0_DRV0 + 1, 
0xa1e6a6e7);
@@ -767,8 +773,8 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, unsigned 
int state)
struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
struct adreno_gpu *adreno_gpu = _gpu->base;
u32 fence_range_lower, fence_range_upper;
+   u32 chipid, chipid_min = 0;
int ret;
-   u32 chipid;
 
/* Vote veto for FAL10 */
if (adreno_is_a650_family(adreno_gpu) || adreno_is_a7xx(adreno_gpu)) {
@@ -827,16 +833,37 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, 
unsigned int state)
 */
gmu_write(gmu, REG_A6XX_GMU_CM3_CFG, 0x4052);
 
-   /*
-* Note that the GMU has a slightly different layout for
-* chip_id, for whatever reason, so a bit of massaging
-* is needed.  The upper 16b are the same, but minor and
-* patchid are packed in four bits each with the lower
-* 8b unused:
-*/
-   chipid  = adreno_gpu->chip_id & 0x;
-   chipid |= (adreno_gpu->chip_id << 4) & 0xf000; /* minor */
-   chipid |= (adreno_gpu->chip_id << 8) & 0x0f00; /* patchid */
+   /* NOTE: A730 may also fall in this if-condition with a future GMU fw 
update. */
+   if (adreno_is_a7xx(adreno_gpu) && !adreno_is_a730(adreno_gpu)) {
+   /* A7xx GPUs have obfuscated chip IDs. Use constant maj = 7 */
+

[PATCH v2 11/14] drm/msm/a6xx: Add A730 support

2023-08-08 Thread Konrad Dybcio
Add support for Adreno 730, also known as GEN7_0_x, found on SM8450.

Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c  | 126 -
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c  |  61 ++
 drivers/gpu/drm/msm/adreno/adreno_device.c |  13 +++
 drivers/gpu/drm/msm/adreno/adreno_gpu.h|   2 +-
 4 files changed, 198 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 61ce8d053355..522043883290 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -837,6 +837,63 @@ const struct adreno_reglist a690_hwcg[] = {
{}
 };
 
+const struct adreno_reglist a730_hwcg[] = {
+   { REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x0222 },
+   { REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x0202 },
+   { REG_A6XX_RBBM_CLOCK_HYST_SP0, 0xf3cf },
+   { REG_A6XX_RBBM_CLOCK_DELAY_SP0, 0x0080 },
+   { REG_A6XX_RBBM_CLOCK_CNTL_TP0, 0x2220 },
+   { REG_A6XX_RBBM_CLOCK_CNTL2_TP0, 0x },
+   { REG_A6XX_RBBM_CLOCK_CNTL3_TP0, 0x },
+   { REG_A6XX_RBBM_CLOCK_CNTL4_TP0, 0x0022 },
+   { REG_A6XX_RBBM_CLOCK_HYST_TP0, 0x },
+   { REG_A6XX_RBBM_CLOCK_HYST2_TP0, 0x },
+   { REG_A6XX_RBBM_CLOCK_HYST3_TP0, 0x },
+   { REG_A6XX_RBBM_CLOCK_HYST4_TP0, 0x0007 },
+   { REG_A6XX_RBBM_CLOCK_DELAY_TP0, 0x },
+   { REG_A6XX_RBBM_CLOCK_DELAY2_TP0, 0x },
+   { REG_A6XX_RBBM_CLOCK_DELAY3_TP0, 0x },
+   { REG_A6XX_RBBM_CLOCK_DELAY4_TP0, 0x0001 },
+   { REG_A6XX_RBBM_CLOCK_CNTL_UCHE, 0x },
+   { REG_A6XX_RBBM_CLOCK_HYST_UCHE, 0x0004 },
+   { REG_A6XX_RBBM_CLOCK_DELAY_UCHE, 0x0002 },
+   { REG_A6XX_RBBM_CLOCK_CNTL_RB0, 0x },
+   { REG_A6XX_RBBM_CLOCK_CNTL2_RB0, 0x0100 },
+   { REG_A6XX_RBBM_CLOCK_CNTL_CCU0, 0x2220 },
+   { REG_A6XX_RBBM_CLOCK_HYST_RB_CCU0, 0x44000f00 },
+   { REG_A6XX_RBBM_CLOCK_CNTL_RAC, 0x25222022 },
+   { REG_A6XX_RBBM_CLOCK_CNTL2_RAC, 0x0055 },
+   { REG_A6XX_RBBM_CLOCK_DELAY_RAC, 0x0011 },
+   { REG_A6XX_RBBM_CLOCK_HYST_RAC, 0x00440044 },
+   { REG_A6XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM, 0x0422 },
+   { REG_A7XX_RBBM_CLOCK_MODE2_GRAS, 0x0222 },
+   { REG_A7XX_RBBM_CLOCK_MODE_BV_GRAS, 0x0022 },
+   { REG_A6XX_RBBM_CLOCK_MODE_GPC, 0x0223 },
+   { REG_A6XX_RBBM_CLOCK_MODE_VFD, 0x },
+   { REG_A7XX_RBBM_CLOCK_MODE_BV_GPC, 0x0022 },
+   { REG_A7XX_RBBM_CLOCK_MODE_BV_VFD, 0x },
+   { REG_A6XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM, 0x },
+   { REG_A6XX_RBBM_CLOCK_HYST_GPC, 0x04104004 },
+   { REG_A6XX_RBBM_CLOCK_HYST_VFD, 0x },
+   { REG_A6XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM, 0x4000 },
+   { REG_A6XX_RBBM_CLOCK_DELAY_GPC, 0x0200 },
+   { REG_A6XX_RBBM_CLOCK_DELAY_VFD, 0x },
+   { REG_A6XX_RBBM_CLOCK_MODE_HLSQ, 0x },
+   { REG_A6XX_RBBM_CLOCK_DELAY_HLSQ, 0x },
+   { REG_A6XX_RBBM_CLOCK_HYST_HLSQ, 0x },
+   { REG_A6XX_RBBM_CLOCK_DELAY_HLSQ_2, 0x0002 },
+   { REG_A7XX_RBBM_CLOCK_MODE_BV_LRZ, 0x5552 },
+   { REG_A7XX_RBBM_CLOCK_MODE_CP, 0x0223 },
+   { REG_A6XX_RBBM_CLOCK_CNTL, 0x8aa8aa82 },
+   { REG_A6XX_RBBM_ISDB_CNT, 0x0182 },
+   { REG_A6XX_RBBM_RAC_THRESHOLD_CNT, 0x },
+   { REG_A6XX_RBBM_SP_HYST_CNT, 0x },
+   { REG_A6XX_RBBM_CLOCK_CNTL_GMU_GX, 0x0222 },
+   { REG_A6XX_RBBM_CLOCK_DELAY_GMU_GX, 0x0111 },
+   { REG_A6XX_RBBM_CLOCK_HYST_GMU_GX, 0x0555 },
+   {},
+};
+
 static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -1048,6 +1105,59 @@ static const u32 a690_protect[] = {
A6XX_PROTECT_NORDWR(0x11c00, 0x0), /*note: infiite range */
 };
 
+static const u32 a730_protect[] = {
+   A6XX_PROTECT_RDONLY(0x0, 0x04ff),
+   A6XX_PROTECT_RDONLY(0x0050b, 0x0058),
+   A6XX_PROTECT_NORDWR(0x0050e, 0x),
+   A6XX_PROTECT_NORDWR(0x00510, 0x),
+   A6XX_PROTECT_NORDWR(0x00534, 0x),
+   A6XX_PROTECT_RDONLY(0x005fb, 0x009d),
+   A6XX_PROTECT_NORDWR(0x00699, 0x01e9),
+   A6XX_PROTECT_NORDWR(0x008a0, 0x0008),
+   A6XX_PROTECT_NORDWR(0x008ab, 0x0024),
+   /* 0x008d0-0x008dd are unprotected on purpose for tools like perfetto */
+   A6XX_PROTECT_RDONLY(0x008de, 0x0154),
+   A6XX_PROTECT_NORDWR(0x00900, 0x004d),
+   A6XX_PROTECT_NORDWR(0x0098d, 0x00b2),
+   A6XX_PROTECT_NORDWR(0x00a41, 0x01be),
+   A6XX_PROTECT_NORDWR(0x00df0, 0x0001),
+   A6XX_PROTECT_NORDWR(0x00e01, 0x),
+   A6XX_PROTECT_NORDWR(0x00e07, 0x0008),
+   A6XX_PROTECT_NORDWR(0x03c00, 0x00c3),
+   A6XX_PROTECT_RDONLY(0x03cc4, 

[PATCH v2 10/14] drm/msm/a6xx: Mostly implement A7xx gpu_state

2023-08-08 Thread Konrad Dybcio
Provide the necessary alternations to mostly support state dumping on
A7xx. Newer GPUs will probably require more changes here. Crashdumper
and debugbus remain untested.

Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 52 +++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h | 61 -
 2 files changed, 110 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 4e5d650578c6..18be2d3bde09 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -948,6 +948,18 @@ static u32 a6xx_get_cp_roq_size(struct msm_gpu *gpu)
return gpu_read(gpu, REG_A6XX_CP_ROQ_THRESHOLDS_2) >> 14;
 }
 
+static u32 a7xx_get_cp_roq_size(struct msm_gpu *gpu)
+{
+   /*
+* The value at CP_ROQ_THRESHOLDS_2[20:31] is in 4dword units.
+* That register however is not directly accessible from APSS on A7xx.
+* Program the SQE_UCODE_DBG_ADDR with offset=0x70d3 and read the value.
+*/
+   gpu_write(gpu, REG_A6XX_CP_SQE_UCODE_DBG_ADDR, 0x70d3);
+
+   return 4 * (gpu_read(gpu, REG_A6XX_CP_SQE_UCODE_DBG_DATA) >> 20);
+}
+
 /* Read a block of data from an indexed register pair */
 static void a6xx_get_indexed_regs(struct msm_gpu *gpu,
struct a6xx_gpu_state *a6xx_state,
@@ -1019,8 +1031,40 @@ static void a6xx_get_indexed_registers(struct msm_gpu 
*gpu,
 
/* Restore the size in the hardware */
gpu_write(gpu, REG_A6XX_CP_MEM_POOL_SIZE, mempool_size);
+}
+
+static void a7xx_get_indexed_registers(struct msm_gpu *gpu,
+   struct a6xx_gpu_state *a6xx_state)
+{
+   int i, indexed_count, mempool_count;
+
+   indexed_count = ARRAY_SIZE(a7xx_indexed_reglist);
+   mempool_count = ARRAY_SIZE(a7xx_cp_bv_mempool_indexed);
 
-   a6xx_state->nr_indexed_regs = count;
+   a6xx_state->indexed_regs = state_kcalloc(a6xx_state,
+   indexed_count + mempool_count,
+   sizeof(*a6xx_state->indexed_regs));
+   if (!a6xx_state->indexed_regs)
+   return;
+
+   a6xx_state->nr_indexed_regs = indexed_count + mempool_count;
+
+   /* First read the common regs */
+   for (i = 0; i < indexed_count; i++)
+   a6xx_get_indexed_regs(gpu, a6xx_state, _indexed_reglist[i],
+   _state->indexed_regs[i]);
+
+   gpu_rmw(gpu, REG_A6XX_CP_CHICKEN_DBG, 0, BIT(2));
+   gpu_rmw(gpu, REG_A7XX_CP_BV_CHICKEN_DBG, 0, BIT(2));
+
+   /* Get the contents of the CP_BV mempool */
+   for (i = 0; i < mempool_count; i++)
+   a6xx_get_indexed_regs(gpu, a6xx_state, 
a7xx_cp_bv_mempool_indexed,
+   _state->indexed_regs[indexed_count - 1 + i]);
+
+   gpu_rmw(gpu, REG_A6XX_CP_CHICKEN_DBG, BIT(2), 0);
+   gpu_rmw(gpu, REG_A7XX_CP_BV_CHICKEN_DBG, BIT(2), 0);
+   return;
 }
 
 struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu)
@@ -1056,6 +1100,12 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu 
*gpu)
return _state->base;
 
/* Get the banks of indexed registers */
+   if (adreno_is_a7xx(adreno_gpu)) {
+   a7xx_get_indexed_registers(gpu, a6xx_state);
+   /* Further codeflow is untested on A7xx. */
+   return _state->base;
+   }
+
a6xx_get_indexed_registers(gpu, a6xx_state);
 
/*
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
index e788ed72eb0d..8d7e6f26480a 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
@@ -338,6 +338,28 @@ static const struct a6xx_registers a6xx_vbif_reglist =
 static const struct a6xx_registers a6xx_gbif_reglist =
REGS(a6xx_gbif_registers, 0, 0);
 
+static const u32 a7xx_ahb_registers[] = {
+   /* RBBM_STATUS */
+   0x210, 0x210,
+   /* RBBM_STATUS2-3 */
+   0x212, 0x213,
+};
+
+static const u32 a7xx_gbif_registers[] = {
+   0x3c00, 0x3c0b,
+   0x3c40, 0x3c42,
+   0x3c45, 0x3c47,
+   0x3c49, 0x3c4a,
+   0x3cc0, 0x3cd1,
+};
+
+static const struct a6xx_registers a7xx_ahb_reglist[] = {
+   REGS(a7xx_ahb_registers, 0, 0),
+};
+
+static const struct a6xx_registers a7xx_gbif_reglist =
+   REGS(a7xx_gbif_registers, 0, 0);
+
 static const u32 a6xx_gmu_gx_registers[] = {
/* GMU GX */
0x, 0x, 0x0010, 0x0013, 0x0016, 0x0016, 0x0018, 0x001b,
@@ -384,14 +406,17 @@ static const struct a6xx_registers a6xx_gmu_reglist[] = {
 };
 
 static u32 a6xx_get_cp_roq_size(struct msm_gpu *gpu);
+static u32 a7xx_get_cp_roq_size(struct msm_gpu *gpu);
 
-static struct a6xx_indexed_registers {
+struct a6xx_indexed_registers {

[PATCH v2 09/14] drm/msm/a6xx: Send ACD state to QMP at GMU resume

2023-08-08 Thread Konrad Dybcio
The QMP mailbox expects to be notified of the ACD (Adaptive Clock
Distribution) state. Get a handle to the mailbox at probe time and
poke it at GMU resume.

Since we don't fully support ACD yet, hardcode the message to "val: 0"
(state = disabled).

Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 21 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  3 +++
 2 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 75984260898e..17e1e72f5d7d 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -980,11 +980,13 @@ static void a6xx_gmu_set_initial_bw(struct msm_gpu *gpu, 
struct a6xx_gmu *gmu)
dev_pm_opp_put(gpu_opp);
 }
 
+#define GMU_ACD_STATE_MSG_LEN  36
 int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
 {
struct adreno_gpu *adreno_gpu = _gpu->base;
struct msm_gpu *gpu = _gpu->base;
struct a6xx_gmu *gmu = _gpu->gmu;
+   char buf[GMU_ACD_STATE_MSG_LEN];
int status, ret;
 
if (WARN(!gmu->initialized, "The GMU is not set up yet\n"))
@@ -992,6 +994,18 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
 
gmu->hung = false;
 
+   /* Notify AOSS about the ACD state (unimplemented for now => disable 
it) */
+   if (!IS_ERR(gmu->qmp)) {
+   ret = snprintf(buf, sizeof(buf),
+  "{class: gpu, res: acd, val: %d}",
+  0 /* Hardcode ACD to be disabled for now */);
+   WARN_ON(ret >= GMU_ACD_STATE_MSG_LEN);
+
+   ret = qmp_send(gmu->qmp, buf, sizeof(buf));
+   if (ret)
+   dev_err(gmu->dev, "failed to send GPU ACD state\n");
+   }
+
/* Turn on the resources */
pm_runtime_get_sync(gmu->dev);
 
@@ -1744,6 +1758,10 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
goto detach_cxpd;
}
 
+   gmu->qmp = qmp_get(gmu->dev);
+   if (IS_ERR(gmu->qmp) && adreno_is_a7xx(adreno_gpu))
+   return PTR_ERR(gmu->qmp);
+
init_completion(>pd_gate);
complete_all(>pd_gate);
gmu->pd_nb.notifier_call = cxpd_notifier_cb;
@@ -1767,6 +1785,9 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
 
return 0;
 
+   if (!IS_ERR_OR_NULL(gmu->qmp))
+   qmp_put(gmu->qmp);
+
 detach_cxpd:
dev_pm_domain_detach(gmu->cxpd, false);
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index 236f81a43caa..592b296aab22 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "msm_drv.h"
 #include "a6xx_hfi.h"
 
@@ -96,6 +97,8 @@ struct a6xx_gmu {
/* For power domain callback */
struct notifier_block pd_nb;
struct completion pd_gate;
+
+   struct qmp *qmp;
 };
 
 static inline u32 gmu_read(struct a6xx_gmu *gmu, u32 offset)

-- 
2.41.0



[PATCH v2 08/14] drm/msm/a6xx: Add skeleton A7xx support

2023-08-08 Thread Konrad Dybcio
A7xx GPUs are - from kernel's POV anyway - basically another generation
of A6xx. They build upon the A650/A660_family advancements, skipping some
writes (presumably more values are preset correctly on reset), adding
some new ones and changing others.

One notable difference is the introduction of a second shadow, called BV.
To handle this with the current code, allocate it right after the current
RPTR shadow.

BV handling and .submit are mostly based on Jonathan Marek's work.

All A7xx GPUs are assumed to have a GMU.
A702 is not an A7xx-class GPU, it's a weird forked A610.

Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  95 +--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 451 
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |   1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  12 +
 drivers/gpu/drm/msm/msm_ringbuffer.h|   2 +
 5 files changed, 481 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 03fa89bf3e4b..75984260898e 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -200,9 +200,10 @@ int a6xx_gmu_wait_for_idle(struct a6xx_gmu *gmu)
 
 static int a6xx_gmu_start(struct a6xx_gmu *gmu)
 {
+   struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
+   struct adreno_gpu *adreno_gpu = _gpu->base;
+   u32 mask, reset_val, val;
int ret;
-   u32 val;
-   u32 mask, reset_val;
 
val = gmu_read(gmu, REG_A6XX_GMU_CM3_DTCM_START + 0xff8);
if (val <= 0x20010004) {
@@ -218,7 +219,11 @@ static int a6xx_gmu_start(struct a6xx_gmu *gmu)
/* Set the log wptr index
 * note: downstream saves the value in poweroff and restores it here
 */
-   gmu_write(gmu, REG_A6XX_GPU_GMU_CX_GMU_PWR_COL_CP_RESP, 0);
+   if (adreno_is_a7xx(adreno_gpu))
+   gmu_write(gmu, REG_A6XX_GMU_GENERAL_9, 0);
+   else
+   gmu_write(gmu, REG_A6XX_GPU_GMU_CX_GMU_PWR_COL_CP_RESP, 0);
+
 
gmu_write(gmu, REG_A6XX_GMU_CM3_SYSRESET, 0);
 
@@ -518,7 +523,9 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
if (IS_ERR(pdcptr))
goto err;
 
-   if (adreno_is_a650(adreno_gpu) || adreno_is_a660_family(adreno_gpu))
+   if (adreno_is_a650(adreno_gpu) ||
+   adreno_is_a660_family(adreno_gpu) ||
+   adreno_is_a7xx(adreno_gpu))
pdc_in_aop = true;
else if (adreno_is_a618(adreno_gpu) || 
adreno_is_a640_family(adreno_gpu))
pdc_address_offset = 0x30090;
@@ -550,7 +557,8 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
gmu_write_rscc(gmu, REG_A6XX_RSCC_PDC_MATCH_VALUE_HI, 0x4514);
 
/* Load RSC sequencer uCode for sleep and wakeup */
-   if (adreno_is_a650_family(adreno_gpu)) {
+   if (adreno_is_a650_family(adreno_gpu) ||
+   adreno_is_a7xx(adreno_gpu)) {
gmu_write_rscc(gmu, REG_A6XX_RSCC_SEQ_MEM_0_DRV0, 0xeaaae5a0);
gmu_write_rscc(gmu, REG_A6XX_RSCC_SEQ_MEM_0_DRV0 + 1, 
0xe1a1ebab);
gmu_write_rscc(gmu, REG_A6XX_RSCC_SEQ_MEM_0_DRV0 + 2, 
0xa2e0a581);
@@ -635,11 +643,18 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
 /* Set up the idle state for the GMU */
 static void a6xx_gmu_power_config(struct a6xx_gmu *gmu)
 {
+   struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
+   struct adreno_gpu *adreno_gpu = _gpu->base;
+
/* Disable GMU WB/RB buffer */
gmu_write(gmu, REG_A6XX_GMU_SYS_BUS_CONFIG, 0x1);
gmu_write(gmu, REG_A6XX_GMU_ICACHE_CONFIG, 0x1);
gmu_write(gmu, REG_A6XX_GMU_DCACHE_CONFIG, 0x1);
 
+   /* A7xx knows better by default! */
+   if (adreno_is_a7xx(adreno_gpu))
+   return;
+
gmu_write(gmu, REG_A6XX_GMU_PWR_COL_INTER_FRAME_CTRL, 0x9c40400);
 
switch (gmu->idle_level) {
@@ -702,7 +717,7 @@ static int a6xx_gmu_fw_load(struct a6xx_gmu *gmu)
u32 itcm_base = 0x;
u32 dtcm_base = 0x0004;
 
-   if (adreno_is_a650_family(adreno_gpu))
+   if (adreno_is_a650_family(adreno_gpu) || adreno_is_a7xx(adreno_gpu))
dtcm_base = 0x10004000;
 
if (gmu->legacy) {
@@ -751,14 +766,22 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, 
unsigned int state)
 {
struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
struct adreno_gpu *adreno_gpu = _gpu->base;
+   u32 fence_range_lower, fence_range_upper;
int ret;
u32 chipid;
 
-   if (adreno_is_a650_family(adreno_gpu)) {
+   /* Vote veto for FAL10 */
+   if (adreno_is_a650_family(adreno_gpu) || adreno_is_a7xx(adreno_gpu)) {
gmu_write(gmu, REG_A6XX_GPU_GMU_CX_GMU_CX_FALNEXT_INTF, 1);
gmu_write(gmu, 

[PATCH v2 07/14] drm/msm/a6xx: Bail out early if setting GPU OOB fails

2023-08-08 Thread Konrad Dybcio
If the GMU can't guarantee the required resources are up, trying to
bring up the GPU is a lost cause. Return early if setting GPU OOB
fails.

Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 6dd6d72bcd86..d4e85e24002f 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1201,7 +1201,9 @@ static int hw_init(struct msm_gpu *gpu)
 
if (!adreno_has_gmu_wrapper(adreno_gpu)) {
/* Make sure the GMU keeps the GPU on while we set it up */
-   a6xx_gmu_set_oob(_gpu->gmu, GMU_OOB_GPU_SET);
+   ret = a6xx_gmu_set_oob(_gpu->gmu, GMU_OOB_GPU_SET);
+   if (ret)
+   return ret;
}
 
/* Clear GBIF halt in case GX domain was not collapsed */

-- 
2.41.0



[PATCH v2 05/14] drm/msm/a6xx: Introduce a6xx_llc_read

2023-08-08 Thread Konrad Dybcio
Add a helper that does exactly what it says on the can, it'll be
required for A7xx.

Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 1ed202c4e497..0fef92f71c4e 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1740,6 +1740,11 @@ static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 
reg, u32 mask, u32 or)
return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or);
 }
 
+static u32 a6xx_llc_read(struct a6xx_gpu *a6xx_gpu, u32 reg)
+{
+   return msm_readl(a6xx_gpu->llc_mmio + (reg << 2));
+}
+
 static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 value)
 {
msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2));

-- 
2.41.0



[PATCH v2 06/14] drm/msm/a6xx: Move LLC accessors to the common header

2023-08-08 Thread Konrad Dybcio
Move these wrappers in preparation for use in a6xx_gmu.c

Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 15 ---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 15 +++
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 0fef92f71c4e..6dd6d72bcd86 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1735,21 +1735,6 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
return IRQ_HANDLED;
 }
 
-static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 mask, u32 or)
-{
-   return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or);
-}
-
-static u32 a6xx_llc_read(struct a6xx_gpu *a6xx_gpu, u32 reg)
-{
-   return msm_readl(a6xx_gpu->llc_mmio + (reg << 2));
-}
-
-static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 value)
-{
-   msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2));
-}
-
 static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu)
 {
llcc_slice_deactivate(a6xx_gpu->llc_slice);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index ab66d281828c..34822b080759 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -62,6 +62,21 @@ static inline bool a6xx_has_gbif(struct adreno_gpu *gpu)
return true;
 }
 
+static inline void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 mask, 
u32 or)
+{
+   return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or);
+}
+
+static inline u32 a6xx_llc_read(struct a6xx_gpu *a6xx_gpu, u32 reg)
+{
+   return msm_readl(a6xx_gpu->llc_mmio + (reg << 2));
+}
+
+static inline void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 
value)
+{
+   msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2));
+}
+
 #define shadowptr(_a6xx_gpu, _ring) ((_a6xx_gpu)->shadow_iova + \
((_ring)->id * sizeof(uint32_t)))
 

-- 
2.41.0



[PATCH v2 04/14] drm/msm/a6xx: Add missing regs for A7XX

2023-08-08 Thread Konrad Dybcio
Add some missing definitions required for A7 support.

This may be substituted with a mesa header sync.

Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx.xml.h | 9 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.xml.h | 8 
 2 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx.xml.h 
b/drivers/gpu/drm/msm/adreno/a6xx.xml.h
index 1c051535fd4a..863b5e3b0e67 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx.xml.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx.xml.h
@@ -1114,6 +1114,12 @@ enum a6xx_tex_type {
 #define REG_A6XX_CP_MISC_CNTL  0x0840
 
 #define REG_A6XX_CP_APRIV_CNTL 0x0844
+#define A6XX_CP_APRIV_CNTL_CDWRITE 0x0040
+#define A6XX_CP_APRIV_CNTL_CDREAD  0x0020
+#define A6XX_CP_APRIV_CNTL_RBRPWB  0x0008
+#define A6XX_CP_APRIV_CNTL_RBPRIVLEVEL 0x0004
+#define A6XX_CP_APRIV_CNTL_RBFETCH 0x0002
+#define A6XX_CP_APRIV_CNTL_ICACHE  0x0001
 
 #define REG_A6XX_CP_PREEMPT_THRESHOLD  0x08c0
 
@@ -1939,6 +1945,8 @@ static inline uint32_t 
REG_A6XX_RBBM_PERFCTR_RBBM_SEL(uint32_t i0) { return 0x00
 
 #define REG_A6XX_RBBM_CLOCK_HYST_TEX_FCHE  0x0122
 
+#define REG_A7XX_RBBM_CLOCK_HYST2_VFD  0x012f
+
 #define REG_A6XX_RBBM_LPAC_GBIF_CLIENT_QOS_CNTL
0x05ff
 
 #define REG_A6XX_DBGC_CFG_DBGBUS_SEL_A 0x0600
@@ -8252,5 +8260,6 @@ static inline uint32_t 
A6XX_CX_DBGC_CFG_DBGBUS_BYTEL_1_BYTEL15(uint32_t val)
 
 #define REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1   0x0002
 
+#define REG_A7XX_CX_MISC_TCM_RET_CNTL  0x0039
 
 #endif /* A6XX_XML */
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.xml.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.xml.h
index fcd9eb53baf8..5b66efafc901 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.xml.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.xml.h
@@ -360,6 +360,12 @@ static inline uint32_t A6XX_GMU_GPU_NAP_CTRL_SID(uint32_t 
val)
 
 #define REG_A6XX_GMU_GENERAL_7 0x51cc
 
+#define REG_A6XX_GMU_GENERAL_8 0x51cd
+
+#define REG_A6XX_GMU_GENERAL_9 0x51ce
+
+#define REG_A6XX_GMU_GENERAL_10
0x51cf
+
 #define REG_A6XX_GMU_ISENSE_CTRL   0x515d
 
 #define REG_A6XX_GPU_CS_ENABLE_REG 0x8920
@@ -471,6 +477,8 @@ static inline uint32_t A6XX_GMU_GPU_NAP_CTRL_SID(uint32_t 
val)
 
 #define REG_A6XX_RSCC_SEQ_BUSY_DRV00x0101
 
+#define REG_A7XX_RSCC_SEQ_MEM_0_DRV0_A740  0x0154
+
 #define REG_A6XX_RSCC_SEQ_MEM_0_DRV0   0x0180
 
 #define REG_A6XX_RSCC_TCS0_DRV0_STATUS 0x0346

-- 
2.41.0



[PATCH v2 02/14] dt-bindings: display/msm/gmu: Allow passing QMP handle

2023-08-08 Thread Konrad Dybcio
When booting the GMU, the QMP mailbox should be pinged about some tunables
(e.g. adaptive clock distribution state). To achieve that, a reference to
it is necessary. Allow it and require it with A730.

Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 Documentation/devicetree/bindings/display/msm/gmu.yaml | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/msm/gmu.yaml 
b/Documentation/devicetree/bindings/display/msm/gmu.yaml
index 20ddb89a4500..e132dbff3c4a 100644
--- a/Documentation/devicetree/bindings/display/msm/gmu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/gmu.yaml
@@ -64,6 +64,10 @@ properties:
   iommus:
 maxItems: 1
 
+  qcom,qmp:
+$ref: /schemas/types.yaml#/definitions/phandle
+description: Reference to the AOSS side-channel message RAM
+
   operating-points-v2: true
 
   opp-table:
@@ -251,6 +255,9 @@ allOf:
 - const: hub
 - const: demet
 
+  required:
+- qcom,qmp
+
   - if:
   properties:
 compatible:

-- 
2.41.0



[PATCH v2 03/14] dt-bindings: display/msm/gpu: Allow A7xx SKUs

2023-08-08 Thread Konrad Dybcio
Allow A7xx SKUs, such as the A730 GPU found on SM8450 and friends.
They use GMU for all things DVFS, just like most A6xx GPUs.

Reviewed-by: Krzysztof Kozlowski 
Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 Documentation/devicetree/bindings/display/msm/gpu.yaml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/msm/gpu.yaml 
b/Documentation/devicetree/bindings/display/msm/gpu.yaml
index 56b9b247e8c2..b019db954793 100644
--- a/Documentation/devicetree/bindings/display/msm/gpu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/gpu.yaml
@@ -23,7 +23,7 @@ properties:
   The driver is parsing the compat string for Adreno to
   figure out the gpu-id and patch level.
 items:
-  - pattern: '^qcom,adreno-[3-6][0-9][0-9]\.[0-9]$'
+  - pattern: '^qcom,adreno-[3-7][0-9][0-9]\.[0-9]$'
   - const: qcom,adreno
   - description: |
   The driver is parsing the compat string for Imageon to
@@ -203,7 +203,7 @@ allOf:
 properties:
   compatible:
 contains:
-  pattern: '^qcom,adreno-6[0-9][0-9]\.[0-9]$'
+  pattern: '^qcom,adreno-[67][0-9][0-9]\.[0-9]$'
 
   then: # Starting with A6xx, the clocks are usually defined in the GMU 
node
 properties:

-- 
2.41.0



[PATCH v2 01/14] dt-bindings: display/msm/gmu: Add Adreno 7[34]0 GMU

2023-08-08 Thread Konrad Dybcio
The GMU on the A7xx series is pretty much the same as on the A6xx parts.
It's now "smarter", needs a bit less register writes and controls more
things (like inter-frame power collapse) mostly internally (instead of
us having to write to G[PM]U_[CG]X registers from APPS)

The only difference worth mentioning is the now-required DEMET clock,
which is strictly required for things like asserting reset lines, not
turning it on results in GMU not being fully functional (all OOB requests
would fail and HFI would hang after the first submitted OOB).

Describe the A730 and A740 GMU.

Reviewed-by: Krzysztof Kozlowski 
Tested-by: Neil Armstrong  # on SM8550-QRD
Tested-by: Dmitry Baryshkov  # sm8450
Signed-off-by: Konrad Dybcio 
---
 .../devicetree/bindings/display/msm/gmu.yaml   | 40 +-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/display/msm/gmu.yaml 
b/Documentation/devicetree/bindings/display/msm/gmu.yaml
index 5fc4106110ad..20ddb89a4500 100644
--- a/Documentation/devicetree/bindings/display/msm/gmu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/gmu.yaml
@@ -21,7 +21,7 @@ properties:
   compatible:
 oneOf:
   - items:
-  - pattern: '^qcom,adreno-gmu-6[0-9][0-9]\.[0-9]$'
+  - pattern: '^qcom,adreno-gmu-[67][0-9][0-9]\.[0-9]$'
   - const: qcom,adreno-gmu
   - const: qcom,adreno-gmu-wrapper
 
@@ -213,6 +213,44 @@ allOf:
 - const: axi
 - const: memnoc
 
+  - if:
+  properties:
+compatible:
+  contains:
+enum:
+  - qcom,adreno-gmu-730.1
+  - qcom,adreno-gmu-740.1
+then:
+  properties:
+reg:
+  items:
+- description: Core GMU registers
+- description: Resource controller registers
+- description: GMU PDC registers
+reg-names:
+  items:
+- const: gmu
+- const: rscc
+- const: gmu_pdc
+clocks:
+  items:
+- description: GPU AHB clock
+- description: GMU clock
+- description: GPU CX clock
+- description: GPU AXI clock
+- description: GPU MEMNOC clock
+- description: GMU HUB clock
+- description: GPUSS DEMET clock
+clock-names:
+  items:
+- const: ahb
+- const: gmu
+- const: cxo
+- const: axi
+- const: memnoc
+- const: hub
+- const: demet
+
   - if:
   properties:
 compatible:

-- 
2.41.0



[PATCH v2 00/14] A7xx support

2023-08-08 Thread Konrad Dybcio
This series attempts to introduce Adreno 700 support (with A730 and A740
found on SM8450 and SM8550 respectively), reusing much of the existing
A6xx code. This submission largely lays the groundwork for expansion and
more or less gives us feature parity (on the kernel side, that is) with
existing A6xx parts.

On top of introducing a very messy set of three (!) separate and
obfuscated deivce identifiers for each 7xx part, this generation
introduces very sophisticated hardware multi-threading and (on some SKUs)
hardware ray-tracing (not supported yet).

After this series, a long-overdue cleanup of drm/msm/adreno is planned
in preparation for adding more features and removing some hardcoding.

The last patch is a hack that may or may not be necessary depending
on your board's humour.. eh.. :/

Developed atop (and hence depends on) [1]

The corresponding devicetree patches are initially available at [2] and
will be posted after this series gets merged. To test it, you'll also need
firmware that you need to obtain from your board (there's none with a
redistributable license, sorry..). Most likely it will be in one of
these directories on your stock android installation:

* /vendor/firmware
* /vendor/firmware_mnt
* /system

..but some vendors make it hard and you have to do some grepping ;)

Requires [3] to work on the userspace side. You'll almost cerainly want
to test it alongside Zink with a lot of debug flags (early impl), like:

TU_DEBUG=sysmem,nolrz,flushall,noubwc MESA_LOADER_DRIVER_OVERRIDE=zink kmscube

[1] 
https://lore.kernel.org/linux-arm-msm/20230517-topic-a7xx_prep-v4-0-b16f273a9...@linaro.org/
[2] https://github.com/SoMainline/linux/commits/topic/a7xx_dt
[3] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23217

Signed-off-by: Konrad Dybcio 
---
Changes in v2:
- Rebase on chipid changes
- Reuse existing description for qcom,aoss in patch 2
- Pick up tags
- Link to v1: 
https://lore.kernel.org/r/20230628-topic-a7xx_drmmsm-v1-0-a7f4496e0...@linaro.org

---
Konrad Dybcio (14):
  dt-bindings: display/msm/gmu: Add Adreno 7[34]0 GMU
  dt-bindings: display/msm/gmu: Allow passing QMP handle
  dt-bindings: display/msm/gpu: Allow A7xx SKUs
  drm/msm/a6xx: Add missing regs for A7XX
  drm/msm/a6xx: Introduce a6xx_llc_read
  drm/msm/a6xx: Move LLC accessors to the common header
  drm/msm/a6xx: Bail out early if setting GPU OOB fails
  drm/msm/a6xx: Add skeleton A7xx support
  drm/msm/a6xx: Send ACD state to QMP at GMU resume
  drm/msm/a6xx: Mostly implement A7xx gpu_state
  drm/msm/a6xx: Add A730 support
  drm/msm/a6xx: Add A740 support
  drm/msm/a6xx: Vastly increase HFI timeout
  drm/msm/a6xx: Poll for GBIF unhalt status in hw_init

 .../devicetree/bindings/display/msm/gmu.yaml   |  47 +-
 .../devicetree/bindings/display/msm/gpu.yaml   |   4 +-
 drivers/gpu/drm/msm/adreno/a6xx.xml.h  |   9 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c  | 204 +--
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h  |   3 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.xml.h  |   8 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c  | 667 ++---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h  |  15 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c|  52 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h|  61 +-
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c  |  90 ++-
 drivers/gpu/drm/msm/adreno/adreno_device.c |  30 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c|   7 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h|  28 +-
 drivers/gpu/drm/msm/msm_ringbuffer.h   |   2 +
 15 files changed, 1094 insertions(+), 133 deletions(-)
---
base-commit: b30de2c05cf2166f4e2c68850efc8dcea1c89780
change-id: 20230628-topic-a7xx_drmmsm-123f30d76cf7

Best regards,
-- 
Konrad Dybcio 



Re: [PATCH v3 3/3] usb: typec: nb7vpq904m: switch to DRM_SIMPLE_BRIDGE

2023-08-08 Thread kernel test robot
Hi Dmitry,

kernel test robot noticed the following build errors:

[auto build test ERROR on drm-misc/drm-misc-next]
[also build test ERROR on usb/usb-testing usb/usb-next usb/usb-linus 
drm-intel/for-linux-next drm-intel/for-linux-next-fixes drm-tip/drm-tip 
linus/master v6.5-rc5 next-20230808]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Dmitry-Baryshkov/drm-display-add-transparent-bridge-helper/20230802-091932
base:   git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
patch link:
https://lore.kernel.org/r/20230802011845.4176631-4-dmitry.baryshkov%40linaro.org
patch subject: [PATCH v3 3/3] usb: typec: nb7vpq904m: switch to 
DRM_SIMPLE_BRIDGE
config: s390-randconfig-r033-20230808 
(https://download.01.org/0day-ci/archive/20230809/202308090347.sztwmcub-...@intel.com/config)
compiler: clang version 15.0.7 (https://github.com/llvm/llvm-project.git 
8dfdcc7b7bf66834a761bd8de445840ef68e4d1a)
reproduce: 
(https://download.01.org/0day-ci/archive/20230809/202308090347.sztwmcub-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202308090347.sztwmcub-...@intel.com/

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/bridge/simple-bridge.c:212:18: error: no member named 
>> 'of_node' in 'struct drm_bridge'
   sbridge->bridge.of_node = pdev->dev.of_node;
   ~~~ ^
   1 error generated.


vim +212 drivers/gpu/drm/bridge/simple-bridge.c

56fe8b6f499167 drivers/gpu/drm/bridge/dumb-vga-dac.c  Maxime Ripard
2016-09-30  168  
94ded532ffdb42 drivers/gpu/drm/bridge/dumb-vga-dac.c  Laurent Pinchart 
2020-02-26  169  static int simple_bridge_probe(struct platform_device *pdev)
56fe8b6f499167 drivers/gpu/drm/bridge/dumb-vga-dac.c  Maxime Ripard
2016-09-30  170  {
94ded532ffdb42 drivers/gpu/drm/bridge/dumb-vga-dac.c  Laurent Pinchart 
2020-02-26  171  struct simple_bridge *sbridge;
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  172  struct device_node *remote;
56fe8b6f499167 drivers/gpu/drm/bridge/dumb-vga-dac.c  Maxime Ripard
2016-09-30  173  
94ded532ffdb42 drivers/gpu/drm/bridge/dumb-vga-dac.c  Laurent Pinchart 
2020-02-26  174  sbridge = devm_kzalloc(>dev, sizeof(*sbridge), 
GFP_KERNEL);
94ded532ffdb42 drivers/gpu/drm/bridge/dumb-vga-dac.c  Laurent Pinchart 
2020-02-26  175  if (!sbridge)
56fe8b6f499167 drivers/gpu/drm/bridge/dumb-vga-dac.c  Maxime Ripard
2016-09-30  176  return -ENOMEM;
94ded532ffdb42 drivers/gpu/drm/bridge/dumb-vga-dac.c  Laurent Pinchart 
2020-02-26  177  platform_set_drvdata(pdev, sbridge);
56fe8b6f499167 drivers/gpu/drm/bridge/dumb-vga-dac.c  Maxime Ripard
2016-09-30  178  
272378ec0eb972 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-02-26  179  sbridge->info = of_device_get_match_data(>dev);
272378ec0eb972 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-02-26  180  
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  181  /* Get the next bridge in the pipeline. */
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  182  remote = of_graph_get_remote_node(pdev->dev.of_node, 
1, -1);
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  183  if (!remote)
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  184  return -EINVAL;
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  185  
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  186  sbridge->next_bridge = of_drm_find_bridge(remote);
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  187  of_node_put(remote);
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  188  
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  189  if (!sbridge->next_bridge) {
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  190  dev_dbg(>dev, "Next bridge not found, 
deferring probe\n");
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  191  return -EPROBE_DEFER;
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  192  }
00686ac55d0a21 drivers/gpu/drm/bridge/simple-bridge.c Laurent Pinchart 
2020-05-26  193  
00686ac55d0a21 drivers/gpu/drm/b

[linux-next:master] BUILD REGRESSION 71cd4fc492ec41e4acd85e98bbf7a13753fc1e03

2023-08-08 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: 71cd4fc492ec41e4acd85e98bbf7a13753fc1e03  Add linux-next specific 
files for 20230808

Error/Warning reports:

https://lore.kernel.org/oe-kbuild-all/202307251531.p8zlftmz-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202308081459.us5rlyay-...@intel.com

Error/Warning: (recently discovered and may have been fixed)

../lib/gcc/loongarch64-linux/12.3.0/plugin/include/config/loongarch/loongarch-opts.h:31:10:
 fatal error: loongarch-def.h: No such file or directory
drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dmub_replay.c:37: warning: This 
comment starts with '/**', but isn't a kernel-doc comment. Refer 
Documentation/doc-guide/kernel-doc.rst
sound/soc/codecs/aw88261.c:651:7: warning: variable 'ret' is used uninitialized 
whenever 'if' condition is false [-Wsometimes-uninitialized]

Unverified Error/Warning (likely false positive, please contact us if 
interested):

drivers/gpu/drm/tests/drm_exec_test.c:166 test_prepare_array() error: 
uninitialized symbol 'ret'.
drivers/mtd/nand/raw/qcom_nandc.c:2590 qcom_op_cmd_mapping() error: 
uninitialized symbol 'ret'.
drivers/mtd/nand/raw/qcom_nandc.c:3017 qcom_check_op() warn: was && intended 
here instead of ||?
kernel/futex/waitwake.c:422 futex_wait_multiple_setup() warn: bitwise AND 
condition is false here
sh4-linux-gcc: internal compiler error: Segmentation fault signal terminated 
program cc1
{standard input}: Warning: end of file not at end of a line; newline inserted
{standard input}:927: Error: pcrel too far

Error/Warning ids grouped by kconfigs:

gcc_recent_errors
|-- alpha-allyesconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dce-dmub_replay.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|-- arc-allyesconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dce-dmub_replay.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|-- arm-allmodconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dce-dmub_replay.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|-- arm-allyesconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dce-dmub_replay.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|-- arm64-allyesconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dce-dmub_replay.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|-- csky-randconfig-m041-20230808
|   `-- 
drivers-gpu-drm-tests-drm_exec_test.c-test_prepare_array()-error:uninitialized-symbol-ret-.
|-- i386-allyesconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dce-dmub_replay.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|-- loongarch-allmodconfig
|   `-- 
lib-gcc-loongarch64-linux-..-plugin-include-config-loongarch-loongarch-opts.h:fatal-error:loongarch-def.h:No-such-file-or-directory
|-- mips-allmodconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dce-dmub_replay.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|-- mips-allyesconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dce-dmub_replay.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|-- parisc-allyesconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dce-dmub_replay.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|-- powerpc-allmodconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dce-dmub_replay.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|-- riscv-allmodconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dce-dmub_replay.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|-- riscv-allyesconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dce-dmub_replay.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|-- riscv-randconfig-m031-20230808
|   |-- 
drivers-gpu-drm-tests-drm_exec_test.c-test_prepare_array()-error:uninitialized-symbol-ret-.
|   |-- 
drivers-mtd-nand-raw-qcom_nandc.c-qcom_check_op()-warn:was-intended-here-instead-of
|   |-- 
drivers-mtd-nand-raw-qcom_nandc.c-qcom_op_cmd_mapping()-error:uninitialized-symbol-ret-.
|   `-- 
kernel-futex-waitwake.c-futex_wait_multiple_setup()-warn:bitwise-AND-condition-is-false-here
|-- s390-allyesconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dce-dmub_replay.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst

[PATCH] drm/radeon: check return value of radeon_ring_lock()

2023-08-08 Thread Nikita Zhandarovich
In the unlikely event of radeon_ring_lock() failing, its errno return
value should be processed. This patch checks said return value and
prints a debug message in case of an error.

Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.

Fixes: 48c0c902e2e6 ("drm/radeon/kms: add support for CP setup on SI")
Signed-off-by: Nikita Zhandarovich 
---
 drivers/gpu/drm/radeon/si.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c
index 8d5e4b25609d..df1b2ebc37c2 100644
--- a/drivers/gpu/drm/radeon/si.c
+++ b/drivers/gpu/drm/radeon/si.c
@@ -3611,6 +3611,10 @@ static int si_cp_start(struct radeon_device *rdev)
for (i = RADEON_RING_TYPE_GFX_INDEX; i <= CAYMAN_RING_TYPE_CP2_INDEX; 
++i) {
ring = >ring[i];
r = radeon_ring_lock(rdev, ring, 2);
+   if (r) {
+   DRM_ERROR("radeon: cp failed to lock ring (%d).\n", r);
+   return r;
+   }
 
/* clear the compute context state */
radeon_ring_write(ring, PACKET3_COMPUTE(PACKET3_CLEAR_STATE, 
0));
-- 
2.25.1



[PATCH] video/hdmi: convert *_infoframe_init() functions to void

2023-08-08 Thread Nikita Zhandarovich
Four hdmi_*_infoframe_init() functions that initialize different
types of hdmi infoframes only return the default 0 value, contrary to
their descriptions. Yet these functions are still unnecessarily checked
against possible errors in case of failure.

Remove redundant error checks in calls to following functions:
- hdmi_spd_infoframe_init
- hdmi_audio_infoframe_init
- hdmi_vendor_infoframe_init
- hdmi_drm_infoframe_init
Also, convert these functions to 'void' and fix their descriptions.

Fixes: 2c676f378edb ("[media] hdmi: added unpack and logging functions for 
InfoFrames")
Signed-off-by: Nikita Zhandarovich 
---
 drivers/gpu/drm/display/drm_hdmi_helper.c |  5 +---
 drivers/gpu/drm/drm_edid.c|  5 +---
 drivers/gpu/drm/i915/display/intel_hdmi.c |  7 ++---
 drivers/gpu/drm/mediatek/mtk_hdmi.c   | 14 ++
 drivers/gpu/drm/radeon/r600_hdmi.c|  6 +---
 drivers/gpu/drm/sti/sti_hdmi.c|  6 +---
 drivers/gpu/drm/tegra/hdmi.c  |  7 +
 drivers/gpu/drm/tegra/sor.c   |  6 +---
 drivers/gpu/drm/vc4/vc4_hdmi.c|  7 +
 drivers/video/hdmi.c  | 46 ++-
 include/linux/hdmi.h  | 10 +++
 11 files changed, 25 insertions(+), 94 deletions(-)

diff --git a/drivers/gpu/drm/display/drm_hdmi_helper.c 
b/drivers/gpu/drm/display/drm_hdmi_helper.c
index faf5e9efa7d3..ce7038a3a183 100644
--- a/drivers/gpu/drm/display/drm_hdmi_helper.c
+++ b/drivers/gpu/drm/display/drm_hdmi_helper.c
@@ -27,7 +27,6 @@ int drm_hdmi_infoframe_set_hdr_metadata(struct 
hdmi_drm_infoframe *frame,
 {
struct drm_connector *connector;
struct hdr_output_metadata *hdr_metadata;
-   int err;
 
if (!frame || !conn_state)
return -EINVAL;
@@ -47,9 +46,7 @@ int drm_hdmi_infoframe_set_hdr_metadata(struct 
hdmi_drm_infoframe *frame,
connector->hdr_sink_metadata.hdmi_type1.eotf))
DRM_DEBUG_KMS("Unknown EOTF %d\n", 
hdr_metadata->hdmi_metadata_type1.eotf);
 
-   err = hdmi_drm_infoframe_init(frame);
-   if (err < 0)
-   return err;
+   hdmi_drm_infoframe_init(frame);
 
frame->eotf = hdr_metadata->hdmi_metadata_type1.eotf;
frame->metadata_type = hdr_metadata->hdmi_metadata_type1.metadata_type;
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index e0dbd9140726..d4933f215675 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -7235,7 +7235,6 @@ drm_hdmi_vendor_infoframe_from_display_mode(struct 
hdmi_vendor_infoframe *frame,
 */
bool has_hdmi_infoframe = connector ?
connector->display_info.has_hdmi_infoframe : false;
-   int err;
 
if (!frame || !mode)
return -EINVAL;
@@ -7243,9 +7242,7 @@ drm_hdmi_vendor_infoframe_from_display_mode(struct 
hdmi_vendor_infoframe *frame,
if (!has_hdmi_infoframe)
return -EINVAL;
 
-   err = hdmi_vendor_infoframe_init(frame);
-   if (err < 0)
-   return err;
+   hdmi_vendor_infoframe_init(frame);
 
/*
 * Even if it's not absolutely necessary to send the infoframe
diff --git a/drivers/gpu/drm/i915/display/intel_hdmi.c 
b/drivers/gpu/drm/i915/display/intel_hdmi.c
index 7ac5e6c5e00d..8b58127bca37 100644
--- a/drivers/gpu/drm/i915/display/intel_hdmi.c
+++ b/drivers/gpu/drm/i915/display/intel_hdmi.c
@@ -763,12 +763,9 @@ intel_hdmi_compute_spd_infoframe(struct intel_encoder 
*encoder,
intel_hdmi_infoframe_enable(HDMI_INFOFRAME_TYPE_SPD);
 
if (IS_DGFX(i915))
-   ret = hdmi_spd_infoframe_init(frame, "Intel", "Discrete gfx");
+   hdmi_spd_infoframe_init(frame, "Intel", "Discrete gfx");
else
-   ret = hdmi_spd_infoframe_init(frame, "Intel", "Integrated gfx");
-
-   if (drm_WARN_ON(encoder->base.dev, ret))
-   return false;
+   hdmi_spd_infoframe_init(frame, "Intel", "Integrated gfx");
 
frame->sdi = HDMI_SPD_SDI_PC;
 
diff --git a/drivers/gpu/drm/mediatek/mtk_hdmi.c 
b/drivers/gpu/drm/mediatek/mtk_hdmi.c
index 0a8e0a13f516..75899e4a011f 100644
--- a/drivers/gpu/drm/mediatek/mtk_hdmi.c
+++ b/drivers/gpu/drm/mediatek/mtk_hdmi.c
@@ -995,12 +995,7 @@ static int mtk_hdmi_setup_spd_infoframe(struct mtk_hdmi 
*hdmi,
u8 buffer[HDMI_INFOFRAME_HEADER_SIZE + HDMI_SPD_INFOFRAME_SIZE];
ssize_t err;
 
-   err = hdmi_spd_infoframe_init(, vendor, product);
-   if (err < 0) {
-   dev_err(hdmi->dev, "Failed to initialize SPD infoframe: %zd\n",
-   err);
-   return err;
-   }
+   hdmi_spd_infoframe_init(, vendor, product);
 
err = hdmi_spd_infoframe_pack(, buffer, sizeof(buffer));
if (err < 0) {
@@ -1018,12 +1013,7 @@ static int mtk_hdmi_setup_audio_infoframe(struct 
mtk_hdmi *hdmi)
u8 buffer[HDMI_INFOFRAME_HEADER_SIZE + 

Re: [PATCH] drm/amd: Use pci_dev_id() to simplify the code

2023-08-08 Thread Alex Deucher
Applied.  Thanks!

Alex

On Mon, Aug 7, 2023 at 9:22 AM Xiongfeng Wang  wrote:
>
> PCI core API pci_dev_id() can be used to get the BDF number for a pci
> device. We don't need to compose it mannually. Use pci_dev_id() to
> simplify the code a little bit.
>
> Signed-off-by: Xiongfeng Wang 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> index 385c6acb5728..aee0cfdc6da3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> @@ -706,7 +706,7 @@ int amdgpu_acpi_pcie_performance_request(struct 
> amdgpu_device *adev,
>
> atcs_input.size = sizeof(struct atcs_pref_req_input);
> /* client id (bit 2-0: func num, 7-3: dev num, 15-8: bus num) */
> -   atcs_input.client_id = adev->pdev->devfn | (adev->pdev->bus->number 
> << 8);
> +   atcs_input.client_id = pci_dev_id(adev->pdev);
> atcs_input.valid_flags_mask = ATCS_VALID_FLAGS_MASK;
> atcs_input.flags = ATCS_WAIT_FOR_COMPLETION;
> if (advertise)
> @@ -776,7 +776,7 @@ int amdgpu_acpi_power_shift_control(struct amdgpu_device 
> *adev,
>
> atcs_input.size = sizeof(struct atcs_pwr_shift_input);
> /* dGPU id (bit 2-0: func num, 7-3: dev num, 15-8: bus num) */
> -   atcs_input.dgpu_id = adev->pdev->devfn | (adev->pdev->bus->number << 
> 8);
> +   atcs_input.dgpu_id = pci_dev_id(adev->pdev);
> atcs_input.dev_acpi_state = dev_state;
> atcs_input.drv_state = drv_state;
>
> @@ -1141,7 +1141,7 @@ int amdgpu_acpi_get_tmr_info(struct amdgpu_device 
> *adev, u64 *tmr_offset,
> if (!tmr_offset || !tmr_size)
> return -EINVAL;
>
> -   bdf = (adev->pdev->bus->number << 8) | adev->pdev->devfn;
> +   bdf = pci_dev_id(adev->pdev);
> dev_info = amdgpu_acpi_get_dev(bdf);
> if (!dev_info)
> return -ENOENT;
> @@ -1162,7 +1162,7 @@ int amdgpu_acpi_get_mem_info(struct amdgpu_device 
> *adev, int xcc_id,
> if (!numa_info)
> return -EINVAL;
>
> -   bdf = (adev->pdev->bus->number << 8) | adev->pdev->devfn;
> +   bdf = pci_dev_id(adev->pdev);
> dev_info = amdgpu_acpi_get_dev(bdf);
> if (!dev_info)
> return -ENOENT;
> --
> 2.20.1
>


[PATCH v3 RESEND] drm/i915/quirk: Add quirk for devices that cannot be dimmed

2023-08-08 Thread Allen Ballway
Cybernet T10C cannot be dimmed without the backlight strobing. Create a
new quirk to lock the minimum brightness to the highest supported value.
This aligns the device with its behavior on Windows, which will not
lower the brightness below maximum.

Signed-off-by: Allen Ballway 
---
V2 -> V3: Fix typo.
V1 -> V2: Fix style issue.

.../gpu/drm/i915/display/intel_backlight.c|  5 
 drivers/gpu/drm/i915/display/intel_quirks.c   | 27 +++
 drivers/gpu/drm/i915/display/intel_quirks.h   |  1 +
 3 files changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_backlight.c 
b/drivers/gpu/drm/i915/display/intel_backlight.c
index 2e8f17c045222..f015563d3ebd5 100644
--- a/drivers/gpu/drm/i915/display/intel_backlight.c
+++ b/drivers/gpu/drm/i915/display/intel_backlight.c
@@ -1192,6 +1192,11 @@ static u32 get_backlight_min_vbt(struct intel_connector 
*connector)

drm_WARN_ON(>drm, panel->backlight.pwm_level_max == 0);

+   if (intel_has_quirk(i915, QUIRK_NO_DIM)) {
+   /* Cannot dim backlight, set minimum to highest value */
+   return panel->backlight.pwm_level_max - 1;
+   }
+
/*
 * XXX: If the vbt value is 255, it makes min equal to max, which leads
 * to problems. There are such machines out there. Either our
diff --git a/drivers/gpu/drm/i915/display/intel_quirks.c 
b/drivers/gpu/drm/i915/display/intel_quirks.c
index a280448df771a..910c95840a539 100644
--- a/drivers/gpu/drm/i915/display/intel_quirks.c
+++ b/drivers/gpu/drm/i915/display/intel_quirks.c
@@ -65,6 +65,12 @@ static void quirk_no_pps_backlight_power_hook(struct 
drm_i915_private *i915)
drm_info(>drm, "Applying no pps backlight power quirk\n");
 }

+static void quirk_no_dim(struct drm_i915_private *i915)
+{
+   intel_set_quirk(i915, QUIRK_NO_DIM);
+   drm_info(>drm, "Applying no dim quirk\n");
+}
+
 struct intel_quirk {
int device;
int subsystem_vendor;
@@ -90,6 +96,12 @@ static int intel_dmi_no_pps_backlight(const struct 
dmi_system_id *id)
return 1;
 }

+static int intel_dmi_no_dim(const struct dmi_system_id *id)
+{
+   DRM_INFO("No dimming allowed on %s\n", id->ident);
+   return 1;
+}
+
 static const struct intel_dmi_quirk intel_dmi_quirks[] = {
{
.dmi_id_list = &(const struct dmi_system_id[]) {
@@ -136,6 +148,20 @@ static const struct intel_dmi_quirk intel_dmi_quirks[] = {
},
.hook = quirk_no_pps_backlight_power_hook,
},
+   {
+   .dmi_id_list = &(const struct dmi_system_id[]) {
+   {
+   .callback = intel_dmi_no_dim,
+   .ident = "Cybernet T10C Tablet",
+   .matches = {DMI_EXACT_MATCH(DMI_BOARD_VENDOR,
+   "Cybernet 
Manufacturing Inc."),
+   DMI_EXACT_MATCH(DMI_BOARD_NAME, 
"T10C Tablet"),
+   },
+   },
+   { }
+   },
+   .hook = quirk_no_dim,
+   },
 };

 static struct intel_quirk intel_quirks[] = {
@@ -218,6 +244,7 @@ void intel_init_quirks(struct drm_i915_private *i915)
 q->subsystem_device == PCI_ANY_ID))
q->hook(i915);
}
+
for (i = 0; i < ARRAY_SIZE(intel_dmi_quirks); i++) {
if (dmi_check_system(*intel_dmi_quirks[i].dmi_id_list) != 0)
intel_dmi_quirks[i].hook(i915);
diff --git a/drivers/gpu/drm/i915/display/intel_quirks.h 
b/drivers/gpu/drm/i915/display/intel_quirks.h
index 10a4d163149fd..b41c7bbf0a5e3 100644
--- a/drivers/gpu/drm/i915/display/intel_quirks.h
+++ b/drivers/gpu/drm/i915/display/intel_quirks.h
@@ -17,6 +17,7 @@ enum intel_quirk_id {
QUIRK_INVERT_BRIGHTNESS,
QUIRK_LVDS_SSC_DISABLE,
QUIRK_NO_PPS_BACKLIGHT_POWER_HOOK,
+   QUIRK_NO_DIM,
 };

 void intel_init_quirks(struct drm_i915_private *i915);
--
2.41.0.255.g8b1d071c50-goog



Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-08-08 Thread Marek Olšák
It's the same situation as SIGSEGV. A process can catch the signal,
but if it doesn't, it gets killed. GL and Vulkan APIs give you a way
to catch the GPU error and prevent the process termination. If you
don't use the API, you'll get undefined behavior, which means anything
can happen, including process termination.



Marek

On Tue, Aug 8, 2023 at 8:14 AM Sebastian Wick  wrote:
>
> On Fri, Aug 4, 2023 at 3:03 PM Daniel Vetter  wrote:
> >
> > On Tue, Jun 27, 2023 at 10:23:23AM -0300, André Almeida wrote:
> > > Create a section that specifies how to deal with DRM device resets for
> > > kernel and userspace drivers.
> > >
> > > Acked-by: Pekka Paalanen 
> > > Signed-off-by: André Almeida 
> > > ---
> > >
> > > v4: 
> > > https://lore.kernel.org/lkml/20230626183347.55118-1-andrealm...@igalia.com/
> > >
> > > Changes:
> > >  - Grammar fixes (Randy)
> > >
> > >  Documentation/gpu/drm-uapi.rst | 68 ++
> > >  1 file changed, 68 insertions(+)
> > >
> > > diff --git a/Documentation/gpu/drm-uapi.rst 
> > > b/Documentation/gpu/drm-uapi.rst
> > > index 65fb3036a580..3cbffa25ed93 100644
> > > --- a/Documentation/gpu/drm-uapi.rst
> > > +++ b/Documentation/gpu/drm-uapi.rst
> > > @@ -285,6 +285,74 @@ for GPU1 and GPU2 from different vendors, and a 
> > > third handler for
> > >  mmapped regular files. Threads cause additional pain with signal
> > >  handling as well.
> > >
> > > +Device reset
> > > +
> > > +
> > > +The GPU stack is really complex and is prone to errors, from hardware 
> > > bugs,
> > > +faulty applications and everything in between the many layers. Some 
> > > errors
> > > +require resetting the device in order to make the device usable again. 
> > > This
> > > +sections describes the expectations for DRM and usermode drivers when a
> > > +device resets and how to propagate the reset status.
> > > +
> > > +Kernel Mode Driver
> > > +--
> > > +
> > > +The KMD is responsible for checking if the device needs a reset, and to 
> > > perform
> > > +it as needed. Usually a hang is detected when a job gets stuck 
> > > executing. KMD
> > > +should keep track of resets, because userspace can query any time about 
> > > the
> > > +reset stats for an specific context. This is needed to propagate to the 
> > > rest of
> > > +the stack that a reset has happened. Currently, this is implemented by 
> > > each
> > > +driver separately, with no common DRM interface.
> > > +
> > > +User Mode Driver
> > > +
> > > +
> > > +The UMD should check before submitting new commands to the KMD if the 
> > > device has
> > > +been reset, and this can be checked more often if the UMD requires it. 
> > > After
> > > +detecting a reset, UMD will then proceed to report it to the application 
> > > using
> > > +the appropriate API error code, as explained in the section below about
> > > +robustness.
> > > +
> > > +Robustness
> > > +--
> > > +
> > > +The only way to try to keep an application working after a reset is if it
> > > +complies with the robustness aspects of the graphical API that it is 
> > > using.
> > > +
> > > +Graphical APIs provide ways to applications to deal with device resets. 
> > > However,
> > > +there is no guarantee that the app will use such features correctly, and 
> > > the
> > > +UMD can implement policies to close the app if it is a repeating 
> > > offender,
> >
> > Not sure whether this one here is due to my input, but s/UMD/KMD. Repeat
> > offender killing is more a policy where the kernel enforces policy, and no
> > longer up to userspace to dtrt (because very clearly userspace is not
> > really doing the right thing anymore when it's just hanging the gpu in an
> > endless loop). Also maybe tune it down further to something like "the
> > kernel driver may implemnent ..."
> >
> > In my opinion the umd shouldn't implement these kind of magic guesses, the
> > entire point of robustness apis is to delegate responsibility for
> > correctly recovering to the application. And the kernel is left with
> > enforcing fair resource usage policies (which eventually might be a
> > cgroups limit on how much gpu time you're allowed to waste with gpu
> > resets).
>
> Killing apps that the kernel thinks are misbehaving really doesn't
> seem like a good idea to me. What if the process is a service getting
> restarted after getting killed? What if killing that process leaves
> the system in a bad state?
>
> Can't the kernel provide some information to user space so that e.g.
> systemd can handle those situations?
>
> > > +likely in a broken loop. This is done to ensure that it does not keep 
> > > blocking
> > > +the user interface from being correctly displayed. This should be done 
> > > even if
> > > +the app is correct but happens to trigger some bug in the 
> > > hardware/driver.
> > > +
> > > +OpenGL
> > > +~~
> > > +
> > > +Apps using OpenGL should use the available robust interfaces, like the
> > > +extension ``GL_ARB_robustness`` 

Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages

2023-08-08 Thread Daniel Vetter
On Thu, Jun 22, 2023 at 10:25:17AM +0200, David Hildenbrand wrote:
> On 22.06.23 09:27, Vivek Kasireddy wrote:
> > The first patch ensures that the mappings needed for handling mmap
> > operation would be managed by using the pfn instead of struct page.
> > The second patch restores support for mapping hugetlb pages where
> > subpages of a hugepage are not directly used anymore (main reason
> > for revert) and instead the hugetlb pages and the relevant offsets
> > are used to populate the scatterlist for dma-buf export and for
> > mmap operation.
> > 
> > Testcase: default_hugepagesz=2M hugepagesz=2M hugepages=2500 options
> > were passed to the Host kernel and Qemu was launched with these
> > relevant options: qemu-system-x86_64 -m 4096m
> > -device virtio-gpu-pci,max_outputs=1,blob=true,xres=1920,yres=1080
> > -display gtk,gl=on
> > -object memory-backend-memfd,hugetlb=on,id=mem1,size=4096M
> > -machine memory-backend=mem1
> > 
> > Replacing -display gtk,gl=on with -display gtk,gl=off above would
> > exercise the mmap handler.
> > 
> 
> While I think the VM_PFNMAP approach is much better and should fix that
> issue at hand, I thought more about missing memlock support and realized
> that we might have to fix something else. SO I'm going to raise the issue
> here.
> 
> I think udmabuf chose the wrong interface to do what it's doing, that makes
> it harder to fix it eventually.
> 
> Instead of accepting a range in a memfd, it should just have accepted a user
> space address range and then used pin_user_pages(FOLL_WRITE|FOLL_LONGTERM)
> to longterm-pin the pages "officially".
> 
> So what's the issue? Udma effectively pins pages longterm ("possibly
> forever") simply by grabbing a reference on them. These pages might easily
> reside in ZONE_MOVABLE or in MIGRATE_CMA pageblocks.
> 
> So what udmabuf does is break memory hotunplug and CMA, because it turns
> pages that have to remain movable unmovable.
> 
> In the pin_user_pages(FOLL_LONGTERM) case we make sure to migrate these
> pages. See mm/gup.c:check_and_migrate_movable_pages() and especially
> folio_is_longterm_pinnable(). We'd probably have to implement something
> similar for udmabuf, where we detect such unpinnable pages and migrate them.
> 
> 
> For example, pairing udmabuf with vfio (which pins pages using
> pin_user_pages(FOLL_LONGTERM)) in QEMU will most probably not work in all
> cases: if udmabuf longterm pinned the pages "the wrong way", vfio will fail
> to migrate them during FOLL_LONGTERM and consequently fail pin_user_pages().
> As long as udmabuf holds a reference on these pages, that will never
> succeed.

Uh this is no good and I totally missed this, because the very first
version of udmabuf used pin_user_pages(FOLL_LONGTERM). I think what we
need here as first fix is a shmem_pin_mapping_page_longterm that does all
the equivalent of pin_user_pages(FOLL_LONGTERM), and use it in udmabuf.
>From a quick look the folio conversions that already landed should help
there.

It might also be good if we convert all the gpu driver users of
shmem_read_mapping_page over to that new shmem_pin_mapping_page_longterm,
just for safety. gpu drivers use a private shmem file and adjust the gfp
mask to clear GFP_MOVEABLE, so the biggest issues shouldn't be possible.
But pin(LONGTERM) compared to just getting a page ref has gained quite a
few other differences in the past years, and it would be good to be
consistent I think.

Anything else than longterm pins wont work for udmabuf, because the
locking between struct page/gup.c/mmu_notifier and dma_buf is rather
fundamentally (and by design due to gpu driver requirements) incompatible
with dma_buf locking rules.
 
> There are *probably* more issues on the QEMU side when udmabuf is paired
> with things like MADV_DONTNEED/FALLOC_FL_PUNCH_HOLE used for virtio-balloon,
> virtio-mem, postcopy live migration, ... for example, in the vfio/vdpa case
> we make sure that we disallow most of these, because otherwise there can be
> an accidental "disconnect" between the pages mapped into the VM (guest view)
> and the pages mapped into the IOMMU (device view), for example, after a
> reboot.

I think once we have the proper longterm pinning for udmabuf we need to
look into what coherency issues are left, and how to best fix them.
udmabuf already requires that the memfd is size sealed to avoid some
issues, we might need to require more. Or on the other side, perhaps
reject or quietly ignore some of the hole punching for longterm pinned
pages, to maintain coherency.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH -next] drm/hisilicon: Remove unused function declaration hibmc_mm_init()

2023-08-08 Thread Yue Haibing
Commit 552a77bab3ff ("drm/hisilicon: Delete the entire file hibmc_ttm.c")
removed the implementation but leave declaration.

Signed-off-by: Yue Haibing 
---
 drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h 
b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
index f957552c6c50..a95fe13aefff 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
@@ -57,8 +57,6 @@ void hibmc_set_current_gate(struct hibmc_drm_private *priv,
 
 int hibmc_de_init(struct hibmc_drm_private *priv);
 int hibmc_vdac_init(struct hibmc_drm_private *priv);
-
-int hibmc_mm_init(struct hibmc_drm_private *hibmc);
 int hibmc_ddc_create(struct drm_device *drm_dev, struct hibmc_connector 
*connector);
 
 #endif
-- 
2.34.1



Re: [PATCH RESEND v4 2/2] drm/mediatek: Fix iommu fault during crtc enabling

2023-08-08 Thread 林睿祥


Re: [PATCH v6 3/4] drm: Expand max DRM device number to full MINORBITS

2023-08-08 Thread James Zhu
I would like if these kernel patches are accepted by everyone, If yes, 
when they can be upstream.


I have a MR for libdrm to support drm nodes type up to 2^MINORBITS  
nodes which can work with these patches,


https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305

Thanks!

James

On 2023-08-08 09:55, Christian König wrote:

Am 28.07.23 um 16:22 schrieb Simon Ser:
On Thursday, July 27th, 2023 at 14:01, Christian König 
 wrote:


We do need patches to stop trying to infer the node type from the 
minor

in libdrm, though. Emil has suggested using sysfs, which we already do
in a few places in libdrm.

That sounds like a really good idea to me as well.

But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps
should use drmGetDevices2() like Emil suggested?

DRM_MAX_MINOR has been bumped to 64 now.

With the new minor allocation scheme, DRM_MAX_MINOR is meaningless
because there is no "max minor per type" concept anymore: the minor no
longer contains the type.

So I'd suggest leaving it as-is (so that old apps still continue to
work on systems with < 64 devices like they do today) and mark it as
deprecated.


Sounds like a plan to me.

Regards,
Christian.


[PATCH] PCI/VGA: Make the vga_is_firmware_default() less arch-independent

2023-08-08 Thread Sui Jingfeng
Currently, the vga_is_firmware_default() function works only on x86 and
IA64 architectures, but it is a no-op on ARM64, PPC, RISC-V, etc. This
patch completes the implementation by tracking the firmware framebuffer's
address range. The added code is trying to identify the VRAM aperture that
contains the firmware framebuffer. Once found, related information about
the VRAM aperture will be tracked.

Note that we need to identify the VRAM aperture before it get moved. We
achieve this by using DECLARE_PCI_FIXUP_CLASS_HEADER(), which ensures that
vga_arb_firmware_fb_addr_tracker() gets called before PCI resource
allocation. Once we found the VRAM aperture that contains firmware fb, we
are able to monitor the address changes of it. If the VRAM aperture of the
primary GPU do moved, we will update our cached firmware framebuffer's
address range accordingly. This approach overcomes the VRAM bar relocation
issue successfully. Hence, this patch make the vga_is_firmware_default()
function works on whatever arch that has UEFI GOP support, including x86
and IA64. But, at the first step, we make it available only on platforms
which PCI resource relocation do happens. Once provided method proved to
be effective and reliable, it can be expanded to other arch easily.

This patch is tested on
1) LS3A5000+LS7A2000 and LS3A5000+LS7A1000 platform.
2) Intel i3-8100 CPU + H110 D4L motherboard with triple video cards:

$ lspci | grep VGA

Intel Corporation CoffeeLake-S GT2 [UHD Graphics 630]
Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470] (rev cf)
ASPEED Technology, Inc. ASPEED Graphics Family (rev 52)

Note that on x86, in order to testing the new approach this patch provided,
we remove the vga_arb_get_fb_range_from_screen_info() call in
vga_is_firmware_default() function, as following.

-#if defined(CONFIG_X86) || defined(CONFIG_IA64)
-   ret = vga_arb_get_fb_range_from_screen_info(_start, _end);
-#else
ret = vga_arb_get_fb_range_from_tracker(_start, _end);
-#endif

It is just that we don't observe the case which VRAM Bar of VGA compatible
controller moves, so there just no need to unify it. But on LoongArch,
the VRAM Bar of AMDGPU do moves.

v2:
* Fix test robot warnnings and fix typos

v3:
* Fix linkage problems if the global screen_info is not exported

Signed-off-by: Sui Jingfeng 
---
 drivers/pci/vgaarb.c | 154 ++-
 1 file changed, 139 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index 5a696078b382..e0919a70af3e 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -61,6 +61,92 @@ static bool vga_arbiter_used;
 static DEFINE_SPINLOCK(vga_lock);
 static DECLARE_WAIT_QUEUE_HEAD(vga_wait_queue);
 
+static struct firmware_fb_tracker {
+   /* The PCI(e) device who owns the firmware framebuffer */
+   struct pci_dev *pdev;
+   /* The index of the VRAM Bar */
+   unsigned int bar;
+   /* Firmware fb's offset from the VRAM aperture start */
+   resource_size_t offset;
+   /* The firmware fb's size, in bytes */
+   resource_size_t size;
+
+   /* Firmware fb's address range, suffer from change */
+   resource_size_t start;
+   resource_size_t end;
+} firmware_fb;
+
+/*
+ * Get the physical address range that the firmware framebuffer occupies.
+ *
+ * The global screen_info is arch-specific; it will not be exported if the
+ * CONFIG_EFI is not selected on Arm64. Hence, CONFIG_EFI is chosen as
+ * compile-time conditional to suppress linkage problems. This guard can be
+ * removed if the global screen_info became arch-independent one day.
+ */
+static bool vga_arb_get_fb_range_from_screen_info(resource_size_t *start,
+ resource_size_t *end)
+{
+   resource_size_t fb_start = 0;
+   resource_size_t fb_size = 0;
+   resource_size_t fb_end;
+
+#if defined(CONFIG_EFI)
+   fb_start = screen_info.lfb_base;
+   if (screen_info.capabilities & VIDEO_CAPABILITY_64BIT_BASE)
+   fb_start |= (u64)screen_info.ext_lfb_base << 32;
+
+   fb_size = screen_info.lfb_size;
+#endif
+
+   /* No firmware framebuffer support */
+   if (!fb_start || !fb_size)
+   return false;
+
+   fb_end = fb_start + fb_size - 1;
+
+   *start = fb_start;
+   *end = fb_end;
+
+   return true;
+}
+
+static bool vga_arb_get_fb_range_from_tracker(resource_size_t *start,
+ resource_size_t *end)
+{
+   struct pci_dev *pdev = firmware_fb.pdev;
+   resource_size_t new_vram_base;
+   resource_size_t new_fb_start;
+   resource_size_t old_fb_start;
+   resource_size_t old_fb_end;
+
+   /*
+* No firmware framebuffer support or no aperture that contains the
+* firmware FB is found. In this case, the firmware_fb.pdev will be
+* NULL. We will return immediately.
+*/
+   if (!pdev)
+  

Re: [PATCH 4/8] drm/sched: Add generic scheduler message interface

2023-08-08 Thread Christian König

Am 08.08.23 um 16:06 schrieb Matthew Brost:

[SNIP]

Basically workqueues are the in kernel infrastructure for exactly that use
case and we are trying to re-create that here and that is usually a rather
bad idea.


Ok let me play around with what this would look like in Xe, what you are
suggesting would be ordered-wq per scheduler, work item for run job,
work item for clean up job, and work item for a message. That might
work I suppose? Only issue I see is scaling as this exposes an
ordered-wq creation directly to an IOCTL. No idea if that is actually a
concern though.


That's a very good question I can't answer of hand either.

But from the history of work queues I know that they were invented to 
reduce the overhead/costs of having many kernel threads.


So my educated guess is that you probably won't find anything better at 
the moment. If work queues then indeed don't match this use case then we 
need to figure out how to improve them or find a different solution.


Christian.



Matt


Regards,
Christian.


Matt


Or what am I missing?


Regards,
Christian.


Worst case I think this isn't a dead-end and can be refactored to
internally use the workqueue services, with the new functions here
just being dumb wrappers until everyone is converted over. So it
doesn't look like an expensive mistake, if it turns out to be a
mistake.
-Daniel



Regards,
Christian.


Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/scheduler/sched_main.c | 52 +-
 include/drm/gpu_scheduler.h| 29 +-
 2 files changed, 78 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 2597fb298733..84821a124ca2 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1049,6 +1049,49 @@ drm_sched_pick_best(struct drm_gpu_scheduler 
**sched_list,
 }
 EXPORT_SYMBOL(drm_sched_pick_best);

+/**
+ * drm_sched_add_msg - add scheduler message
+ *
+ * @sched: scheduler instance
+ * @msg: message to be added
+ *
+ * Can and will pass an jobs waiting on dependencies or in a runnable queue.
+ * Messages processing will stop if schedule run wq is stopped and resume when
+ * run wq is started.
+ */
+void drm_sched_add_msg(struct drm_gpu_scheduler *sched,
+struct drm_sched_msg *msg)
+{
+ spin_lock(>job_list_lock);
+ list_add_tail(>link, >msgs);
+ spin_unlock(>job_list_lock);
+
+ drm_sched_run_wq_queue(sched);
+}
+EXPORT_SYMBOL(drm_sched_add_msg);
+
+/**
+ * drm_sched_get_msg - get scheduler message
+ *
+ * @sched: scheduler instance
+ *
+ * Returns NULL or message
+ */
+static struct drm_sched_msg *
+drm_sched_get_msg(struct drm_gpu_scheduler *sched)
+{
+ struct drm_sched_msg *msg;
+
+ spin_lock(>job_list_lock);
+ msg = list_first_entry_or_null(>msgs,
+struct drm_sched_msg, link);
+ if (msg)
+ list_del(>link);
+ spin_unlock(>job_list_lock);
+
+ return msg;
+}
+
 /**
  * drm_sched_main - main scheduler thread
  *
@@ -1060,6 +1103,7 @@ static void drm_sched_main(struct work_struct *w)
 container_of(w, struct drm_gpu_scheduler, work_run);
 struct drm_sched_entity *entity;
 struct drm_sched_job *cleanup_job;
+ struct drm_sched_msg *msg;
 int r;

 if (READ_ONCE(sched->pause_run_wq))
@@ -1067,12 +,15 @@ static void drm_sched_main(struct work_struct *w)

 cleanup_job = drm_sched_get_cleanup_job(sched);
 entity = drm_sched_select_entity(sched);
+ msg = drm_sched_get_msg(sched);

- if (!entity && !cleanup_job)
+ if (!entity && !cleanup_job && !msg)
 return; /* No more work */

 if (cleanup_job)
 sched->ops->free_job(cleanup_job);
+ if (msg)
+ sched->ops->process_msg(msg);

 if (entity) {
 struct dma_fence *fence;
@@ -1082,7 +1129,7 @@ static void drm_sched_main(struct work_struct *w)
 sched_job = drm_sched_entity_pop_job(entity);
 if (!sched_job) {
 complete_all(>entity_idle);
- if (!cleanup_job)
+ if (!cleanup_job && !msg)
 return; /* No more work */
 goto again;
 }
@@ -1177,6 +1224,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,

 init_waitqueue_head(>job_scheduled);
 INIT_LIST_HEAD(>pending_list);
+ INIT_LIST_HEAD(>msgs);
 spin_lock_init(>job_list_lock);
 atomic_set(>hw_rq_count, 0);
 INIT_DELAYED_WORK(>work_tdr, drm_sched_job_timedout);
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index df1993dd44ae..267bd060d178 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -394,6 +394,23 @@ enum drm_gpu_sched_stat {
 

Re: [PATCH 4/8] drm/sched: Add generic scheduler message interface

2023-08-08 Thread Matthew Brost
On Mon, Aug 07, 2023 at 05:46:16PM +0200, Christian König wrote:
> Am 04.08.23 um 16:13 schrieb Matthew Brost:
> > [SNIP]
> > Christian / Daniel - I've read both of you comments and having a hard
> > time parsing them. I do not really understand the issue with this patch
> > or exactly what is being suggested instead. Let's try to work through
> > this.
> > 
> > > > > > I'm still extremely frowned on this.
> > > > > > 
> > > > > > If you need this functionality then let the drivers decide which
> > > > > > runqueue the scheduler should use.
> > What do you mean by runqueue here? Do you mean 'struct
> > workqueue_struct'? The scheduler in this context is 'struct
> > drm_gpu_scheduler', right?
> 
> Sorry for the confusing wording, your understanding is correct.
> 
> > Yes, we have added this functionality iin the first patch.
> > 
> > > > > > When you then create a single threaded runqueue you can just submit 
> > > > > > work
> > > > > > to it and serialize this with the scheduler work.
> > > > > > 
> > We don't want to use a single threaded workqueue_struct in Xe, we want
> > to use a system_wq as run_job() can be executed in parallel across
> > multiple entites (or drm_gpu_scheduler as in Xe we have 1 to 1
> > relationship between entity and scheduler). What we want is on per
> > entity / scheduler granularity to be able to communicate into the
> > backend a message synchronously (run_job / free_job not executing,
> > scheduler execution not paused for a reset).
> > 
> > If I'm underatanding what you suggesting in Xe we'd create an ordered
> > workqueue_struct per drm_gpu_scheduler and the queue messages on the
> > ordered workqueue_struct?
> 
> Yes, correct.
> 
> > This seems pretty messy to me as now we have
> > open coded a solution bypassing the scheduler, every drm_gpu_scheduler
> > creates its own workqueue_struct, and we'd also have to open code the
> > pausing of these messages for resets too.
> > 
> > IMO this is pretty clean solution that follows the pattern of cleanup
> > jobs already in place.
> 
> Yeah, exactly that's the point. Moving the job cleanup into the scheduler
> thread is seen as very very bad idea by me.
> 
> And I really don't want to exercise that again for different use cases.
> 
> > 
> > > > > > This way we wouldn't duplicate this core kernel function inside the
> > > > > > scheduler.
> > > > > Yeah that's essentially the design we picked for the tdr workers,
> > > > > where some drivers have requirements that all tdr work must be done on
> > > > > the same thread (because of cross-engine coordination issues). But
> > > > > that would require that we rework the scheduler as a pile of
> > > > > self-submitting work items, and I'm not sure that actually fits all
> > > > > that well into the core workqueue interfaces either.
> > This is the ordering between TDRs firing between different
> > drm_gpu_scheduler and larger external resets (GT in Xe) an ordered
> > workqueue_struct makes sense for this. Here we are talking about
> > ordering jobs and messages within a single drm_gpu_scheduler. Using the
> > main execution thread to do ordering makes sense in my opinion.
> 
> I completely disagree to that.
> 
> Take a look at how this came to be. This is a very very ugly hack and we
> already had a hard time making lockdep understand the different fence
> signaling dependencies with freeing the job and I'm pretty sure that is
> still not 100% correct.
> 
> > 
> > > > There were already patches floating around which did exactly that.
> > > > 
> > > > Last time I checked those were actually looking pretty good.
> > > > 
> > Link to patches for reference.
> > 
> > > > Additional to message passing advantage the real big issue with the
> > > > scheduler and 1 to 1 mapping is that we create a kernel thread for each
> > > > instance, which results in tons on overhead.
> > First patch in the series switches from kthread to work queue, that is
> > still a good idea.
> 
> This was the patch I was referring to. Sorry didn't remembered that this was
> in the same patch set.
> 
> > 
> > > > Just using a work item which is submitted to a work queue completely 
> > > > avoids
> > > > that.
> > > Hm I should have read the entire series first, since that does the
> > > conversion still. Apologies for the confusion, and yeah we should be able
> > > to just submit other work to the same wq with the first patch? And so
> > > hand-rolling this infra here isn't needed at all?
> > > 
> > I wouldn't call this hand rolling, rather it following patten already in
> > place.
> 
> Basically workqueues are the in kernel infrastructure for exactly that use
> case and we are trying to re-create that here and that is usually a rather
> bad idea.
> 

Ok let me play around with what this would look like in Xe, what you are
suggesting would be ordered-wq per scheduler, work item for run job,
work item for clean up job, and work item for a message. That might
work I suppose? Only issue I see is scaling as this exposes an

Re: [PATCH v6 3/4] drm: Expand max DRM device number to full MINORBITS

2023-08-08 Thread Christian König

Am 28.07.23 um 16:22 schrieb Simon Ser:

On Thursday, July 27th, 2023 at 14:01, Christian König 
 wrote:


We do need patches to stop trying to infer the node type from the minor
in libdrm, though. Emil has suggested using sysfs, which we already do
in a few places in libdrm.

That sounds like a really good idea to me as well.

But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps
should use drmGetDevices2() like Emil suggested?

DRM_MAX_MINOR has been bumped to 64 now.

With the new minor allocation scheme, DRM_MAX_MINOR is meaningless
because there is no "max minor per type" concept anymore: the minor no
longer contains the type.

So I'd suggest leaving it as-is (so that old apps still continue to
work on systems with < 64 devices like they do today) and mark it as
deprecated.


Sounds like a plan to me.

Regards,
Christian.


Re: 2b5d1c29f6c4 ("drm/nouveau/disp: PIOR DP uses GPIO for HPD, not PMGR AUX interrupts")

2023-08-08 Thread Borislav Petkov
On Tue, Aug 08, 2023 at 12:39:32PM +0200, Karol Herbst wrote:
> ahh, that would have been good to know :)

Yeah, I didn't see it before - it would only freeze. Only after I added
the printk you requested.

> Mind figuring out what's exactly NULL inside nvif_object_mthd? Or
> rather what line `nvif_object_mthd+0x136` belongs to, then it should
> be easy to figure out what's wrong here.

That looks like this:

816ddfee:   e8 8d 04 4e 00  callq  81bbe480 
<__memcpy>
816ddff3:   41 8d 56 20 lea0x20(%r14),%edx
816ddff7:   49 8b 44 24 08  mov0x8(%r12),%rax
816ddffc:   83 fa 17cmp$0x17,%edx
816ddfff:   76 7d   jbe816de07e 

816de001:   49 39 c4cmp%rax,%r12
816de004:   74 45   je 816de04b 


<--- RIP points here.

The 0x20 also fits the deref address: 0020.

Which means %rax is 0. Yap.

816de006:   48 8b 78 20 mov0x20(%rax),%rdi
816de00a:   4c 89 64 24 10  mov%r12,0x10(%rsp)
816de00f:   48 8b 40 38 mov0x38(%rax),%rax
816de013:   c6 44 24 06 ff  movb   $0xff,0x6(%rsp)
816de018:   31 c9   xor%ecx,%ecx
816de01a:   48 89 e6mov%rsp,%rsi
816de01d:   48 8b 40 28 mov0x28(%rax),%rax
816de021:   e8 3a 0c 4f 00  callq  81bcec60 
<__x86_indirect_thunk_array>


Now, the preprocessed asm version of nvif/object.c says around here:


callmemcpy  #
# drivers/gpu/drm/nouveau/nvif/object.c:160:ret = nvif_object_ioctl(object, 
args, sizeof(*args) + size, NULL);
leal32(%r14), %edx  #, _108
# drivers/gpu/drm/nouveau/nvif/object.c:33: struct nvif_client *client = 
object->client;
movq8(%r12), %rax   # object_19(D)->client, client
# drivers/gpu/drm/nouveau/nvif/object.c:38: if (size >= sizeof(*args) && 
args->v0.version == 0) {
cmpl$23, %edx   #, _108
jbe .L69#,
# drivers/gpu/drm/nouveau/nvif/object.c:39: if (object != 
>object)
cmpq%rax, %r12  # client, object
je  .L70#,
# drivers/gpu/drm/nouveau/nvif/object.c:47: return 
client->driver->ioctl(client->object.priv, data, size, hack);
movq32(%rax), %rdi  # client_109->object.priv, 
client_109->object.priv


So I'd say that client is NULL. IINM.


movq%r12, 16(%rsp)  # object, MEM[(union  *)].v0.object
# drivers/gpu/drm/nouveau/nvif/object.c:47: return 
client->driver->ioctl(client->object.priv, data, size, hack);
movq56(%rax), %rax  # client_109->driver, client_109->driver
# drivers/gpu/drm/nouveau/nvif/object.c:43: args->v0.owner = 
NVIF_IOCTL_V0_OWNER_ANY;
movb$-1, 6(%rsp)#, MEM[(union  *)].v0.owner
.L64:
# drivers/gpu/drm/nouveau/nvif/object.c:47: return 
client->driver->ioctl(client->object.priv, data, size, hack);
xorl%ecx, %ecx  #
movq%rsp, %rsi  #,
movq40(%rax), %rax  #, _77->ioctl
call__x86_indirect_thunk_rax
# drivers/gpu/drm/nouveau/nvif/object.c:161:memcpy(data, args->mthd.data, 
size);

> > [4.144676] #PF: supervisor read access in kernel mode
> > [4.144676] #PF: error_code(0x) - not-present page
> > [4.144676] PGD 0 P4D 0
> > [4.144676] Oops:  [#1] PREEMPT SMP PTI
> > [4.144676] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc5-dirty #1
> > [4.144676] Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A13 
> > 05/11/2014
> > [4.144676] RIP: 0010:nvif_object_mthd+0x136/0x1e0
> > [4.144676] Code: f2 4c 89 ee 48 8d 7c 24 20 66 89 04 24 c6 44 24 18 00 
> > e8 8d 04 4e 00 41 8d 56 20 49 8b 44 24 08 83 fa 17 76 7d 49 39 c4 74 45 
> > <48> 8b 78 20 4c 89 64 24 10 48 8b 40 38 c6 44 24 06 ff 31 c9 48 89

Opcode bytes around RIP look correct too:

./scripts/decodecode < /tmp/oops
[ 4.144676] Code: f2 4c 89 ee 48 8d 7c 24 20 66 89 04 24 c6 44 24 18 00 e8 8d 
04 4e 00 41 8d 56 20 49 8b 44 24 08 83 fa 17 76 7d 49 39 c4 74 45 <48> 8b 78 20 
4c 89 64 24 10 48 8b 40 38 c6 44 24 06 ff 31 c9 48 89
All code

   0:   f2 4c 89 ee repnz mov %r13,%rsi
   4:   48 8d 7c 24 20  lea0x20(%rsp),%rdi
   9:   66 89 04 24 mov%ax,(%rsp)
   d:   c6 44 24 18 00  movb   $0x0,0x18(%rsp)
  12:   e8 8d 04 4e 00  callq  0x4e04a4
  17:   41 8d 56 20 lea0x20(%r14),%edx
  1b:   49 8b 44 24 08  mov0x8(%r12),%rax
  20:   83 fa 17cmp$0x17,%edx
  23:   76 7d   jbe0xa2
  25:   49 39 c4cmp%rax,%r12
  28:   74 45   je 0x6f
  2a:*  48 8b 78 20 mov0x20(%rax),%rdi  <-- trapping 

Re: [PATCH v4 03/17] drm/imagination/uapi: Add PowerVR driver UAPI

2023-08-08 Thread Michel Dänzer
On 7/14/23 16:25, Sarah Walker wrote:
> 
> +/**
> + * DOC: PowerVR IOCTL CREATE_BO interface
> + */
> +
> +/**
> + * DOC: Flags for CREATE_BO
> + *
> + * The  drm_pvr_ioctl_create_bo_args.flags field is 64 bits wide and 
> consists
> + * of three groups of flags: creation, device mapping and CPU mapping.
> + *
> + * We use "device" to refer to the GPU here because of the ambiguity between
> + * CPU and GPU in some fonts.
> + *
> + * Creation options
> + *These use the prefix ``DRM_PVR_BO_CREATE_``.
> + *
> + *:ZEROED: Require the allocated buffer to be zeroed before returning. 
> Note
> + *  that this is an active operation, and is never zero cost. Unless it 
> is
> + *  explicitly required, this option should not be set.

Making this optional is kind of problematic from a security standpoint 
(information leak, at least if the memory was previously used by a different 
process). See e.g. the discussion starting at 
https://gitlab.freedesktop.org/mesa/mesa/-/issues/9189#note_1972986 .

AFAICT the approach I suggested there (Clear freed memory in the background, 
and make it available for allocation again only once the clear has finished) 
isn't really possible with gem_shmem in its current state though. There seems 
to be ongoing work to do something like that for __GFP_ZERO in general, maybe 
gem_shmem could take advantage of that when it lands. I'm afraid this series 
can't depend on that though.


> +/**
> + * DOC: PowerVR IOCTL VM_MAP and VM_UNMAP interfaces
> + *
> + * The VM UAPI allows userspace to create buffer object mappings in GPU 
> virtual address space.
> + *
> + * The client is responsible for managing GPU address space. It should 
> allocate mappings within
> + * the heaps returned by %DRM_PVR_DEV_QUERY_HEAP_INFO_GET.
> + *
> + * %DRM_IOCTL_PVR_VM_MAP creates a new mapping. The client provides the 
> target virtual address for
> + * the mapping. Size and offset within the mapped buffer object can be 
> specified, so the client can
> + * partially map a buffer.
> + *
> + * %DRM_IOCTL_PVR_VM_UNMAP removes a mapping. The entire mapping will be 
> removed from GPU address
> + * space. For this reason only the start address is provided by the client.
> + */

FWIW, the amdgpu driver uses a single ioctl for VM map & unmap (plus two 
additional operations for partial residency). Maybe this would make sense for 
the PowerVR driver as well, in particular if it might support partial residency 
in the future.

(amdgpu also uses similar multiplexer ioctls for other things such as context 
create/destroy/...)

Just an idea, feel free to ignore.


> +/**
> + * DOC: Flags for SUBMIT_JOB ioctl geometry command.
> + *
> + * .. c:macro:: DRM_PVR_SUBMIT_JOB_GEOM_CMD_FIRST
> + *
> + *Indicates if this the first command to be issued for a render.
> + *
> + * .. c:macro:: DRM_PVR_SUBMIT_JOB_GEOM_CMD_LAST

Does user space really need to pass in the FIRST/LAST flags, can't the kernel 
driver determine this implicitly? What happens if user space sets these 
incorrectly?


> + * .. c:macro:: DRM_PVR_SUBMIT_JOB_FRAG_CMD_PREVENT_CDM_OVERLAP
> + *
> + *Disallow compute overlapped with this render.

Does this affect only compute from the same context, or also from other 
contexts?

(Similar question for DRM_PVR_SUBMIT_JOB_COMPUTE_CMD_PREVENT_ALL_OVERLAP)


P.S. I mostly just skimmed the other patches of the series, but my impression 
is that the patches and code are cleanly structured and well-documented.

-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v2 0/2] pwm: Manage owner assignment implicitly for drivers

2023-08-08 Thread Linus Walleij
On Fri, Aug 4, 2023 at 4:28 PM Uwe Kleine-König
 wrote:

> (implicit) v1 of this series can be found at
> https://lore.kernel.org/linux-pwm/20230803140633.138165-1-u.kleine-koe...@pengutronix.de
>  .
>
> Changes since then only affect documentation that I missed to adapt before.
> Thanks to Laurent for catching that
>
> Best regards
> Uwe
>
> Uwe Kleine-König (2):
>   pwm: Manage owner assignment implicitly for drivers
>   pwm: crc: Allow compilation as module and with COMPILE_TEST

Clearly the right thing to do! Nice patches.
Reviewed-by: Linus Walleij 

Yours,
Linus Walleij


Re: [PATCH] drm/test: drm_exec: fix memory leak on object prepare

2023-08-08 Thread Christian König

Am 28.07.23 um 01:10 schrieb Danilo Krummrich:

drm_exec_prepare_obj() and drm_exec_prepare_array() both reserve
dma-fence slots and hence a dma_resv_list without ever freeing it.

Make sure to call drm_gem_private_object_fini() for each GEM object
passed to drm_exec_prepare_obj()/drm_exec_prepare_array() throughout the
test to fix this up.

While at it, remove some trailing empty lines.

Fixes: 9710631cc8f3 ("drm: add drm_exec selftests v4")
Signed-off-by: Danilo Krummrich 


Thanks, can you please rebase on current drm-misc-next and re-send.

Thanks,
Christian.


---
  drivers/gpu/drm/tests/drm_exec_test.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/tests/drm_exec_test.c 
b/drivers/gpu/drm/tests/drm_exec_test.c
index df31f89a7945..80761e734a15 100644
--- a/drivers/gpu/drm/tests/drm_exec_test.c
+++ b/drivers/gpu/drm/tests/drm_exec_test.c
@@ -118,8 +118,6 @@ static void test_duplicates(struct kunit *test)
drm_exec_fini();
  }
  
-

-
  static void test_prepare(struct kunit *test)
  {
struct drm_gem_object gobj = { };
@@ -137,6 +135,8 @@ static void test_prepare(struct kunit *test)
break;
}
drm_exec_fini();
+
+   drm_gem_private_object_fini();
  }
  
  static void test_prepare_array(struct kunit *test)

@@ -156,6 +156,9 @@ static void test_prepare_array(struct kunit *test)
 1);
KUNIT_EXPECT_EQ(test, ret, 0);
drm_exec_fini();
+
+   drm_gem_private_object_fini();
+   drm_gem_private_object_fini();
  }
  
  static struct kunit_case drm_exec_tests[] = {




Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-08-08 Thread Jason Gunthorpe
On Tue, Aug 08, 2023 at 07:37:19AM +, Kasireddy, Vivek wrote:
> Hi Jason,
> 
> > 
> > > No, adding HMM_PFN_REQ_WRITE still doesn't help in fixing the issue.
> > > Although, I do not have THP enabled (or built-in), shmem does not evict
> > > the pages after hole punch as noted in the comment in shmem_fallocate():
> > 
> > This is the source of all your problems.
> > 
> > Things that are mm-centric are supposed to track the VMAs and changes to
> > the PTEs. If you do something in userspace and it doesn't cause the
> > CPU page tables to change then it certainly shouldn't cause any mmu
> > notifiers or hmm_range_fault changes.
> I am not doing anything out of the blue in the userspace. I think the behavior
> I am seeing with shmem (where an invalidation event (MMU_NOTIFY_CLEAR)
> does occur because of a hole punch but the PTEs don't really get updated)
> can arguably be considered an optimization. 

Your explanations don't make sense.

If MMU_NOTIFER_CLEAR was sent but the PTEs were left present then:

> > There should still be an invalidation notifier at some point when the
> > CPU tables do eventually change, whenever that is. Missing that
> > notification would be a bug.
> I clearly do not see any notification getting triggered (from both 
> shmem_fault()
> and hugetlb_fault()) when the PTEs do get updated as the hole is refilled
> due to writes. Are you saying that there needs to be an invalidation event
> (MMU_NOTIFY_CLEAR?) dispatched at this point?

You don't get to get shmem_fault in the first place.

If they were marked non-prsent during the CLEAR then the shadow side
remains non-present until it gets its own fault.

If they were made non-present without an invalidation then that is a
bug.

> > hmm_range_fault() is the correct API to use if you are working with
> > notifiers. Do not hack something together using pin_user_pages.

> I noticed that hmm_range_fault() does not seem to be working as expected
> given that it gets stuck(hangs) while walking hugetlb pages.

You are the first to report that, it sounds like a serious bug. Please
try to fix it.

> Regardless, as I mentioned above, the lack of notification when PTEs
> do get updated due to writes is the crux of the issue
> here. Therefore, AFAIU, triggering an invalidation event or some
> other kind of notification would help in fixing this issue.

You seem to be facing some kind of bug in the mm, it sounds pretty
serious, and it almost certainly is a missing invalidation.

Basically, anything that changes a PTE must eventually trigger an
invalidation. It is illegal to change a PTE from one present value to
another present value without invalidation notification.

It is not surprising something would be missed here.

Jason


Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-08-08 Thread Sebastian Wick
On Fri, Aug 4, 2023 at 3:03 PM Daniel Vetter  wrote:
>
> On Tue, Jun 27, 2023 at 10:23:23AM -0300, André Almeida wrote:
> > Create a section that specifies how to deal with DRM device resets for
> > kernel and userspace drivers.
> >
> > Acked-by: Pekka Paalanen 
> > Signed-off-by: André Almeida 
> > ---
> >
> > v4: 
> > https://lore.kernel.org/lkml/20230626183347.55118-1-andrealm...@igalia.com/
> >
> > Changes:
> >  - Grammar fixes (Randy)
> >
> >  Documentation/gpu/drm-uapi.rst | 68 ++
> >  1 file changed, 68 insertions(+)
> >
> > diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
> > index 65fb3036a580..3cbffa25ed93 100644
> > --- a/Documentation/gpu/drm-uapi.rst
> > +++ b/Documentation/gpu/drm-uapi.rst
> > @@ -285,6 +285,74 @@ for GPU1 and GPU2 from different vendors, and a third 
> > handler for
> >  mmapped regular files. Threads cause additional pain with signal
> >  handling as well.
> >
> > +Device reset
> > +
> > +
> > +The GPU stack is really complex and is prone to errors, from hardware bugs,
> > +faulty applications and everything in between the many layers. Some errors
> > +require resetting the device in order to make the device usable again. This
> > +sections describes the expectations for DRM and usermode drivers when a
> > +device resets and how to propagate the reset status.
> > +
> > +Kernel Mode Driver
> > +--
> > +
> > +The KMD is responsible for checking if the device needs a reset, and to 
> > perform
> > +it as needed. Usually a hang is detected when a job gets stuck executing. 
> > KMD
> > +should keep track of resets, because userspace can query any time about the
> > +reset stats for an specific context. This is needed to propagate to the 
> > rest of
> > +the stack that a reset has happened. Currently, this is implemented by each
> > +driver separately, with no common DRM interface.
> > +
> > +User Mode Driver
> > +
> > +
> > +The UMD should check before submitting new commands to the KMD if the 
> > device has
> > +been reset, and this can be checked more often if the UMD requires it. 
> > After
> > +detecting a reset, UMD will then proceed to report it to the application 
> > using
> > +the appropriate API error code, as explained in the section below about
> > +robustness.
> > +
> > +Robustness
> > +--
> > +
> > +The only way to try to keep an application working after a reset is if it
> > +complies with the robustness aspects of the graphical API that it is using.
> > +
> > +Graphical APIs provide ways to applications to deal with device resets. 
> > However,
> > +there is no guarantee that the app will use such features correctly, and 
> > the
> > +UMD can implement policies to close the app if it is a repeating offender,
>
> Not sure whether this one here is due to my input, but s/UMD/KMD. Repeat
> offender killing is more a policy where the kernel enforces policy, and no
> longer up to userspace to dtrt (because very clearly userspace is not
> really doing the right thing anymore when it's just hanging the gpu in an
> endless loop). Also maybe tune it down further to something like "the
> kernel driver may implemnent ..."
>
> In my opinion the umd shouldn't implement these kind of magic guesses, the
> entire point of robustness apis is to delegate responsibility for
> correctly recovering to the application. And the kernel is left with
> enforcing fair resource usage policies (which eventually might be a
> cgroups limit on how much gpu time you're allowed to waste with gpu
> resets).

Killing apps that the kernel thinks are misbehaving really doesn't
seem like a good idea to me. What if the process is a service getting
restarted after getting killed? What if killing that process leaves
the system in a bad state?

Can't the kernel provide some information to user space so that e.g.
systemd can handle those situations?

> > +likely in a broken loop. This is done to ensure that it does not keep 
> > blocking
> > +the user interface from being correctly displayed. This should be done 
> > even if
> > +the app is correct but happens to trigger some bug in the hardware/driver.
> > +
> > +OpenGL
> > +~~
> > +
> > +Apps using OpenGL should use the available robust interfaces, like the
> > +extension ``GL_ARB_robustness`` (or ``GL_EXT_robustness`` for OpenGL ES). 
> > This
> > +interface tells if a reset has happened, and if so, all the context state 
> > is
> > +considered lost and the app proceeds by creating new ones. If it is 
> > possible to
> > +determine that robustness is not in use, the UMD will terminate the app 
> > when a
> > +reset is detected, giving that the contexts are lost and the app won't be 
> > able
> > +to figure this out and recreate the contexts.
> > +
> > +Vulkan
> > +~~
> > +
> > +Apps using Vulkan should check for ``VK_ERROR_DEVICE_LOST`` for 
> > submissions.
> > +This error code means, among other things, that a device reset has 
> > 

Re: [PATCH RFC v5 01/10] drm: Introduce pixel_source DRM plane property

2023-08-08 Thread Sebastian Wick
On Mon, Aug 7, 2023 at 7:52 PM Jessica Zhang  wrote:
>
>
>
> On 8/4/2023 6:15 AM, Sebastian Wick wrote:
> > On Fri, Jul 28, 2023 at 7:03 PM Jessica Zhang  
> > wrote:
> >>
> >> Add support for pixel_source property to drm_plane and related
> >> documentation. In addition, force pixel_source to
> >> DRM_PLANE_PIXEL_SOURCE_FB in DRM_IOCTL_MODE_SETPLANE as to not break
> >> legacy userspace.
> >>
> >> This enum property will allow user to specify a pixel source for the
> >> plane. Possible pixel sources will be defined in the
> >> drm_plane_pixel_source enum.
> >>
> >> The current possible pixel sources are DRM_PLANE_PIXEL_SOURCE_NONE and
> >> DRM_PLANE_PIXEL_SOURCE_FB with *_PIXEL_SOURCE_FB being the default value.
> >>
> >> Signed-off-by: Jessica Zhang 
> >> ---
> >>   drivers/gpu/drm/drm_atomic_state_helper.c |  1 +
> >>   drivers/gpu/drm/drm_atomic_uapi.c |  4 ++
> >>   drivers/gpu/drm/drm_blend.c   | 85 
> >> +++
> >>   drivers/gpu/drm/drm_plane.c   |  3 ++
> >>   include/drm/drm_blend.h   |  2 +
> >>   include/drm/drm_plane.h   | 21 
> >>   6 files changed, 116 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c 
> >> b/drivers/gpu/drm/drm_atomic_state_helper.c
> >> index 784e63d70a42..01638c51ce0a 100644
> >> --- a/drivers/gpu/drm/drm_atomic_state_helper.c
> >> +++ b/drivers/gpu/drm/drm_atomic_state_helper.c
> >> @@ -252,6 +252,7 @@ void __drm_atomic_helper_plane_state_reset(struct 
> >> drm_plane_state *plane_state,
> >>
> >>  plane_state->alpha = DRM_BLEND_ALPHA_OPAQUE;
> >>  plane_state->pixel_blend_mode = DRM_MODE_BLEND_PREMULTI;
> >> +   plane_state->pixel_source = DRM_PLANE_PIXEL_SOURCE_FB;
> >>
> >>  if (plane->color_encoding_property) {
> >>  if (!drm_object_property_get_default_value(>base,
> >> diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
> >> b/drivers/gpu/drm/drm_atomic_uapi.c
> >> index d867e7f9f2cd..454f980e16c9 100644
> >> --- a/drivers/gpu/drm/drm_atomic_uapi.c
> >> +++ b/drivers/gpu/drm/drm_atomic_uapi.c
> >> @@ -544,6 +544,8 @@ static int drm_atomic_plane_set_property(struct 
> >> drm_plane *plane,
> >>  state->src_w = val;
> >>  } else if (property == config->prop_src_h) {
> >>  state->src_h = val;
> >> +   } else if (property == plane->pixel_source_property) {
> >> +   state->pixel_source = val;
> >>  } else if (property == plane->alpha_property) {
> >>  state->alpha = val;
> >>  } else if (property == plane->blend_mode_property) {
> >> @@ -616,6 +618,8 @@ drm_atomic_plane_get_property(struct drm_plane *plane,
> >>  *val = state->src_w;
> >>  } else if (property == config->prop_src_h) {
> >>  *val = state->src_h;
> >> +   } else if (property == plane->pixel_source_property) {
> >> +   *val = state->pixel_source;
> >>  } else if (property == plane->alpha_property) {
> >>  *val = state->alpha;
> >>  } else if (property == plane->blend_mode_property) {
> >> diff --git a/drivers/gpu/drm/drm_blend.c b/drivers/gpu/drm/drm_blend.c
> >> index 6e74de833466..c500310a3d09 100644
> >> --- a/drivers/gpu/drm/drm_blend.c
> >> +++ b/drivers/gpu/drm/drm_blend.c
> >> @@ -185,6 +185,21 @@
> >>*  plane does not expose the "alpha" property, then this is
> >>*  assumed to be 1.0
> >>*
> >> + * pixel_source:
> >> + * pixel_source is set up with 
> >> drm_plane_create_pixel_source_property().
> >> + * It is used to toggle the active source of pixel data for the plane.
> >> + * The plane will only display data from the set pixel_source -- any
> >> + * data from other sources will be ignored.
> >> + *
> >> + * Possible values:
> >> + *
> >> + * "NONE":
> >> + * No active pixel source.
> >> + * Committing with a NONE pixel source will disable the plane.
> >> + *
> >> + * "FB":
> >> + * Framebuffer source set by the "FB_ID" property.
> >> + *
> >>* Note that all the property extensions described here apply either to 
> >> the
> >>* plane or the CRTC (e.g. for the background color, which currently is 
> >> not
> >>* exposed and assumed to be black).
> >> @@ -615,3 +630,73 @@ int drm_plane_create_blend_mode_property(struct 
> >> drm_plane *plane,
> >>  return 0;
> >>   }
> >>   EXPORT_SYMBOL(drm_plane_create_blend_mode_property);
> >> +
> >> +/**
> >> + * drm_plane_create_pixel_source_property - create a new pixel source 
> >> property
> >> + * @plane: DRM plane
> >> + * @extra_sources: Bitmask of additional supported pixel_sources for the 
> >> driver.
> >> + *DRM_PLANE_PIXEL_SOURCE_FB always be enabled as a 
> >> supported
> >> + *source.
> >> + *
> >> + * This creates a new property describing the current source of 

Re: [PATCH RESEND v4 2/2] drm/mediatek: Fix iommu fault during crtc enabling

2023-08-08 Thread Eugen Hristev

Hi Jason,

On 8/7/23 04:51, Jason-JH.Lin wrote:

The plane_state of drm_atomic_state is not sync to the mtk_plane_state
stored in mtk_crtc during crtc enabling.

So we need to update the mtk_plane_state stored in mtk_crtc by the
drm_atomic_state carried from mtk_drm_crtc_atomic_enable().

While updating mtk_plane_state, OVL layer should be disabled when the fb
in plane_state of drm_atomic_state is NULL.

Fixes: 119f5173628a ("drm/mediatek: Add DRM Driver for Mediatek SoC MT8173.")
Signed-off-by: Jason-JH.Lin 
---
Change in RESEND v4:
Remove redundant plane_state assigning.
---
  drivers/gpu/drm/mediatek/mtk_drm_crtc.c  | 14 ++
  drivers/gpu/drm/mediatek/mtk_drm_plane.c | 11 ---
  drivers/gpu/drm/mediatek/mtk_drm_plane.h |  2 ++
  3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c 
b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
index d40142842f85..7db4d6551da7 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
@@ -328,7 +328,7 @@ static void ddp_cmdq_cb(struct mbox_client *cl, void *mssg)
  }
  #endif
  
-static int mtk_crtc_ddp_hw_init(struct mtk_drm_crtc *mtk_crtc)

+static int mtk_crtc_ddp_hw_init(struct mtk_drm_crtc *mtk_crtc, struct 
drm_atomic_state *state)
  {
struct drm_crtc *crtc = _crtc->base;
struct drm_connector *connector;
@@ -405,11 +405,17 @@ static int mtk_crtc_ddp_hw_init(struct mtk_drm_crtc 
*mtk_crtc)
/* Initially configure all planes */
for (i = 0; i < mtk_crtc->layer_nr; i++) {
struct drm_plane *plane = _crtc->planes[i];
-   struct mtk_plane_state *plane_state;
+   struct drm_plane_state *new_state;
+   struct mtk_plane_state *plane_state = 
to_mtk_plane_state(plane->state);
struct mtk_ddp_comp *comp;
unsigned int local_layer;
  
-		plane_state = to_mtk_plane_state(plane->state);


any reason why you moved the initialization of plane_state at the 
declaration phase ?



+   /* sync the new plane state from drm_atomic_state */
+   if (state->planes[i].ptr) {
+   new_state = drm_atomic_get_new_plane_state(state, 
state->planes[i].ptr);

Can drm_atomic_get_new_plane_state fail ? and new_state becomes null ?

I see mtk_plane_update_new_state assumes new_state being a correct 
state/pointer.


Regards,


+   mtk_plane_update_new_state(new_state, plane_state);
+   }
+
comp = mtk_drm_ddp_comp_for_plane(crtc, plane, _layer);
if (comp)
mtk_ddp_comp_layer_config(comp, local_layer,
@@ -687,7 +693,7 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
return;
}
  
-	ret = mtk_crtc_ddp_hw_init(mtk_crtc);

+   ret = mtk_crtc_ddp_hw_init(mtk_crtc, state);
if (ret) {
pm_runtime_put(comp->dev);
return;
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_plane.c 
b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
index b1a918ffe457..ef4460f98c07 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_plane.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
@@ -134,8 +134,8 @@ static int mtk_plane_atomic_async_check(struct drm_plane 
*plane,
   true, true);
  }
  
-static void mtk_plane_update_new_state(struct drm_plane_state *new_state,

-  struct mtk_plane_state *mtk_plane_state)
+void mtk_plane_update_new_state(struct drm_plane_state *new_state,
+   struct mtk_plane_state *mtk_plane_state)
  {
struct drm_framebuffer *fb = new_state->fb;
struct drm_gem_object *gem;
@@ -146,6 +146,11 @@ static void mtk_plane_update_new_state(struct 
drm_plane_state *new_state,
dma_addr_t hdr_addr = 0;
unsigned int hdr_pitch = 0;
  
+	if (!fb) {

+   mtk_plane_state->pending.enable = false;
+   return;
+   }
+
gem = fb->obj[0];
mtk_gem = to_mtk_gem_obj(gem);
addr = mtk_gem->dma_addr;
@@ -180,7 +185,7 @@ static void mtk_plane_update_new_state(struct 
drm_plane_state *new_state,
   fb->format->cpp[0] * (x_offset_in_blocks + 1);
}
  
-	mtk_plane_state->pending.enable = true;

+   mtk_plane_state->pending.enable = new_state->visible;
mtk_plane_state->pending.pitch = pitch;
mtk_plane_state->pending.hdr_pitch = hdr_pitch;
mtk_plane_state->pending.format = format;
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_plane.h 
b/drivers/gpu/drm/mediatek/mtk_drm_plane.h
index 99aff7da0831..0a7d70d13e43 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_plane.h
+++ b/drivers/gpu/drm/mediatek/mtk_drm_plane.h
@@ -46,6 +46,8 @@ to_mtk_plane_state(struct drm_plane_state *state)
return container_of(state, struct mtk_plane_state, base);
  }
  
+void mtk_plane_update_new_state(struct 

[PATCH] drm/rockchip: Don't spam logs in atomic check

2023-08-08 Thread Daniel Stone
Userspace should not be able to trigger DRM_ERROR messages to spam the
logs; especially not through atomic commit parameters which are
completely legitimate for userspace to attempt.

Signed-off-by: Daniel Stone 
Fixes: 7707f7227f09 ("drm/rockchip: Add support for afbc")
---
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index 86fd9f51c692..14320bc73e5b 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -832,12 +832,12 @@ static int vop_plane_atomic_check(struct drm_plane *plane,
 * need align with 2 pixel.
 */
if (fb->format->is_yuv && ((new_plane_state->src.x1 >> 16) % 2)) {
-   DRM_ERROR("Invalid Source: Yuv format not support odd xpos\n");
+   DRM_DEBUG_KMS("Invalid Source: Yuv format not support odd 
xpos\n");
return -EINVAL;
}
 
if (fb->format->is_yuv && new_plane_state->rotation & 
DRM_MODE_REFLECT_Y) {
-   DRM_ERROR("Invalid Source: Yuv format does not support this 
rotation\n");
+   DRM_DEBUG_KMS("Invalid Source: Yuv format does not support this 
rotation\n");
return -EINVAL;
}
 
@@ -845,7 +845,7 @@ static int vop_plane_atomic_check(struct drm_plane *plane,
struct vop *vop = to_vop(crtc);
 
if (!vop->data->afbc) {
-   DRM_ERROR("vop does not support AFBC\n");
+   DRM_DEBUG_KMS("vop does not support AFBC\n");
return -EINVAL;
}
 
@@ -854,15 +854,16 @@ static int vop_plane_atomic_check(struct drm_plane *plane,
return ret;
 
if (new_plane_state->src.x1 || new_plane_state->src.y1) {
-   DRM_ERROR("AFBC does not support offset display, 
xpos=%d, ypos=%d, offset=%d\n",
- new_plane_state->src.x1,
- new_plane_state->src.y1, fb->offsets[0]);
+   DRM_DEBUG_KMS("AFBC does not support offset display, " \
+ "xpos=%d, ypos=%d, offset=%d\n",
+ new_plane_state->src.x1, 
new_plane_state->src.y1,
+ fb->offsets[0]);
return -EINVAL;
}
 
if (new_plane_state->rotation && new_plane_state->rotation != 
DRM_MODE_ROTATE_0) {
-   DRM_ERROR("No rotation support in AFBC, rotation=%d\n",
- new_plane_state->rotation);
+   DRM_DEBUG_KMS("No rotation support in AFBC, 
rotation=%d\n",
+ new_plane_state->rotation);
return -EINVAL;
}
}
-- 
2.41.0



Re: 2b5d1c29f6c4 ("drm/nouveau/disp: PIOR DP uses GPIO for HPD, not PMGR AUX interrupts")

2023-08-08 Thread Karol Herbst
On Mon, Aug 7, 2023 at 5:05 PM Borislav Petkov  wrote:
>
> On Mon, Aug 07, 2023 at 01:49:42PM +0200, Karol Herbst wrote:
> > in what way does it stop? Just not progressing? That would be kinda
> > concerning. Mind tracing with what arguments `nvkm_uevent_add` is
> > called with and without that patch?
>
> Well, me dumping those args I guess made the box not freeze before
> catching a #PF over serial. Does that help?
>
> 
> [3.410135] Unpacking initramfs...
> [3.416319] software IO TLB: mapped [mem 
> 0xa877d000-0xac77d000] (64MB)
> [3.418227] Initialise system trusted keyrings
> [3.432273] workingset: timestamp_bits=56 max_order=22 bucket_order=0
> [3.439006] ntfs: driver 2.1.32 [Flags: R/W].
> [3.443368] fuse: init (API version 7.38)
> [3.447601] 9p: Installing v9fs 9p2000 file system support
> [3.453223] Key type asymmetric registered
> [3.457332] Asymmetric key parser 'x509' registered
> [3.462236] Block layer SCSI generic (bsg) driver version 0.4 loaded 
> (major 250)
> [3.475865] efifb: probing for efifb
> [3.479458] efifb: framebuffer at 0xf900, using 1920k, total 1920k
> [3.485969] efifb: mode is 800x600x32, linelength=3200, pages=1
> [3.491872] efifb: scrolling: redraw
> [3.495438] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0
> [3.502349] Console: switching to colour frame buffer device 100x37
> [3.509564] fb0: EFI VGA frame buffer device
> [3.514013] ACPI: \_PR_.CP00: Found 4 idle states
> [3.518850] ACPI: \_PR_.CP01: Found 4 idle states
> [3.523687] ACPI: \_PR_.CP02: Found 4 idle states
> [3.528515] ACPI: \_PR_.CP03: Found 4 idle states
> [3.533346] ACPI: \_PR_.CP04: Found 4 idle states
> [3.538173] ACPI: \_PR_.CP05: Found 4 idle states
> [3.543003] ACPI: \_PR_.CP06: Found 4 idle states
> [3.544219] Freeing initrd memory: 8196K
> [3.547844] ACPI: \_PR_.CP07: Found 4 idle states
> [3.609542] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> [3.616224] 00:05: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 
> 16550A
> [3.625552] serial :00:16.3: enabling device ( -> 0003)
> [3.633034] :00:16.3: ttyS1 at I/O 0xf0a0 (irq = 17, base_baud = 
> 115200) is a 16550A
> [3.642451] Linux agpgart interface v0.103
> [3.647141] ACPI: bus type drm_connector registered
> [3.653261] Console: switching to colour dummy device 80x25
> [3.659092] nouveau :03:00.0: vgaarb: deactivate vga console
> [3.665174] nouveau :03:00.0: NVIDIA GT218 (0a8c00b1)
> [3.784585] nouveau :03:00.0: bios: version 70.18.83.00.08
> [3.792244] nouveau :03:00.0: fb: 512 MiB DDR3
> [3.948786] nouveau :03:00.0: DRM: VRAM: 512 MiB
> [3.953755] nouveau :03:00.0: DRM: GART: 1048576 MiB
> [3.959073] nouveau :03:00.0: DRM: TMDS table version 2.0
> [3.964808] nouveau :03:00.0: DRM: DCB version 4.0
> [3.969938] nouveau :03:00.0: DRM: DCB outp 00: 02000360 
> [3.976367] nouveau :03:00.0: DRM: DCB outp 01: 02000362 00020010
> [3.982792] nouveau :03:00.0: DRM: DCB outp 02: 028003a6 0f220010
> [3.989223] nouveau :03:00.0: DRM: DCB outp 03: 01011380 
> [3.995647] nouveau :03:00.0: DRM: DCB outp 04: 08011382 00020010
> [4.002076] nouveau :03:00.0: DRM: DCB outp 05: 088113c6 0f220010
> [4.008511] nouveau :03:00.0: DRM: DCB conn 00: 00101064
> [4.014151] nouveau :03:00.0: DRM: DCB conn 01: 00202165
> [4.021710] nvkm_uevent_add: uevent: 0x888100242100, event: 
> 0x8881022de1a0, id: 0x0, bits: 0x1, func: 0x
> [4.033680] nvkm_uevent_add: uevent: 0x888100242300, event: 
> 0x8881022de1a0, id: 0x0, bits: 0x1, func: 0x
> [4.045429] nouveau :03:00.0: DRM: MM: using COPY for buffer copies
> [4.052059] stackdepot: allocating hash table of 1048576 entries via 
> kvcalloc
> [4.067191] nvkm_uevent_add: uevent: 0x888100242800, event: 
> 0x888104b3e260, id: 0x0, bits: 0x1, func: 0x
> [4.078936] nvkm_uevent_add: uevent: 0x888100242900, event: 
> 0x888104b3e260, id: 0x1, bits: 0x1, func: 0x
> [4.090514] nvkm_uevent_add: uevent: 0x888100242a00, event: 
> 0x888102091f28, id: 0x1, bits: 0x3, func: 0x8177b700
> [4.102118] tsc: Refined TSC clocksource calibration: 3591.345 MHz
> [4.108342] clocksource: tsc: mask: 0x max_cycles: 
> 0x33c4635c383, max_idle_ns: 440795314831 ns
> [4.108401] nvkm_uevent_add: uevent: 0x8881020b6000, event: 
> 0x888102091f28, id: 0xf, bits: 0x3, func: 0x8177b700
> [4.129864] clocksource: Switched to clocksource tsc
> [4.131478] [drm] Initialized nouveau 1.3.1 20120801 for :03:00.0 on 
> minor 0
> [4.143806] BUG: kernel NULL pointer dereference, address: 0020

ahh, that would have been good to know :) Mind figuring out 

[PATCH 10/20] drm/i915/dp: Add functions to get min/max src input bpc with DSC

2023-08-08 Thread Ankit Nautiyal
Separate out functions for getting maximum and minimum input BPC based
on platforms, when DSC is used.

v2: Use HAS_DSC macro instead of platform check while getting min input
bpc. (Stan)

Signed-off-by: Ankit Nautiyal 
---
 drivers/gpu/drm/i915/display/intel_dp.c | 35 +++--
 1 file changed, 27 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
b/drivers/gpu/drm/i915/display/intel_dp.c
index c13efd0b7c98..b414d09b5e80 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -1535,6 +1535,18 @@ intel_dp_compute_link_config_wide(struct intel_dp 
*intel_dp,
return -EINVAL;
 }
 
+static
+u8 intel_dp_dsc_max_src_input_bpc(struct drm_i915_private *i915)
+{
+   /* Max DSC Input BPC for ICL is 10 and for TGL+ is 12 */
+   if (DISPLAY_VER(i915) >= 12)
+   return 12;
+   if (DISPLAY_VER(i915) == 11)
+   return 10;
+
+   return 0;
+}
+
 int intel_dp_dsc_compute_bpp(struct intel_dp *intel_dp, u8 max_req_bpc)
 {
struct drm_i915_private *i915 = dp_to_i915(intel_dp);
@@ -1542,11 +1554,12 @@ int intel_dp_dsc_compute_bpp(struct intel_dp *intel_dp, 
u8 max_req_bpc)
u8 dsc_bpc[3] = {0};
u8 dsc_max_bpc;
 
-   /* Max DSC Input BPC for ICL is 10 and for TGL+ is 12 */
-   if (DISPLAY_VER(i915) >= 12)
-   dsc_max_bpc = min_t(u8, 12, max_req_bpc);
-   else
-   dsc_max_bpc = min_t(u8, 10, max_req_bpc);
+   dsc_max_bpc = intel_dp_dsc_max_src_input_bpc(i915);
+
+   if (!dsc_max_bpc)
+   return dsc_max_bpc;
+
+   dsc_max_bpc = min_t(u8, dsc_max_bpc, max_req_bpc);
 
num_bpc = drm_dp_dsc_sink_supported_input_bpcs(intel_dp->dsc_dpcd,
   dsc_bpc);
@@ -1674,6 +1687,13 @@ static bool intel_dp_dsc_supports_format(struct intel_dp 
*intel_dp,
return drm_dp_dsc_sink_supports_format(intel_dp->dsc_dpcd, 
sink_dsc_format);
 }
 
+static
+u8 intel_dp_dsc_min_src_input_bpc(struct drm_i915_private *i915)
+{
+   /* Min DSC Input BPC for ICL+ is 8 */
+   return HAS_DSC(i915) ? 8 : 0;
+}
+
 int intel_dp_dsc_compute_config(struct intel_dp *intel_dp,
struct intel_crtc_state *pipe_config,
struct drm_connector_state *conn_state,
@@ -1707,10 +1727,9 @@ int intel_dp_dsc_compute_config(struct intel_dp 
*intel_dp,
pipe_bpp = pipe_config->pipe_bpp;
}
 
-   /* Min Input BPC for ICL+ is 8 */
-   if (pipe_bpp < 8 * 3) {
+   if (pipe_bpp < intel_dp_dsc_min_src_input_bpc(dev_priv) * 3) {
drm_dbg_kms(_priv->drm,
-   "No DSC support for less than 8bpc\n");
+   "Computed BPC less than min supported by source for 
DSC\n");
return -EINVAL;
}
 
-- 
2.40.1



[RFC v6 3/3] drm/ttm/tests: Add tests for ttm_pool

2023-08-08 Thread Karolina Stolarek
Add KUnit tests that exercise page allocation using page pools
and freeing pages, either by returning them to the pool or
freeing them. Add a basic test for ttm_pool cleanup. Introduce
helpers to create a dummy ttm_buffer_object.

Signed-off-by: Karolina Stolarek 
---
 drivers/gpu/drm/ttm/tests/Makefile|   1 +
 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c |  17 +
 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h |   4 +
 drivers/gpu/drm/ttm/tests/ttm_pool_test.c | 437 ++
 4 files changed, 459 insertions(+)
 create mode 100644 drivers/gpu/drm/ttm/tests/ttm_pool_test.c

diff --git a/drivers/gpu/drm/ttm/tests/Makefile 
b/drivers/gpu/drm/ttm/tests/Makefile
index 7917805f37af..ec87c4fc1ad5 100644
--- a/drivers/gpu/drm/ttm/tests/Makefile
+++ b/drivers/gpu/drm/ttm/tests/Makefile
@@ -2,4 +2,5 @@
 
 obj-$(CONFIG_DRM_TTM_KUNIT_TEST) += \
 ttm_device_test.o \
+ttm_pool_test.o \
 ttm_kunit_helpers.o
diff --git a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c 
b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
index dedc1857734b..81661d8827aa 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
@@ -25,6 +25,23 @@ int ttm_device_kunit_init(struct ttm_test_devices *priv,
 }
 EXPORT_SYMBOL_GPL(ttm_device_kunit_init);
 
+struct ttm_buffer_object *ttm_bo_kunit_init(struct kunit *test,
+   struct ttm_test_devices *devs,
+   size_t size)
+{
+   struct drm_gem_object gem_obj = { .size = size };
+   struct ttm_buffer_object *bo;
+
+   bo = kunit_kzalloc(test, sizeof(*bo), GFP_KERNEL);
+   KUNIT_ASSERT_NOT_NULL(test, bo);
+
+   bo->base = gem_obj;
+   bo->bdev = devs->ttm_dev;
+
+   return bo;
+}
+EXPORT_SYMBOL_GPL(ttm_bo_kunit_init);
+
 struct ttm_test_devices *ttm_test_devices_basic(struct kunit *test)
 {
struct ttm_test_devices *devs;
diff --git a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h 
b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
index f9f5bc03e93a..e261e3660d0b 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
+++ b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
@@ -7,6 +7,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -24,6 +25,9 @@ int ttm_device_kunit_init(struct ttm_test_devices *priv,
  struct ttm_device *ttm,
  bool use_dma_alloc,
  bool use_dma32);
+struct ttm_buffer_object *ttm_bo_kunit_init(struct kunit *test,
+   struct ttm_test_devices *devs,
+   size_t size);
 
 struct ttm_test_devices *ttm_test_devices_basic(struct kunit *test);
 struct ttm_test_devices *ttm_test_devices_all(struct kunit *test);
diff --git a/drivers/gpu/drm/ttm/tests/ttm_pool_test.c 
b/drivers/gpu/drm/ttm/tests/ttm_pool_test.c
new file mode 100644
index ..8d90870fb199
--- /dev/null
+++ b/drivers/gpu/drm/ttm/tests/ttm_pool_test.c
@@ -0,0 +1,437 @@
+// SPDX-License-Identifier: GPL-2.0 AND MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+#include 
+
+#include 
+#include 
+
+#include "ttm_kunit_helpers.h"
+
+struct ttm_pool_test_case {
+   const char *description;
+   unsigned int order;
+   bool use_dma_alloc;
+};
+
+struct ttm_pool_test_priv {
+   struct ttm_test_devices *devs;
+
+   /* Used to create mock ttm_tts */
+   struct ttm_buffer_object *mock_bo;
+};
+
+static struct ttm_operation_ctx simple_ctx = {
+   .interruptible = true,
+   .no_wait_gpu = false,
+};
+
+static int ttm_pool_test_init(struct kunit *test)
+{
+   struct ttm_pool_test_priv *priv;
+
+   priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL);
+   KUNIT_ASSERT_NOT_NULL(test, priv);
+
+   priv->devs = ttm_test_devices_basic(test);
+   test->priv = priv;
+
+   return 0;
+}
+
+static void ttm_pool_test_fini(struct kunit *test)
+{
+   struct ttm_pool_test_priv *priv = test->priv;
+
+   ttm_test_devices_put(test, priv->devs);
+}
+
+static struct ttm_tt *ttm_tt_kunit_init(struct kunit *test,
+   uint32_t page_flags,
+   enum ttm_caching caching,
+   size_t size)
+{
+   struct ttm_pool_test_priv *priv = test->priv;
+   struct ttm_buffer_object *bo;
+   struct ttm_tt *tt;
+   int err;
+
+   bo = ttm_bo_kunit_init(test, priv->devs, size);
+   KUNIT_ASSERT_NOT_NULL(test, bo);
+   priv->mock_bo = bo;
+
+   tt = kunit_kzalloc(test, sizeof(*tt), GFP_KERNEL);
+   KUNIT_ASSERT_NOT_NULL(test, tt);
+
+   err = ttm_tt_init(tt, priv->mock_bo, page_flags, caching, 0);
+   KUNIT_ASSERT_EQ(test, err, 0);
+
+   return tt;
+}
+
+static struct ttm_pool *ttm_pool_pre_populated(struct kunit *test,
+  size_t size,
+  

[RFC v6 2/3] drm/ttm/tests: Add tests for ttm_device

2023-08-08 Thread Karolina Stolarek
Test initialization and cleanup of the ttm_device struct, including
some error paths. Verify the creation of page pools if use_dma_alloc
param is true.

Signed-off-by: Karolina Stolarek 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/ttm/tests/ttm_device_test.c | 158 
 1 file changed, 158 insertions(+)

diff --git a/drivers/gpu/drm/ttm/tests/ttm_device_test.c 
b/drivers/gpu/drm/ttm/tests/ttm_device_test.c
index 76d927d07501..b1b423b68cdf 100644
--- a/drivers/gpu/drm/ttm/tests/ttm_device_test.c
+++ b/drivers/gpu/drm/ttm/tests/ttm_device_test.c
@@ -8,6 +8,13 @@
 
 #include "ttm_kunit_helpers.h"
 
+struct ttm_device_test_case {
+   const char *description;
+   bool use_dma_alloc;
+   bool use_dma32;
+   bool pools_init_expected;
+};
+
 static void ttm_device_init_basic(struct kunit *test)
 {
struct ttm_test_devices *priv = test->priv;
@@ -37,8 +44,159 @@ static void ttm_device_init_basic(struct kunit *test)
ttm_device_fini(ttm_dev);
 }
 
+static void ttm_device_init_multiple(struct kunit *test)
+{
+   struct ttm_test_devices *priv = test->priv;
+   struct ttm_device *ttm_devs;
+   unsigned int i, num_dev = 3;
+   int err;
+
+   ttm_devs = kunit_kcalloc(test, num_dev, sizeof(*ttm_devs), GFP_KERNEL);
+   KUNIT_ASSERT_NOT_NULL(test, ttm_devs);
+
+   for (i = 0; i < num_dev; i++) {
+   err = ttm_device_kunit_init(priv, _devs[i], false, false);
+   KUNIT_ASSERT_EQ(test, err, 0);
+
+   KUNIT_EXPECT_PTR_EQ(test, ttm_devs[i].dev_mapping,
+   priv->drm->anon_inode->i_mapping);
+   KUNIT_ASSERT_NOT_NULL(test, ttm_devs[i].wq);
+   KUNIT_EXPECT_PTR_EQ(test, ttm_devs[i].funcs, _dev_funcs);
+   KUNIT_ASSERT_NOT_NULL(test, ttm_devs[i].man_drv[TTM_PL_SYSTEM]);
+   }
+
+   KUNIT_ASSERT_EQ(test, list_count_nodes(_devs[0].device_list), 
num_dev);
+
+   for (i = 0; i < num_dev; i++)
+   ttm_device_fini(_devs[i]);
+}
+
+static void ttm_device_fini_basic(struct kunit *test)
+{
+   struct ttm_test_devices *priv = test->priv;
+   struct ttm_device *ttm_dev;
+   struct ttm_resource_manager *man;
+   int err;
+
+   ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
+   KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
+
+   err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+   KUNIT_ASSERT_EQ(test, err, 0);
+
+   man = ttm_manager_type(ttm_dev, TTM_PL_SYSTEM);
+   KUNIT_ASSERT_NOT_NULL(test, man);
+
+   ttm_device_fini(ttm_dev);
+
+   KUNIT_ASSERT_FALSE(test, man->use_type);
+   KUNIT_ASSERT_TRUE(test, list_empty(>lru[0]));
+   KUNIT_ASSERT_NULL(test, ttm_dev->man_drv[TTM_PL_SYSTEM]);
+}
+
+static void ttm_device_init_no_vma_man(struct kunit *test)
+{
+   struct ttm_test_devices *priv = test->priv;
+   struct drm_device *drm = priv->drm;
+   struct ttm_device *ttm_dev;
+   struct drm_vma_offset_manager *vma_man;
+   int err;
+
+   ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
+   KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
+
+   /* Let's pretend there's no VMA manager allocated */
+   vma_man = drm->vma_offset_manager;
+   drm->vma_offset_manager = NULL;
+
+   err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+   KUNIT_EXPECT_EQ(test, err, -EINVAL);
+
+   /* Bring the manager back for a graceful cleanup */
+   drm->vma_offset_manager = vma_man;
+}
+
+static const struct ttm_device_test_case ttm_device_cases[] = {
+   {
+   .description = "No DMA allocations, no DMA32 required",
+   .use_dma_alloc = false,
+   .use_dma32 = false,
+   .pools_init_expected = false,
+   },
+   {
+   .description = "DMA allocations, DMA32 required",
+   .use_dma_alloc = true,
+   .use_dma32 = true,
+   .pools_init_expected = true,
+   },
+   {
+   .description = "No DMA allocations, DMA32 required",
+   .use_dma_alloc = false,
+   .use_dma32 = true,
+   .pools_init_expected = false,
+   },
+   {
+   .description = "DMA allocations, no DMA32 required",
+   .use_dma_alloc = true,
+   .use_dma32 = false,
+   .pools_init_expected = true,
+   },
+};
+
+static void ttm_device_case_desc(const struct ttm_device_test_case *t, char 
*desc)
+{
+   strscpy(desc, t->description, KUNIT_PARAM_DESC_SIZE);
+}
+
+KUNIT_ARRAY_PARAM(ttm_device, ttm_device_cases, ttm_device_case_desc);
+
+static void ttm_device_init_pools(struct kunit *test)
+{
+   struct ttm_test_devices *priv = test->priv;
+   const struct ttm_device_test_case *params = test->param_value;
+   struct ttm_device *ttm_dev;
+   struct ttm_pool *pool;
+   struct ttm_pool_type pt;
+   int err;
+
+   ttm_dev = 

[RFC v6 1/3] drm/ttm: Introduce KUnit test

2023-08-08 Thread Karolina Stolarek
Add the initial version of unit tests for ttm_device struct, together
with helper functions.

Signed-off-by: Karolina Stolarek 
---
 drivers/gpu/drm/Kconfig   | 15 +++
 drivers/gpu/drm/ttm/Makefile  |  1 +
 drivers/gpu/drm/ttm/tests/.kunitconfig|  4 +
 drivers/gpu/drm/ttm/tests/Makefile|  5 +
 drivers/gpu/drm/ttm/tests/ttm_device_test.c   | 54 +++
 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c | 96 +++
 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h | 37 +++
 7 files changed, 212 insertions(+)
 create mode 100644 drivers/gpu/drm/ttm/tests/.kunitconfig
 create mode 100644 drivers/gpu/drm/ttm/tests/Makefile
 create mode 100644 drivers/gpu/drm/ttm/tests/ttm_device_test.c
 create mode 100644 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
 create mode 100644 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 2a44b9419d4d..9d1f0e04fd56 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -195,6 +195,21 @@ config DRM_TTM
  GPU memory types. Will be enabled automatically if a device driver
  uses it.
 
+config DRM_TTM_KUNIT_TEST
+tristate "KUnit tests for TTM" if !KUNIT_ALL_TESTS
+default n
+depends on DRM && KUNIT
+select DRM_TTM
+select DRM_EXPORT_FOR_TESTS if m
+select DRM_KUNIT_TEST_HELPERS
+default KUNIT_ALL_TESTS
+help
+  Enables unit tests for TTM, a GPU memory manager subsystem used
+  to manage memory buffers. This option is mostly useful for kernel
+  developers.
+
+  If in doubt, say "N".
+
 config DRM_EXEC
tristate
depends on DRM
diff --git a/drivers/gpu/drm/ttm/Makefile b/drivers/gpu/drm/ttm/Makefile
index f906b22959cf..dad298127226 100644
--- a/drivers/gpu/drm/ttm/Makefile
+++ b/drivers/gpu/drm/ttm/Makefile
@@ -8,3 +8,4 @@ ttm-y := ttm_tt.o ttm_bo.o ttm_bo_util.o ttm_bo_vm.o 
ttm_module.o \
 ttm-$(CONFIG_AGP) += ttm_agp_backend.o
 
 obj-$(CONFIG_DRM_TTM) += ttm.o
+obj-$(CONFIG_DRM_TTM_KUNIT_TEST) += tests/
diff --git a/drivers/gpu/drm/ttm/tests/.kunitconfig 
b/drivers/gpu/drm/ttm/tests/.kunitconfig
new file mode 100644
index ..75fdce0cd98e
--- /dev/null
+++ b/drivers/gpu/drm/ttm/tests/.kunitconfig
@@ -0,0 +1,4 @@
+CONFIG_KUNIT=y
+CONFIG_DRM=y
+CONFIG_DRM_KUNIT_TEST_HELPERS=y
+CONFIG_DRM_TTM_KUNIT_TEST=y
diff --git a/drivers/gpu/drm/ttm/tests/Makefile 
b/drivers/gpu/drm/ttm/tests/Makefile
new file mode 100644
index ..7917805f37af
--- /dev/null
+++ b/drivers/gpu/drm/ttm/tests/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0 AND MIT
+
+obj-$(CONFIG_DRM_TTM_KUNIT_TEST) += \
+ttm_device_test.o \
+ttm_kunit_helpers.o
diff --git a/drivers/gpu/drm/ttm/tests/ttm_device_test.c 
b/drivers/gpu/drm/ttm/tests/ttm_device_test.c
new file mode 100644
index ..76d927d07501
--- /dev/null
+++ b/drivers/gpu/drm/ttm/tests/ttm_device_test.c
@@ -0,0 +1,54 @@
+// SPDX-License-Identifier: GPL-2.0 AND MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+#include 
+#include 
+#include 
+
+#include "ttm_kunit_helpers.h"
+
+static void ttm_device_init_basic(struct kunit *test)
+{
+   struct ttm_test_devices *priv = test->priv;
+   struct ttm_device *ttm_dev;
+   struct ttm_resource_manager *ttm_sys_man;
+   int err;
+
+   ttm_dev = kunit_kzalloc(test, sizeof(*ttm_dev), GFP_KERNEL);
+   KUNIT_ASSERT_NOT_NULL(test, ttm_dev);
+
+   err = ttm_device_kunit_init(priv, ttm_dev, false, false);
+   KUNIT_ASSERT_EQ(test, err, 0);
+
+   KUNIT_EXPECT_PTR_EQ(test, ttm_dev->funcs, _dev_funcs);
+   KUNIT_ASSERT_NOT_NULL(test, ttm_dev->wq);
+   KUNIT_ASSERT_NOT_NULL(test, ttm_dev->man_drv[TTM_PL_SYSTEM]);
+
+   ttm_sys_man = _dev->sysman;
+   KUNIT_ASSERT_NOT_NULL(test, ttm_sys_man);
+   KUNIT_EXPECT_TRUE(test, ttm_sys_man->use_tt);
+   KUNIT_EXPECT_TRUE(test, ttm_sys_man->use_type);
+   KUNIT_ASSERT_NOT_NULL(test, ttm_sys_man->func);
+
+   KUNIT_EXPECT_PTR_EQ(test, ttm_dev->dev_mapping,
+   priv->drm->anon_inode->i_mapping);
+
+   ttm_device_fini(ttm_dev);
+}
+
+static struct kunit_case ttm_device_test_cases[] = {
+   KUNIT_CASE(ttm_device_init_basic),
+   {}
+};
+
+static struct kunit_suite ttm_device_test_suite = {
+   .name = "ttm_device",
+   .init = ttm_test_devices_init,
+   .exit = ttm_test_devices_fini,
+   .test_cases = ttm_device_test_cases,
+};
+
+kunit_test_suites(_device_test_suite);
+
+MODULE_LICENSE("GPL");
diff --git a/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c 
b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
new file mode 100644
index ..dedc1857734b
--- /dev/null
+++ b/drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
@@ -0,0 +1,96 @@
+// SPDX-License-Identifier: GPL-2.0 AND MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+#include 

[RFC v6 0/3] Introduce KUnit tests for TTM subsystem

2023-08-08 Thread Karolina Stolarek
This series introduces KUnit[1] tests for TTM (Translation Table Manager)
subsystem, a memory manager used by graphics drivers to create and manage
memory buffers across different memory domains, such as system memory
or VRAM.

Unit tests implemented here cover two data structures:
  - ttm_device -- referred as a buffer object device, which stores
resource managers and page pools
  - ttm_pool -- a struct of pools (ttm_pool_type) of different page
orders and caching attributes, with pages that can be reused on
the next buffer allocation

Use kunit_tool script to manually run the tests:

$ ./tools/testing/kunit/kunit.py run --kunitconfig=drivers/gpu/drm/ttm/tests

To build a kernel with TTM KUnit tests, first enable CONFIG_KUNIT, and then
CONFIG_DRM_TTM_KUNIT_TEST.

As for now, tests are architecture-agnostic (i.e. KUnit runner uses UML
kernel), which means that we have limited coverage in some places. For
example, we can't fully test the initialization of global page pools,
such as global_write_combined. It is to be decided if we want to stick
to UML or use CONFIG_X86 (at least to some extent).

These patches are just a beginning of the work to improve the test
coverage of TTM. Feel free to suggest changes, test cases or priorities.

Many thanks,
Karolina

v6:
  - Rebase the series on the top of drm-misc-next (Christian)
  - Remove drm_dev_put() call from ttm_test_devices_put, the drm device is 
already freed in drm_kunit_helper_free_device()
  - Remove an unnecessary priv assignment in ttm_test_devices_all()
  - Delete ttm_bo_put() from ttm_pool_test_fini() (as for now, we don't count
krefs for dummy BOs)

v5:
  - Drop unnecessary brackets in 2/3
  - Rebase KConfig file on the top of drm-tip

v4:
  - Test helpers have been changed to make the creation of init/fini
functions for each test suite easier:
+ Decouple device creation from test initialization by adding 
  helpers that initialize ttm_test_devices, a struct which stores
  DRM/TTM devices, and can be used in test-specific init/finis
  (see ttm_pool_tests.c for an example)
+ Introduce generic init/fini functions for tests that only need
  devices
+ Add ttm_device field to ttm_test_devices (previously
  ttm_test_devices_priv)
  - Make TTM buffer object outlive its TT (Christian)
  - Add a dedicated struct for ttm_pool_test (struct ttm_pool_test_priv)
  - Rename functions and structs:
+ struct ttm_test_devices_priv   --> struct ttm_test_devices
+ ttm_kunit_helper_init_device() --> ttm_device_kunit_init()
+ ttm_kunit_helper_ttm_bo_init() --> ttm_bo_kunit_init()
  - Split ttm_kunit_helper_init() into full config (with ttm_device
init) and basic (init only with device/drm_device) initialization
functions

v3:
  - Rename ttm_kunit_helper_alloc_device() to ttm_kunit_helper_init_device()
(Christian)
  - Don't leak a full-blown drm_gem_object in ttm_kunit_helper_ttm_bo_init().
(Christian). Create a small mock object just to get ttm_tt_init_fields()
to init the right number of pages
  - As a follow up to the change above, delete ttm_kunit_helper_ttm_bo_fini()
and just use ttm_bo_put()

v2:
  - Add missing symbol exports in ttm_kunit_helpers.c
  - Update helpers include to fix compilation issues (didn't catch it as
KUnit tests weren't enabled in the kernel I tested, an oversight
on my part)
  - Add checks for ttm_pool fields in ttm_pool_alloc_basic(), including the
one for NUMA node id
  - Rebase the changes on the top of drm-tip


[1] - https://www.kernel.org/doc/html/latest/dev-tools/kunit/index.html

Karolina Stolarek (3):
  drm/ttm: Introduce KUnit test
  drm/ttm/tests: Add tests for ttm_device
  drm/ttm/tests: Add tests for ttm_pool

 drivers/gpu/drm/Kconfig   |  15 +
 drivers/gpu/drm/ttm/Makefile  |   1 +
 drivers/gpu/drm/ttm/tests/.kunitconfig|   4 +
 drivers/gpu/drm/ttm/tests/Makefile|   6 +
 drivers/gpu/drm/ttm/tests/ttm_device_test.c   | 212 +
 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c | 113 +
 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h |  41 ++
 drivers/gpu/drm/ttm/tests/ttm_pool_test.c | 437 ++
 8 files changed, 829 insertions(+)
 create mode 100644 drivers/gpu/drm/ttm/tests/.kunitconfig
 create mode 100644 drivers/gpu/drm/ttm/tests/Makefile
 create mode 100644 drivers/gpu/drm/ttm/tests/ttm_device_test.c
 create mode 100644 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c
 create mode 100644 drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.h
 create mode 100644 drivers/gpu/drm/ttm/tests/ttm_pool_test.c

-- 
2.25.1



Re: [RFC v5 0/3] Introduce KUnit tests for TTM subsystem

2023-08-08 Thread Karolina Stolarek

Hi Christian,

On 7.08.2023 17:06, Christian König wrote:

Am 07.08.23 um 14:21 schrieb Karolina Stolarek:

Hi Christian,

On 3.08.2023 09:56, Christian König wrote:
Feel free to add Reviewed-by: Christian König 
 to the whole series and push to 
drm-misc-next.


Thanks for reviewing the patches while I was away.

I don't have commit rights to push it to drm-misc-next, so I'll go and 
find someone to help me out. Still, I was thinking if I should send v6 
of the series. I fixed a couple of small issues while working on new 
tests, like UAF warnings from my kunit helpers when running kunit.py 
with --raw_output option, but I can include them as a separate patch 
in the new series. What's your preference?


Please send out the series once more based on current drm-misc-next and 
I can push it later today or tomorrow.


I'll send out the series in a minute or so. I decided to _not_ add your 
r-b to the patches I modified, so you can take the final look and make 
sure everything is fine. There were no conflicts during the rebase, you 
can always take v5 if you fancy so.


Thank you for your help!

All the best,
Karolina



Regards,
Christian.



All the best,
Karolina



Thanks,
Christian.

Am 14.07.23 um 16:10 schrieb Karolina Stolarek:
This series introduces KUnit[1] tests for TTM (Translation Table 
Manager)
subsystem, a memory manager used by graphics drivers to create and 
manage

memory buffers across different memory domains, such as system memory
or VRAM.

Unit tests implemented here cover two data structures:
   - ttm_device -- referred as a buffer object device, which stores
 resource managers and page pools
   - ttm_pool -- a struct of pools (ttm_pool_type) of different page
 orders and caching attributes, with pages that can be reused on
 the next buffer allocation

Use kunit_tool script to manually run the tests:

$ ./tools/testing/kunit/kunit.py run 
--kunitconfig=drivers/gpu/drm/ttm/tests


To build a kernel with TTM KUnit tests, first enable CONFIG_KUNIT, 
and then

CONFIG_DRM_TTM_KUNIT_TEST.

As for now, tests are architecture-agnostic (i.e. KUnit runner uses UML
kernel), which means that we have limited coverage in some places. For
example, we can't fully test the initialization of global page pools,
such as global_write_combined. It is to be decided if we want to stick
to UML or use CONFIG_X86 (at least to some extent).

These patches are just a beginning of the work to improve the test
coverage of TTM. Feel free to suggest changes, test cases or 
priorities.


Many thanks,
Karolina

v5:
   - Drop unnecessary brackets in 2/3
   - Rebase KConfig file on the top of drm-tip

v4:
   - Test helpers have been changed to make the creation of init/fini
 functions for each test suite easier:
 + Decouple device creation from test initialization by adding
   helpers that initialize ttm_test_devices, a struct which stores
   DRM/TTM devices, and can be used in test-specific init/finis
   (see ttm_pool_tests.c for an example)
 + Introduce generic init/fini functions for tests that only need
   devices
 + Add ttm_device field to ttm_test_devices (previously
   ttm_test_devices_priv)
   - Make TTM buffer object outlive its TT (Christian)
   - Add a dedicated struct for ttm_pool_test (struct 
ttm_pool_test_priv)

   - Rename functions and structs:
 + struct ttm_test_devices_priv   --> struct ttm_test_devices
 + ttm_kunit_helper_init_device() --> ttm_device_kunit_init()
 + ttm_kunit_helper_ttm_bo_init() --> ttm_bo_kunit_init()
   - Split ttm_kunit_helper_init() into full config (with ttm_device
 init) and basic (init only with device/drm_device) initialization
 functions

v3:
   - Rename ttm_kunit_helper_alloc_device() to 
ttm_kunit_helper_init_device()

 (Christian)
   - Don't leak a full-blown drm_gem_object in 
ttm_kunit_helper_ttm_bo_init().
 (Christian). Create a small mock object just to get 
ttm_tt_init_fields()

 to init the right number of pages
   - As a follow up to the change above, delete 
ttm_kunit_helper_ttm_bo_fini()

 and just use ttm_bo_put()

v2:
   - Add missing symbol exports in ttm_kunit_helpers.c
   - Update helpers include to fix compilation issues (didn't catch 
it as

 KUnit tests weren't enabled in the kernel I tested, an oversight
 on my part)
   - Add checks for ttm_pool fields in ttm_pool_alloc_basic(), 
including the

 one for NUMA node id
   - Rebase the changes on the top of drm-tip


[1] - https://www.kernel.org/doc/html/latest/dev-tools/kunit/index.html

Karolina Stolarek (3):
   drm/ttm: Introduce KUnit test
   drm/ttm/tests: Add tests for ttm_device
   drm/ttm/tests: Add tests for ttm_pool

  drivers/gpu/drm/Kconfig   |  15 +
  drivers/gpu/drm/ttm/Makefile  |   1 +
  drivers/gpu/drm/ttm/tests/.kunitconfig    |   4 +
  drivers/gpu/drm/ttm/tests/Makefile    |   6 +
  

Re: [PATCH RESEND v4 2/2] drm/mediatek: Fix iommu fault during crtc enabling

2023-08-08 Thread 林睿祥


Re: [PATCH 4/6] accel/ivpu: Add param ioctl to identify capabilities

2023-08-08 Thread Stanislaw Gruszka
On Thu, Aug 03, 2023 at 10:37:37AM +0200, Stanislaw Gruszka wrote:
> > Seems like we might want to decide this now, because if we define a iVPU
> > specific ioctl as proposed here, but then switch to an Accel-wide mechanism
> > later, iVPU is going to be stuck supporting both.
> 
> For the record, we do not add new ioctl in this patch, we just extend
> existing DRM_IOCTL_IVPU_GET_PARAM one.

To avoid confusion, I'll change the topic and commit massage
before applying:

accel/ivpu: Extend get_param ioctl to identify capabilities

Add DRM_IVPU_PARAM_CAPABILITIES parameters to get_param ioctl to query
driver capabilities. For now use it for identify metric streamer and
new dma memory range features. Currently upstream version of intel_vpu
does not have those, they will be added it the future.

Regards
Stanislaw


Re: [PATCH RESEND v4 2/2] drm/mediatek: Fix iommu fault during crtc enabling

2023-08-08 Thread 林睿祥


[PATCH v8 7/7] phy: freescale: Add HDMI PHY driver for i.MX8MQ

2023-08-08 Thread Sandor Yu
Add Cadence HDP-TX HDMI PHY driver for i.MX8MQ.

Cadence HDP-TX PHY could be put in either DP mode or
HDMI mode base on the configuration chosen.
HDMI PHY mode is configurated in the driver.

Signed-off-by: Sandor Yu 
Tested-by: Alexander Stein 
---
 drivers/phy/freescale/Kconfig   |   9 +
 drivers/phy/freescale/Makefile  |   1 +
 drivers/phy/freescale/phy-fsl-imx8mq-hdmi.c | 955 
 3 files changed, 965 insertions(+)
 create mode 100644 drivers/phy/freescale/phy-fsl-imx8mq-hdmi.c

diff --git a/drivers/phy/freescale/Kconfig b/drivers/phy/freescale/Kconfig
index 2999ba1e57d0..0c07fccba917 100644
--- a/drivers/phy/freescale/Kconfig
+++ b/drivers/phy/freescale/Kconfig
@@ -44,6 +44,15 @@ config PHY_FSL_IMX8MQ_DP_PHY
  Enable this to support the Cadence HDPTX DP PHY driver
  on i.MX8MQ SOC.
 
+config PHY_FSL_IMX8MQ_HDMI_PHY
+   tristate "Freescale i.MX8MQ HDMI PHY support"
+   depends on OF && HAS_IOMEM
+   depends on COMMON_CLK
+   select GENERIC_PHY
+   help
+ Enable this to support the Cadence HDPTX HDMI PHY driver
+ on i.MX8MQ SOC.
+
 endif
 
 config PHY_FSL_LYNX_28G
diff --git a/drivers/phy/freescale/Makefile b/drivers/phy/freescale/Makefile
index 915a429d9fbc..245783c04951 100644
--- a/drivers/phy/freescale/Makefile
+++ b/drivers/phy/freescale/Makefile
@@ -5,3 +5,4 @@ obj-$(CONFIG_PHY_MIXEL_MIPI_DPHY)   += 
phy-fsl-imx8-mipi-dphy.o
 obj-$(CONFIG_PHY_FSL_IMX8M_PCIE)   += phy-fsl-imx8m-pcie.o
 obj-$(CONFIG_PHY_FSL_LYNX_28G) += phy-fsl-lynx-28g.o
 obj-$(CONFIG_PHY_FSL_IMX8MQ_DP_PHY)+= phy-fsl-imx8mq-dp.o
+obj-$(CONFIG_PHY_FSL_IMX8MQ_HDMI_PHY)  += phy-fsl-imx8mq-hdmi.o
diff --git a/drivers/phy/freescale/phy-fsl-imx8mq-hdmi.c 
b/drivers/phy/freescale/phy-fsl-imx8mq-hdmi.c
new file mode 100644
index ..fffaaa888ba2
--- /dev/null
+++ b/drivers/phy/freescale/phy-fsl-imx8mq-hdmi.c
@@ -0,0 +1,955 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Cadence High-Definition Multimedia Interface (HDMI) PHY driver
+ *
+ * Copyright (C) 2022,2023 NXP Semiconductor, Inc.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define ADDR_PHY_AFE   0x8
+
+/* PHY registers */
+#define CMN_SSM_BIAS_TMR   0x0022
+#define CMN_PLLSM0_USER_DEF_CTRL   0x002f
+#define CMN_PSM_CLK_CTRL   0x0061
+#define CMN_CDIAG_REFCLK_CTRL  0x0062
+#define CMN_PLL0_VCOCAL_START  0x0081
+#define CMN_PLL0_VCOCAL_INIT_TMR   0x0084
+#define CMN_PLL0_VCOCAL_ITER_TMR   0x0085
+#define CMN_TXPUCAL_CTRL   0x00e0
+#define CMN_TXPDCAL_CTRL   0x00f0
+#define CMN_TXPU_ADJ_CTRL  0x0108
+#define CMN_TXPD_ADJ_CTRL  0x010c
+#define CMN_DIAG_PLL0_FBH_OVRD 0x01c0
+#define CMN_DIAG_PLL0_FBL_OVRD 0x01c1
+#define CMN_DIAG_PLL0_OVRD 0x01c2
+#define CMN_DIAG_PLL0_TEST_MODE0x01c4
+#define CMN_DIAG_PLL0_V2I_TUNE 0x01c5
+#define CMN_DIAG_PLL0_CP_TUNE  0x01c6
+#define CMN_DIAG_PLL0_LF_PROG  0x01c7
+#define CMN_DIAG_PLL0_PTATIS_TUNE1 0x01c8
+#define CMN_DIAG_PLL0_PTATIS_TUNE2 0x01c9
+#define CMN_DIAG_PLL0_INCLK_CTRL   0x01ca
+#define CMN_DIAG_PLL0_PXL_DIVH 0x01cb
+#define CMN_DIAG_PLL0_PXL_DIVL 0x01cc
+#define CMN_DIAG_HSCLK_SEL 0x01e0
+#define XCVR_PSM_RCTRL 0x4001
+#define TX_TXCC_CAL_SCLR_MULT_00x4047
+#define TX_TXCC_CPOST_MULT_00_00x404c
+#define XCVR_DIAG_PLLDRC_CTRL  0x40e0
+#define XCVR_DIAG_PLLDRC_CTRL  0x40e0
+#define XCVR_DIAG_HSCLK_SEL0x40e1
+#define XCVR_DIAG_BIDI_CTRL0x40e8
+#define TX_PSC_A0  0x4100
+#define TX_PSC_A1  0x4101
+#define TX_PSC_A2  0x4102
+#define TX_PSC_A3  0x4103
+#define TX_DIAG_TX_CTRL0x41e0
+#define TX_DIAG_TX_DRV 0x41e1
+#define TX_DIAG_BGREF_PREDRV_DELAY 0x41e7
+#define TX_DIAG_ACYA_0 0x41ff
+#define TX_DIAG_ACYA_1 0x43ff
+#define TX_DIAG_ACYA_2 0x45ff
+#define TX_DIAG_ACYA_3 0x47ff
+#define TX_ANA_CTRL_REG_1  0x5020
+#define TX_ANA_CTRL_REG_2  0x5021
+#define TX_DIG_CTRL_REG_2  0x5024
+#define TXDA_CYA_AUXDA_CYA 0x5025
+#define TX_ANA_CTRL_REG_3  0x5026
+#define TX_ANA_CTRL_REG_4  0x5027
+#define TX_ANA_CTRL_REG_5  0x5029
+#define 

[PATCH v8 6/7] phy: freescale: Add DisplayPort PHY driver for i.MX8MQ

2023-08-08 Thread Sandor Yu
Add Cadence HDP-TX DisplayPort PHY driver for i.MX8MQ

Cadence HDP-TX PHY could be put in either DP mode or
HDMI mode base on the configuration chosen.
DisplayPort PHY mode is configurated in the driver.

Signed-off-by: Sandor Yu 
---
 drivers/phy/freescale/Kconfig |   9 +
 drivers/phy/freescale/Makefile|   1 +
 drivers/phy/freescale/phy-fsl-imx8mq-dp.c | 714 ++
 3 files changed, 724 insertions(+)
 create mode 100644 drivers/phy/freescale/phy-fsl-imx8mq-dp.c

diff --git a/drivers/phy/freescale/Kconfig b/drivers/phy/freescale/Kconfig
index 853958fb2c06..2999ba1e57d0 100644
--- a/drivers/phy/freescale/Kconfig
+++ b/drivers/phy/freescale/Kconfig
@@ -35,6 +35,15 @@ config PHY_FSL_IMX8M_PCIE
  Enable this to add support for the PCIE PHY as found on
  i.MX8M family of SOCs.
 
+config PHY_FSL_IMX8MQ_DP_PHY
+   tristate "Freescale i.MX8MQ DP PHY support"
+   depends on OF && HAS_IOMEM
+   depends on COMMON_CLK
+   select GENERIC_PHY
+   help
+ Enable this to support the Cadence HDPTX DP PHY driver
+ on i.MX8MQ SOC.
+
 endif
 
 config PHY_FSL_LYNX_28G
diff --git a/drivers/phy/freescale/Makefile b/drivers/phy/freescale/Makefile
index cedb328bc4d2..915a429d9fbc 100644
--- a/drivers/phy/freescale/Makefile
+++ b/drivers/phy/freescale/Makefile
@@ -4,3 +4,4 @@ obj-$(CONFIG_PHY_MIXEL_LVDS_PHY)+= 
phy-fsl-imx8qm-lvds-phy.o
 obj-$(CONFIG_PHY_MIXEL_MIPI_DPHY)  += phy-fsl-imx8-mipi-dphy.o
 obj-$(CONFIG_PHY_FSL_IMX8M_PCIE)   += phy-fsl-imx8m-pcie.o
 obj-$(CONFIG_PHY_FSL_LYNX_28G) += phy-fsl-lynx-28g.o
+obj-$(CONFIG_PHY_FSL_IMX8MQ_DP_PHY)+= phy-fsl-imx8mq-dp.o
diff --git a/drivers/phy/freescale/phy-fsl-imx8mq-dp.c 
b/drivers/phy/freescale/phy-fsl-imx8mq-dp.c
new file mode 100644
index ..b1f45c0b27b5
--- /dev/null
+++ b/drivers/phy/freescale/phy-fsl-imx8mq-dp.c
@@ -0,0 +1,714 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Cadence HDP-TX Display Port Interface (DP) PHY driver
+ *
+ * Copyright (C) 2022, 2023 NXP Semiconductor, Inc.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define ADDR_PHY_AFE   0x8
+
+/* PHY registers */
+#define CMN_SSM_BIAS_TMR   0x0022
+#define CMN_PLLSM0_PLLEN_TMR   0x0029
+#define CMN_PLLSM0_PLLPRE_TMR  0x002a
+#define CMN_PLLSM0_PLLVREF_TMR 0x002b
+#define CMN_PLLSM0_PLLLOCK_TMR 0x002c
+#define CMN_PLLSM0_USER_DEF_CTRL   0x002f
+#define CMN_PSM_CLK_CTRL   0x0061
+#define CMN_PLL0_VCOCAL_START  0x0081
+#define CMN_PLL0_VCOCAL_INIT_TMR   0x0084
+#define CMN_PLL0_VCOCAL_ITER_TMR   0x0085
+#define CMN_PLL0_INTDIV0x0094
+#define CMN_PLL0_FRACDIV   0x0095
+#define CMN_PLL0_HIGH_THR  0x0096
+#define CMN_PLL0_DSM_DIAG  0x0097
+#define CMN_PLL0_SS_CTRL2  0x0099
+#define CMN_ICAL_INIT_TMR  0x00c4
+#define CMN_ICAL_ITER_TMR  0x00c5
+#define CMN_RXCAL_INIT_TMR 0x00d4
+#define CMN_RXCAL_ITER_TMR 0x00d5
+#define CMN_TXPUCAL_INIT_TMR   0x00e4
+#define CMN_TXPUCAL_ITER_TMR   0x00e5
+#define CMN_TXPDCAL_INIT_TMR   0x00f4
+#define CMN_TXPDCAL_ITER_TMR   0x00f5
+#define CMN_ICAL_ADJ_INIT_TMR  0x0102
+#define CMN_ICAL_ADJ_ITER_TMR  0x0103
+#define CMN_RX_ADJ_INIT_TMR0x0106
+#define CMN_RX_ADJ_ITER_TMR0x0107
+#define CMN_TXPU_ADJ_INIT_TMR  0x010a
+#define CMN_TXPU_ADJ_ITER_TMR  0x010b
+#define CMN_TXPD_ADJ_INIT_TMR  0x010e
+#define CMN_TXPD_ADJ_ITER_TMR  0x010f
+#define CMN_DIAG_PLL0_FBH_OVRD 0x01c0
+#define CMN_DIAG_PLL0_FBL_OVRD 0x01c1
+#define CMN_DIAG_PLL0_OVRD 0x01c2
+#define CMN_DIAG_PLL0_TEST_MODE0x01c4
+#define CMN_DIAG_PLL0_V2I_TUNE 0x01c5
+#define CMN_DIAG_PLL0_CP_TUNE  0x01c6
+#define CMN_DIAG_PLL0_LF_PROG  0x01c7
+#define CMN_DIAG_PLL0_PTATIS_TUNE1 0x01c8
+#define CMN_DIAG_PLL0_PTATIS_TUNE2 0x01c9
+#define CMN_DIAG_HSCLK_SEL 0x01e0
+#define CMN_DIAG_PER_CAL_ADJ   0x01ec
+#define CMN_DIAG_CAL_CTRL  0x01ed
+#define CMN_DIAG_ACYA  0x01ff
+#define XCVR_PSM_RCTRL 0x4001
+#define XCVR_PSM_CAL_TMR   0x4002
+#define XCVR_PSM_A0IN_TMR  0x4003
+#define TX_TXCC_CAL_SCLR_MULT_00x4047
+#define TX_TXCC_CPOST_MULT_00_00x404c
+#define XCVR_DIAG_PLLDRC_CTRL  

[PATCH v8 5/7] dt-bindings: phy: Add Freescale iMX8MQ DP and HDMI PHY

2023-08-08 Thread Sandor Yu
Add bindings for Freescale iMX8MQ DP and HDMI PHY.

Signed-off-by: Sandor Yu 
Reviewed-by: Rob Herring 
---
 .../bindings/phy/fsl,imx8mq-dp-hdmi-phy.yaml  | 53 +++
 1 file changed, 53 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/phy/fsl,imx8mq-dp-hdmi-phy.yaml

diff --git a/Documentation/devicetree/bindings/phy/fsl,imx8mq-dp-hdmi-phy.yaml 
b/Documentation/devicetree/bindings/phy/fsl,imx8mq-dp-hdmi-phy.yaml
new file mode 100644
index ..917f113503dc
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/fsl,imx8mq-dp-hdmi-phy.yaml
@@ -0,0 +1,53 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/phy/fsl,imx8mq-dp-hdmi-phy.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Cadence HDP-TX DP/HDMI PHY for Freescale i.MX8MQ SoC
+
+maintainers:
+  - Sandor Yu 
+
+properties:
+  compatible:
+enum:
+  - fsl,imx8mq-dp-phy
+  - fsl,imx8mq-hdmi-phy
+
+  reg:
+maxItems: 1
+
+  clocks:
+items:
+  - description: PHY reference clock.
+  - description: APB clock.
+
+  clock-names:
+items:
+  - const: ref
+  - const: apb
+
+  "#phy-cells":
+const: 0
+
+required:
+  - compatible
+  - reg
+  - clocks
+  - clock-names
+  - "#phy-cells"
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+#include 
+dp_phy: phy@32c0 {
+compatible = "fsl,imx8mq-dp-phy";
+reg = <0x32c0 0x10>;
+#phy-cells = <0>;
+clocks = <_phy_27m>, < IMX8MQ_CLK_DISP_APB_ROOT>;
+clock-names = "ref", "apb";
+};
-- 
2.34.1



[PATCH v8 4/7] drm: bridge: Cadence: Add MHDP8501 DP/HDMI driver

2023-08-08 Thread Sandor Yu
Add a new DRM DisplayPort and HDMI bridge driver for Candence MHDP8501
used in i.MX8MQ SOC. MHDP8501 could support HDMI or DisplayPort
standards according embedded Firmware running in the uCPU.

For iMX8MQ SOC, the DisplayPort/HDMI FW was loaded and activated by
SOC's ROM code. Bootload binary included respective specific firmware
is required.

Driver will check display connector type and
then load the corresponding driver.

Signed-off-by: Sandor Yu 
Tested-by: Alexander Stein 
---
 drivers/gpu/drm/bridge/cadence/Kconfig|  15 +
 drivers/gpu/drm/bridge/cadence/Makefile   |   2 +
 .../drm/bridge/cadence/cdns-mhdp8501-core.c   | 313 +++
 .../drm/bridge/cadence/cdns-mhdp8501-core.h   | 410 +
 .../gpu/drm/bridge/cadence/cdns-mhdp8501-dp.c | 780 ++
 .../drm/bridge/cadence/cdns-mhdp8501-hdmi.c   | 674 +++
 6 files changed, 2194 insertions(+)
 create mode 100644 drivers/gpu/drm/bridge/cadence/cdns-mhdp8501-core.c
 create mode 100644 drivers/gpu/drm/bridge/cadence/cdns-mhdp8501-core.h
 create mode 100644 drivers/gpu/drm/bridge/cadence/cdns-mhdp8501-dp.c
 create mode 100644 drivers/gpu/drm/bridge/cadence/cdns-mhdp8501-hdmi.c

diff --git a/drivers/gpu/drm/bridge/cadence/Kconfig 
b/drivers/gpu/drm/bridge/cadence/Kconfig
index ec35215a2003..d9daf7ec0cd5 100644
--- a/drivers/gpu/drm/bridge/cadence/Kconfig
+++ b/drivers/gpu/drm/bridge/cadence/Kconfig
@@ -46,3 +46,18 @@ config DRM_CDNS_MHDP8546_J721E
  initializes the J721E Display Port and sets up the
  clock and data muxes.
 endif
+
+config DRM_CDNS_MHDP8501
+   tristate "Cadence MHDP8501 DP/HDMI bridge"
+   select DRM_KMS_HELPER
+   select DRM_PANEL_BRIDGE
+   select DRM_DISPLAY_DP_HELPER
+   select DRM_DISPLAY_HELPER
+   select DRM_CDNS_AUDIO
+   depends on OF
+   help
+ Support Cadence MHDP8501 DisplayPort/HDMI bridge.
+ Cadence MHDP8501 support one or more protocols,
+ including DisplayPort and HDMI.
+ To use the DP and HDMI drivers, their respective
+ specific firmware is required.
diff --git a/drivers/gpu/drm/bridge/cadence/Makefile 
b/drivers/gpu/drm/bridge/cadence/Makefile
index c95fd5b81d13..ea327287d1c1 100644
--- a/drivers/gpu/drm/bridge/cadence/Makefile
+++ b/drivers/gpu/drm/bridge/cadence/Makefile
@@ -5,3 +5,5 @@ cdns-dsi-$(CONFIG_DRM_CDNS_DSI_J721E) += cdns-dsi-j721e.o
 obj-$(CONFIG_DRM_CDNS_MHDP8546) += cdns-mhdp8546.o
 cdns-mhdp8546-y := cdns-mhdp8546-core.o cdns-mhdp8546-hdcp.o
 cdns-mhdp8546-$(CONFIG_DRM_CDNS_MHDP8546_J721E) += cdns-mhdp8546-j721e.o
+obj-$(CONFIG_DRM_CDNS_MHDP8501) += cdns-mhdp8501.o
+cdns-mhdp8501-y := cdns-mhdp8501-core.o cdns-mhdp8501-dp.o cdns-mhdp8501-hdmi.o
diff --git a/drivers/gpu/drm/bridge/cadence/cdns-mhdp8501-core.c 
b/drivers/gpu/drm/bridge/cadence/cdns-mhdp8501-core.c
new file mode 100644
index ..29573ce247d1
--- /dev/null
+++ b/drivers/gpu/drm/bridge/cadence/cdns-mhdp8501-core.c
@@ -0,0 +1,313 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Cadence Display Port Interface (DP) driver
+ *
+ * Copyright (C) 2023 NXP Semiconductor, Inc.
+ *
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "cdns-mhdp8501-core.h"
+
+static int cdns_mhdp8501_read_hpd(struct cdns_mhdp_device *mhdp)
+{
+   u8 status;
+   int ret;
+
+   mutex_lock(>mbox_mutex);
+
+   ret = cdns_mhdp_mailbox_send(mhdp, MB_MODULE_ID_GENERAL,
+GENERAL_GET_HPD_STATE, 0, NULL);
+   if (ret)
+   goto err_get_hpd;
+
+   ret = cdns_mhdp_mailbox_recv_header(mhdp, MB_MODULE_ID_GENERAL,
+   GENERAL_GET_HPD_STATE,
+   sizeof(status));
+   if (ret)
+   goto err_get_hpd;
+
+   ret = cdns_mhdp_mailbox_recv_data(mhdp, , sizeof(status));
+   if (ret)
+   goto err_get_hpd;
+
+   mutex_unlock(>mbox_mutex);
+
+   return status;
+
+err_get_hpd:
+   DRM_ERROR("read hpd  failed: %d\n", ret);
+   mutex_unlock(>mbox_mutex);
+
+   return ret;
+}
+
+enum drm_connector_status cdns_mhdp8501_detect(struct cdns_mhdp_device *mhdp)
+{
+   u8 hpd = 0xf;
+
+   hpd = cdns_mhdp8501_read_hpd(mhdp);
+   if (hpd == 1)
+   return connector_status_connected;
+   else if (hpd == 0)
+   return connector_status_disconnected;
+
+   DRM_INFO("Unknown cable status, hdp=%u\n", hpd);
+   return connector_status_unknown;
+}
+
+static void hotplug_work_func(struct work_struct *work)
+{
+   struct cdns_mhdp_device *mhdp = container_of(work,
+struct cdns_mhdp_device,
+hotplug_work.work);
+   enum drm_connector_status status = cdns_mhdp8501_detect(mhdp);
+
+   drm_bridge_hpd_notify(>bridge, status);
+
+   if (status == 

[PATCH v8 3/7] dt-bindings: display: bridge: Add Cadence MHDP850

2023-08-08 Thread Sandor Yu
Add bindings for Cadence MHDP8501 DisplayPort/HDMI bridge.

Signed-off-by: Sandor Yu 
---
 .../display/bridge/cdns,mhdp8501.yaml | 104 ++
 1 file changed, 104 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/bridge/cdns,mhdp8501.yaml

diff --git 
a/Documentation/devicetree/bindings/display/bridge/cdns,mhdp8501.yaml 
b/Documentation/devicetree/bindings/display/bridge/cdns,mhdp8501.yaml
new file mode 100644
index ..3ae643845cfe
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/bridge/cdns,mhdp8501.yaml
@@ -0,0 +1,104 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/bridge/cdns,mhdp8501.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Cadence MHDP8501 DP/HDMI bridge
+
+maintainers:
+  - Sandor Yu 
+
+description:
+  Cadence MHDP8501 DisplayPort/HDMI interface.
+
+properties:
+  compatible:
+enum:
+  - fsl,imx8mq-mhdp8501
+
+  reg:
+maxItems: 1
+
+  clocks:
+maxItems: 1
+description: MHDP8501 DP/HDMI APB clock.
+
+  phys:
+maxItems: 1
+description:
+  phandle to the DisplayPort or HDMI PHY
+
+  interrupts:
+items:
+  - description: Hotplug cable plugin.
+  - description: Hotplug cable plugout.
+
+  interrupt-names:
+items:
+  - const: plug_in
+  - const: plug_out
+
+  ports:
+$ref: /schemas/graph.yaml#/properties/ports
+
+properties:
+  port@0:
+$ref: /schemas/graph.yaml#/properties/port
+description:
+  Input port from display controller output.
+  port@1:
+$ref: /schemas/graph.yaml#/properties/port
+description:
+  Output port to DisplayPort or HDMI connector.
+
+required:
+  - port@0
+  - port@1
+
+required:
+  - compatible
+  - reg
+  - clocks
+  - interrupts
+  - interrupt-names
+  - phys
+  - ports
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+#include 
+
+mhdp: display-bridge@32c0 {
+compatible = "fsl,imx8mq-mhdp8501";
+reg = <0x32c0 0x10>;
+interrupts = ,
+ ;
+interrupt-names = "plug_in", "plug_out";
+clocks = < IMX8MQ_CLK_DISP_APB_ROOT>;
+phys = <_phy>;
+
+ports {
+#address-cells = <1>;
+#size-cells = <0>;
+
+port@0 {
+reg = <0>;
+
+mhdp_in: endpoint {
+remote-endpoint = <_out>;
+};
+};
+
+port@1 {
+reg = <1>;
+
+mhdp_out: endpoint {
+remote-endpoint = <_connector>;
+};
+};
+};
+};
-- 
2.34.1



[PATCH v8 2/7] phy: Add HDMI configuration options

2023-08-08 Thread Sandor Yu
Allow HDMI PHYs to be configured through the generic
functions through a custom structure added to the generic union.

The parameters added here are based on HDMI PHY
implementation practices.  The current set of parameters
should cover the potential users.

Signed-off-by: Sandor Yu 
---
 include/linux/phy/phy-hdmi.h | 24 
 include/linux/phy/phy.h  |  7 ++-
 2 files changed, 30 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/phy/phy-hdmi.h

diff --git a/include/linux/phy/phy-hdmi.h b/include/linux/phy/phy-hdmi.h
new file mode 100644
index ..b7de88e9090f
--- /dev/null
+++ b/include/linux/phy/phy-hdmi.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright 2022 NXP
+ */
+
+#ifndef __PHY_HDMI_H_
+#define __PHY_HDMI_H_
+
+#include 
+/**
+ * struct phy_configure_opts_hdmi - HDMI configuration set
+ * @pixel_clk_rate: Pixel clock of video modes in KHz.
+ * @bpc: Maximum bits per color channel.
+ * @color_space: Colorspace in enum hdmi_colorspace.
+ *
+ * This structure is used to represent the configuration state of a HDMI phy.
+ */
+struct phy_configure_opts_hdmi {
+   unsigned int pixel_clk_rate;
+   unsigned int bpc;
+   enum hdmi_colorspace color_space;
+};
+
+#endif /* __PHY_HDMI_H_ */
diff --git a/include/linux/phy/phy.h b/include/linux/phy/phy.h
index f6d607ef0e80..94d489a8a163 100644
--- a/include/linux/phy/phy.h
+++ b/include/linux/phy/phy.h
@@ -17,6 +17,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 
@@ -42,7 +43,8 @@ enum phy_mode {
PHY_MODE_MIPI_DPHY,
PHY_MODE_SATA,
PHY_MODE_LVDS,
-   PHY_MODE_DP
+   PHY_MODE_DP,
+   PHY_MODE_HDMI,
 };
 
 enum phy_media {
@@ -60,11 +62,14 @@ enum phy_media {
  * the DisplayPort protocol.
  * @lvds:  Configuration set applicable for phys supporting
  * the LVDS phy mode.
+ * @hdmi:  Configuration set applicable for phys supporting
+ * the HDMI phy mode.
  */
 union phy_configure_opts {
struct phy_configure_opts_mipi_dphy mipi_dphy;
struct phy_configure_opts_dpdp;
struct phy_configure_opts_lvds  lvds;
+   struct phy_configure_opts_hdmi  hdmi;
 };
 
 /**
-- 
2.34.1



[PATCH v8 1/7] drm: bridge: Cadence: convert mailbox functions to macro functions

2023-08-08 Thread Sandor Yu
MHDP8546 mailbox access functions will be share to other mhdp driver
and Cadence HDP-TX HDMI/DP PHY drivers.
Move those functions to head file include/drm/bridge/cdns-mhdp-mailbox.h
and convert them to macro functions.

Signed-off-by: Sandor Yu 
---
 .../drm/bridge/cadence/cdns-mhdp8546-core.c   | 195 +-
 .../drm/bridge/cadence/cdns-mhdp8546-core.h   |   1 -
 include/drm/bridge/cdns-mhdp-mailbox.h| 240 ++
 3 files changed, 241 insertions(+), 195 deletions(-)
 create mode 100644 include/drm/bridge/cdns-mhdp-mailbox.h

diff --git a/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-core.c 
b/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-core.c
index f6822dfa3805..ddd3c633c7bf 100644
--- a/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-core.c
+++ b/drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-core.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -54,200 +55,6 @@
 #include "cdns-mhdp8546-hdcp.h"
 #include "cdns-mhdp8546-j721e.h"
 
-static int cdns_mhdp_mailbox_read(struct cdns_mhdp_device *mhdp)
-{
-   int ret, empty;
-
-   WARN_ON(!mutex_is_locked(>mbox_mutex));
-
-   ret = readx_poll_timeout(readl, mhdp->regs + CDNS_MAILBOX_EMPTY,
-empty, !empty, MAILBOX_RETRY_US,
-MAILBOX_TIMEOUT_US);
-   if (ret < 0)
-   return ret;
-
-   return readl(mhdp->regs + CDNS_MAILBOX_RX_DATA) & 0xff;
-}
-
-static int cdns_mhdp_mailbox_write(struct cdns_mhdp_device *mhdp, u8 val)
-{
-   int ret, full;
-
-   WARN_ON(!mutex_is_locked(>mbox_mutex));
-
-   ret = readx_poll_timeout(readl, mhdp->regs + CDNS_MAILBOX_FULL,
-full, !full, MAILBOX_RETRY_US,
-MAILBOX_TIMEOUT_US);
-   if (ret < 0)
-   return ret;
-
-   writel(val, mhdp->regs + CDNS_MAILBOX_TX_DATA);
-
-   return 0;
-}
-
-static int cdns_mhdp_mailbox_recv_header(struct cdns_mhdp_device *mhdp,
-u8 module_id, u8 opcode,
-u16 req_size)
-{
-   u32 mbox_size, i;
-   u8 header[4];
-   int ret;
-
-   /* read the header of the message */
-   for (i = 0; i < sizeof(header); i++) {
-   ret = cdns_mhdp_mailbox_read(mhdp);
-   if (ret < 0)
-   return ret;
-
-   header[i] = ret;
-   }
-
-   mbox_size = get_unaligned_be16(header + 2);
-
-   if (opcode != header[0] || module_id != header[1] ||
-   req_size != mbox_size) {
-   /*
-* If the message in mailbox is not what we want, we need to
-* clear the mailbox by reading its contents.
-*/
-   for (i = 0; i < mbox_size; i++)
-   if (cdns_mhdp_mailbox_read(mhdp) < 0)
-   break;
-
-   return -EINVAL;
-   }
-
-   return 0;
-}
-
-static int cdns_mhdp_mailbox_recv_data(struct cdns_mhdp_device *mhdp,
-  u8 *buff, u16 buff_size)
-{
-   u32 i;
-   int ret;
-
-   for (i = 0; i < buff_size; i++) {
-   ret = cdns_mhdp_mailbox_read(mhdp);
-   if (ret < 0)
-   return ret;
-
-   buff[i] = ret;
-   }
-
-   return 0;
-}
-
-static int cdns_mhdp_mailbox_send(struct cdns_mhdp_device *mhdp, u8 module_id,
- u8 opcode, u16 size, u8 *message)
-{
-   u8 header[4];
-   int ret, i;
-
-   header[0] = opcode;
-   header[1] = module_id;
-   put_unaligned_be16(size, header + 2);
-
-   for (i = 0; i < sizeof(header); i++) {
-   ret = cdns_mhdp_mailbox_write(mhdp, header[i]);
-   if (ret)
-   return ret;
-   }
-
-   for (i = 0; i < size; i++) {
-   ret = cdns_mhdp_mailbox_write(mhdp, message[i]);
-   if (ret)
-   return ret;
-   }
-
-   return 0;
-}
-
-static
-int cdns_mhdp_reg_read(struct cdns_mhdp_device *mhdp, u32 addr, u32 *value)
-{
-   u8 msg[4], resp[8];
-   int ret;
-
-   put_unaligned_be32(addr, msg);
-
-   mutex_lock(>mbox_mutex);
-
-   ret = cdns_mhdp_mailbox_send(mhdp, MB_MODULE_ID_GENERAL,
-GENERAL_REGISTER_READ,
-sizeof(msg), msg);
-   if (ret)
-   goto out;
-
-   ret = cdns_mhdp_mailbox_recv_header(mhdp, MB_MODULE_ID_GENERAL,
-   GENERAL_REGISTER_READ,
-   sizeof(resp));
-   if (ret)
-   goto out;
-
-   ret = cdns_mhdp_mailbox_recv_data(mhdp, resp, sizeof(resp));
-   if (ret)
-   goto out;
-
-   /* Returned address value should be the same as requested */
-   if (memcmp(msg, resp, 

[PATCH v8 0/7] Initial support Cadence MHDP8501(HDMI/DP) for i.MX8MQ

2023-08-08 Thread Sandor Yu
The patch set initial support Cadence MHDP8501(HDMI/DP) DRM bridge
drivers and Cadence HDP-TX PHY(HDMI/DP) drivers for Freescale i.MX8MQ.

The patch set compose of DRM bridge drivers and PHY drivers.

Both of them need the followed two patches to pass build.
  drm: bridge: Cadence: convert mailbox functions to macro functions
  phy: Add HDMI configuration options

DRM bridges driver patches:
  dt-bindings: display: bridge: Add Cadence MHDP850
  drm: bridge: Cadence: Add MHDP8501 DP/HDMI driver

PHY driver patches:
  dt-bindings: phy: Add Freescale iMX8MQ DP and HDMI PHY
  phy: freescale: Add DisplayPort PHY driver for i.MX8MQ
  phy: freescale: Add HDMI PHY driver for i.MX8MQ

v7->v8:
MHDP8501 HDMI/DP:
- Correct DT node name to "display-bridge".
- Remove "cdns,mhdp8501" from mhdp8501 dt-binding doc.

HDMI/DP PHY:
- Introduced functions `wait_for_ack` and `wait_for_ack_clear` to handle
  waiting with acknowledgment bits set and cleared respectively.
- Use FIELD_PRE() to set bitfields for both HDMI and DP PHY.

v6->v7:
MHDP8501 HDMI/DP:
- Combine HDMI and DP driver into one mhdp8501 driver.
  Use the connector type to load the corresponding functions.
- Remove connector init functions.
- Add  in phy_hdmi.h to reuse ‘enum hdmi_colorspace’.

HDMI/DP PHY:
- Lowercase hex values
- Fix parameters indent issue on some functions
- Replace ‘udelay’ with ‘usleep_range’

v5->v6:
HDMI/DP bridge driver
- 8501 is the part number of Cadence MHDP on i.MX8MQ.
  Use MHDP8501 to name hdmi/dp drivers and files. 
- Add compatible "fsl,imx8mq-mhdp8501-dp" for i.MX8MQ DP driver
- Add compatible "fsl,imx8mq-mhdp8501-hdmi" for i.MX8MQ HDMI driver
- Combine HDMI and DP dt-bindings into one file cdns,mhdp8501.yaml
- Fix HDMI scrambling is not enable issue when driver working in 4Kp60
  mode.
- Add HDMI/DP PHY API mailbox protect.

HDMI/DP PHY driver:
- Rename DP and HDMI PHY files and move to folder phy/freescale/
- Remove properties num_lanes and link_rate from DP PHY driver.
- Combine HDMI and DP dt-bindings into one file fsl,imx8mq-dp-hdmi-phy.yaml
- Update compatible string to "fsl,imx8mq-dp-phy".
- Update compatible string to "fsl,imx8mq-hdmi-phy".

v4->v5:
- Drop "clk" suffix in clock name.
- Add output port property in the example of hdmi/dp.

v3->v4:
dt-bindings:
- Correct dt-bindings coding style and address review comments.
- Add apb_clk description.
- Add output port for HDMI/DP connector
PHY:
- Alphabetically sorted in Kconfig and Makefile for DP and HDMI PHY
- Remove unused registers define from HDMI and DP PHY drivers.
- More description in phy_hdmi.h.
- Add apb_clk to HDMI and DP phy driver.
HDMI/DP:
- Use get_unaligned_le32() to replace hardcode type conversion
  in HDMI AVI infoframe data fill function.
- Add mailbox mutex lock in HDMI/DP driver for phy functions
  to reslove race conditions between HDMI/DP and PHY drivers.
- Add apb_clk to both HDMI and DP driver.
- Rename some function names and add prefix with "cdns_hdmi/cdns_dp".
- Remove bpc 12 and 16 optional that not supported.

v2->v3:
Address comments for dt-bindings files.
- Correct dts-bindings file names 
  Rename phy-cadence-hdptx-dp.yaml to cdns,mhdp-imx8mq-dp.yaml
  Rename phy-cadence-hdptx-hdmi.yaml to cdns,mhdp-imx8mq-hdmi.yaml
- Drop redundant words and descriptions.
- Correct hdmi/dp node name.

v2 is a completely different version compared to v1.
Previous v1 can be available here [1].

v1->v2:
- Reuse Cadence mailbox access functions from mhdp8546 instead of
  rockchip DP.
- Mailbox access functions be convert to marco functions
  that will be referenced by HDP-TX PHY(HDMI/DP) driver too.
- Plain bridge instead of component driver.
- Standalone Cadence HDP-TX PHY(HDMI/DP) driver.
- Audio driver are removed from the patch set, it will be add in another
  patch set later.

[1] 
https://patchwork.kernel.org/project/linux-rockchip/cover/cover.1590982881.git.sandor...@nxp.com/

Sandor Yu (7):
  drm: bridge: Cadence: convert mailbox functions to macro functions
  phy: Add HDMI configuration options
  dt-bindings: display: bridge: Add Cadence MHDP850
  drm: bridge: Cadence: Add MHDP8501 DP/HDMI driver
  dt-bindings: phy: Add Freescale iMX8MQ DP and HDMI PHY
  phy: freescale: Add DisplayPort PHY driver for i.MX8MQ
  phy: freescale: Add HDMI PHY driver for i.MX8MQ

 .../display/bridge/cdns,mhdp8501.yaml | 104 ++
 .../bindings/phy/fsl,imx8mq-dp-hdmi-phy.yaml  |  53 +
 drivers/gpu/drm/bridge/cadence/Kconfig|  15 +
 drivers/gpu/drm/bridge/cadence/Makefile   |   2 +
 .../drm/bridge/cadence/cdns-mhdp8501-core.c   | 313 ++
 .../drm/bridge/cadence/cdns-mhdp8501-core.h   | 410 
 .../gpu/drm/bridge/cadence/cdns-mhdp8501-dp.c | 780 ++
 .../drm/bridge/cadence/cdns-mhdp8501-hdmi.c   | 674 
 .../drm/bridge/cadence/cdns-mhdp8546-core.c   | 195 +---
 .../drm/bridge/cadence/cdns-mhdp8546-core.h   |   1 -
 drivers/phy/freescale/Kconfig |  18 +
 drivers/phy/freescale/Makefile|   2 +
 

Re: [PATCH v4 46/48] mm: shrinker: make memcg slab shrink lockless

2023-08-08 Thread Qi Zheng

Hi Dave,

On 2023/8/8 10:44, Dave Chinner wrote:

On Mon, Aug 07, 2023 at 07:09:34PM +0800, Qi Zheng wrote:

Like global slab shrink, this commit also uses refcount+RCU method to make
memcg slab shrink lockless.


This patch does random code cleanups amongst the actual RCU changes.
Can you please move the cleanups to a spearate patch to reduce the
noise in this one?


Sure, will do.




diff --git a/mm/shrinker.c b/mm/shrinker.c
index d318f5621862..fee6f62904fb 100644
--- a/mm/shrinker.c
+++ b/mm/shrinker.c
@@ -107,6 +107,12 @@ static struct shrinker_info 
*shrinker_info_protected(struct mem_cgroup *memcg,
 lockdep_is_held(_rwsem));
  }
  
+static struct shrinker_info *shrinker_info_rcu(struct mem_cgroup *memcg,

+  int nid)
+{
+   return rcu_dereference(memcg->nodeinfo[nid]->shrinker_info);
+}


This helper doesn't add value. It doesn't tell me that
rcu_read_lock() needs to be held when it is called, for one


How about adding a comment or an assertion here?




  static int expand_one_shrinker_info(struct mem_cgroup *memcg, int new_size,
int old_size, int new_nr_max)
  {
@@ -198,7 +204,7 @@ void set_shrinker_bit(struct mem_cgroup *memcg, int nid, 
int shrinker_id)
struct shrinker_info_unit *unit;
  
  		rcu_read_lock();

-   info = rcu_dereference(memcg->nodeinfo[nid]->shrinker_info);
+   info = shrinker_info_rcu(memcg, nid);


... whilst the original code here was obviously correct.


unit = info->unit[shriner_id_to_index(shrinker_id)];
if (!WARN_ON_ONCE(shrinker_id >= info->map_nr_max)) {
/* Pairs with smp mb in shrink_slab() */
@@ -211,7 +217,7 @@ void set_shrinker_bit(struct mem_cgroup *memcg, int nid, 
int shrinker_id)
  
  static DEFINE_IDR(shrinker_idr);
  
-static int prealloc_memcg_shrinker(struct shrinker *shrinker)

+static int shrinker_memcg_alloc(struct shrinker *shrinker)


Cleanups in a separate patch.


OK.




@@ -253,10 +258,15 @@ static long xchg_nr_deferred_memcg(int nid, struct 
shrinker *shrinker,
  {
struct shrinker_info *info;
struct shrinker_info_unit *unit;
+   long nr_deferred;
  
-	info = shrinker_info_protected(memcg, nid);

+   rcu_read_lock();
+   info = shrinker_info_rcu(memcg, nid);
unit = info->unit[shriner_id_to_index(shrinker->id)];
-   return 
atomic_long_xchg(>nr_deferred[shriner_id_to_offset(shrinker->id)], 0);
+   nr_deferred = 
atomic_long_xchg(>nr_deferred[shriner_id_to_offset(shrinker->id)], 0);
+   rcu_read_unlock();
+
+   return nr_deferred;
  }


This adds two rcu_read_lock() sections to every call to
do_shrink_slab(). It's not at all clear ifrom any of the other code
that do_shrink_slab() now has internal rcu_read_lock() sections


The xchg_nr_deferred_memcg() will only be called in shrink_slab_memcg(),
so other code doesn't need to know that information?




@@ -464,18 +480,23 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, 
int nid,
if (!mem_cgroup_online(memcg))
return 0;
  
-	if (!down_read_trylock(_rwsem))

-   return 0;
-
-   info = shrinker_info_protected(memcg, nid);
+again:
+   rcu_read_lock();
+   info = shrinker_info_rcu(memcg, nid);
if (unlikely(!info))
goto unlock;
  
-	for (; index < shriner_id_to_index(info->map_nr_max); index++) {

+   if (index < shriner_id_to_index(info->map_nr_max)) {
struct shrinker_info_unit *unit;
  
  		unit = info->unit[index];
  
+		/*

+* The shrinker_info_unit will not be freed, so we can
+* safely release the RCU lock here.
+*/
+   rcu_read_unlock();


Why - what guarantees that the shrinker_info_unit exists at this
point? We hold no reference to it, we hold no reference to any
shrinker, etc. What provides this existence guarantee?


The shrinker_info_unit is never freed unless the memcg is destroyed.
Here we hold the refcount of this memcg (mem_cgroup_iter() -->
css_tryget()), so the shrinker_info_unit will not be freed.




+
for_each_set_bit(offset, unit->map, SHRINKER_UNIT_BITS) {
struct shrink_control sc = {
.gfp_mask = gfp_mask,
@@ -485,12 +506,14 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, 
int nid,
struct shrinker *shrinker;
int shrinker_id = calc_shrinker_id(index, offset);
  
+			rcu_read_lock();

shrinker = idr_find(_idr, shrinker_id);
-   if (unlikely(!shrinker || !(shrinker->flags & 
SHRINKER_REGISTERED))) {
-   if (!shrinker)
-   clear_bit(offset, unit->map);
+   if (unlikely(!shrinker || 

Re: [PATCH v4 19/48] rcu: dynamically allocate the rcu-kfree shrinker

2023-08-08 Thread Muchun Song



> On Aug 7, 2023, at 19:09, Qi Zheng  wrote:
> 
> Use new APIs to dynamically allocate the rcu-kfree shrinker.
> 
> Signed-off-by: Qi Zheng 

Reviewed-by: Muchun Song 




RE: [PATCH v2 1/2] pwm: Manage owner assignment implicitly for drivers

2023-08-08 Thread nobuhiro1.iwamatsu
Hi Uwe,

> -Original Message-
> From: Uwe Kleine-König 
> Sent: Friday, August 4, 2023 11:27 PM
> To: Thierry Reding ; Laurent Pinchart
> 
> Cc: Linus Walleij ; Bartosz Golaszewski
> ; Andy Shevchenko ; Douglas Anderson
> ; Andrzej Hajda ; Neil
> Armstrong ; Robert Foss ;
> Jonas Karlman ; Jernej Skrabec
> ; David Airlie ; Daniel Vetter
> ; Pavel Machek ; Lee Jones
> ; Hector Martin ; Sven Peter
> ; Alyssa Rosenzweig ; Nicolas
> Ferre ; Alexandre Belloni
> ; Claudiu Beznea
> ; Ray Jui ; Scott
> Branden ; Broadcom internal kernel review list
> ; Florian Fainelli
> ; Alexander Shiyan ;
> Benson Leung ; Guenter Roeck
> ; Shawn Guo ; Sascha
> Hauer ; Pengutronix Kernel Team
> ; Fabio Estevam ; NXP
> Linux Team ; Paul Cercueil ;
> Vladimir Zapolskiy ; Kevin Hilman ;
> Jerome Brunet ; Martin Blumenstingl
> ; Conor Dooley
> ; Daire McNamara
> ; Matthias Brugger
> ; AngeloGioacchino Del Regno
> ; Jonathan Neuschäfer
> ; Heiko Stuebner ; Krzysztof
> Kozlowski ; Alim Akhtar
> ; Palmer Dabbelt ; Paul
> Walmsley ; Michael Walle ;
> Orson Zhai ; Baolin Wang
> ; Chunyan Zhang
> ; Fabrice Gasnier ;
> Maxime Coquelin ; Alexandre Torgue
> ; Chen-Yu Tsai ; Samuel
> Holland ; Hammer Hsieh
> ; Jonathan Hunter ;
> iwamatsu nobuhiro(岩松 信洋 ○DITC□DIT○OST)
> ; Sean Anderson
> ; Michal Simek ;
> Johan Hovold ; Alex Elder ; Greg
> Kroah-Hartman ; Anjelique Melendez
> ; Dmitry Baryshkov
> ; Luca Weiss ; Bjorn
> Andersson ; linux-...@vger.kernel.org;
> linux-g...@vger.kernel.org; dri-devel@lists.freedesktop.org;
> linux-l...@vger.kernel.org; as...@lists.linux.dev;
> linux-arm-ker...@lists.infradead.org; linux-rpi-ker...@lists.infradead.org;
> chrome-platf...@lists.linux.dev; linux-m...@vger.kernel.org;
> linux-amlo...@lists.infradead.org; linux-ri...@lists.infradead.org;
> linux-media...@lists.infradead.org; linux-rockc...@lists.infradead.org;
> linux-samsung-...@vger.kernel.org;
> linux-st...@st-md-mailman.stormreply.com; linux-su...@lists.linux.dev;
> linux-te...@vger.kernel.org; greybus-...@lists.linaro.org;
> linux-stag...@lists.linux.dev
> Subject: [PATCH v2 1/2] pwm: Manage owner assignment implicitly for drivers
> 
> Instead of requiring each driver to care for assigning the owner member of
> struct pwm_ops, handle that implicitly using a macro. Note that the owner
> member has to be moved to struct pwm_chip, as the ops structure usually lives
> in read-only memory and so cannot be modified.
> 
> The upside is that new lowlevel drivers cannot forget the assignment and save
> one line each. The pwm-crc driver didn't assign .owner, that's not a problem 
> in
> practise though as the driver cannot be compiled as a module.
> 
> Signed-off-by: Uwe Kleine-König 
> ---
>  drivers/gpio/gpio-mvebu.c |  1 -
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c |  1 -
>  drivers/leds/rgb/leds-qcom-lpg.c  |  1 -
>  drivers/pwm/core.c| 24
> ++--
>  drivers/pwm/pwm-ab8500.c  |  1 -
>  drivers/pwm/pwm-apple.c   |  1 -
>  drivers/pwm/pwm-atmel-hlcdc.c |  1 -
>  drivers/pwm/pwm-atmel-tcb.c   |  1 -
>  drivers/pwm/pwm-atmel.c   |  1 -
>  drivers/pwm/pwm-bcm-iproc.c   |  1 -
>  drivers/pwm/pwm-bcm-kona.c|  1 -
>  drivers/pwm/pwm-bcm2835.c |  1 -
>  drivers/pwm/pwm-berlin.c  |  1 -
>  drivers/pwm/pwm-brcmstb.c |  1 -
>  drivers/pwm/pwm-clk.c |  1 -
>  drivers/pwm/pwm-clps711x.c|  1 -
>  drivers/pwm/pwm-cros-ec.c |  1 -
>  drivers/pwm/pwm-dwc.c |  1 -
>  drivers/pwm/pwm-ep93xx.c  |  1 -
>  drivers/pwm/pwm-fsl-ftm.c |  1 -
>  drivers/pwm/pwm-hibvt.c   |  1 -
>  drivers/pwm/pwm-img.c |  1 -
>  drivers/pwm/pwm-imx-tpm.c |  1 -
>  drivers/pwm/pwm-imx1.c|  1 -
>  drivers/pwm/pwm-imx27.c   |  1 -
>  drivers/pwm/pwm-intel-lgm.c   |  1 -
>  drivers/pwm/pwm-iqs620a.c |  1 -
>  drivers/pwm/pwm-jz4740.c  |  1 -
>  drivers/pwm/pwm-keembay.c |  1 -
>  drivers/pwm/pwm-lp3943.c  |  1 -
>  drivers/pwm/pwm-lpc18xx-sct.c |  1 -
>  drivers/pwm/pwm-lpc32xx.c |  1 -
>  drivers/pwm/pwm-lpss.c|  1 -
>  drivers/pwm/pwm-mediatek.c|  1 -
>  drivers/pwm/pwm-meson.c   |  1 -
>  drivers/pwm/pwm-microchip-core.c  |  1 -
>  drivers/pwm/pwm-mtk-disp.c|  1 -
>  drivers/pwm/pwm-mxs.c |  1 -
>  drivers/pwm/pwm-ntxec.c   |  1 -
>  drivers/pwm/pwm-omap-dmtimer.c|  1 -
>  drivers/pwm/pwm-pca9685.c |  1 -
>  drivers/pwm/pwm-pxa.c |  1 -
>  drivers/pwm/pwm-raspberrypi-poe.c |  1 -
>  drivers/pwm/pwm-rcar.c|  1 -
>  drivers/pwm/pwm-renesas-tpu.c |  1 -
>  drivers/pwm/pwm-rockchip.c|  1 -
>  drivers/pwm/pwm-rz-mtu3.c |  1 -
>  

[PATCH -next v2] drm/kmb: Remove unused variable layer_irqs

2023-08-08 Thread GUO Zihua
layer_irqs does not seems to have any user. So remove it completely.

This resolves sparse warning:
  warning: symbol 'layer_irqs' was not declared. Should it be static?

Signed-off-by: GUO Zihua 
---

v2:
  V1 is titled "drm/kmb: Make layer_irqs static". This patch removes
layer_irqs completely as there does not seems to be a user for it.

---
 drivers/gpu/drm/kmb/kmb_plane.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/kmb/kmb_plane.c b/drivers/gpu/drm/kmb/kmb_plane.c
index 9e0562aa2bcb..5a8c7cbf27b0 100644
--- a/drivers/gpu/drm/kmb/kmb_plane.c
+++ b/drivers/gpu/drm/kmb/kmb_plane.c
@@ -17,13 +17,6 @@
 #include "kmb_plane.h"
 #include "kmb_regs.h"
 
-const u32 layer_irqs[] = {
-   LCD_INT_VL0,
-   LCD_INT_VL1,
-   LCD_INT_GL0,
-   LCD_INT_GL1
-};
-
 /* Conversion (yuv->rgb) matrix from myriadx */
 static const u32 csc_coef_lcd[] = {
1024, 0, 1436,
-- 
2.17.1



Re: [PATCH v4 06/48] binder: dynamically allocate the android-binder shrinker

2023-08-08 Thread Muchun Song



> On Aug 7, 2023, at 19:08, Qi Zheng  wrote:
> 
> Use new APIs to dynamically allocate the android-binder shrinker.
> 
> Signed-off-by: Qi Zheng 

Reviewed-by: Muchun Song 




RE: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-08-08 Thread Kasireddy, Vivek
Hi Jason,

> 
> > No, adding HMM_PFN_REQ_WRITE still doesn't help in fixing the issue.
> > Although, I do not have THP enabled (or built-in), shmem does not evict
> > the pages after hole punch as noted in the comment in shmem_fallocate():
> 
> This is the source of all your problems.
> 
> Things that are mm-centric are supposed to track the VMAs and changes to
> the PTEs. If you do something in userspace and it doesn't cause the
> CPU page tables to change then it certainly shouldn't cause any mmu
> notifiers or hmm_range_fault changes.
I am not doing anything out of the blue in the userspace. I think the behavior
I am seeing with shmem (where an invalidation event (MMU_NOTIFY_CLEAR)
does occur because of a hole punch but the PTEs don't really get updated)
can arguably be considered an optimization. 

> 
> There should still be an invalidation notifier at some point when the
> CPU tables do eventually change, whenever that is. Missing that
> notification would be a bug.
I clearly do not see any notification getting triggered (from both shmem_fault()
and hugetlb_fault()) when the PTEs do get updated as the hole is refilled
due to writes. Are you saying that there needs to be an invalidation event
(MMU_NOTIFY_CLEAR?) dispatched at this point?

> 
> > If I force it to read-fault or write-fault (by hacking 
> > hmm_pte_need_fault()),
> > it gets indefinitely stuck in the do while loop in hmm_range_fault().
> > AFAIU, unless there is a way to fault-in zero pages (or any scratch pages)
> > after hole punch that get invalidated because of writes, I do not see how
> > using hmm_range_fault() can help with my use-case.
> 
> hmm_range_fault() is the correct API to use if you are working with
> notifiers. Do not hack something together using pin_user_pages.
I noticed that hmm_range_fault() does not seem to be working as expected
given that it gets stuck(hangs) while walking hugetlb pages. Regardless,
as I mentioned above, the lack of notification when PTEs do get updated due
to writes is the crux of the issue here. Therefore, AFAIU, triggering an
invalidation event or some other kind of notification would help in fixing
this issue.

Thanks,
Vivek

> 
> Jason



Re: [PATCH v4 45/48] mm: shrinker: make global slab shrink lockless

2023-08-08 Thread Qi Zheng

Hi Dave,

On 2023/8/8 10:24, Dave Chinner wrote:

On Mon, Aug 07, 2023 at 07:09:33PM +0800, Qi Zheng wrote:

diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index eb342994675a..f06225f18531 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -4,6 +4,8 @@
  
  #include 

  #include 
+#include 
+#include 
  
  #define SHRINKER_UNIT_BITS	BITS_PER_LONG
  
@@ -87,6 +89,10 @@ struct shrinker {

int seeks;  /* seeks to recreate an obj */
unsigned flags;
  
+	refcount_t refcount;

+   struct completion done;
+   struct rcu_head rcu;


Documentation, please. What does the refcount protect, what does the
completion provide, etc.


How about the following:

/*
 * reference count of this shrinker, holding this can guarantee
 * that the shrinker will not be released.
 */
refcount_t refcount;
/*
 * Wait for shrinker::refcount to reach 0, that is, no shrinker
 * is running or will run again.
 */
struct completion done;




+
void *private_data;
  
  	/* These are for internal use */

@@ -120,6 +126,17 @@ struct shrinker *shrinker_alloc(unsigned int flags, const 
char *fmt, ...);
  void shrinker_register(struct shrinker *shrinker);
  void shrinker_free(struct shrinker *shrinker);
  
+static inline bool shrinker_try_get(struct shrinker *shrinker)

+{
+   return refcount_inc_not_zero(>refcount);
+}
+
+static inline void shrinker_put(struct shrinker *shrinker)
+{
+   if (refcount_dec_and_test(>refcount))
+   complete(>done);
+}
+
  #ifdef CONFIG_SHRINKER_DEBUG
  extern int __printf(2, 3) shrinker_debugfs_rename(struct shrinker *shrinker,
  const char *fmt, ...);
diff --git a/mm/shrinker.c b/mm/shrinker.c
index 1911c06b8af5..d318f5621862 100644
--- a/mm/shrinker.c
+++ b/mm/shrinker.c
@@ -2,6 +2,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  
  #include "internal.h"

@@ -577,33 +578,42 @@ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct 
mem_cgroup *memcg,
if (!mem_cgroup_disabled() && !mem_cgroup_is_root(memcg))
return shrink_slab_memcg(gfp_mask, nid, memcg, priority);
  
-	if (!down_read_trylock(_rwsem))

-   goto out;
-
-   list_for_each_entry(shrinker, _list, list) {
+   rcu_read_lock();
+   list_for_each_entry_rcu(shrinker, _list, list) {
struct shrink_control sc = {
.gfp_mask = gfp_mask,
.nid = nid,
.memcg = memcg,
};
  
+		if (!shrinker_try_get(shrinker))

+   continue;
+
+   /*
+* We can safely unlock the RCU lock here since we already
+* hold the refcount of the shrinker.
+*/
+   rcu_read_unlock();
+
ret = do_shrink_slab(, shrinker, priority);
if (ret == SHRINK_EMPTY)
ret = 0;
freed += ret;
+
/*
-* Bail out if someone want to register a new shrinker to
-* prevent the registration from being stalled for long periods
-* by parallel ongoing shrinking.
+* This shrinker may be deleted from shrinker_list and freed
+* after the shrinker_put() below, but this shrinker is still
+* used for the next traversal. So it is necessary to hold the
+* RCU lock first to prevent this shrinker from being freed,
+* which also ensures that the next shrinker that is traversed
+* will not be freed (even if it is deleted from shrinker_list
+* at the same time).
 */


This needs to be moved to the head of the function, and document
the whole list walk, get, put and completion parts of the algorithm
that make it safe. There's more to this than "we hold a reference
count", especially the tricky "we might see the shrinker before it
is fully initialised" case


How about moving these documents to before list_for_each_entry_rcu(),
and then go to the head of shrink_slab_memcg() to explain the memcg
slab shrink case.




.

  void shrinker_free(struct shrinker *shrinker)
  {
struct dentry *debugfs_entry = NULL;
@@ -686,9 +712,18 @@ void shrinker_free(struct shrinker *shrinker)
if (!shrinker)
return;
  
+	if (shrinker->flags & SHRINKER_REGISTERED) {

+   shrinker_put(shrinker);
+   wait_for_completion(>done);
+   }


Needs a comment explaining why we need to wait here...


/*
 * Wait for all lookups of the shrinker to complete, after that, no
 * shrinker is running or will run again, then we can safely free
 * the structure where the shrinker is located, such as super_block
 * etc.
 */


+
down_write(_rwsem);
if 

Re: [PATCH drm-misc-next v9 01/11] drm/gem: fix lockdep check for dma-resv lock

2023-08-08 Thread Boris Brezillon
On Thu,  3 Aug 2023 18:52:20 +0200
Danilo Krummrich  wrote:

> When no custom lock is set to protect a GEMs GPUVA list, lockdep checks
> should fall back to the GEM objects dma-resv lock. With the current
> implementation we're setting the lock_dep_map of the GEM objects 'resv'
> pointer (in case no custom lock_dep_map is set yet) on
> drm_gem_private_object_init().
> 
> However, the GEM objects 'resv' pointer might still change after
> drm_gem_private_object_init() is called, e.g. through
> ttm_bo_init_reserved(). This can result in the wrong lock being tracked.
> 
> To fix this, call dma_resv_held() directly from
> drm_gem_gpuva_assert_lock_held() and fall back to the GEMs lock_dep_map
> pointer only if an actual custom lock is set.
> 
> Fixes: e6303f323b1a ("drm: manager to keep track of GPUs VA mappings")
> Signed-off-by: Danilo Krummrich 

Reviewed-by: Boris Brezillon 

but I'm wondering if it wouldn't be a good thing to add a
drm_gem_set_resv() helper, so the core can control drm_gem_object::resv
re-assignments (block them if it's happening after the GEM has been
exposed to the outside world or update auxiliary data if it's happening
before that).

> ---
>  include/drm/drm_gem.h | 15 +--
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> index c0b13c43b459..bc9f6aa2f3fe 100644
> --- a/include/drm/drm_gem.h
> +++ b/include/drm/drm_gem.h
> @@ -551,15 +551,17 @@ int drm_gem_evict(struct drm_gem_object *obj);
>   * @lock: the lock used to protect the gpuva list. The locking primitive
>   * must contain a dep_map field.
>   *
> - * Call this if you're not proctecting access to the gpuva list
> - * with the dma-resv lock, otherwise, drm_gem_gpuva_init() takes care
> - * of initializing lock_dep_map for you.
> + * Call this if you're not proctecting access to the gpuva list with the
> + * dma-resv lock, but with a custom lock.
>   */
>  #define drm_gem_gpuva_set_lock(obj, lock) \
> - if (!(obj)->gpuva.lock_dep_map) \
> + if (!WARN((obj)->gpuva.lock_dep_map, \
> +   "GEM GPUVA lock should be set only once.")) \
>   (obj)->gpuva.lock_dep_map = &(lock)->dep_map
>  #define drm_gem_gpuva_assert_lock_held(obj) \
> - lockdep_assert(lock_is_held((obj)->gpuva.lock_dep_map))
> + lockdep_assert((obj)->gpuva.lock_dep_map ? \
> +lock_is_held((obj)->gpuva.lock_dep_map) : \
> +dma_resv_held((obj)->resv))
>  #else
>  #define drm_gem_gpuva_set_lock(obj, lock) do {} while (0)
>  #define drm_gem_gpuva_assert_lock_held(obj) do {} while (0)
> @@ -573,11 +575,12 @@ int drm_gem_evict(struct drm_gem_object *obj);
>   *
>   * Calling this function is only necessary for drivers intending to support 
> the
>   * _driver_feature DRIVER_GEM_GPUVA.
> + *
> + * See also drm_gem_gpuva_set_lock().
>   */
>  static inline void drm_gem_gpuva_init(struct drm_gem_object *obj)
>  {
>   INIT_LIST_HEAD(>gpuva.list);
> - drm_gem_gpuva_set_lock(obj, >resv->lock.base);
>  }
>  
>  /**



Re: [PATCH] drm: atmel-hlcdc: Support inverting the pixel clock polarity

2023-08-08 Thread Miquel Raynal
Hi Sam,

s...@ravnborg.org wrote on Mon, 7 Aug 2023 18:52:45 +0200:

> Hi Miquel,
> 
> On Mon, Aug 07, 2023 at 11:12:46AM +0200, Miquel Raynal wrote:
> > Hi Sam,
> > 
> > s...@ravnborg.org wrote on Sat, 10 Jun 2023 22:05:15 +0200:
> >   
> > > On Fri, Jun 09, 2023 at 04:48:43PM +0200, Miquel Raynal wrote:  
> > > > On the SoC host controller, the pixel clock can be:
> > > > * standard: data is launched on the rising edge
> > > > * inverted: data is launched on the falling edge
> > > > 
> > > > Some panels may need the inverted option to be used so let's support
> > > > this DRM flag.
> > > > 
> > > > Signed-off-by: Miquel Raynal 
> > > 
> > > Hi Miquel,
> > > 
> > > the patch is:
> > > Reviewed-by: Sam Ravnborg 
> > > 
> > > I hope someone else can pick it up and apply it to drm-misc as
> > > my drm-misc setup is hopelessly outdated atm.  
> > 
> > I haven't been noticed this patch was picked-up, is your tree still
> > outdated or can you take care of it?  
> 
> I am still hopelessly behind on stuff.

No problem.

> I copied a few people on this mail that I hope can help.

Nice, thanks a lot!

> Link to the original patch:
> https://lore.kernel.org/dri-devel/20230609144843.851327-1-miquel.ray...@bootlin.com/
> 
>   Sam

Let me know in case it's easier if I re-send it.

Thanks,
Miquèl


Re: [PATCH v4 44/48] mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}

2023-08-08 Thread Qi Zheng

Hi Dave,

On 2023/8/8 10:12, Dave Chinner wrote:

On Mon, Aug 07, 2023 at 07:09:32PM +0800, Qi Zheng wrote:

Currently, we maintain two linear arrays per node per memcg, which are
shrinker_info::map and shrinker_info::nr_deferred. And we need to resize
them when the shrinker_nr_max is exceeded, that is, allocate a new array,
and then copy the old array to the new array, and finally free the old
array by RCU.

For shrinker_info::map, we do set_bit() under the RCU lock, so we may set
the value into the old map which is about to be freed. This may cause the
value set to be lost. The current solution is not to copy the old map when
resizing, but to set all the corresponding bits in the new map to 1. This
solves the data loss problem, but bring the overhead of more pointless
loops while doing memcg slab shrink.

For shrinker_info::nr_deferred, we will only modify it under the read lock
of shrinker_rwsem, so it will not run concurrently with the resizing. But
after we make memcg slab shrink lockless, there will be the same data loss
problem as shrinker_info::map, and we can't work around it like the map.

For such resizable arrays, the most straightforward idea is to change it
to xarray, like we did for list_lru [1]. We need to do xa_store() in the
list_lru_add()-->set_shrinker_bit(), but this will cause memory
allocation, and the list_lru_add() doesn't accept failure. A possible
solution is to pre-allocate, but the location of pre-allocation is not
well determined.


So you implemented a two level array that preallocates leaf
nodes to work around it? It's remarkable complex for what it does,


Yes, here I have implemented a two level array like the following:

+---+++-+
| shrinker_info | unit 0 | unit 1 | ... | (secondary array)
+---+++-+
 ^
 |
+---+-+
| nr_deferred[] | map | (leaf array)
+---+-+
(shrinker_info_unit)

The leaf array is never freed unless the memcg is destroyed. The
secondary array will be resized every time the shrinker id exceeds
shrinker_nr_max.


I can't help but think a radix tree using a special holder for
nr_deferred values of zero would end up being simpler...


I tried. If the shrinker uses list_lru, then we can preallocate
xa node where list_lru_one is pre-allocated. But for other types of
shrinkers, the location of pre-allocation is not easy to determine
(Such as deferred_split_shrinker). And we can't force all memcg aware
shrinkers to use list_lru, so I gave up using xarray and implemented the 
above two-level array.





Therefore, this commit chooses to introduce a secondary array for
shrinker_info::{map, nr_deferred}, so that we only need to copy this
secondary array every time the size is resized. Then even if we get the
old secondary array under the RCU lock, the found map and nr_deferred are
also true, so no data is lost.


I don't understand what you are trying to describe here. If we get
the old array, then don't we get either a stale nr_deferred value,
or the update we do gets lost because the next shrinker lookup will
find the new array and os the deferred value stored to the old one
is never seen again?


As shown above, the leaf array will not be freed when shrinker_info is
expanded, so the shrinker_info_unit can be indexed from both the old
and the new shrinker_info->unit[x]. So the updated nr_deferred and map
will not be lost.





[1]. 
https://lore.kernel.org/all/20220228122126.37293-13-songmuc...@bytedance.com/

Signed-off-by: Qi Zheng 
Reviewed-by: Muchun Song 
---

.

diff --git a/mm/shrinker.c b/mm/shrinker.c
index a27779ed3798..1911c06b8af5 100644
--- a/mm/shrinker.c
+++ b/mm/shrinker.c
@@ -12,15 +12,50 @@ DECLARE_RWSEM(shrinker_rwsem);
  #ifdef CONFIG_MEMCG
  static int shrinker_nr_max;
  
-/* The shrinker_info is expanded in a batch of BITS_PER_LONG */

-static inline int shrinker_map_size(int nr_items)
+static inline int shrinker_unit_size(int nr_items)
  {
-   return (DIV_ROUND_UP(nr_items, BITS_PER_LONG) * sizeof(unsigned long));
+   return (DIV_ROUND_UP(nr_items, SHRINKER_UNIT_BITS) * sizeof(struct 
shrinker_info_unit *));
  }
  
-static inline int shrinker_defer_size(int nr_items)

+static inline void shrinker_unit_free(struct shrinker_info *info, int start)
  {
-   return (round_up(nr_items, BITS_PER_LONG) * sizeof(atomic_long_t));
+   struct shrinker_info_unit **unit;
+   int nr, i;
+
+   if (!info)
+   return;
+
+   unit = info->unit;
+   nr = DIV_ROUND_UP(info->map_nr_max, SHRINKER_UNIT_BITS);
+
+   for (i = start; i < nr; i++) {
+   if (!unit[i])
+   break;
+
+   kvfree(unit[i]);
+   unit[i] = NULL;
+   }
+}
+
+static inline int shrinker_unit_alloc(struct shrinker_info *new,
+  struct shrinker_info *old, int 

Re: [PATCH drm-misc-next v9 06/11] drm/nouveau: fence: separate fence alloc and emit

2023-08-08 Thread Christian König




Am 07.08.23 um 20:54 schrieb Danilo Krummrich:

Hi Christian,

On 8/7/23 20:07, Christian König wrote:

Am 03.08.23 um 18:52 schrieb Danilo Krummrich:

The new (VM_BIND) UAPI exports DMA fences through DRM syncobjs. Hence,
in order to emit fences within DMA fence signalling critical sections
(e.g. as typically done in the DRM GPU schedulers run_job() 
callback) we

need to separate fence allocation and fence emitting.


At least from the description that sounds like it might be illegal. 
Daniel can you take a look as well.


What exactly are you doing here?


I'm basically doing exactly the same as amdgpu_fence_emit() does in 
amdgpu_ib_schedule() called by amdgpu_job_run().


The difference - and this is what this patch is for - is that I 
separate the fence allocation from emitting the fence, such that the 
fence structure is allocated before the job is submitted to the GPU 
scheduler. amdgpu solves this with GFP_ATOMIC within 
amdgpu_fence_emit() to allocate the fence structure in this case.


Yeah, that use case is perfectly valid. Maybe update the commit message 
a bit to better describe that.


Something like "Separate fence allocation and emitting to avoid 
allocation within DMA fence signalling critical sections inside the DRM 
scheduler. This helps implementing the new UAPI".


Regards,
Christian.



- Danilo



Regards,
Christian.



Signed-off-by: Danilo Krummrich 
---
  drivers/gpu/drm/nouveau/dispnv04/crtc.c |  9 -
  drivers/gpu/drm/nouveau/nouveau_bo.c    | 52 
+++--

  drivers/gpu/drm/nouveau/nouveau_chan.c  |  6 ++-
  drivers/gpu/drm/nouveau/nouveau_dmem.c  |  9 +++--
  drivers/gpu/drm/nouveau/nouveau_fence.c | 16 +++-
  drivers/gpu/drm/nouveau/nouveau_fence.h |  3 +-
  drivers/gpu/drm/nouveau/nouveau_gem.c   |  5 ++-
  7 files changed, 59 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c 
b/drivers/gpu/drm/nouveau/dispnv04/crtc.c

index a6f2e681bde9..a34924523133 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
@@ -1122,11 +1122,18 @@ nv04_page_flip_emit(struct nouveau_channel 
*chan,

  PUSH_NVSQ(push, NV_SW, NV_SW_PAGE_FLIP, 0x);
  PUSH_KICK(push);
-    ret = nouveau_fence_new(chan, false, pfence);
+    ret = nouveau_fence_new(pfence);
  if (ret)
  goto fail;
+    ret = nouveau_fence_emit(*pfence, chan);
+    if (ret)
+    goto fail_fence_unref;
+
  return 0;
+
+fail_fence_unref:
+    nouveau_fence_unref(pfence);
  fail:
  spin_lock_irqsave(>event_lock, flags);
  list_del(>head);
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c

index 057bc995f19b..e9cbbf594e6f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -820,29 +820,39 @@ nouveau_bo_move_m2mf(struct ttm_buffer_object 
*bo, int evict,

  mutex_lock(>mutex);
  else
  mutex_lock_nested(>mutex, SINGLE_DEPTH_NESTING);
+
  ret = nouveau_fence_sync(nouveau_bo(bo), chan, true, 
ctx->interruptible);

-    if (ret == 0) {
-    ret = drm->ttm.move(chan, bo, bo->resource, new_reg);
-    if (ret == 0) {
-    ret = nouveau_fence_new(chan, false, );
-    if (ret == 0) {
-    /* TODO: figure out a better solution here
- *
- * wait on the fence here explicitly as going through
- * ttm_bo_move_accel_cleanup somehow doesn't seem 
to do it.

- *
- * Without this the operation can timeout and we'll 
fallback to a
- * software copy, which might take several minutes 
to finish.

- */
-    nouveau_fence_wait(fence, false, false);
-    ret = ttm_bo_move_accel_cleanup(bo,
-    >base,
-    evict, false,
-    new_reg);
-    nouveau_fence_unref();
-    }
-    }
+    if (ret)
+    goto out_unlock;
+
+    ret = drm->ttm.move(chan, bo, bo->resource, new_reg);
+    if (ret)
+    goto out_unlock;
+
+    ret = nouveau_fence_new();
+    if (ret)
+    goto out_unlock;
+
+    ret = nouveau_fence_emit(fence, chan);
+    if (ret) {
+    nouveau_fence_unref();
+    goto out_unlock;
  }
+
+    /* TODO: figure out a better solution here
+ *
+ * wait on the fence here explicitly as going through
+ * ttm_bo_move_accel_cleanup somehow doesn't seem to do it.
+ *
+ * Without this the operation can timeout and we'll fallback to a
+ * software copy, which might take several minutes to finish.
+ */
+    nouveau_fence_wait(fence, false, false);
+    ret = ttm_bo_move_accel_cleanup(bo, >base, evict, false,
+    new_reg);
+    nouveau_fence_unref();
+
+out_unlock:
  mutex_unlock(>mutex);
  return ret;
  }
diff --git a/drivers/gpu/drm/nouveau/nouveau_chan.c 

  1   2   >