Re: [PATCH] drm/buddy: Fix the range bias clear memory allocation issue

2024-05-08 Thread Daniel Vetter
On Wed, May 08, 2024 at 12:27:20PM +0530, Arunpravin Paneer Selvam wrote:
> Problem statement: During the system boot time, an application request
> for the bulk volume of cleared range bias memory when the clear_avail
> is zero, we dont fallback into normal allocation method as we had an
> unnecessary clear_avail check which prevents the fallback method leads
> to fb allocation failure following system goes into unresponsive state.
> 
> Solution: Remove the unnecessary clear_avail check in the range bias
> allocation function.
> 
> Signed-off-by: Arunpravin Paneer Selvam 
> Fixes: 96950929eb23 ("drm/buddy: Implement tracking clear page feature")
> Reviewed-by: Matthew Auld 
> ---
>  drivers/gpu/drm/drm_buddy.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Can you please also add a kunit test case to exercise this corner case and
make sure it stays fixed?

Thanks, Sima
> 
> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> index 284ebae71cc4..831929ac95eb 100644
> --- a/drivers/gpu/drm/drm_buddy.c
> +++ b/drivers/gpu/drm/drm_buddy.c
> @@ -574,7 +574,7 @@ __drm_buddy_alloc_range_bias(struct drm_buddy *mm,
>  
>   block = __alloc_range_bias(mm, start, end, order,
>  flags, fallback);
> - if (IS_ERR(block) && mm->clear_avail)
> + if (IS_ERR(block))
>   return __alloc_range_bias(mm, start, end, order,
> flags, !fallback);
>  
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] Documentation/gpu: Document the situation with unqualified drm-memory-

2024-05-06 Thread Daniel Vetter
On Fri, May 03, 2024 at 06:06:03PM +0100, Tvrtko Ursulin wrote:
> 
> On 03/05/2024 16:58, Alex Deucher wrote:
> > On Fri, May 3, 2024 at 11:33 AM Daniel Vetter  wrote:
> > > 
> > > On Fri, May 03, 2024 at 01:58:38PM +0100, Tvrtko Ursulin wrote:
> > > > 
> > > > [And I forgot dri-devel.. doing well!]
> > > > 
> > > > On 03/05/2024 13:40, Tvrtko Ursulin wrote:
> > > > > 
> > > > > [Correcting Christian's email]
> > > > > 
> > > > > On 03/05/2024 13:36, Tvrtko Ursulin wrote:
> > > > > > From: Tvrtko Ursulin 
> > > > > > 
> > > > > > Currently it is not well defined what is drm-memory- compared to 
> > > > > > other
> > > > > > categories.
> > > > > > 
> > > > > > In practice the only driver which emits these keys is amdgpu and in 
> > > > > > them
> > > > > > exposes the total memory use (including shared).
> > > > > > 
> > > > > > Document that drm-memory- and drm-total-memory- are aliases to
> > > > > > prevent any
> > > > > > confusion in the future.
> > > > > > 
> > > > > > While at it also clarify that the reserved sub-string 'memory' 
> > > > > > refers to
> > > > > > the memory region component.
> > > > > > 
> > > > > > Signed-off-by: Tvrtko Ursulin 
> > > > > > Cc: Alex Deucher 
> > > > > > Cc: Christian König 
> > > > > 
> > > > > Mea culpa, I copied the mistake from
> > > > > 77d17c4cd0bf52eacfad88e63e8932eb45d643c5. :)
> > > > > 
> > > > > Regards,
> > > > > 
> > > > > Tvrtko
> > > > > 
> > > > > > Cc: Rob Clark 
> > > > > > ---
> > > > > >Documentation/gpu/drm-usage-stats.rst | 10 +-
> > > > > >1 file changed, 9 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/Documentation/gpu/drm-usage-stats.rst
> > > > > > b/Documentation/gpu/drm-usage-stats.rst
> > > > > > index 6dc299343b48..ef5c0a0aa477 100644
> > > > > > --- a/Documentation/gpu/drm-usage-stats.rst
> > > > > > +++ b/Documentation/gpu/drm-usage-stats.rst
> > > > > > @@ -128,7 +128,9 @@ Memory
> > > > > >Each possible memory type which can be used to store buffer
> > > > > > objects by the
> > > > > >GPU in question shall be given a stable and unique name to be
> > > > > > returned as the
> > > > > > -string here.  The name "memory" is reserved to refer to normal
> > > > > > system memory.
> > > > > > +string here.
> > > > > > +
> > > > > > +The region name "memory" is reserved to refer to normal system 
> > > > > > memory.
> > > > > >Value shall reflect the amount of storage currently consumed by
> > > > > > the buffer
> > > > > >objects belong to this client, in the respective memory region.
> > > > > > @@ -136,6 +138,9 @@ objects belong to this client, in the respective
> > > > > > memory region.
> > > > > >Default unit shall be bytes with optional unit specifiers of 
> > > > > > 'KiB'
> > > > > > or 'MiB'
> > > > > >indicating kibi- or mebi-bytes.
> > > > > > +This is an alias for drm-total- and only one of the two
> > > > > > should be
> > > > > > +present.
> > > 
> > > This feels a bit awkward and seems to needlessly complicate fdinfo uapi.
> > > 
> > > - Could we just patch amdgpu to follow everyone else, and avoid the
> > >special case? If there's no tool that relies on the special amdgpu
> > >prefix then that would be a lot easier.
> > > 
> > > - If that's not on the table, could we make everyone (with a suitable
> > >helper or something) just print both variants, so that we again have
> > >consisent fdinfo output? Or breaks that a different set of existing
> > >tools.
> > > 
> > > - Finally maybe could we get away with fixing amd by adding the common
> > >format there, deprecating the old, fixing

Re: [PATCH] Documentation/gpu: Document the situation with unqualified drm-memory-

2024-05-03 Thread Daniel Vetter
On Fri, May 03, 2024 at 01:58:38PM +0100, Tvrtko Ursulin wrote:
> 
> [And I forgot dri-devel.. doing well!]
> 
> On 03/05/2024 13:40, Tvrtko Ursulin wrote:
> > 
> > [Correcting Christian's email]
> > 
> > On 03/05/2024 13:36, Tvrtko Ursulin wrote:
> > > From: Tvrtko Ursulin 
> > > 
> > > Currently it is not well defined what is drm-memory- compared to other
> > > categories.
> > > 
> > > In practice the only driver which emits these keys is amdgpu and in them
> > > exposes the total memory use (including shared).
> > > 
> > > Document that drm-memory- and drm-total-memory- are aliases to
> > > prevent any
> > > confusion in the future.
> > > 
> > > While at it also clarify that the reserved sub-string 'memory' refers to
> > > the memory region component.
> > > 
> > > Signed-off-by: Tvrtko Ursulin 
> > > Cc: Alex Deucher 
> > > Cc: Christian König 
> > 
> > Mea culpa, I copied the mistake from
> > 77d17c4cd0bf52eacfad88e63e8932eb45d643c5. :)
> > 
> > Regards,
> > 
> > Tvrtko
> > 
> > > Cc: Rob Clark 
> > > ---
> > >   Documentation/gpu/drm-usage-stats.rst | 10 +-
> > >   1 file changed, 9 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/Documentation/gpu/drm-usage-stats.rst
> > > b/Documentation/gpu/drm-usage-stats.rst
> > > index 6dc299343b48..ef5c0a0aa477 100644
> > > --- a/Documentation/gpu/drm-usage-stats.rst
> > > +++ b/Documentation/gpu/drm-usage-stats.rst
> > > @@ -128,7 +128,9 @@ Memory
> > >   Each possible memory type which can be used to store buffer
> > > objects by the
> > >   GPU in question shall be given a stable and unique name to be
> > > returned as the
> > > -string here.  The name "memory" is reserved to refer to normal
> > > system memory.
> > > +string here.
> > > +
> > > +The region name "memory" is reserved to refer to normal system memory.
> > >   Value shall reflect the amount of storage currently consumed by
> > > the buffer
> > >   objects belong to this client, in the respective memory region.
> > > @@ -136,6 +138,9 @@ objects belong to this client, in the respective
> > > memory region.
> > >   Default unit shall be bytes with optional unit specifiers of 'KiB'
> > > or 'MiB'
> > >   indicating kibi- or mebi-bytes.
> > > +This is an alias for drm-total- and only one of the two
> > > should be
> > > +present.

This feels a bit awkward and seems to needlessly complicate fdinfo uapi.

- Could we just patch amdgpu to follow everyone else, and avoid the
  special case? If there's no tool that relies on the special amdgpu
  prefix then that would be a lot easier.

- If that's not on the table, could we make everyone (with a suitable
  helper or something) just print both variants, so that we again have
  consisent fdinfo output? Or breaks that a different set of existing
  tools.

- Finally maybe could we get away with fixing amd by adding the common
  format there, deprecating the old, fixing the tools that would break and
  then maybe if we're lucky, remove the old one from amdgpu in a year or
  so?

Uapi that's "either do $foo or on this one driver, do $bar" is just
guaranteed to fragement the ecosystem, so imo that should be the absolute
last resort.
-Sima

> > > +
> > >   - drm-shared-:  [KiB|MiB]
> > >   The total size of buffers that are shared with another file (e.g.,
> > > have more
> > > @@ -145,6 +150,9 @@ than a single handle).
> > >   The total size of buffers that including shared and private memory.
> > > +This is an alias for drm-memory- and only one of the two
> > > should be
> > > +present.
> > > +
> > >   - drm-resident-:  [KiB|MiB]
> > >   The total size of buffers that are resident in the specified region.

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH v4 00/42] Color Pipeline API w/ VKMS

2024-02-29 Thread Daniel Vetter
rs/gpu/drm/drm_mode_config.c |   7 +
>  drivers/gpu/drm/drm_plane.c   |  52 ++
>  drivers/gpu/drm/tests/Makefile|   3 +-
>  drivers/gpu/drm/tests/drm_fixp_test.c |  69 ++
>  drivers/gpu/drm/vkms/Kconfig  |  20 +
>  drivers/gpu/drm/vkms/Makefile |   4 +-
>  drivers/gpu/drm/vkms/tests/.kunitconfig   |   4 +
>  drivers/gpu/drm/vkms/tests/vkms_color_tests.c | 449 ++
>  drivers/gpu/drm/vkms/vkms_colorop.c   | 100 +++
>  drivers/gpu/drm/vkms/vkms_composer.c  | 135 ++-
>  drivers/gpu/drm/vkms/vkms_drv.h   |   8 +
>  drivers/gpu/drm/vkms/vkms_luts.c  | 802 ++
>  drivers/gpu/drm/vkms/vkms_luts.h  |  12 +
>  drivers/gpu/drm/vkms/vkms_plane.c |   2 +
>  include/drm/drm_atomic.h  | 122 +++
>  include/drm/drm_atomic_uapi.h |   3 +
>  include/drm/drm_colorop.h | 301 +++
>  include/drm/drm_file.h|   7 +
>  include/drm/drm_fixed.h   |  35 +-
>  include/drm/drm_mode_config.h |  18 +
>  include/drm/drm_plane.h   |  13 +
>  include/uapi/drm/drm.h|  16 +
>  include/uapi/drm/drm_mode.h   |  14 +
>  38 files changed, 3882 insertions(+), 30 deletions(-)
>  create mode 100644 Documentation/gpu/rfc/color_pipeline.rst
>  create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_colorop.c
>  create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_colorop.h
>  create mode 100644 drivers/gpu/drm/drm_colorop.c
>  create mode 100644 drivers/gpu/drm/tests/drm_fixp_test.c
>  create mode 100644 drivers/gpu/drm/vkms/Kconfig
>  create mode 100644 drivers/gpu/drm/vkms/tests/.kunitconfig
>  create mode 100644 drivers/gpu/drm/vkms/tests/vkms_color_tests.c
>  create mode 100644 drivers/gpu/drm/vkms/vkms_colorop.c
>  create mode 100644 drivers/gpu/drm/vkms/vkms_luts.c
>  create mode 100644 drivers/gpu/drm/vkms/vkms_luts.h
>  create mode 100644 include/drm/drm_colorop.h
> 
> --
> 2.44.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Kernel 6.7+ broke under-powering of my RX 6700XT. (Archlinux, mesa/amdgpu)

2024-02-26 Thread Daniel Vetter
gt;>> Side note: I assume those "lower bounds checking" is done round about
> > >>>>>> the same way by the Windows driver? Does that one allow users to go
> > >>>>>> lower somehow? Say after modifying the registry or something like 
> > >>>>>> that?
> > >>>>>> Or through external tools?
> > >>>>> Windows uses the same limit.  I'm not aware of any way to override the
> > >>>>> limit on windows off hand.
> > >>>>>
> > >>>>> Alex
> > >>>>>
> > >>>>>
> > >>>>>> Ciao, Thorsten
> > >>>>>>
> > >>>>>>>>>>> Roman posted something that apparently was meant to go to the 
> > >>>>>>>>>>> list, so
> > >>>>>>>>>>> let me put it here:
> > >>>>>>>>>>>
> > >>>>>>>>>>> """
> > >>>>>>>>>>> UPDATE: User fililip already posted patch, but it need to be 
> > >>>>>>>>>>> merged,
> > >>>>>>>>>>> discussion is on gitlab link below.
> > >>>>>>>>>>>
> > >>>>>>>>>>> (PS: I hope I am replying correctly to "all" now? - using 
> > >>>>>>>>>>> original addr.)
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>> it seems that commit was already found(see user's 'fililip' 
> > >>>>>>>>>>>> comment):
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> https://gitlab.freedesktop.org/drm/amd/-/issues/3183
> > >>>>>>>>>>>> commit 1958946858a62b6b5392ed075aa219d199bcae39
> > >>>>>>>>>>>> Author: Ma Jun 
> > >>>>>>>>>>>> Date:   Thu Oct 12 09:33:45 2023 +0800
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>   drm/amd/pm: Support for getting power1_cap_min value
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>   Support for getting power1_cap_min value on smu13 and 
> > >>>>>>>>>>>> smu11.
> > >>>>>>>>>>>>   For other Asics, we still use 0 as the default value.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>   Signed-off-by: Ma Jun 
> > >>>>>>>>>>>>   Reviewed-by: Kenneth Feng 
> > >>>>>>>>>>>>   Signed-off-by: Alex Deucher 
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> However, this is not good as it remove under-powering range 
> > >>>>>>>>>>>> too far. I
> > >>>>>>>>>>> was getting only about 7% less performance but 90W(!) less 
> > >>>>>>>>>>> consumption
> > >>>>>>>>>>> when set to my 115W before. Also I wonder if we as a OS of 
> > >>>>>>>>>>> options and
> > >>>>>>>>>>> freedom have to stick to such very high reference for min 
> > >>>>>>>>>>> values without
> > >>>>>>>>>>> ability to override them through some sys ctrls. Commit was 
> > >>>>>>>>>>> done by amd
> > >>>>>>>>>>> guy and I wonder if because of maybe this post that I made few 
> > >>>>>>>>>>> months
> > >>>>>>>>>>> ago(business strategy?):
> > >>>>>>>>>>> https://www.reddit.com/r/Amd/comments/183gye7/rx_6700xt_from_230w_to_capped_115w_at_only_10/
> > >>>>>>>>>>>> This is not a dangerous OC upwards where I can understand 
> > >>>>>>>>>>>> desire to
> > >>>>>>>>>>> protect HW, it is downward, having min cap at 190W when card 
> > >>>>>>>>>>> pull on
> > >>>>>>>>>>> 115W almost same speed is IMO crazy to deny. We don't talk 
> > >>>>>>>>>>> about default
> > >>>>>>>>>>> or reference values here either, just a move to lower the range 
> > >>>>>>>>>>> of
> > >>>>>>>>>>> options for whatever reason.
> > >>>>>>>>>>>> I don't know how much power you guys have over them, but please
> > >>>>>>>>>>> consider either reverting this change, or give us an option to 
> > >>>>>>>>>>> set
> > >>>>>>>>>>> min_cap through say /sys (right now param is readonly, even for 
> > >>>>>>>>>>> root).
> > >>>>>>>>>>>> Thank you in advance for looking into this, with regards:  
> > >>>>>>>>>>>> Romano
> > >>>>>>>>>>> """
> > >>>>>>>>>>>
> > >>>>>>>>>>> And while at it, let me add this issue to the tracking as well
> > >>>>>>>>>>>
> > >>>>>>>>>>> [TLDR: I'm adding this report to the list of tracked Linux 
> > >>>>>>>>>>> kernel
> > >>>>>>>>>>> regressions; the text you find below is based on a few templates
> > >>>>>>>>>>> paragraphs you might have encountered already in similar form.
> > >>>>>>>>>>> See link in footer if these mails annoy you.]
> > >>>>>>>>>>>
> > >>>>>>>>>>> Thanks for the report. To be sure the issue doesn't fall 
> > >>>>>>>>>>> through the
> > >>>>>>>>>>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel 
> > >>>>>>>>>>> regression
> > >>>>>>>>>>> tracking bot:
> > >>>>>>>>>>>
> > >>>>>>>>>>> #regzbot introduced 1958946858a62b /
> > >>>>>>>>>>> #regzbot title drm: amdgpu: under-powering broke
> > >>>>>>>>>>>
> > >>>>>>>>>>> Ciao, Thorsten (wearing his 'the Linux kernel's regression 
> > >>>>>>>>>>> tracker' hat)
> > >>>>>>>>>>> --
> > >>>>>>>>>>> Everything you wanna know about Linux kernel regression 
> > >>>>>>>>>>> tracking:
> > >>>>>>>>>>> https://linux-regtracking.leemhuis.info/about/#tldr
> > >>>>>>>>>>> That page also explains what to do if mails like this annoy you.
> > >>>>>>>
> > >
> > >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 1/6] tracing, dma-buf: add a trace_dma_fence_sync_to event

2024-02-16 Thread Daniel Vetter
On Fri, Feb 16, 2024 at 05:51:59PM +0100, Christian König wrote:
> Am 16.02.24 um 17:32 schrieb Daniel Vetter:
> > On Tue, Feb 13, 2024 at 04:50:26PM +0100, Pierre-Eric Pelloux-Prayer wrote:
> > > This new event can be used to trace where a given dma_fence is added
> > > as a dependency of some other work.
> > How?
> > 
> > What I'd expected here is that you add a dependency chain from one fence
> > to another, but this only has one fence.
> 
> That's what I though initially as well, but at the point we add the
> dependency fences to the scheduler job we don't have the scheduler fence
> initialized yet.
> 
> We could change this so that we only trace all the fences after we have
> initialized the scheduler fence, but then we loose the information where the
> dependency comes from.

Hm right, I thought we'd dump the hashed pointe value into the fence
events too, then you could make the connection. But we don't, so this is a
bit annoying ...

And walking the entire scheduler dependency chain at trace_dma_fence_emit
time (or something similar) maybe?
-Sima

> > How do you figure out what's the
> > next dma_fence that will stall on this dependency?
> 
> I'm not fully sure on that either. Pierre?
> 
> Christian.
> 
> 
> >   Like in the gpu
> > scheduler we do know what will be the fence that userspace gets back, so
> > we can make that connection. And same for the atomic code (although you
> > don't wire that up at all).
> > 
> > I'm very confused on how this works and rather worried it's a brittle
> > amdgpu-only solution ...
> > -Sima
> > 
> > > I plan to use it in amdgpu.
> > > 
> > > Signed-off-by: Pierre-Eric Pelloux-Prayer 
> > > 
> > > ---
> > >   drivers/dma-buf/dma-fence.c  |  1 +
> > >   include/trace/events/dma_fence.h | 34 
> > >   2 files changed, 35 insertions(+)
> > > 
> > > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> > > index e0fd99e61a2d..671a499a5ccd 100644
> > > --- a/drivers/dma-buf/dma-fence.c
> > > +++ b/drivers/dma-buf/dma-fence.c
> > > @@ -23,6 +23,7 @@
> > >   EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
> > >   EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
> > >   EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
> > > +EXPORT_TRACEPOINT_SYMBOL(dma_fence_sync_to);
> > >   static DEFINE_SPINLOCK(dma_fence_stub_lock);
> > >   static struct dma_fence dma_fence_stub;
> > > diff --git a/include/trace/events/dma_fence.h 
> > > b/include/trace/events/dma_fence.h
> > > index 3963e79ca7b4..9b3875f7aa79 100644
> > > --- a/include/trace/events/dma_fence.h
> > > +++ b/include/trace/events/dma_fence.h
> > > @@ -83,6 +83,40 @@ DEFINE_EVENT(dma_fence, dma_fence_wait_end,
> > >   TP_ARGS(fence)
> > >   );
> > > +DECLARE_EVENT_CLASS(dma_fence_from,
> > > +
> > > + TP_PROTO(struct dma_fence *fence, const char *reason),
> > > +
> > > + TP_ARGS(fence, reason),
> > > +
> > > + TP_STRUCT__entry(
> > > + __string(driver, fence->ops->get_driver_name(fence))
> > > + __string(timeline, fence->ops->get_timeline_name(fence))
> > > + __field(unsigned int, context)
> > > + __field(unsigned int, seqno)
> > > + __string(reason, reason)
> > > + ),
> > > +
> > > + TP_fast_assign(
> > > + __assign_str(driver, fence->ops->get_driver_name(fence));
> > > + __assign_str(timeline, fence->ops->get_timeline_name(fence));
> > > + __entry->context = fence->context;
> > > + __entry->seqno = fence->seqno;
> > > + __assign_str(reason, reason);
> > > + ),
> > > +
> > > + TP_printk("driver=%s timeline=%s context=%u seqno=%u reason=%s",
> > > +   __get_str(driver), __get_str(timeline), __entry->context,
> > > +   __entry->seqno, __get_str(reason))
> > > +);
> > > +
> > > +DEFINE_EVENT(dma_fence_from, dma_fence_sync_to,
> > > +
> > > + TP_PROTO(struct dma_fence *fence, const char *reason),
> > > +
> > > + TP_ARGS(fence, reason)
> > > +);
> > > +
> > >   #endif /*  _TRACE_DMA_FENCE_H */
> > >   /* This part must be outside protection */
> > > -- 
> > > 2.40.1
> > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v4 1/3] drm: Add drm_get_acpi_edid() helper

2024-02-16 Thread Daniel Vetter
On Mon, Feb 12, 2024 at 01:27:57PM +0200, Jani Nikula wrote:
> On Sat, 10 Feb 2024, Mario Limonciello  wrote:
> > On 2/9/2024 12:57, Daniel Vetter wrote:
> >> On Fri, Feb 09, 2024 at 09:34:13AM -0600, Mario Limonciello wrote:
> >>> On 2/9/2024 05:07, Daniel Vetter wrote:
> >>>> On Thu, Feb 08, 2024 at 11:57:11AM +0200, Jani Nikula wrote:
> >>>>> On Wed, 07 Feb 2024, Mario Limonciello  
> >>>>> wrote:
> >>>>>> Some manufacturers have intentionally put an EDID that differs from
> >>>>>> the EDID on the internal panel on laptops.  Drivers can call this
> >>>>>> helper to attempt to fetch the EDID from the BIOS's ACPI _DDC method.
> >>>>>>
> >>>>>> Signed-off-by: Mario Limonciello 
> >>>>>> ---
> >>>>>>drivers/gpu/drm/Kconfig|  5 +++
> >>>>>>drivers/gpu/drm/drm_edid.c | 77 
> >>>>>> ++
> >>>>>>include/drm/drm_edid.h |  1 +
> >>>>>>3 files changed, 83 insertions(+)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> >>>>>> index 6ec33d36f3a4..ec2bb71e8b36 100644
> >>>>>> --- a/drivers/gpu/drm/Kconfig
> >>>>>> +++ b/drivers/gpu/drm/Kconfig
> >>>>>> @@ -21,6 +21,11 @@ menuconfig DRM
> >>>>>>select KCMP
> >>>>>>select VIDEO_CMDLINE
> >>>>>>select VIDEO_NOMODESET
> >>>>>> +  select ACPI_VIDEO if ACPI
> >>>>>> +  select BACKLIGHT_CLASS_DEVICE if ACPI
> >>>>>> +  select INPUT if ACPI
> >>>>>> +  select X86_PLATFORM_DEVICES if ACPI && X86
> >>>>>> +  select ACPI_WMI if ACPI && X86
> >>>>>
> >>>>> I think I'll defer to drm maintainers on whether this is okay or
> >>>>> something to be avoided.
> >>>>
> >>>> Uh yeah this is a bit much, and select just messes with everything. Just
> >>>> #ifdef this in the code with a dummy alternative, if users configure 
> >>>> their
> >>>> kernel without acpi but need it, they get to keep all the pieces.
> >>>>
> >>>> Alternatively make a DRM_ACPI_HELPERS symbol, but imo a Kconfig for every
> >>>> function is also not great. And just using #ifdef in the code also works
> >>>> for CONFIG_OF, which is exactly the same thing for platforms using dt to
> >>>> describe hw.
> >>>>
> >>>> Also I'd expect ACPI code to already provide dummy functions if ACPI is
> >>>> provided, so you probably dont even need all that much #ifdef in the 
> >>>> code.
> >>>>
> >>>> What we defo cant do is select platform/hw stuff just because you enable
> >>>> CONFIG_DRM.
> >>>> -Sima
> >>>
> >>> The problem was with linking.  I'll experiment with #ifdef for the next
> >>> version.
> >> 
> >> Ah yes, if e.g. acpi is a module but drm is built-in then it will compile,
> >> but not link.
> >> 
> >> You need
> >> 
> >>depends on (ACPI || ACPI=n)
> >> 
> >> for this. Looks a bit funny but works for all combinations.
> >
> > Nope; this fails at link time with this combination:
> >
> > CONFIG_ACPI=y
> > CONFIG_ACPI_VIDEO=m
> > CONFIG_DRM=y
> >
> > ld: drivers/gpu/drm/drm_edid.o: in function `drm_do_probe_acpi_edid':
> > drm_edid.c:(.text+0xd34): undefined reference to `acpi_video_get_edid'
> > make[5]: *** [scripts/Makefile.vmlinux:37: vmlinux] Error 1
> >
> > So the logical solution is to try
> > depends on (ACPI_VIDEO || ACPI_VIDEO=n)
> >
> > But that leads me back to the rabbit hole of why I had the selects moved 
> > to drm instead of drivers in the first place:
> >
> > drivers/gpu/drm/Kconfig:8:error: recursive dependency detected!
> > drivers/gpu/drm/Kconfig:8:  symbol DRM depends on ACPI_VIDEO
> > drivers/acpi/Kconfig:213:   symbol ACPI_VIDEO depends on 
> > BACKLIGHT_CLASS_DEVICE
> > drivers/video/backlight/Kconfig:136:symbol BACKLIGHT_CLASS_DEVICE is 
> > selected by DRM_RADEON
> > drivers/gpu/drm/radeon/Kconfig:3:   symbol DRM_RADEON depends on DRM
> 
> Generally 

Re: [PATCH v2 6/6] drm: add drm_mode_atomic_commit event

2024-02-16 Thread Daniel Vetter
On Tue, Feb 13, 2024 at 11:20:17AM -0500, Steven Rostedt wrote:
> On Tue, 13 Feb 2024 16:50:31 +0100
> Pierre-Eric Pelloux-Prayer  wrote:
> 
> > @@ -1503,6 +1504,24 @@ int drm_mode_atomic_ioctl(struct drm_device *dev,
> > drm_mode_object_put(obj);
> > }
> >  
> > +   if (trace_drm_mode_atomic_commit_enabled()) {
> > +   struct drm_crtc_state *crtc_state;
> > +   struct drm_crtc *crtc;
> > +   int *crtcs;
> > +   int i, num_crtcs;
> > +
> > +   crtcs = kcalloc(dev->mode_config.num_crtc, sizeof(int),
> > +   GFP_KERNEL);
> 
> If the above allocation fails, this will cause a NULL kernel dereference.

Yeah can't we somehow iterate directly into the trace subsystem? If
nothing else works I guess just a per-crtc event should do.

The more fundamental issue: I don't get how this works. For atomic we
have:
- explicitly handed in in-fences as dependencies with the IN_FENCE
  property
- dependencies that drivers fish out of the dma_resv object of the
  underlying gem buffer objects for each framebuffer. That has become
  pretty much entirely generic code since everyone uses the same, and so
  imo the dependency tracking should be fully generic too

- atomic has an out-fence too, so we could even do the full fence->fence
  dependency tracking. It's just not created as a userspace object if all
  userspace asks for is a drm vblank event, but it is very much there. And
  I think if you want fence tracking, we really should have fence tracking
  :-) Also the out-fence should be 100% generic (or it's a driver bug)
  because the driver functions hide the differences between generating a
  vblank event and signalling a dma_fence.

Cheers, Sima


> 
> -- Steve
> 
> > +
> > +   num_crtcs = 0;
> > +   for_each_new_crtc_in_state(state, crtc, crtc_state, i)
> > +   crtcs[num_crtcs++] = drm_crtc_index(crtc);
> > +
> > +   trace_drm_mode_atomic_commit(file_priv, crtcs, num_crtcs, 
> > arg->flags);
> > +
> > +   kfree(crtcs);
> > +   }
> > +
> > ret = prepare_signaling(dev, state, arg, file_priv, _state,
> > _fences);
> > if (ret)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 1/6] tracing, dma-buf: add a trace_dma_fence_sync_to event

2024-02-16 Thread Daniel Vetter
On Tue, Feb 13, 2024 at 04:50:26PM +0100, Pierre-Eric Pelloux-Prayer wrote:
> This new event can be used to trace where a given dma_fence is added
> as a dependency of some other work.

How?

What I'd expected here is that you add a dependency chain from one fence
to another, but this only has one fence. How do you figure out what's the
next dma_fence that will stall on this dependency? Like in the gpu
scheduler we do know what will be the fence that userspace gets back, so
we can make that connection. And same for the atomic code (although you
don't wire that up at all).

I'm very confused on how this works and rather worried it's a brittle
amdgpu-only solution ...
-Sima

> I plan to use it in amdgpu.
> 
> Signed-off-by: Pierre-Eric Pelloux-Prayer 
> ---
>  drivers/dma-buf/dma-fence.c  |  1 +
>  include/trace/events/dma_fence.h | 34 
>  2 files changed, 35 insertions(+)
> 
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index e0fd99e61a2d..671a499a5ccd 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -23,6 +23,7 @@
>  EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
>  EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
>  EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
> +EXPORT_TRACEPOINT_SYMBOL(dma_fence_sync_to);
>  
>  static DEFINE_SPINLOCK(dma_fence_stub_lock);
>  static struct dma_fence dma_fence_stub;
> diff --git a/include/trace/events/dma_fence.h 
> b/include/trace/events/dma_fence.h
> index 3963e79ca7b4..9b3875f7aa79 100644
> --- a/include/trace/events/dma_fence.h
> +++ b/include/trace/events/dma_fence.h
> @@ -83,6 +83,40 @@ DEFINE_EVENT(dma_fence, dma_fence_wait_end,
>   TP_ARGS(fence)
>  );
>  
> +DECLARE_EVENT_CLASS(dma_fence_from,
> +
> + TP_PROTO(struct dma_fence *fence, const char *reason),
> +
> + TP_ARGS(fence, reason),
> +
> + TP_STRUCT__entry(
> + __string(driver, fence->ops->get_driver_name(fence))
> + __string(timeline, fence->ops->get_timeline_name(fence))
> + __field(unsigned int, context)
> + __field(unsigned int, seqno)
> + __string(reason, reason)
> + ),
> +
> + TP_fast_assign(
> + __assign_str(driver, fence->ops->get_driver_name(fence));
> + __assign_str(timeline, fence->ops->get_timeline_name(fence));
> + __entry->context = fence->context;
> + __entry->seqno = fence->seqno;
> + __assign_str(reason, reason);
> + ),
> +
> + TP_printk("driver=%s timeline=%s context=%u seqno=%u reason=%s",
> +   __get_str(driver), __get_str(timeline), __entry->context,
> +   __entry->seqno, __get_str(reason))
> +);
> +
> +DEFINE_EVENT(dma_fence_from, dma_fence_sync_to,
> +
> + TP_PROTO(struct dma_fence *fence, const char *reason),
> +
> + TP_ARGS(fence, reason)
> +);
> +
>  #endif /*  _TRACE_DMA_FENCE_H */
>  
>  /* This part must be outside protection */
> -- 
> 2.40.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 0/6] dma-fence, drm, amdgpu new trace events

2024-02-16 Thread Daniel Vetter
On Tue, Feb 13, 2024 at 04:50:25PM +0100, Pierre-Eric Pelloux-Prayer wrote:
> This series adds new events to make it easier for tools
> like gpuvis or umr to graph the GPUs, kernel and applications
> activity.
> 
> UMR patches using these events can be found here:
> https://gitlab.freedesktop.org/tomstdenis/umr/-/merge_requests/37
> 
> V1:
> https://patchwork.kernel.org/project/linux-media/patch/20240117184329.479554-1-pierre-eric.pelloux-pra...@amd.com/
> 
> Changes from V1:
> * uses trace_dma_fence_sync_to from dma-fence-chain.c
> * new amdgpu events
> * new drm plane commit event

I think a patch to add this to the drm/sched dependency tracking would be
really neat. With the addition of drm_sched_job_add_dependency() and
friends that would wire up some basic dependency tracking for a _lot_ of
drivers.

It should also be done before the amdgpu specific additions, because
amdgpu is also using that and we don't want to duplicate fence dependency
tracking in drivers that should be in common code.

Cheer, Sima
> 
> Pierre-Eric Pelloux-Prayer (6):
>   tracing, dma-buf: add a trace_dma_fence_sync_to event
>   dma-buf/fence-chain: use trace_dma_fence_sync_to
>   amdgpu: use trace_dma_fence_sync_to in amdgpu_fence_sync
>   drm/amdgpu: add BO clear event
>   drm/amdgpu: add a amdgpu_cs_ioctl2 event
>   drm: add drm_mode_atomic_commit event
> 
>  drivers/dma-buf/dma-fence-chain.c |  4 +++
>  drivers/dma-buf/dma-fence.c   |  1 +
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  8 ++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 16 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c   |  8 ++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c   |  4 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c|  2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c  | 11 --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h  |  3 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 28 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c  |  4 +--
>  drivers/gpu/drm/drm_atomic_uapi.c | 19 +++
>  drivers/gpu/drm/drm_trace.h   | 28 +--
>  include/trace/events/dma_fence.h  | 34 +++
>  14 files changed, 144 insertions(+), 26 deletions(-)
> 
> -- 
> 2.40.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/buddy: Fix alloc_range() error handling code

2024-02-09 Thread Daniel Vetter
On Sat, Feb 10, 2024 at 12:06:58AM +0530, Arunpravin Paneer Selvam wrote:
> Hi Daniel,
> 
> On 2/9/2024 11:34 PM, Daniel Vetter wrote:
> > On Fri, Feb 09, 2024 at 08:56:24PM +0530, Arunpravin Paneer Selvam wrote:
> > > Few users have observed display corruption when they boot
> > > the machine to KDE Plasma or playing games. We have root
> > > caused the problem that whenever alloc_range() couldn't
> > > find the required memory blocks the function was returning
> > > SUCCESS in some of the corner cases.
> > > 
> > > The right approach would be if the total allocated size
> > > is less than the required size, the function should
> > > return -ENOSPC.
> > > 
> > > Cc:   # 6.7+
> > > Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory allocation")
> > > Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3097
> > > Tested-by: Mario Limonciello 
> > > Link: 
> > > https://patchwork.kernel.org/project/dri-devel/patch/20240207174456.341121-1-arunpravin.paneersel...@amd.com/
> > > Acked-by: Christian König 
> > > Reviewed-by: Matthew Auld 
> > > Signed-off-by: Arunpravin Paneer Selvam 
> > New unit test for this would be most excellent - these kind of missed edge
> > cases is exactly what kunit is for. Can you please follow up with, since
> > we don't want to hold up the bugfix for longer?
> Matthew Auld has added a new unit test for this case. Please let us know if
> this will suffice.
> https://patchwork.freedesktop.org/patch/577497/?series=129671=1

Ah yeah, might be best to submit them both together as one series (you
just need to add your own signed-off-by if you resend other people's
patches). That way bots can pick it up together, since new testcase and
bugfix only make sense together.
-Sima

> 
> Thanks,
> Arun.
> > -Sima
> > 
> > > ---
> > >   drivers/gpu/drm/drm_buddy.c | 6 ++
> > >   1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> > > index f57e6d74fb0e..c1a99bf4dffd 100644
> > > --- a/drivers/gpu/drm/drm_buddy.c
> > > +++ b/drivers/gpu/drm/drm_buddy.c
> > > @@ -539,6 +539,12 @@ static int __alloc_range(struct drm_buddy *mm,
> > >   } while (1);
> > >   list_splice_tail(, blocks);
> > > +
> > > + if (total_allocated < size) {
> > > + err = -ENOSPC;
> > > + goto err_free;
> > > + }
> > > +
> > >   return 0;
> > >   err_undo:
> > > -- 
> > > 2.25.1
> > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v4 1/3] drm: Add drm_get_acpi_edid() helper

2024-02-09 Thread Daniel Vetter
On Fri, Feb 09, 2024 at 09:34:13AM -0600, Mario Limonciello wrote:
> On 2/9/2024 05:07, Daniel Vetter wrote:
> > On Thu, Feb 08, 2024 at 11:57:11AM +0200, Jani Nikula wrote:
> > > On Wed, 07 Feb 2024, Mario Limonciello  wrote:
> > > > Some manufacturers have intentionally put an EDID that differs from
> > > > the EDID on the internal panel on laptops.  Drivers can call this
> > > > helper to attempt to fetch the EDID from the BIOS's ACPI _DDC method.
> > > > 
> > > > Signed-off-by: Mario Limonciello 
> > > > ---
> > > >   drivers/gpu/drm/Kconfig|  5 +++
> > > >   drivers/gpu/drm/drm_edid.c | 77 ++
> > > >   include/drm/drm_edid.h |  1 +
> > > >   3 files changed, 83 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> > > > index 6ec33d36f3a4..ec2bb71e8b36 100644
> > > > --- a/drivers/gpu/drm/Kconfig
> > > > +++ b/drivers/gpu/drm/Kconfig
> > > > @@ -21,6 +21,11 @@ menuconfig DRM
> > > > select KCMP
> > > > select VIDEO_CMDLINE
> > > > select VIDEO_NOMODESET
> > > > +   select ACPI_VIDEO if ACPI
> > > > +   select BACKLIGHT_CLASS_DEVICE if ACPI
> > > > +   select INPUT if ACPI
> > > > +   select X86_PLATFORM_DEVICES if ACPI && X86
> > > > +   select ACPI_WMI if ACPI && X86
> > > 
> > > I think I'll defer to drm maintainers on whether this is okay or
> > > something to be avoided.
> > 
> > Uh yeah this is a bit much, and select just messes with everything. Just
> > #ifdef this in the code with a dummy alternative, if users configure their
> > kernel without acpi but need it, they get to keep all the pieces.
> > 
> > Alternatively make a DRM_ACPI_HELPERS symbol, but imo a Kconfig for every
> > function is also not great. And just using #ifdef in the code also works
> > for CONFIG_OF, which is exactly the same thing for platforms using dt to
> > describe hw.
> > 
> > Also I'd expect ACPI code to already provide dummy functions if ACPI is
> > provided, so you probably dont even need all that much #ifdef in the code.
> > 
> > What we defo cant do is select platform/hw stuff just because you enable
> > CONFIG_DRM.
> > -Sima
> 
> The problem was with linking.  I'll experiment with #ifdef for the next
> version.

Ah yes, if e.g. acpi is a module but drm is built-in then it will compile,
but not link.

You need

depends on (ACPI || ACPI=n)

for this. Looks a bit funny but works for all combinations.

Since this gets mess it might be useful to have a DRM_ACPI_HELPERS Kconfig
that controls all this.
-Sima

> 
> > 
> > > 
> > > 
> > > > help
> > > >   Kernel-level support for the Direct Rendering Infrastructure 
> > > > (DRI)
> > > >   introduced in XFree86 4.0. If you say Y here, you need to 
> > > > select
> > > > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> > > > index 923c4423151c..c649b4f9fd8e 100644
> > > > --- a/drivers/gpu/drm/drm_edid.c
> > > > +++ b/drivers/gpu/drm/drm_edid.c
> > > > @@ -28,6 +28,7 @@
> > > >* DEALINGS IN THE SOFTWARE.
> > > >*/
> > > > +#include 
> > > >   #include 
> > > >   #include 
> > > >   #include 
> > > > @@ -2188,6 +2189,49 @@ drm_do_probe_ddc_edid(void *data, u8 *buf, 
> > > > unsigned int block, size_t len)
> > > > return ret == xfers ? 0 : -1;
> > > >   }
> > > > +/**
> > > > + * drm_do_probe_acpi_edid() - get EDID information via ACPI _DDC
> > > > + * @data: struct drm_device
> > > > + * @buf: EDID data buffer to be filled
> > > > + * @block: 128 byte EDID block to start fetching from
> > > > + * @len: EDID data buffer length to fetch
> > > > + *
> > > > + * Try to fetch EDID information by calling acpi_video_get_edid() 
> > > > function.
> > > > + *
> > > > + * Return: 0 on success or error code on failure.
> > > > + */
> > > > +static int
> > > > +drm_do_probe_acpi_edid(void *data, u8 *buf, unsigned int block, size_t 
> > > > len)
> > > > +{
> > > > +   struct drm_device *ddev = data;
> > > > +   struct acpi_device *acpidev = ACPI_COMPANI

Re: [PATCH] drm/buddy: Fix alloc_range() error handling code

2024-02-09 Thread Daniel Vetter
On Fri, Feb 09, 2024 at 08:56:24PM +0530, Arunpravin Paneer Selvam wrote:
> Few users have observed display corruption when they boot
> the machine to KDE Plasma or playing games. We have root
> caused the problem that whenever alloc_range() couldn't
> find the required memory blocks the function was returning
> SUCCESS in some of the corner cases.
> 
> The right approach would be if the total allocated size
> is less than the required size, the function should
> return -ENOSPC.
> 
> Cc:   # 6.7+
> Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory allocation")
> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3097
> Tested-by: Mario Limonciello 
> Link: 
> https://patchwork.kernel.org/project/dri-devel/patch/20240207174456.341121-1-arunpravin.paneersel...@amd.com/
> Acked-by: Christian König 
> Reviewed-by: Matthew Auld 
> Signed-off-by: Arunpravin Paneer Selvam 

New unit test for this would be most excellent - these kind of missed edge
cases is exactly what kunit is for. Can you please follow up with, since
we don't want to hold up the bugfix for longer?
-Sima

> ---
>  drivers/gpu/drm/drm_buddy.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> index f57e6d74fb0e..c1a99bf4dffd 100644
> --- a/drivers/gpu/drm/drm_buddy.c
> +++ b/drivers/gpu/drm/drm_buddy.c
> @@ -539,6 +539,12 @@ static int __alloc_range(struct drm_buddy *mm,
>   } while (1);
>  
>   list_splice_tail(, blocks);
> +
> + if (total_allocated < size) {
> + err = -ENOSPC;
> + goto err_free;
> + }
> +
>   return 0;
>  
>  err_undo:
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v4 1/3] drm: Add drm_get_acpi_edid() helper

2024-02-09 Thread Daniel Vetter
ll the other struct
> drm_edid based EDID reading functions.
> 
> > + * @connector: connector we're probing
> > + *
> > + * Use the BIOS to attempt to grab EDID data if possible.
> > + *
> > + * The returned pointer must be freed using drm_edid_free().
> > + *
> > + * Return: Pointer to valid EDID or NULL if we couldn't find any.
> > + */
> > +const struct drm_edid *drm_get_acpi_edid(struct drm_connector *connector)
> > +{
> > +   const struct drm_edid *drm_edid;
> > +
> > +   switch (connector->connector_type) {
> > +   case DRM_MODE_CONNECTOR_LVDS:
> > +   case DRM_MODE_CONNECTOR_eDP:
> > +   break;
> > +   default:
> > +   return NULL;
> > +   }
> > +
> > +   if (connector->force == DRM_FORCE_OFF)
> > +   return NULL;
> > +
> > +   drm_edid = drm_edid_read_custom(connector, drm_do_probe_acpi_edid, 
> > connector->dev);
> > +
> > +   /* Note: Do *not* call connector updates here. */
> > +
> > +   return drm_edid;
> > +}
> > +EXPORT_SYMBOL(drm_get_acpi_edid);
> > +
> >  /**
> >   * drm_edid_read_custom - Read EDID data using given EDID block read 
> > function
> >   * @connector: Connector to use
> > diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
> > index 7923bc00dc7a..ca41be289fc6 100644
> > --- a/include/drm/drm_edid.h
> > +++ b/include/drm/drm_edid.h
> > @@ -410,6 +410,7 @@ struct edid *drm_do_get_edid(struct drm_connector 
> > *connector,
> > void *data);
> >  struct edid *drm_get_edid(struct drm_connector *connector,
> >   struct i2c_adapter *adapter);
> > +const struct drm_edid *drm_get_acpi_edid(struct drm_connector *connector);
> 
> There's a comment
> 
> /* Interface based on struct drm_edid */
> 
> towards the end of the file, gathering all the new API under it.
> 
> Other than that, LGTM,
> 
> BR,
> Jani.
> 
> >  u32 drm_edid_get_panel_id(struct i2c_adapter *adapter);
> >  struct edid *drm_get_edid_switcheroo(struct drm_connector *connector,
> >  struct i2c_adapter *adapter);
> 
> -- 
> Jani Nikula, Intel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 3/3] drm/amdgpu: wire up the can_remove() callback

2024-02-09 Thread Daniel Vetter
On Tue, Feb 06, 2024 at 07:42:49PM +0100, Christian König wrote:
> Am 06.02.24 um 15:29 schrieb Daniel Vetter:
> > On Fri, Feb 02, 2024 at 03:40:03PM -0800, Greg Kroah-Hartman wrote:
> > > On Fri, Feb 02, 2024 at 05:25:56PM -0500, Hamza Mahfooz wrote:
> > > > Removing an amdgpu device that still has user space references allocated
> > > > to it causes undefined behaviour.
> > > Then fix that please.  There should not be anything special about your
> > > hardware that all of the tens of thousands of other devices can't handle
> > > today.
> > > 
> > > What happens when I yank your device out of a system with a pci hotplug
> > > bus?  You can't prevent that either, so this should not be any different
> > > at all.
> > > 
> > > sorry, but please, just fix your driver.
> > fwiw Christian König from amd already rejected this too, I have no idea
> > why this was submitted
> 
> Well that was my fault.
> 
> I commented on an internal bug tracker that when sysfs bind/undbind is a
> different code path from PCI remove/re-scan we could try to reject it.
> 
> Turned out it isn't a different code path.

Yeah it's exactly the same code, and removing the sysfs stuff means we
cant test hotunplug without physical hotunplugging stuff anymore. So
really not great - if one is buggy so is the other, and sysfs allows us to
control the timing a lot better to hit specific issues.
-Sima

> >   since the very elaborate plan I developed with a
> > bunch of amd folks was to fix the various lifetime lolz we still have in
> > drm. We unfortunately export the world of internal objects to userspace as
> > uabi objects with dma_buf, dma_fence and everything else, but it's all
> > fixable and we have the plan even documented:
> > 
> > https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#device-hot-unplug
> > 
> > So yeah anything that isn't that plan of record is very much no-go for drm
> > drivers. Unless we change that plan of course, but that needs a
> > documentation patch first and a big discussion.
> > 
> > Aside from an absolute massive pile of kernel-internal refcounting bugs
> > the really big one we agreed on after a lot of discussion is that SIGBUS
> > on dma-buf mmaps is no-go for drm drivers, because it would break way too
> > much userspace in ways which are simply not fixable (since sig handlers
> > are shared in a process, which means the gl/vk driver cannot use it).
> > 
> > Otherwise it's bog standard "fix the kernel bugs" work, just a lot of it.
> 
> Ignoring a few memory leaks because of messed up refcounting we actually got
> that working quite nicely.
> 
> At least hot unplug / hot add seems to be working rather reliable in our
> internal testing.
> 
> So it can't be that messed up.
> 
> Regards,
> Christian.
> 
> > 
> > Cheers, Sima
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 3/3] drm/amdgpu: wire up the can_remove() callback

2024-02-06 Thread Daniel Vetter
On Fri, Feb 02, 2024 at 03:40:03PM -0800, Greg Kroah-Hartman wrote:
> On Fri, Feb 02, 2024 at 05:25:56PM -0500, Hamza Mahfooz wrote:
> > Removing an amdgpu device that still has user space references allocated
> > to it causes undefined behaviour.
> 
> Then fix that please.  There should not be anything special about your
> hardware that all of the tens of thousands of other devices can't handle
> today.
> 
> What happens when I yank your device out of a system with a pci hotplug
> bus?  You can't prevent that either, so this should not be any different
> at all.
> 
> sorry, but please, just fix your driver.

fwiw Christian König from amd already rejected this too, I have no idea
why this was submitted since the very elaborate plan I developed with a
bunch of amd folks was to fix the various lifetime lolz we still have in
drm. We unfortunately export the world of internal objects to userspace as
uabi objects with dma_buf, dma_fence and everything else, but it's all
fixable and we have the plan even documented:

https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#device-hot-unplug

So yeah anything that isn't that plan of record is very much no-go for drm
drivers. Unless we change that plan of course, but that needs a
documentation patch first and a big discussion.

Aside from an absolute massive pile of kernel-internal refcounting bugs
the really big one we agreed on after a lot of discussion is that SIGBUS
on dma-buf mmaps is no-go for drm drivers, because it would break way too
much userspace in ways which are simply not fixable (since sig handlers
are shared in a process, which means the gl/vk driver cannot use it).

Otherwise it's bog standard "fix the kernel bugs" work, just a lot of it.

Cheers, Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 1/1] drm/virtio: Implement device_attach

2024-01-30 Thread Daniel Vetter
On Tue, Jan 30, 2024 at 12:10:31PM +0100, Daniel Vetter wrote:
> On Mon, Jan 29, 2024 at 06:31:19PM +0800, Julia Zhang wrote:
> > As vram objects don't have backing pages and thus can't implement
> > drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf
> > callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf()
> > and implement virtgpu specific map/unmap/attach callbacks to support
> > both of shmem objects and vram objects.
> > 
> > Signed-off-by: Julia Zhang 
> > ---
> >  drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++---
> >  1 file changed, 36 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c 
> > b/drivers/gpu/drm/virtio/virtgpu_prime.c
> > index 44425f20d91a..b490a5343b06 100644
> > --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
> > +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
> > @@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment 
> > *attach,
> >  {
> > struct drm_gem_object *obj = attach->dmabuf->priv;
> > struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> > +   struct sg_table *sgt;
> > +   int ret;
> >  
> > if (virtio_gpu_is_vram(bo))
> > return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
> >  
> > -   return drm_gem_map_dma_buf(attach, dir);
> > +   sgt = drm_prime_pages_to_sg(obj->dev,
> > +   to_drm_gem_shmem_obj(obj)->pages,
> > +   obj->size >> PAGE_SHIFT);
> > +   if (IS_ERR(sgt))
> > +   return sgt;
> > +
> > +   ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> > +   if (ret) {
> > +   sg_free_table(sgt);
> > +   kfree(sgt);
> > +   return ERR_PTR(ret);
> > +   }
> > +
> > +   return sgt;
> >  }
> >  
> >  static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
> > @@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct 
> > dma_buf_attachment *attach,
> > struct drm_gem_object *obj = attach->dmabuf->priv;
> > struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> >  
> > +   if (!sgt)
> > +   return;
> > +
> > if (virtio_gpu_is_vram(bo)) {
> > virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
> > -   return;
> > +   } else {
> > +   dma_unmap_sgtable(attach->dev, sgt, dir, 
> > DMA_ATTR_SKIP_CPU_SYNC);
> > +   sg_free_table(sgt);
> > +   kfree(sgt);
> > }
> > +}
> > +
> > +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
> > +struct dma_buf_attachment *attach)
> > +{
> > +   struct drm_gem_object *obj = attach->dmabuf->priv;
> > +   struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> > +   int ret = 0;
> > +
> > +   if (!virtio_gpu_is_vram(bo) && obj->funcs->pin)
> > +   ret = obj->funcs->pin(obj);
> >  
> > -   drm_gem_unmap_dma_buf(attach, sgt, dir);
> > +   return ret;
> 
> This doesn't look like what I've expected. There should be no need to
> change the map/unmap functions, especially not for the usual gem bo case.
> We should definitely keep using the exact same code for that. Instead all
> I expected is roughly
> 
> virtgpu_gem_device_attach()
> {
>   if (virtio_gpu_is_vram(bo)) {
>   if (can_access_virtio_vram_directly(attach->dev)
>   return 0;
>   else
>   return -EBUSY;
>   } else {
>   return drm_gem_map_attach();
>   }
> }
> 
> Note that I think can_access_virtio_vram_directly() needs to be
> implemented first. I'm not even sure it's possible, might be that all the
> importers need to set the attachment->peer2peer flag. Which is why this
> thing exists really. But that's a pile more work to do.
> 
> Frankly the more I look at the original patch that added vram export
> support the more this just looks like a "pls revert, this is just too
> broken".

The commit I mean is this one: ea5ea3d8a117 ("drm/virtio: support mapping
exported vram"). The commit message definitely needs to cite that one, and
also needs a cc: stable because not rejecting invalid imports is a pretty
big deal.

Also adding David.
-Sima

> 
> We should definitely not open-code any functions for the gem_bo export
> case, which your patch seems to do? Or maybe I'm just extremely confused.
> -Sima
> 
> >  
> >  static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
> > @@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops 
> > virtgpu_dmabuf_ops =  {
> > .vmap = drm_gem_dmabuf_vmap,
> > .vunmap = drm_gem_dmabuf_vunmap,
> > },
> > -   .device_attach = drm_gem_map_attach,
> > +   .device_attach = virtgpu_gem_device_attach,
> > .get_uuid = virtgpu_virtio_get_uuid,
> >  };
> >  
> > -- 
> > 2.34.1
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 1/1] drm/virtio: Implement device_attach

2024-01-30 Thread Daniel Vetter
On Mon, Jan 29, 2024 at 06:31:19PM +0800, Julia Zhang wrote:
> As vram objects don't have backing pages and thus can't implement
> drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf
> callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf()
> and implement virtgpu specific map/unmap/attach callbacks to support
> both of shmem objects and vram objects.
> 
> Signed-off-by: Julia Zhang 
> ---
>  drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++---
>  1 file changed, 36 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c 
> b/drivers/gpu/drm/virtio/virtgpu_prime.c
> index 44425f20d91a..b490a5343b06 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
> @@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
>  {
>   struct drm_gem_object *obj = attach->dmabuf->priv;
>   struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> + struct sg_table *sgt;
> + int ret;
>  
>   if (virtio_gpu_is_vram(bo))
>   return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
>  
> - return drm_gem_map_dma_buf(attach, dir);
> + sgt = drm_prime_pages_to_sg(obj->dev,
> + to_drm_gem_shmem_obj(obj)->pages,
> + obj->size >> PAGE_SHIFT);
> + if (IS_ERR(sgt))
> + return sgt;
> +
> + ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> + if (ret) {
> + sg_free_table(sgt);
> + kfree(sgt);
> + return ERR_PTR(ret);
> + }
> +
> + return sgt;
>  }
>  
>  static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
> @@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct 
> dma_buf_attachment *attach,
>   struct drm_gem_object *obj = attach->dmabuf->priv;
>   struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>  
> + if (!sgt)
> + return;
> +
>   if (virtio_gpu_is_vram(bo)) {
>   virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
> - return;
> + } else {
> + dma_unmap_sgtable(attach->dev, sgt, dir, 
> DMA_ATTR_SKIP_CPU_SYNC);
> + sg_free_table(sgt);
> + kfree(sgt);
>   }
> +}
> +
> +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
> +  struct dma_buf_attachment *attach)
> +{
> + struct drm_gem_object *obj = attach->dmabuf->priv;
> + struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> + int ret = 0;
> +
> + if (!virtio_gpu_is_vram(bo) && obj->funcs->pin)
> + ret = obj->funcs->pin(obj);
>  
> - drm_gem_unmap_dma_buf(attach, sgt, dir);
> + return ret;

This doesn't look like what I've expected. There should be no need to
change the map/unmap functions, especially not for the usual gem bo case.
We should definitely keep using the exact same code for that. Instead all
I expected is roughly

virtgpu_gem_device_attach()
{
if (virtio_gpu_is_vram(bo)) {
if (can_access_virtio_vram_directly(attach->dev)
return 0;
else
return -EBUSY;
} else {
return drm_gem_map_attach();
}
}

Note that I think can_access_virtio_vram_directly() needs to be
implemented first. I'm not even sure it's possible, might be that all the
importers need to set the attachment->peer2peer flag. Which is why this
thing exists really. But that's a pile more work to do.

Frankly the more I look at the original patch that added vram export
support the more this just looks like a "pls revert, this is just too
broken".

We should definitely not open-code any functions for the gem_bo export
case, which your patch seems to do? Or maybe I'm just extremely confused.
-Sima

>  
>  static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
> @@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops 
> =  {
>   .vmap = drm_gem_dmabuf_vmap,
>   .vunmap = drm_gem_dmabuf_vunmap,
>   },
> - .device_attach = drm_gem_map_attach,
> + .device_attach = virtgpu_gem_device_attach,
>   .get_uuid = virtgpu_virtio_get_uuid,
>  };
>  
> -- 
> 2.34.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch



Re: [PATCH v3 3/3] drm/amdgpu: Implement check_async_props for planes

2024-01-30 Thread Daniel Vetter
On Sun, Jan 28, 2024 at 06:25:15PM -0300, André Almeida wrote:
> AMD GPUs can do async flips with changes on more properties than just
> the FB ID, so implement a custom check_async_props for AMD planes.
> 
> Allow amdgpu to do async flips with overlay planes as well.
> 
> Signed-off-by: André Almeida 
> ---
> v3: allow overlay planes

This comment very much written with a lack of clearly better ideas, but:

Do we really need this much flexibility, especially for the first driver
adding the first few additional properties?

A simple bool on struct drm_plane to indicate whether async flips are ok
or not should also do this job here? Maybe a bit of work to roll that out
to the primary planes for current drivers, but not much. And wouldn't need
drivers to implement some very uapi-marshalling atomic code ...

Also we could probably remove the current drm_mode_config.async_flip flag
and entirely replace it with the per-plane one.
-Sima
> 
>  .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 29 +++
>  1 file changed, 29 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> index 116121e647ca..ed75b69636b4 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
> @@ -25,6 +25,7 @@
>   */
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1430,6 +1431,33 @@ static void 
> amdgpu_dm_plane_drm_plane_destroy_state(struct drm_plane *plane,
>   drm_atomic_helper_plane_destroy_state(plane, state);
>  }
>  
> +static int amdgpu_dm_plane_check_async_props(struct drm_property *prop,
> +   struct drm_plane *plane,
> +   struct drm_plane_state *plane_state,
> +   struct drm_mode_object *obj,
> +   u64 prop_value, u64 old_val)
> +{
> + struct drm_mode_config *config = >dev->mode_config;
> + int ret;
> +
> + if (prop != config->prop_fb_id &&
> + prop != config->prop_in_fence_fd) {
> + ret = drm_atomic_plane_get_property(plane, plane_state,
> + prop, _val);
> + return drm_atomic_check_prop_changes(ret, old_val, prop_value, 
> prop);
> + }
> +
> + if (plane_state->plane->type != DRM_PLANE_TYPE_PRIMARY &&
> + plane_state->plane->type != DRM_PLANE_TYPE_OVERLAY) {
> + drm_dbg_atomic(prop->dev,
> +"[OBJECT:%d] Only primary or overlay planes can 
> be changed during async flip\n",
> +obj->id);
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
>  static const struct drm_plane_funcs dm_plane_funcs = {
>   .update_plane   = drm_atomic_helper_update_plane,
>   .disable_plane  = drm_atomic_helper_disable_plane,
> @@ -1438,6 +1466,7 @@ static const struct drm_plane_funcs dm_plane_funcs = {
>   .atomic_duplicate_state = amdgpu_dm_plane_drm_plane_duplicate_state,
>   .atomic_destroy_state = amdgpu_dm_plane_drm_plane_destroy_state,
>   .format_mod_supported = amdgpu_dm_plane_format_mod_supported,
> + .check_async_props = amdgpu_dm_plane_check_async_props,
>  };
>  
>  int amdgpu_dm_plane_init(struct amdgpu_display_manager *dm,
> -- 
> 2.43.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 3/7] drm/amd/display: Add handling for new "active color format" property

2024-01-10 Thread Daniel Vetter
On Wed, 10 Jan 2024 at 13:53, Andri Yngvason  wrote:
>
> mið., 10. jan. 2024 kl. 11:10 skrifaði Daniel Vetter :
> >
> > On Tue, Jan 09, 2024 at 06:11:00PM +, Andri Yngvason wrote:
> > > + /* Extract information from crtc to communicate it to userspace as 
> > > connector properties */
> > > + for_each_new_connector_in_state(state, connector, new_con_state, i) 
> > > {
> > > + struct drm_crtc *crtc = new_con_state->crtc;
> > > + struct dc_stream_state *stream;
> > > +
> > > + if (crtc) {
> > > + new_crtc_state = 
> > > drm_atomic_get_new_crtc_state(state, crtc);
> > > + dm_new_crtc_state = 
> > > to_dm_crtc_state(new_crtc_state);
> > > + stream = dm_new_crtc_state->stream;
> > > +
> > > + if (stream) {
> > > + 
> > > drm_connector_set_active_color_format_property(connector,
> > > + 
> > > convert_dc_pixel_encoding_into_drm_color_format(
> > > + 
> > > dm_new_crtc_state->stream->timing.pixel_encoding));
> > > + }
> > > + } else {
> > > + 
> > > drm_connector_set_active_color_format_property(connector, 0);
> >
> > Just realized an even bigger reason why your current design doesn't work:
> > You don't have locking here.
> >
> > And you cannot grab the required lock, which is
> > drm_dev->mode_config.mutex, because that would result in deadlocks. So
> > this really needs to use the atomic state based design I've described.
> >
>
> Maybe we should just drop "actual color format" and instead fail the
> modeset if the "preferred color format" property cannot be satisfied?
> It seems like the simplest thing to do here, though it is perhaps less
> convenient for userspace. In that case, the "preferred color format"
> property should just be called "color format".

Yeah that's more in line with how other atomic properties work. This
way userspace can figure out what works with a TEST_ONLY commit too.
And for this to work you probably want to have an "automatic" setting
too.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 3/7] drm/amd/display: Add handling for new "active color format" property

2024-01-10 Thread Daniel Vetter
.c
> @@ -600,6 +600,10 @@ dm_dp_add_mst_connector(struct drm_dp_mst_topology_mgr 
> *mgr,
>   if (connector->max_bpc_property)
>   drm_connector_attach_max_bpc_property(connector, 8, 16);
>  
> + connector->active_color_format_property = 
> master->base.active_color_format_property;
> + if (connector->active_color_format_property)
> + 
> drm_connector_attach_active_color_format_property(>base);
> +
>   connector->vrr_capable_property = master->base.vrr_capable_property;
>   if (connector->vrr_capable_property)
>   drm_connector_attach_vrr_capable_property(connector);
> -- 
> 2.43.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/1] drm/virtio: Implement device_attach

2024-01-10 Thread Daniel Vetter
On Wed, Jan 10, 2024 at 11:46:35AM +0100, Christian König wrote:
> Am 10.01.24 um 11:22 schrieb Daniel Vetter:
> > On Wed, Jan 10, 2024 at 11:19:37AM +0100, Christian König wrote:
> > > Am 10.01.24 um 10:56 schrieb Julia Zhang:
> > > > drm_gem_map_attach() requires drm_gem_object_funcs.get_sg_table to be
> > > > implemented, or else return ENOSYS. Virtio has no get_sg_table
> > > > implemented for vram object. To fix this, add a new device_attach to
> > > > call drm_gem_map_attach() for shmem object and return 0 for vram object
> > > > instead of calling drm_gem_map_attach for both of these two kinds of
> > > > object.
> > > Well as far as I can see this is nonsense from the DMA-buf side of things.
> > > 
> > > SG tables are always needed as long as you don't re-import the same object
> > > into your driver and then you shouldn't end up in this function in the 
> > > first
> > > place.
> > > 
> > > So that drm_gem_map_attach() requires get_sg_table to be implemented is
> > > intentional and should never be overridden like this.
> > See my reply, tldr; you're allowed to reject ->attach with -EBUSY to
> > handle exactly this case of non-shareable buffer types. But definitely
> > don't silently fail, that's a "we'll oops on map_attachment" kind of bug
> > :-)
> 
> Ah, yes that makes much more sense!
> 
> So basically just the "return 0;" needs to be "return -EBUSY;".

Well plus 2nd patch to polish the virtio_dma_buf docs a bit, that would be
nice :-D
-Sima

> 
> Regards,
> Christian.
> 
> > -Sima
> > 
> > > Regards,
> > > Christian.
> > > 
> > > > Signed-off-by: Julia Zhang 
> > > > ---
> > > >drivers/gpu/drm/virtio/virtgpu_prime.c | 14 +-
> > > >1 file changed, 13 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c 
> > > > b/drivers/gpu/drm/virtio/virtgpu_prime.c
> > > > index 44425f20d91a..f0b0ff6f3813 100644
> > > > --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
> > > > +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
> > > > @@ -71,6 +71,18 @@ static void virtgpu_gem_unmap_dma_buf(struct 
> > > > dma_buf_attachment *attach,
> > > > drm_gem_unmap_dma_buf(attach, sgt, dir);
> > > >}
> > > > +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
> > > > +struct dma_buf_attachment *attach)
> > > > +{
> > > > +   struct drm_gem_object *obj = attach->dmabuf->priv;
> > > > +   struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> > > > +
> > > > +   if (virtio_gpu_is_vram(bo))
> > > > +   return 0;
> > > > +
> > > > +   return drm_gem_map_attach(dma_buf, attach);
> > > > +}
> > > > +
> > > >static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
> > > > .ops = {
> > > > .cache_sgt_mapping = true,
> > > > @@ -83,7 +95,7 @@ static const struct virtio_dma_buf_ops 
> > > > virtgpu_dmabuf_ops =  {
> > > > .vmap = drm_gem_dmabuf_vmap,
> > > > .vunmap = drm_gem_dmabuf_vunmap,
> > > > },
> > > > -   .device_attach = drm_gem_map_attach,
> > > > +   .device_attach = virtgpu_gem_device_attach,
> > > > .get_uuid = virtgpu_virtio_get_uuid,
> > > >};
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/1] drm/virtio: Implement RESOURCE_GET_LAYOUT ioctl

2024-01-10 Thread Daniel Vetter
L_RINGS_MASK is in
>   * effect.  The event size is sizeof(drm_event), since there is no additional
> @@ -261,6 +278,10 @@ struct drm_virtgpu_context_init {
>   DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_CONTEXT_INIT,   \
>   struct drm_virtgpu_context_init)
>  
> +#define DRM_IOCTL_VIRTGPU_RESOURCE_QUERY_LAYOUT  
> \
> + DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_RESOURCE_QUERY_LAYOUT,  \
> + struct drm_virtgpu_resource_query_layout)
> +
>  #if defined(__cplusplus)
>  }
>  #endif
> diff --git a/include/uapi/linux/virtio_gpu.h b/include/uapi/linux/virtio_gpu.h
> index f556fde07b76..547575232376 100644
> --- a/include/uapi/linux/virtio_gpu.h
> +++ b/include/uapi/linux/virtio_gpu.h
> @@ -65,6 +65,11 @@
>   */
>  #define VIRTIO_GPU_F_CONTEXT_INIT4
>  
> +/*
> + * VIRTIO_GPU_CMD_RESOURCE_QUERY_LAYOUT
> + */
> +#define VIRTIO_GPU_F_RESOURCE_QUERY_LAYOUT 5
> +
>  enum virtio_gpu_ctrl_type {
>   VIRTIO_GPU_UNDEFINED = 0,
>  
> @@ -95,6 +100,7 @@ enum virtio_gpu_ctrl_type {
>   VIRTIO_GPU_CMD_SUBMIT_3D,
>   VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB,
>   VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB,
> + VIRTIO_GPU_CMD_RESOURCE_QUERY_LAYOUT,
>  
>   /* cursor commands */
>   VIRTIO_GPU_CMD_UPDATE_CURSOR = 0x0300,
> @@ -108,6 +114,7 @@ enum virtio_gpu_ctrl_type {
>   VIRTIO_GPU_RESP_OK_EDID,
>   VIRTIO_GPU_RESP_OK_RESOURCE_UUID,
>   VIRTIO_GPU_RESP_OK_MAP_INFO,
> + VIRTIO_GPU_RESP_OK_RESOURCE_LAYOUT,
>  
>   /* error responses */
>   VIRTIO_GPU_RESP_ERR_UNSPEC = 0x1200,
> @@ -453,4 +460,27 @@ struct virtio_gpu_resource_unmap_blob {
>   __le32 padding;
>  };
>  
> +/* VIRTIO_GPU_CMD_RESOURCE_QUERY_LAYOUT */
> +struct virtio_gpu_resource_query_layout {
> + struct virtio_gpu_ctrl_hdr hdr;
> + __le32 resource_id;
> + __le32 width;
> + __le32 height;
> + __le32 format;
> + __le32 bind;
> +};
> +
> +
> +/* VIRTIO_GPU_RESP_OK_RESOURCE_LAYOUT */
> +#define VIRTIO_GPU_RES_MAX_PLANES 4
> +struct virtio_gpu_resp_resource_layout {
> + struct virtio_gpu_ctrl_hdr hdr;
> + __le64 modifier;
> + __le32 num_planes;
> + struct virtio_gpu_resource_plane {
> + __le64 offset;
> + __le32 stride;
> + } planes[VIRTIO_GPU_RES_MAX_PLANES];
> +};
> +
>  #endif
> -- 
> 2.34.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 2/7] drm/uAPI: Add "active color format" drm property as feedback for userspace

2024-01-10 Thread Daniel Vetter
On Tue, Jan 09, 2024 at 11:12:11PM +, Andri Yngvason wrote:
> Hi Daniel,
> 
> þri., 9. jan. 2024 kl. 22:32 skrifaði Daniel Stone :
> 
> > On Tue, 9 Jan 2024 at 18:12, Andri Yngvason  wrote:
> > > + * active color format:
> > > + * This read-only property tells userspace the color format
> > actually used
> > > + * by the hardware display engine "on the cable" on a connector.
> > The chosen
> > > + * value depends on hardware capabilities, both display engine and
> > > + * connected monitor. Drivers shall use
> > > + * drm_connector_attach_active_color_format_property() to install
> > this
> > > + * property. Possible values are "not applicable", "rgb",
> > "ycbcr444",
> > > + * "ycbcr422", and "ycbcr420".
> >
> > How does userspace determine what's happened without polling? Will it
> > only change after an `ALLOW_MODESET` commit, and be guaranteed to be
> > updated after the commit has completed and the event being sent?
> > Should it send a HOTPLUG event? Other?
> >
> 
> Userspace does not determine what's happened without polling. The purpose
> of this property is not for programmatic verification that the preferred
> property was applied. It is my understanding that it's mostly intended for
> debugging purposes. It should only change as a consequence of modesetting,
> although I didn't actually look into what happens if you set the "preferred
> color format" outside of a modeset.

This feels a bit irky to me, since we don't have any synchronization and
it kinda breaks how userspace gets to know about stuff.

For context the current immutable properties are all stuff that's derived
from the sink (like edid, or things like that). Userspace is guaranteed to
get a hotplug event (minus driver bugs as usual) if any of these change,
and we've added infrastructure so that the hotplug event even contains the
specific property so that userspace can avoid re-read (which can cause
some costly re-probing) them all.

As an example you can look at drm_connector_set_link_status_property,
which drivers follow by a call to drm_kms_helper_connector_hotplug_event
to make sure userspace knows about what's up. Could be optimized I think.

This thing here works entirely differently, and I think we need somewhat
new semantics for this:

- I agree it should be read-only for userspace, so immutable sounds right.

- But I also agree with Daniel Stone that this should be tied more
  directly to the modeset state.

So I think the better approach would be to put the output type into
drm_connector_state, require that drivers compute it in their
->atomic_check code (which in the future would allow us to report it out
for TEST_ONLY commits too), and so guarantee that the value is updated
right after the kms ioctl returns (and not somewhen later for non-blocking
commits).

You probably need a bit of work to be able to handle immutable properties
with the atomic state infrastructure, but I think otherwise this should
fit all rather neatly.

Cheers, Sima
> 
> The way I've implemented things in sway, calling the
> "preferred_signal_format" command triggers a modeset with the "preferred
> color format" set and calling "get_outputs", immediately queries the
> "actual color format" and displays it.
> 
> Regards,
> Andri

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/1] drm/virtio: Implement device_attach

2024-01-10 Thread Daniel Vetter
On Wed, Jan 10, 2024 at 11:19:37AM +0100, Christian König wrote:
> Am 10.01.24 um 10:56 schrieb Julia Zhang:
> > drm_gem_map_attach() requires drm_gem_object_funcs.get_sg_table to be
> > implemented, or else return ENOSYS. Virtio has no get_sg_table
> > implemented for vram object. To fix this, add a new device_attach to
> > call drm_gem_map_attach() for shmem object and return 0 for vram object
> > instead of calling drm_gem_map_attach for both of these two kinds of
> > object.
> 
> Well as far as I can see this is nonsense from the DMA-buf side of things.
> 
> SG tables are always needed as long as you don't re-import the same object
> into your driver and then you shouldn't end up in this function in the first
> place.
> 
> So that drm_gem_map_attach() requires get_sg_table to be implemented is
> intentional and should never be overridden like this.

See my reply, tldr; you're allowed to reject ->attach with -EBUSY to
handle exactly this case of non-shareable buffer types. But definitely
don't silently fail, that's a "we'll oops on map_attachment" kind of bug
:-)
-Sima

> 
> Regards,
> Christian.
> 
> > 
> > Signed-off-by: Julia Zhang 
> > ---
> >   drivers/gpu/drm/virtio/virtgpu_prime.c | 14 +-
> >   1 file changed, 13 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c 
> > b/drivers/gpu/drm/virtio/virtgpu_prime.c
> > index 44425f20d91a..f0b0ff6f3813 100644
> > --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
> > +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
> > @@ -71,6 +71,18 @@ static void virtgpu_gem_unmap_dma_buf(struct 
> > dma_buf_attachment *attach,
> > drm_gem_unmap_dma_buf(attach, sgt, dir);
> >   }
> > +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
> > +struct dma_buf_attachment *attach)
> > +{
> > +   struct drm_gem_object *obj = attach->dmabuf->priv;
> > +   struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> > +
> > +   if (virtio_gpu_is_vram(bo))
> > +   return 0;
> > +
> > +   return drm_gem_map_attach(dma_buf, attach);
> > +}
> > +
> >   static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
> > .ops = {
> > .cache_sgt_mapping = true,
> > @@ -83,7 +95,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops 
> > =  {
> > .vmap = drm_gem_dmabuf_vmap,
> > .vunmap = drm_gem_dmabuf_vunmap,
> > },
> > -   .device_attach = drm_gem_map_attach,
> > +   .device_attach = virtgpu_gem_device_attach,
> > .get_uuid = virtgpu_virtio_get_uuid,
> >   };
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/1] drm/virtio: Implement device_attach

2024-01-10 Thread Daniel Vetter
On Wed, Jan 10, 2024 at 05:56:28PM +0800, Julia Zhang wrote:
> drm_gem_map_attach() requires drm_gem_object_funcs.get_sg_table to be
> implemented, or else return ENOSYS. Virtio has no get_sg_table
> implemented for vram object. To fix this, add a new device_attach to
> call drm_gem_map_attach() for shmem object and return 0 for vram object
> instead of calling drm_gem_map_attach for both of these two kinds of
> object.
> 
> Signed-off-by: Julia Zhang 
> ---
>  drivers/gpu/drm/virtio/virtgpu_prime.c | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c 
> b/drivers/gpu/drm/virtio/virtgpu_prime.c
> index 44425f20d91a..f0b0ff6f3813 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
> @@ -71,6 +71,18 @@ static void virtgpu_gem_unmap_dma_buf(struct 
> dma_buf_attachment *attach,
>   drm_gem_unmap_dma_buf(attach, sgt, dir);
>  }
>  
> +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
> +  struct dma_buf_attachment *attach)
> +{
> + struct drm_gem_object *obj = attach->dmabuf->priv;
> + struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +
> + if (virtio_gpu_is_vram(bo))
> + return 0;

You need to reject attach here because these vram buffer objects cannot be
used by any other driver. In that case dma_buf_attach _must_ fail, not
silently succeed.

Because if it silently succeeds then the subsequent dma_buf_map_attachment
will blow up because you don't have the ->get_sg_table hook implemented.

Per the documentation the error code for this case must be -EBUSY, see the
section for the attach hook here:

https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#c.dma_buf_ops

Since you're looking into this area, please make sure there's not other
similar mistake in virtio code.

Also can you please make a kerneldoc patch for struct virtio_dma_buf_ops
to improve the documentation there? I think it would be good to move those
to the inline style and then at least put a kernel-doc hyperlink to struct
dma_buf_ops.attach and mention that attach must fail for non-shareable
buffers.

In general the virtio_dma_buf kerneldoc seems to be on the "too minimal,
explains nothing" side of things :-/

Cheers, Sima

> +
> + return drm_gem_map_attach(dma_buf, attach);
> +}
> +
>  static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
>   .ops = {
>   .cache_sgt_mapping = true,
> @@ -83,7 +95,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops = 
>  {
>   .vmap = drm_gem_dmabuf_vmap,
>   .vunmap = drm_gem_dmabuf_vunmap,
>   },
> - .device_attach = drm_gem_map_attach,
> + .device_attach = virtgpu_gem_device_attach,
>   .get_uuid = virtgpu_virtio_get_uuid,
>  };
>  
> -- 
> 2.34.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 2/2] drm/amdgpu: add shared fdinfo stats

2024-01-09 Thread Daniel Vetter
On Tue, 9 Jan 2024 at 14:25, Tvrtko Ursulin
 wrote:
>
>
> On 09/01/2024 12:54, Daniel Vetter wrote:
> > On Tue, Jan 09, 2024 at 09:30:15AM +, Tvrtko Ursulin wrote:
> >>
> >> On 09/01/2024 07:56, Christian König wrote:
> >>> Am 07.12.23 um 19:02 schrieb Alex Deucher:
> >>>> Add shared stats.  Useful for seeing shared memory.
> >>>>
> >>>> v2: take dma-buf into account as well
> >>>>
> >>>> Signed-off-by: Alex Deucher 
> >>>> Cc: Rob Clark 
> >>>> ---
> >>>>drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c |  4 
> >>>>drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 11 +++
> >>>>drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  6 ++
> >>>>3 files changed, 21 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> >>>> index 5706b282a0c7..c7df7fa3459f 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> >>>> @@ -97,6 +97,10 @@ void amdgpu_show_fdinfo(struct drm_printer *p,
> >>>> struct drm_file *file)
> >>>>   stats.requested_visible_vram/1024UL);
> >>>>drm_printf(p, "amd-requested-gtt:\t%llu KiB\n",
> >>>>   stats.requested_gtt/1024UL);
> >>>> +drm_printf(p, "drm-shared-vram:\t%llu KiB\n",
> >>>> stats.vram_shared/1024UL);
> >>>> +drm_printf(p, "drm-shared-gtt:\t%llu KiB\n",
> >>>> stats.gtt_shared/1024UL);
> >>>> +drm_printf(p, "drm-shared-cpu:\t%llu KiB\n",
> >>>> stats.cpu_shared/1024UL);
> >>>> +
> >>>>for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
> >>>>if (!usage[hw_ip])
> >>>>continue;
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> index d79b4ca1ecfc..1b37d95475b8 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>> @@ -1287,25 +1287,36 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
> >>>>  struct amdgpu_mem_stats *stats)
> >>>>{
> >>>>uint64_t size = amdgpu_bo_size(bo);
> >>>> +struct drm_gem_object *obj;
> >>>>unsigned int domain;
> >>>> +bool shared;
> >>>>/* Abort if the BO doesn't currently have a backing store */
> >>>>if (!bo->tbo.resource)
> >>>>return;
> >>>> +obj = >tbo.base;
> >>>> +shared = (obj->handle_count > 1) || obj->dma_buf;
> >>>
> >>> I still think that looking at handle_count is the completely wrong
> >>> approach, we should really only look at obj->dma_buf.
> >>
> >> Yeah it is all a bit tricky with the handle table walk. I don't think it is
> >> even possible to claim it is shared with obj->dma_buf could be the same
> >> process creating say via udmabuf and importing into drm. It is a wild
> >> scenario yes, but it could be private memory in that case. Not sure where 
> >> it
> >> would leave us if we said this is just a limitation of a BO based tracking.
> >>
> >> Would adding a new category "imported" help?
> >>
> >> Hmm or we simply change drm-usage-stats.rst:
> >>
> >> """
> >> - drm-shared-:  [KiB|MiB]
> >>
> >> The total size of buffers that are shared with another file (ie. have more
> >> than than a single handle).
> >> """
> >>
> >> Changing ie into eg coule be get our of jail free card to allow the
> >> "(obj->handle_count > 1) || obj->dma_buf;" condition?
> >>
> >> Because of the shared with another _file_ wording would cover my wild
> >> udmabuf self-import case. Unless there are more such creative private 
> >> import
> >> options.
> >
> > Yeah I think clarifying that we can only track sharing with other fd and
> > have no idea whether this means sharing with another process or not is
> > probably si

Re: [PATCH 2/2] drm/amdgpu: add shared fdinfo stats

2024-01-09 Thread Daniel Vetter
visible_vram(bo))
> > >   stats->visible_vram += size;
> > > +    if (shared)
> > > +    stats->vram_shared += size;
> > >   break;
> > >   case AMDGPU_GEM_DOMAIN_GTT:
> > >   stats->gtt += size;
> > > +    if (shared)
> > > +    stats->gtt_shared += size;
> > >   break;
> > >   case AMDGPU_GEM_DOMAIN_CPU:
> > >   default:
> > >   stats->cpu += size;
> > > +    if (shared)
> > > +    stats->cpu_shared += size;
> > >   break;
> > >   }
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > > index d28e21baef16..0503af75dc26 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > > @@ -138,12 +138,18 @@ struct amdgpu_bo_vm {
> > >   struct amdgpu_mem_stats {
> > >   /* current VRAM usage, includes visible VRAM */
> > >   uint64_t vram;
> > > +    /* current shared VRAM usage, includes visible VRAM */
> > > +    uint64_t vram_shared;
> > >   /* current visible VRAM usage */
> > >   uint64_t visible_vram;
> > >   /* current GTT usage */
> > >   uint64_t gtt;
> > > +    /* current shared GTT usage */
> > > +    uint64_t gtt_shared;
> > >   /* current system memory usage */
> > >   uint64_t cpu;
> > > +    /* current shared system memory usage */
> > > +    uint64_t cpu_shared;
> > >   /* sum of evicted buffers, includes visible VRAM */
> > >   uint64_t evicted_vram;
> > >   /* sum of evicted buffers due to CPU access */
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v5 00/32] drm/amd/display: add AMD driver-specific properties for color mgmt

2023-11-30 Thread Daniel Vetter
On Tue, 28 Nov 2023 at 23:11, Harry Wentland  wrote:
>
> On 2023-11-16 14:57, Melissa Wen wrote:
> > Hello,
> >
> > This series extends the current KMS color management API with AMD
> > driver-specific properties to enhance the color management support on
> > AMD Steam Deck. The key additions to the color pipeline include:
> >
>
> snip
>
> > Melissa Wen (18):
> >   drm/drm_mode_object: increase max objects to accommodate new color
> > props
> >   drm/drm_property: make replace_property_blob_from_id a DRM helper
> >   drm/drm_plane: track color mgmt changes per plane
>
> If all patches are merged through amd-staging-drm-next I worry that
> conflicts creep in if any code around replace_property_blob_from_id
> changes in DRM.
>
> My plan is to merge DRM patches through drm-misc-next, as well
> as include them in the amd-staging-drm-next merge. They should then
> fall out at the next amd-staging-drm-next pull and (hopefully)
> ensure that there is no conflict.
>
> If no objections I'll go ahead with that later this week.

Double-merging tends to be the worst because git doesn't realize the
commits match, which actually makes the conflicts worse when they
happen (because the 3-way merge diff gets absolute confused by all the
changed context and misplaces everything to the max). So please don't,
_only_ every cherry-pick when a patch in -next is also needed in
-fixes, and we didn't put it into the right tree. But even that is a
bit tricky and should only be done by maintainers (using dim
cherry-pick if it's drm-misc) because the conflicts tend to be bad and
need to be sorted out with backmerges sooner than later.

For this case merge everything through one tree with the right acks,
pull to drm-next asap and then backmerge into the other tree. Here
probably amdgpu-next with drm-misc maintainer acks for the 3 core
patches. Or if amdgpu-next pull won't come for a while, put them into
drm-misc-next and just wait a week until it's in drm-next, then
forward amdgpu-next.

Cheers, Sima

> Harry
>
> >   drm/amd/display: add driver-specific property for plane degamma LUT
> >   drm/amd/display: explicitly define EOTF and inverse EOTF
> >   drm/amd/display: document AMDGPU pre-defined transfer functions
> >   drm/amd/display: add plane 3D LUT driver-specific properties
> >   drm/amd/display: add plane shaper LUT and TF driver-specific
> > properties
> >   drm/amd/display: add CRTC gamma TF driver-specific property
> >   drm/amd/display: add comments to describe DM crtc color mgmt behavior
> >   drm/amd/display: encapsulate atomic regamma operation
> >   drm/amd/display: decouple steps for mapping CRTC degamma to DC plane
> >   drm/amd/display: reject atomic commit if setting both plane and CRTC
> > degamma
> >   drm/amd/display: add plane shaper LUT support
> >   drm/amd/display: add plane shaper TF support
> >   drm/amd/display: add plane 3D LUT support
> >   drm/amd/display: add plane CTM driver-specific property
> >   drm/amd/display: add plane CTM support
> >
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  |  91 ++
> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  34 +-
> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 108 +++
> >  .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 818 --
> >  .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c|  72 ++
> >  .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 232 -
> >  .../gpu/drm/amd/display/include/fixed31_32.h  |  12 +
> >  drivers/gpu/drm/arm/malidp_crtc.c |   2 +-
> >  drivers/gpu/drm/drm_atomic.c  |   1 +
> >  drivers/gpu/drm/drm_atomic_state_helper.c |   1 +
> >  drivers/gpu/drm/drm_atomic_uapi.c |  43 +-
> >  drivers/gpu/drm/drm_property.c|  49 ++
> >  include/drm/drm_mode_object.h |   2 +-
> >  include/drm/drm_plane.h   |   7 +
> >  include/drm/drm_property.h|   6 +
> >  include/uapi/drm/drm_mode.h   |   8 +
> >  16 files changed, 1377 insertions(+), 109 deletions(-)
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [pull] amdgpu drm-fixes-6.7

2023-11-17 Thread Daniel Vetter
On Fri, Nov 17, 2023 at 01:34:41AM -0500, Alex Deucher wrote:
> Hi Dave, Sima,
> 
> Fixes for 6.7.
> 
> The following changes since commit b85ea95d086471afb4ad062012a4d73cd328fa86:
> 
>   Linux 6.7-rc1 (2023-11-12 16:19:07 -0800)
> 
> are available in the Git repository at:
> 
>   https://gitlab.freedesktop.org/agd5f/linux.git 
> tags/amd-drm-fixes-6.7-2023-11-17
> 
> for you to fetch changes up to e8c2d3e25b844ad8f7c8b269a7cfd65285329264:
> 
>   drm/amdgpu/gmc9: disable AGP aperture (2023-11-17 00:58:41 -0500)

Pulled to drm-fixes, thanks!
-Sima
> 
> 
> amd-drm-fixes-6.7-2023-11-17:
> 
> amdgpu:
> - DMCUB fixes
> - SR-IOV fix
> - GMC9 fix
> - Documentation fix
> - DSC MST fix
> - CS chunk parsing fix
> - SMU13.0.6 fixes
> - 8K tiled display fix
> - Fix potential NULL pointer dereferences
> - Cursor lag fix
> - Backlight fix
> - DCN s0ix fix
> - XGMI fix
> - DCN encoder disable logic fix
> - AGP aperture fixes
> 
> 
> Alex Deucher (5):
>   drm/amdgpu/gmc11: fix logic typo in AGP check
>   drm/amdgpu: add a module parameter to control the AGP aperture
>   drm/amdgpu/gmc11: disable AGP aperture
>   drm/amdgpu/gmc10: disable AGP aperture
>   drm/amdgpu/gmc9: disable AGP aperture
> 
> Asad Kamal (2):
>   drm/amd/pm: Update metric table for smu v13_0_6
>   drm/amd/pm: Fill pcie error counters for gpu v1_4
> 
> Duncan Ma (1):
>   drm/amd/display: Negate IPS allow and commit bits
> 
> Fangzhi Zuo (1):
>   drm/amd/display: Fix DSC not Enabled on Direct MST Sink
> 
> José Pekkarinen (1):
>   drm/amd/display: fix NULL dereference
> 
> Le Ma (1):
>   drm/amdgpu: finalizing mem_partitions at the end of GMC v9 sw_fini
> 
> Lewis Huang (1):
>   drm/amd/display: Change the DMCUB mailbox memory location from FB to 
> inbox
> 
> Lijo Lazar (1):
>   drm/amd/pm: Don't send unload message for reset
> 
> Mario Limonciello (1):
>   drm/amd/display: fix a NULL pointer dereference in amdgpu_dm_i2c_xfer()
> 
> Muhammad Ahmed (1):
>   drm/amd/display: Add null checks for 8K60 lightup
> 
> Nicholas Kazlauskas (1):
>   drm/amd/display: Guard against invalid RPTR/WPTR being set
> 
> Nicholas Susanto (1):
>   drm/amd/display: Fix encoder disable logic
> 
> Paul Hsieh (1):
>   drm/amd/display: Clear dpcd_sink_ext_caps if not set
> 
> Shiwu Zhang (1):
>   drm/amdgpu: add and populate the port num into xgmi topology info
> 
> Srinivasan Shanmugam (1):
>   drm/amdgpu: Address member 'ring' not described in 'amdgpu_ vce, 
> uvd_entity_init()'
> 
> Tianci Yin (1):
>   drm/amd/display: Enable fast plane updates on DCN3.2 and above
> 
> Victor Lu (1):
>   drm/amdgpu: Do not program VF copy regs in mmhub v1.8 under SRIOV (v2)
> 
> Yang Wang (1):
>   drm/amdgpu: fix ras err_data null pointer issue in amdgpu_ras.c
> 
> YuanShang (1):
>   drm/amdgpu: correct chunk_ptr to a pointer to chunk.
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 10 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c|  5 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h|  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c|  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c|  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c|  1 +
>  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c |  2 +-
>  drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c |  5 ++-
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |  7 +--
>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c|  6 +--
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 24 ++-
>  .../drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c  |  5 +--
>  .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c| 29 ++---
>  .../amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c   | 18 
>  drivers/gpu/drm/amd/display/dc/core/dc.c   |  6 +--
>  drivers/gpu/drm/amd/display/dc/core/dc_resource.c  |  3 ++
>  drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c   | 10 ++---
>  drivers/gpu/drm/amd/display/dc/dc_types.h  |  1 +
>  .../display/dc/dcn35/dcn35_dio_stream_encoder.c| 10 ++---
>  .../gpu/drm/amd/display/dc/link/link_detection.c   |  3 ++
>  drivers/gpu/drm/amd/display/dmub/dmub_srv.h| 22 ++
>  drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c| 50 
> +-
>  .../amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_pmfw.h| 10 -
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c   | 10 -
>  26 files changed, 160 insertions(+), 84 deletions(-)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [pull] amdgpu drm-next-6.7

2023-11-10 Thread Daniel Vetter
vers/gpu/drm/amd/amdkfd/kfd_svm.c   |   8 +-
>  .../drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c  |   3 +
>  .../drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h|   2 +-
>  .../amd/display/dc/clk_mgr/dcn35/dcn35_clk_mgr.c   |  21 +-
>  drivers/gpu/drm/amd/display/dc/core/dc.c   |  27 +-
>  drivers/gpu/drm/amd/display/dc/dc.h|   2 +-
>  drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c   |  74 +
>  drivers/gpu/drm/amd/display/dc/dc_dmub_srv.h   |   8 +
>  drivers/gpu/drm/amd/display/dc/dc_dp_types.h   |   3 +-
>  drivers/gpu/drm/amd/display/dc/dc_types.h  |   4 +-
>  drivers/gpu/drm/amd/display/dc/dce/dce_abm.h   |  15 -
>  drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h  | 186 +---
>  drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dsc.c   |  10 +-
>  drivers/gpu/drm/amd/display/dc/dcn35/dcn35_dccg.c  |  73 ++---
>  .../gpu/drm/amd/display/dc/dcn35/dcn35_pg_cntl.c   |  10 +-
>  .../gpu/drm/amd/display/dc/dcn35/dcn35_pg_cntl.h   |   1 +
>  .../gpu/drm/amd/display/dc/dcn35/dcn35_resource.c  |  37 ++-
>  .../amd/display/dc/dml/dcn30/display_mode_vba_30.c |   2 +-
>  .../amd/display/dc/dml2/dml2_dc_resource_mgmt.c|  61 ++--
>  .../drm/amd/display/dc/dml2/dml2_internal_types.h  |   4 +-
>  .../amd/display/dc/dml2/dml2_translation_helper.c  |  55 +++-
>  .../amd/display/dc/dml2/dml2_translation_helper.h  |   2 +-
>  drivers/gpu/drm/amd/display/dc/dml2/dml2_utils.c   |  18 +-
>  drivers/gpu/drm/amd/display/dc/dml2/dml2_wrapper.c |   2 +-
>  drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c|  11 +
>  .../gpu/drm/amd/display/dc/hwss/dce/dce_hwseq.h|  18 +-
>  .../drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c|  17 +-
>  .../drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c|  34 ++-
>  drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h   |   5 +
>  drivers/gpu/drm/amd/display/dc/inc/hw/dsc.h|   2 +
>  drivers/gpu/drm/amd/display/dc/inc/hw/optc.h   | 219 ++
>  drivers/gpu/drm/amd/display/dc/inc/hw/pg_cntl.h|   2 +
>  .../amd/display/dc/link/accessories/link_dp_cts.c  |  17 +-
>  .../dc/link/protocols/link_dp_irq_handler.c|  15 +-
>  drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h|  25 +-
>  drivers/gpu/drm/amd/pm/amdgpu_dpm.c|  12 +-
>  drivers/gpu/drm/amd/pm/amdgpu_pm.c |  29 +-
>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c  |   5 +-
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c   | 315 
> +
>  85 files changed, 1688 insertions(+), 773 deletions(-)
>  create mode 100644 drivers/gpu/drm/amd/display/dc/inc/hw/optc.h

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 0/4] drm/amd/display: stop using drm_edid_override_connector_update()

2023-09-04 Thread Daniel Vetter
, you really need to do the reviews
> on the mailing lists.

Aye. Maybe with the clarification that if the embargoed code touches
areas that are common code (or really should be handled in common
code), then the cross-driver parts also need to be reviewed in public
as upfront prep patches. If that's not possible (try to fix your
process to make that possible please), at least ping stakeholders in
private to give them a heads up, so that when the IP enabling gets
published it's not going to be held up in the review for the necessary
common changes. What's not good is if code that should be reviewed on
dri-devel bypasses all that just because it's part of a hardware
enabling series.

Cheers, Sima

> Alex
>
>
> > >
> > >
> > > BR,
> > > Jani.
> > >
> > >
> > >>
> > >> With the patch. both following git grep commands return nothing in
> > >> amd-staging-drm-next.
> > >>
> > >> $ git grep drm_edid_override_connector_update -- drivers/gpu/drm/amd
> > >> $ git grep edid_override -- drivers/gpu/drm/amd
> > >>
> > >> Best regards,
> > >> Alex Hung
> > >>
> > >>>>>
> > >>>>> What is the goal of the reverts?  I don't disagree that we may be
> > >>>>> using the interfaces wrong, but reverting them will regess
> > >>>>> functionality in the driver.
> > >>>>
> > >>>> The commits are in v6.5-rc1, but not yet in a release. No user depends
> > >>>> on them yet. I'd strongly prefer them not reaching v6.5 final and 
> > >>>> users.
> > >>>
> > >>> Sorry for confusion here, that's obviously come and gone already. :(
> > >>>
> > >>>> The firmware EDID, override EDID, connector forcing, the EDID property,
> > >>>> etc. have been and somewhat still are a hairy mess that we must keep
> > >>>> untangling, and this isn't helping.
> > >>>>
> > >>>> I've put in crazy amounts of work on this, and I've added kernel-doc
> > >>>> comments about stuff that should and should not be done, but they go
> > >>>> unread and ignored.
> > >>>>
> > >>>> I really don't want to end up having to clean this up myself before I
> > >>>> can embark on further cleanups and refactoring.
> > >>>>
> > >>>> And again, if the functionality in the driver depends on conflating two
> > >>>> things that should be separate, it's probably not such a hot idea to 
> > >>>> let
> > >>>> it reach users either. Even if it's just debugfs.
> > >>>>
> > >>>>
> > >>>> BR,
> > >>>> Jani.
> > >>>
> > >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 0/4] drm/amd/display: stop using drm_edid_override_connector_update()

2023-08-30 Thread Daniel Vetter
On Wed, Aug 30, 2023 at 10:29:46AM +0300, Jani Nikula wrote:
> Upstream code should be reviewed in public.
 
Yup
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-08-04 Thread Daniel Vetter
+first place. DRM devices should make use of devcoredump to store relevant
> +information about the reset, so this information can be added to user bug
> +reports.

Since we do not seem to have a solid consensus in the community about
non-robust userspace, maybe we could just document that lack of consensus
to unblock this patch? Something like this:

Non-Robust Userspace


Userspace that doesn't support robust interfaces (like an non-robust
OpenGL context or API without any robustness support like libva) leave the
robustness handling entirely to the userspace driver. There is no strong
community consensus on what the userspace driver should do in that case,
since all reasonable approaches have some clear downsides.

With the s/UMD/KMD/ further up and maybe something added to record the
non-robustness non-consensus:

Acked-by: Daniel Vetter 

Cheers, Daniel



> +
>  .. _drm_driver_ioctl:
>  
>  IOCTL Support on Device Nodes
> -- 
> 2.41.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [pull] amdgpu, amdkfd, radeon drm-next-6.6

2023-08-04 Thread Daniel Vetter
/drm/amd/display/dc/dcn30/dcn30_optc.h  |   3 +
>  .../gpu/drm/amd/display/dc/dcn30/dcn30_resource.c  |   4 +-
>  drivers/gpu/drm/amd/display/dc/dcn301/Makefile |   3 +-
>  .../gpu/drm/amd/display/dc/dcn301/dcn301_optc.c| 185 +++
>  .../gpu/drm/amd/display/dc/dcn301/dcn301_optc.h|  36 ++
>  .../drm/amd/display/dc/dcn301/dcn301_resource.c|  10 +-
>  .../drm/amd/display/dc/dcn303/dcn303_resource.c|   2 +-
>  drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c  |  52 +-
>  drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.h  |   5 +
>  .../amd/display/dc/dcn31/dcn31_dio_link_encoder.c  |   2 +-
>  .../display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c |   2 +-
>  .../gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c|   1 +
>  .../drm/amd/display/dc/dcn314/dcn314_resource.c|  18 +-
>  .../drm/amd/display/dc/dcn315/dcn315_resource.c|   2 +-
>  drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c  |   5 +-
>  drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c |   2 -
>  .../gpu/drm/amd/display/dc/dcn32/dcn32_resource.c  |   2 +-
>  .../amd/display/dc/dcn32/dcn32_resource_helpers.c  |  24 +-
>  .../amd/display/dc/dml/dcn21/display_mode_vba_21.c |   2 +-
>  .../amd/display/dc/dml/dcn31/display_mode_vba_31.c |   2 +-
>  .../gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c |  31 +-
>  .../display/dc/dml/dcn314/display_mode_vba_314.c   |   2 +-
>  .../gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c   |  24 +-
>  .../dc/dml/dcn32/display_mode_vba_util_32.c|   9 +-
>  drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c|  66 ++-
>  drivers/gpu/drm/amd/display/dc/inc/hw/abm.h|   6 +
>  drivers/gpu/drm/amd/display/dc/inc/hw/aux_engine.h |   2 -
>  drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h   |   5 +
>  .../amd/display/dc/irq/dcn314/irq_service_dcn314.c |   7 +-
>  .../amd/display/dc/link/accessories/link_dp_cts.c  | 107 ++--
>  .../amd/display/dc/link/hwss/link_hwss_hpo_dp.c|  10 +
>  .../gpu/drm/amd/display/dc/link/link_detection.c   |   3 +-
>  drivers/gpu/drm/amd/display/dc/link/link_dpms.c|  21 +-
>  .../gpu/drm/amd/display/dc/link/link_validation.c  |   8 +-
>  .../drm/amd/display/dc/link/protocols/link_ddc.c   |   2 +-
>  .../display/dc/link/protocols/link_dp_capability.c |  22 +-
>  .../display/dc/link/protocols/link_dp_training.c   |   9 +-
>  .../link_dp_training_fixed_vs_pe_retimer.c |  90 +++-
>  .../dc/link/protocols/link_edp_panel_control.c |  80 +--
>  .../dc/link/protocols/link_edp_panel_control.h |   1 +
>  drivers/gpu/drm/amd/display/dmub/dmub_srv.h|   7 +
>  drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h| 131 +
>  .../drm/amd/display/dmub/inc/dmub_subvp_state.h| 183 ---
>  drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c  |   8 +
>  drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.h  |   2 +
>  drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c|  31 +-
>  .../drm/amd/display/include/link_service_types.h   |   2 +-
>  drivers/gpu/drm/amd/include/amd_shared.h   |   1 +
>  drivers/gpu/drm/amd/include/kgd_kfd_interface.h|   9 +-
>  drivers/gpu/drm/amd/include/kgd_pp_interface.h |  69 +++
>  drivers/gpu/drm/amd/include/mes_v11_api_def.h  |   4 +-
>  drivers/gpu/drm/amd/include/yellow_carp_offset.h   |   6 +-
>  drivers/gpu/drm/amd/pm/amdgpu_pm.c |   3 +-
>  drivers/gpu/drm/amd/pm/inc/amdgpu_pm.h |   3 +-
>  drivers/gpu/drm/amd/pm/inc/smu_v13_0_0_pptable.h   |  21 +-
>  .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c|  14 +-
>  .../gpu/drm/amd/pm/swsmu/inc/smu_11_0_cdr_table.h  |   6 +-
>  drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h   |   4 +
>  .../gpu/drm/amd/pm/swsmu/inc/smu_v13_0_7_pptable.h |  21 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c  |   6 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c|  27 +-
>  .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c|  99 +---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c   | 109 +++-
>  drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c|   6 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu12/smu_v12_0.c |   3 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c |   2 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c |  48 ++
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c   |  37 +-
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c   |   2 +-
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c   |  35 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c |   9 +-
>  drivers/gpu/drm/radeon/atom.c  |  18 +-
>  drivers/gpu/drm/radeon/clearstate_si.h |   3 +-
>  drivers/gpu/drm/radeon/r300.c  |   6 +-
>  drivers/gpu/drm/radeon/radeon_atombios.c   |  12 +-
>  drivers/gpu/drm/radeon/radeon_atpx_handler.c   |  18 +-
>  drivers/gpu/drm/radeon/radeon_combios.c|   4 +-
>  drivers/gpu/drm/radeon/radeon_connectors.c |  11 +-
>  drivers/gpu/drm/radeon/radeon_drv.c|  51 +-
>  drivers/gpu/drm/radeon/radeon_drv.h|  13 +
>  drivers/gpu/drm/radeon/radeon_encoders.c   |  22 +-
>  drivers/gpu/drm/radeon/radeon_gart.c   |  37 +-
>  drivers/gpu/drm/radeon/radeon_gem.c|   4 +-
>  drivers/gpu/drm/radeon/radeon_kms.c|  10 +-
>  drivers/gpu/drm/radeon/radeon_legacy_tv.c  |   6 +-
>  drivers/gpu/drm/radeon/radeon_test.c   |   8 +-
>  drivers/gpu/drm/radeon/radeon_vce.c|   4 +-
>  drivers/gpu/drm/radeon/rv770.c |  33 +-
>  drivers/gpu/drm/radeon/rv770_smc.c |  36 +-
>  drivers/gpu/drm/radeon/sislands_smc.h  |  51 +-
>  245 files changed,  insertions(+), 2621 deletions(-)
>  create mode 100644 Documentation/gpu/amdgpu/flashing.rst
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.h
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c
>  rename drivers/gpu/drm/amd/amdgpu/{aqua_vanjaram_reg_init.c => 
> aqua_vanjaram.c} (99%)
>  create mode 100644 drivers/gpu/drm/amd/display/dc/dcn301/dcn301_optc.c
>  create mode 100644 drivers/gpu/drm/amd/display/dc/dcn301/dcn301_optc.h
>  delete mode 100644 drivers/gpu/drm/amd/display/dmub/inc/dmub_subvp_state.h

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch



Re: [PATCH v5 6/6] drm/doc: Define KMS atomic state set

2023-08-02 Thread Daniel Vetter
On Mon, 31 Jul 2023 at 04:01, André Almeida  wrote:
>
> Em 13/07/2023 04:51, Pekka Paalanen escreveu:
> > On Tue, 11 Jul 2023 10:57:57 +0200
> > Daniel Vetter  wrote:
> >
> >> On Fri, Jul 07, 2023 at 07:40:59PM -0300, André Almeida wrote:
> >>> From: Pekka Paalanen 
> >>>
> >>> Specify how the atomic state is maintained between userspace and
> >>> kernel, plus the special case for async flips.
> >>>
> >>> Signed-off-by: Pekka Paalanen 
> >>> Signed-off-by: André Almeida 
> >>> ---
> >>> v4: total rework by Pekka
> >>> ---
> >>>   Documentation/gpu/drm-uapi.rst | 41 ++
> >>>   1 file changed, 41 insertions(+)
> >>>
> >>> diff --git a/Documentation/gpu/drm-uapi.rst 
> >>> b/Documentation/gpu/drm-uapi.rst
> >>> index 65fb3036a580..6a1662c08901 100644
> >>> --- a/Documentation/gpu/drm-uapi.rst
> >>> +++ b/Documentation/gpu/drm-uapi.rst
> >>> @@ -486,3 +486,44 @@ and the CRTC index is its position in this array.
> >>>
> >>>   .. kernel-doc:: include/uapi/drm/drm_mode.h
> >>>  :internal:
> >>> +
> >>> +KMS atomic state
> >>> +
> >>> +
> >>> +An atomic commit can change multiple KMS properties in an atomic fashion,
> >>> +without ever applying intermediate or partial state changes.  Either the 
> >>> whole
> >>> +commit succeeds or fails, and it will never be applied partially. This 
> >>> is the
> >>> +fundamental improvement of the atomic API over the older non-atomic API 
> >>> which is
> >>> +referred to as the "legacy API".  Applying intermediate state could 
> >>> unexpectedly
> >>> +fail, cause visible glitches, or delay reaching the final state.
> >>> +
> >>> +An atomic commit can be flagged with DRM_MODE_ATOMIC_TEST_ONLY, which 
> >>> means the
> >>> +complete state change is validated but not applied.  Userspace should 
> >>> use this
> >>> +flag to validate any state change before asking to apply it. If 
> >>> validation fails
> >>> +for any reason, userspace should attempt to fall back to another, perhaps
> >>> +simpler, final state.  This allows userspace to probe for various 
> >>> configurations
> >>> +without causing visible glitches on screen and without the need to undo a
> >>> +probing change.
> >>> +
> >>> +The changes recorded in an atomic commit apply on top the current KMS 
> >>> state in
> >>> +the kernel. Hence, the complete new KMS state is the complete old KMS 
> >>> state with
> >>> +the committed property settings done on top. The kernel will 
> >>> automatically avoid
> >>> +no-operation changes, so it is safe and even expected for userspace to 
> >>> send
> >>> +redundant property settings.  No-operation changes do not count towards 
> >>> actually
> >>> +needed changes, e.g.  setting MODE_ID to a different blob with identical
> >>> +contents as the current KMS state shall not be a modeset on its own.
> >>
> >> Small clarification: The kernel indeed tries very hard to make redundant
> >> changes a no-op, and I think we should consider any issues here bugs. But
> >> it still has to check, which means it needs to acquire the right locks and
> >> put in the right (cross-crtc) synchronization points, and due to
> >> implmentation challenges it's very hard to try to avoid that in all cases.
> >> So adding redundant changes especially across crtc (and their connected
> >> planes/connectors) might result in some oversynchronization issues, and
> >> userspace should therefore avoid them if feasible.
> >>
> >> With some sentences added to clarify this:
> >>
> >> Reviewed-by: Daniel Vetter 
> >
> > After talking on IRC yesterday, we realized that the no-op rule is
> > nowhere near as generic as I have believed. Roughly:
> > https://oftc.irclog.whitequark.org/dri-devel/2023-07-12#1689152446-1689157291;
> >
> >
>
> How about:
>
> The changes recorded in an atomic commit apply on top the current KMS
> state in the kernel. Hence, the complete new KMS state is the complete
> old KMS state with the committed property settings done on top. The
> kernel will try to avoid no-operation changes, so it is safe for
>

Re: [PATCH v5 6/6] drm/doc: Define KMS atomic state set

2023-07-11 Thread Daniel Vetter
On Fri, Jul 07, 2023 at 07:40:59PM -0300, André Almeida wrote:
> From: Pekka Paalanen 
> 
> Specify how the atomic state is maintained between userspace and
> kernel, plus the special case for async flips.
> 
> Signed-off-by: Pekka Paalanen 
> Signed-off-by: André Almeida 
> ---
> v4: total rework by Pekka
> ---
>  Documentation/gpu/drm-uapi.rst | 41 ++
>  1 file changed, 41 insertions(+)
> 
> diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
> index 65fb3036a580..6a1662c08901 100644
> --- a/Documentation/gpu/drm-uapi.rst
> +++ b/Documentation/gpu/drm-uapi.rst
> @@ -486,3 +486,44 @@ and the CRTC index is its position in this array.
>  
>  .. kernel-doc:: include/uapi/drm/drm_mode.h
> :internal:
> +
> +KMS atomic state
> +
> +
> +An atomic commit can change multiple KMS properties in an atomic fashion,
> +without ever applying intermediate or partial state changes.  Either the 
> whole
> +commit succeeds or fails, and it will never be applied partially. This is the
> +fundamental improvement of the atomic API over the older non-atomic API 
> which is
> +referred to as the "legacy API".  Applying intermediate state could 
> unexpectedly
> +fail, cause visible glitches, or delay reaching the final state.
> +
> +An atomic commit can be flagged with DRM_MODE_ATOMIC_TEST_ONLY, which means 
> the
> +complete state change is validated but not applied.  Userspace should use 
> this
> +flag to validate any state change before asking to apply it. If validation 
> fails
> +for any reason, userspace should attempt to fall back to another, perhaps
> +simpler, final state.  This allows userspace to probe for various 
> configurations
> +without causing visible glitches on screen and without the need to undo a
> +probing change.
> +
> +The changes recorded in an atomic commit apply on top the current KMS state 
> in
> +the kernel. Hence, the complete new KMS state is the complete old KMS state 
> with
> +the committed property settings done on top. The kernel will automatically 
> avoid
> +no-operation changes, so it is safe and even expected for userspace to send
> +redundant property settings.  No-operation changes do not count towards 
> actually
> +needed changes, e.g.  setting MODE_ID to a different blob with identical
> +contents as the current KMS state shall not be a modeset on its own.

Small clarification: The kernel indeed tries very hard to make redundant
changes a no-op, and I think we should consider any issues here bugs. But
it still has to check, which means it needs to acquire the right locks and
put in the right (cross-crtc) synchronization points, and due to
implmentation challenges it's very hard to try to avoid that in all cases.
So adding redundant changes especially across crtc (and their connected
planes/connectors) might result in some oversynchronization issues, and
userspace should therefore avoid them if feasible.

With some sentences added to clarify this:

Reviewed-by: Daniel Vetter 

> +
> +A "modeset" is a change in KMS state that might enable, disable, or 
> temporarily
> +disrupt the emitted video signal, possibly causing visible glitches on 
> screen. A
> +modeset may also take considerably more time to complete than other kinds of
> +changes, and the video sink might also need time to adapt to the new signal
> +properties. Therefore a modeset must be explicitly allowed with the flag
> +DRM_MODE_ATOMIC_ALLOW_MODESET.  This in combination with
> +DRM_MODE_ATOMIC_TEST_ONLY allows userspace to determine if a state change is
> +likely to cause visible disruption on screen and avoid such changes when end
> +users do not expect them.
> +
> +An atomic commit with the flag DRM_MODE_PAGE_FLIP_ASYNC is allowed to
> +effectively change only the FB_ID property on any planes. No-operation 
> changes
> +are ignored as always. Changing any other property will cause the commit to 
> be
> +rejected.
> -- 
> 2.41.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [pull] amdgpu drm-fixes-6.3

2023-04-13 Thread Daniel Vetter
On Wed, Apr 12, 2023 at 05:56:37PM -0400, Alex Deucher wrote:
> Hi Dave, Daniel,
> 
> Fixes for 6.3.
> 
> The following changes since commit 09a9639e56c01c7a00d6c0ca63f4c7c41abe075d:
> 
>   Linux 6.3-rc6 (2023-04-09 11:15:57 -0700)
> 
> are available in the Git repository at:
> 
>   https://gitlab.freedesktop.org/agd5f/linux.git 
> tags/amd-drm-fixes-6.3-2023-04-12
> 
> for you to fetch changes up to b9a24d8bd51e2db425602fa82d7f4c06aa3db852:
> 
>   drm/amd/pm: correct the pcie link state check for SMU13 (2023-04-12 
> 16:11:22 -0400)

Pulled, thanks
> 
> 
> amd-drm-fixes-6.3-2023-04-12:
> 
> amdgpu:
> - SMU13 fixes
> - DP MST fix
> 
> 
> Evan Quan (1):
>   drm/amd/pm: correct the pcie link state check for SMU13
> 
> Horatio Zhang (2):
>   drm/amd/pm: correct SMU13.0.7 pstate profiling clock settings
>   drm/amd/pm: correct SMU13.0.7 max shader clock reporting
> 
> Wayne Lin (1):
>   drm/amd/display: Pass the right info to drm_dp_remove_payload
> 
>  .../drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c  | 57 --
>  drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h   |  6 ++
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c   |  4 +-
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c   | 87 
> +++---
>  4 files changed, 135 insertions(+), 19 deletions(-)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v3 3/7] drm/amdgpu: Switch to fdinfo helper

2023-04-12 Thread Daniel Vetter
On Tue, Apr 11, 2023 at 03:56:08PM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  3 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 16 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h |  2 +-
>  3 files changed, 9 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index f5ffca24def4..3611cfd5f076 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -2752,7 +2752,7 @@ static const struct file_operations 
> amdgpu_driver_kms_fops = {
>   .compat_ioctl = amdgpu_kms_compat_ioctl,
>  #endif
>  #ifdef CONFIG_PROC_FS
> - .show_fdinfo = amdgpu_show_fdinfo
> + .show_fdinfo = drm_fop_show_fdinfo,
>  #endif
>  };
>  
> @@ -2807,6 +2807,7 @@ static const struct drm_driver amdgpu_kms_driver = {
>   .dumb_map_offset = amdgpu_mode_dumb_mmap,
>   .fops = _driver_kms_fops,
>   .release = _driver_release_kms,
> + .show_fdinfo = amdgpu_show_fdinfo,
>  
>   .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
>   .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> index 99a7855ab1bc..c2fdd5e448d1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> @@ -53,9 +53,8 @@ static const char *amdgpu_ip_name[AMDGPU_HW_IP_NUM] = {
>   [AMDGPU_HW_IP_VCN_JPEG] =   "jpeg",
>  };
>  
> -void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
> +void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file)
>  {
> - struct drm_file *file = f->private_data;
>   struct amdgpu_device *adev = drm_to_adev(file->minor->dev);
>   struct amdgpu_fpriv *fpriv = file->driver_priv;
>   struct amdgpu_vm *vm = >vm;
> @@ -86,18 +85,15 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file 
> *f)
>* **
>*/
>  
> - seq_printf(m, "pasid:\t%u\n", fpriv->vm.pasid);
> - seq_printf(m, "drm-driver:\t%s\n", file->minor->dev->driver->name);
> - seq_printf(m, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
> - seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
> - seq_printf(m, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
> - seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
> - seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
> + drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid);
> + drm_printf(p, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
> + drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
> + drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);

random aside, but we're not super consistent here, some of these have an
additional ' ' space.

I guess a next step would be a drm_fdinfo_printf(drm_printer *p, const
char *name, const char *printf, ...) and maybe some specialized ones that
dtrt for specific parameters, like drm_fdinfo_llu().

But that's for next one I guess :-)
-Daniel


>   for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
>   if (!usage[hw_ip])
>   continue;
>  
> - seq_printf(m, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],
> + drm_printf(p, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],
>  ktime_to_ns(usage[hw_ip]));
>   }
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
> index e86834bfea1d..0398f5a159ef 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
> @@ -37,6 +37,6 @@
>  #include "amdgpu_ids.h"
>  
>  uint32_t amdgpu_get_ip_count(struct amdgpu_device *adev, int id);
> -void amdgpu_show_fdinfo(struct seq_file *m, struct file *f);
> +void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file);
>  
>  #endif
> -- 
> 2.39.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: fdinfo blew up? (Was: [RFC PATCH 0/4] uapi, drm: Add and implement RLIMIT_GPUPRIO)

2023-04-05 Thread Daniel Vetter
On Wed, 5 Apr 2023 at 11:11, Tvrtko Ursulin
 wrote:
>
>
> On 05/04/2023 09:28, Daniel Vetter wrote:
> > On Tue, 4 Apr 2023 at 12:45, Tvrtko Ursulin
> >  wrote:
> >>
> >>
> >> Hi,
> >>
> >> On 03/04/2023 20:40, Joshua Ashton wrote:
> >>> Hello all!
> >>>
> >>> I would like to propose a new API for allowing processes to control
> >>> the priority of GPU queues similar to RLIMIT_NICE/RLIMIT_RTPRIO.
> >>>
> >>> The main reason for this is for compositors such as Gamescope and
> >>> SteamVR vrcompositor to be able to create realtime async compute
> >>> queues on AMD without the need of CAP_SYS_NICE.
> >>>
> >>> The current situation is bad for a few reasons, one being that in order
> >>> to setcap the executable, typically one must run as root which involves
> >>> a pretty high privelage escalation in order to achieve one
> >>> small feat, a realtime async compute queue queue for VR or a compositor.
> >>> The executable cannot be setcap'ed inside a
> >>> container nor can the setcap'ed executable be run in a container with
> >>> NO_NEW_PRIVS.
> >>>
> >>> I go into more detail in the description in
> >>> `uapi: Add RLIMIT_GPUPRIO`.
> >>>
> >>> My initial proposal here is to add a new RLIMIT, `RLIMIT_GPUPRIO`,
> >>> which seems to make most initial sense to me to solve the problem.
> >>>
> >>> I am definitely not set that this is the best formulation however
> >>> or if this should be linked to DRM (in terms of it's scheduler
> >>> priority enum/definitions) in any way and and would really like other
> >>> people's opinions across the stack on this.
> >>>
> >>> Once initial concern is that potentially this RLIMIT could out-live
> >>> the lifespan of DRM. It sounds crazy saying it right now, something
> >>> that definitely popped into my mind when touching `resource.h`. :-)
> >>>
> >>> Anyway, please let me know what you think!
> >>> Definitely open to any feedback and advice you may have. :D
> >>
> >> Interesting! I tried to solved the similar problem two times in the past 
> >> already.
> >>
> >> First time I was proposing to tie nice to DRM scheduling priority [1] - if 
> >> the latter has been left at default - drawing the analogy with the 
> >> nice+ionice handling. That was rejected and I was nudged towards the 
> >> cgroups route.
> >>
> >> So with that second attempt I implemented a hierarchical opaque 
> >> drm.priority cgroup controller [2]. I think it would allow you to solve 
> >> your use case too by placing your compositor in a cgroup with an elevated 
> >> priority level.
> >>
> >> Implementation wise in my proposal it was left to individual drivers to 
> >> "meld" the opaque cgroup drm.priority with the driver specific priority 
> >> concept.
> >>
> >> That too wasn't too popular with the feedback (AFAIR) that the priority is 
> >> a too subsystem specific concept.
> >>
> >> Finally I was left with a weight based drm cgroup controller, exactly 
> >> following the controls of the CPU and IO ones, but with much looser 
> >> runtime guarantees. [3]
> >>
> >> I don't think this last one works for your use case, at least not at the 
> >> current state for drm scheduling capability, where the implementation is a 
> >> "bit" too reactive for realtime.
> >>
> >> Depending on how the discussion around your rlimit proposal goes, perhaps 
> >> one alternative could be to go the cgroup route and add an attribute like 
> >> drm.realtime. That perhaps sounds abstract and generic enough to be 
> >> passable. Built as a simplification of [2] it wouldn't be too complicated.
> >>
> >> On the actual proposal of RLIMIT_GPUPRIO...
> >>
> >> The name would be problematic since we have generic hw accelerators (not 
> >> just GPUs) under the DRM subsystem. Perhaps RLIMIT_DRMPRIO would be better 
> >> but I think you will need to copy some more mailing lists and people on 
> >> that one. Because I can imagine one or two more fundamental questions this 
> >> opens up, as you have eluded in your cover letter as well.
> >
> > So I don't want to get into the bikeshed, I think Tvrtko summarized
> > pretty well that this is a hard problem with lots of attempts (I 

Re: [RFC PATCH 0/4] uapi, drm: Add and implement RLIMIT_GPUPRIO

2023-04-05 Thread Daniel Vetter
omething like the minimal fdinfo stuff
(minimal I guess to avoid wider discussion) which then blew up because
it wasn't thought out well enough.

Adding at least some of the people who probably should be cc'ed on
this. Please add more.

Cheers, Daniel


>
> Regards,
>
> Tvrtko
>
> [1] 
> https://lore.kernel.org/dri-devel/20220407152806.3387898-1-tvrtko.ursu...@linux.intel.com/T/
> [2] 
> https://lore.kernel.org/lkml/20221019173254.3361334-4-tvrtko.ursu...@linux.intel.com/T/#u
> [3] 
> https://lore.kernel.org/lkml/20230314141904.1210824-1-tvrtko.ursu...@linux.intel.com/



--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [pull] amdgpu, amdkfd, radeon drm-next-6.4

2023-04-03 Thread Daniel Vetter
smu11/vangogh_ppt.c   | 2 +
>  drivers/gpu/drm/radeon/Makefile| 3 +-
>  drivers/gpu/drm/radeon/radeon.h| 2 +
>  drivers/gpu/drm/radeon/radeon_display.c| 4 -
>  drivers/gpu/drm/radeon/radeon_drv.c| 3 +-
>  drivers/gpu/drm/radeon/radeon_drv.h| 1 -
>  drivers/gpu/drm/radeon/radeon_fb.c |   400 -
>  drivers/gpu/drm/radeon/radeon_fbdev.c  |   422 +
>  drivers/gpu/drm/radeon/radeon_gem.c|24 +
>  drivers/gpu/drm/radeon/radeon_kms.c|18 -
>  drivers/gpu/drm/radeon/radeon_mode.h   |20 +-
>  134 files changed, 119288 insertions(+), 930 deletions(-)
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.h
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.h
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/nbio_v7_9.h
>  create mode 100644 
> drivers/gpu/drm/amd/include/asic_reg/athub/athub_1_8_0_offset.h
>  create mode 100644 
> drivers/gpu/drm/amd/include/asic_reg/athub/athub_1_8_0_sh_mask.h
>  create mode 100644 drivers/gpu/drm/amd/include/asic_reg/gc/gc_9_4_3_offset.h
>  create mode 100644 drivers/gpu/drm/amd/include/asic_reg/gc/gc_9_4_3_sh_mask.h
>  create mode 100644 
> drivers/gpu/drm/amd/include/asic_reg/mmhub/mmhub_1_8_0_offset.h
>  create mode 100644 
> drivers/gpu/drm/amd/include/asic_reg/mmhub/mmhub_1_8_0_sh_mask.h
>  create mode 100644 
> drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_9_0_offset.h
>  create mode 100644 
> drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_9_0_sh_mask.h
>  create mode 100644 
> drivers/gpu/drm/amd/include/asic_reg/oss/osssys_4_4_2_offset.h
>  create mode 100644 
> drivers/gpu/drm/amd/include/asic_reg/oss/osssys_4_4_2_sh_mask.h
>  delete mode 100644 drivers/gpu/drm/radeon/radeon_fb.c
>  create mode 100644 drivers/gpu/drm/radeon/radeon_fbdev.c

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [pull] amdgpu drm-fixes-6.3

2023-03-30 Thread Daniel Vetter
On Thu, Mar 30, 2023 at 11:38:59AM -0400, Alex Deucher wrote:
> Hi Dave, Daniel,
> 
> A regression fix for 6.3.
> 
> The following changes since commit 68dc1846c3a44d5e633be145c169ce2fd5420695:
> 
>   drm/amd/display: Take FEC Overhead into Timeslot Calculation (2023-03-29 
> 17:21:06 -0400)
> 
> are available in the Git repository at:
> 
>   https://gitlab.freedesktop.org/agd5f/linux.git 
> tags/amd-drm-fixes-6.3-2023-03-30
> 
> for you to fetch changes up to 2fec9dc8e0acc3dfb56d1389151bcf405f087b10:
> 
>   drm/amdgpu: allow more APUs to do mode2 reset when go to S4 (2023-03-30 
> 11:23:58 -0400)
> 
> 
> amd-drm-fixes-6.3-2023-03-30:
> 
> amdgpu:
> - Hibernation regression fix

Yeah for a regression fix a 2nd pull makes sense, pulled, thanks.
> 
> 
> Tim Huang (1):
>   drm/amdgpu: allow more APUs to do mode2 reset when go to S4
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [pull] amdgpu drm-fixes-6.3

2023-03-30 Thread Daniel Vetter
On Wed, Mar 29, 2023 at 06:00:59PM -0400, Alex Deucher wrote:
> Hi Dave, Daniel,
> 
> Fixes for 6.3.
> 
> The following changes since commit 197b6b60ae7bc51dd0814953c562833143b292aa:
> 
>   Linux 6.3-rc4 (2023-03-26 14:40:20 -0700)
> 
> are available in the Git repository at:
> 
>   https://gitlab.freedesktop.org/agd5f/linux.git 
> tags/amd-drm-fixes-6.3-2023-03-29
> 
> for you to fetch changes up to 68dc1846c3a44d5e633be145c169ce2fd5420695:
> 
>   drm/amd/display: Take FEC Overhead into Timeslot Calculation (2023-03-29 
> 17:21:06 -0400)

Pulled, thanks

> 
> 
> amd-drm-fixes-6.3-2023-03-29:
> 
> amdgpu:
> - Two DP MST fixes
> 
> 
> Fangzhi Zuo (2):
>   drm/amd/display: Add DSC Support for Synaptics Cascaded MST Hub
>   drm/amd/display: Take FEC Overhead into Timeslot Calculation
> 
>  .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c| 51 
> ++
>  .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.h| 15 +++
>  2 files changed, 58 insertions(+), 8 deletions(-)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [pull] amdgpu drm-fixes-6.3

2023-03-24 Thread Daniel Vetter
On Thu, Mar 23, 2023 at 12:19:39PM -0400, Alex Deucher wrote:
> Hi Dave, Daniel,
> 
> Fixes for 6.3.
> 
> The following changes since commit e8d018dd0257f744ca50a729e3d042cf2ec9da65:
> 
>   Linux 6.3-rc3 (2023-03-19 13:27:55 -0700)
> 
> are available in the Git repository at:
> 
>   https://gitlab.freedesktop.org/agd5f/linux.git 
> tags/amd-drm-fixes-6.3-2023-03-23
> 
> for you to fetch changes up to f9537b1fa7fb51c2162bc15ce469cbbf1ca0fbfe:
> 
>   drm/amd/display: Set dcn32 caps.seamless_odm (2023-03-23 09:39:34 -0400)

Pulled, thanks.
-Daniel

> 
> 
> amd-drm-fixes-6.3-2023-03-23:
> 
> amdgpu:
> - S4 fix
> - Soft reset fixes
> - SR-IOV fix
> - Remove an out of date comment in the DC code
> - ASPM fix
> - DCN 3.2 fixes
> 
> 
> Alex Hung (1):
>   drm/amd/display: remove outdated 8bpc comments
> 
> Hersen Wu (2):
>   drm/amd/display: fix wrong index used in dccg32_set_dpstreamclk
>   drm/amd/display: Set dcn32 caps.seamless_odm
> 
> Jane Jian (1):
>   drm/amdgpu/gfx: set cg flags to enter/exit safe mode
> 
> Kai-Heng Feng (1):
>   drm/amdgpu/nv: Apply ASPM quirk on Intel ADL + AMD Navi
> 
> Tim Huang (2):
>   drm/amdgpu: reposition the gpu reset checking for reuse
>   drm/amdgpu: skip ASIC reset for APUs when go to S4
> 
> Tong Liu01 (1):
>   drm/amdgpu: add mes resume when do gfx post soft reset
> 
> YuBiao Wang (1):
>   drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  5 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c   | 41 
> --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  5 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  9 +
>  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 14 
>  drivers/gpu/drm/amd/amdgpu/nv.c|  2 +-
>  drivers/gpu/drm/amd/amdgpu/vi.c| 17 +
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  |  1 -
>  drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c  |  3 +-
>  .../gpu/drm/amd/display/dc/dcn32/dcn32_resource.c  |  1 +
>  11 files changed, 72 insertions(+), 41 deletions(-)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/fb-helper: Remove drm_fb_helper_unprepare() from drm_fb_helper_fini()

2023-02-17 Thread Daniel Vetter
On Fri, Feb 17, 2023 at 09:18:54AM +0100, Thomas Zimmermann wrote:
> Hi
> 
> Am 16.02.23 um 21:11 schrieb Daniel Vetter:
> > On Thu, Feb 16, 2023 at 03:06:20PM +0100, Thomas Zimmermann wrote:
> > > Move drm_fb_helper_unprepare() from drm_fb_helper_fini() into the
> > > calling fbdev implementation. Avoids a possible stale mutex with
> > > generic fbdev code.
> > > 
> > > As indicated by its name, drm_fb_helper_prepare() prepares struct
> > > drm_fb_helper before setting up the fbdev support with a call to
> > > drm_fb_helper_init(). In legacy fbdev emulation, this happens next
> > > to each other. If successful, drm_fb_helper_fini() later tear down
> > > the fbdev device and also unprepare via drm_fb_helper_unprepare().
> > > 
> > > Generic fbdev emulation prepares struct drm_fb_helper immediately
> > > after allocating the instance. It only calls drm_fb_helper_init()
> > > as part of processing a hotplug event. If the hotplug-handling fails,
> > > it runs drm_fb_helper_fini(). This unprepares the fb-helper instance
> > > and the next hotplug event runs on stale data.
> > > 
> > > Solve this by moving drm_fb_helper_unprepare() from drm_fb_helper_fini()
> > > into the fbdev implementations. Call it right before freeing the
> > > fb-helper instance.
> > > 
> > > Fixes: 4825797c36da ("drm/fb-helper: Introduce drm_fb_helper_unprepare()")
> > > Cc: Thomas Zimmermann 
> > > Cc: Javier Martinez Canillas 
> > > Cc: Maarten Lankhorst 
> > > Cc: Maxime Ripard 
> > > Cc: David Airlie 
> > > Cc: Daniel Vetter 
> > > Cc: dri-de...@lists.freedesktop.org
> > > 
> > > Signed-off-by: Thomas Zimmermann 
> > 
> > This reminds me of an old patch I just recently stumbled over again:
> > 
> > https://lore.kernel.org/dri-devel/Y3St2VHJ7jEmcNFw@phenom.ffwll.local/
> > 
> > Should I resurrect that one maybe and send it out? I think that also ties
> > a bit into your story here.
> 
> I don't think it will be necessary. I began to convert the existing fbdev
> emulation to make use of drm_client, which should resove a number of
> problems. I expect to post this after the various trees have merged the
> recent changes to fbdev helpers.

The only version the patch is fixing is the client one, the old one is
unfixable (I think at least, hence just the comments). Note that the link
is pre-splitting, I do have a rebased version here.

I'll just send that out and head into vacations :-)
-Daniel

> 
> Best regards
> Thomas
> 
> > 
> > > ---
> > >   drivers/gpu/drm/armada/armada_fbdev.c  | 3 +++
> > >   drivers/gpu/drm/drm_fb_helper.c| 2 --
> > >   drivers/gpu/drm/drm_fbdev_generic.c| 2 ++
> > >   drivers/gpu/drm/exynos/exynos_drm_fbdev.c  | 3 ++-
> > >   drivers/gpu/drm/gma500/framebuffer.c   | 2 ++
> > >   drivers/gpu/drm/i915/display/intel_fbdev.c | 1 +
> > >   drivers/gpu/drm/msm/msm_fbdev.c| 2 ++
> > >   drivers/gpu/drm/omapdrm/omap_fbdev.c   | 2 ++
> > >   drivers/gpu/drm/radeon/radeon_fb.c | 2 ++
> > >   drivers/gpu/drm/tegra/fb.c | 1 +
> > >   10 files changed, 17 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/armada/armada_fbdev.c 
> > > b/drivers/gpu/drm/armada/armada_fbdev.c
> > > index 07e410c62b7a..0e44f53e9fa4 100644
> > > --- a/drivers/gpu/drm/armada/armada_fbdev.c
> > > +++ b/drivers/gpu/drm/armada/armada_fbdev.c
> > > @@ -147,6 +147,7 @@ int armada_fbdev_init(struct drm_device *dev)
> > >err_fb_setup:
> > >   drm_fb_helper_fini(fbh);
> > >err_fb_helper:
> > > + drm_fb_helper_unprepare(fbh);
> > >   priv->fbdev = NULL;
> > >   return ret;
> > >   }
> > > @@ -164,6 +165,8 @@ void armada_fbdev_fini(struct drm_device *dev)
> > >   if (fbh->fb)
> > >   fbh->fb->funcs->destroy(fbh->fb);
> > > + drm_fb_helper_unprepare(fbh);
> > > +
> > >   priv->fbdev = NULL;
> > >   }
> > >   }
> > > diff --git a/drivers/gpu/drm/drm_fb_helper.c 
> > > b/drivers/gpu/drm/drm_fb_helper.c
> > > index 28c428e9c530..a39998047f8a 100644
> > > --- a/drivers/gpu/drm/drm_fb_helper.c
> > > +++ b/drivers/gpu/drm/drm_fb_helper.c
> > > @@ -590,8 +590,6 @@ void drm_fb_helper_fini(struct drm_fb_helper 
> > > *fb

Re: [PATCH] drm/fb-helper: Remove drm_fb_helper_unprepare() from drm_fb_helper_fini()

2023-02-16 Thread Daniel Vetter
On Thu, Feb 16, 2023 at 03:06:20PM +0100, Thomas Zimmermann wrote:
> Move drm_fb_helper_unprepare() from drm_fb_helper_fini() into the
> calling fbdev implementation. Avoids a possible stale mutex with
> generic fbdev code.
> 
> As indicated by its name, drm_fb_helper_prepare() prepares struct
> drm_fb_helper before setting up the fbdev support with a call to
> drm_fb_helper_init(). In legacy fbdev emulation, this happens next
> to each other. If successful, drm_fb_helper_fini() later tear down
> the fbdev device and also unprepare via drm_fb_helper_unprepare().
> 
> Generic fbdev emulation prepares struct drm_fb_helper immediately
> after allocating the instance. It only calls drm_fb_helper_init()
> as part of processing a hotplug event. If the hotplug-handling fails,
> it runs drm_fb_helper_fini(). This unprepares the fb-helper instance
> and the next hotplug event runs on stale data.
> 
> Solve this by moving drm_fb_helper_unprepare() from drm_fb_helper_fini()
> into the fbdev implementations. Call it right before freeing the
> fb-helper instance.
> 
> Fixes: 4825797c36da ("drm/fb-helper: Introduce drm_fb_helper_unprepare()")
> Cc: Thomas Zimmermann 
> Cc: Javier Martinez Canillas 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> 
> Signed-off-by: Thomas Zimmermann 

This reminds me of an old patch I just recently stumbled over again:

https://lore.kernel.org/dri-devel/Y3St2VHJ7jEmcNFw@phenom.ffwll.local/

Should I resurrect that one maybe and send it out? I think that also ties
a bit into your story here.

> ---
>  drivers/gpu/drm/armada/armada_fbdev.c  | 3 +++
>  drivers/gpu/drm/drm_fb_helper.c| 2 --
>  drivers/gpu/drm/drm_fbdev_generic.c| 2 ++
>  drivers/gpu/drm/exynos/exynos_drm_fbdev.c  | 3 ++-
>  drivers/gpu/drm/gma500/framebuffer.c   | 2 ++
>  drivers/gpu/drm/i915/display/intel_fbdev.c | 1 +
>  drivers/gpu/drm/msm/msm_fbdev.c| 2 ++
>  drivers/gpu/drm/omapdrm/omap_fbdev.c   | 2 ++
>  drivers/gpu/drm/radeon/radeon_fb.c | 2 ++
>  drivers/gpu/drm/tegra/fb.c | 1 +
>  10 files changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/armada/armada_fbdev.c 
> b/drivers/gpu/drm/armada/armada_fbdev.c
> index 07e410c62b7a..0e44f53e9fa4 100644
> --- a/drivers/gpu/drm/armada/armada_fbdev.c
> +++ b/drivers/gpu/drm/armada/armada_fbdev.c
> @@ -147,6 +147,7 @@ int armada_fbdev_init(struct drm_device *dev)
>   err_fb_setup:
>   drm_fb_helper_fini(fbh);
>   err_fb_helper:
> + drm_fb_helper_unprepare(fbh);
>   priv->fbdev = NULL;
>   return ret;
>  }
> @@ -164,6 +165,8 @@ void armada_fbdev_fini(struct drm_device *dev)
>   if (fbh->fb)
>   fbh->fb->funcs->destroy(fbh->fb);
>  
> + drm_fb_helper_unprepare(fbh);
> +
>   priv->fbdev = NULL;
>   }
>  }
> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> index 28c428e9c530..a39998047f8a 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -590,8 +590,6 @@ void drm_fb_helper_fini(struct drm_fb_helper *fb_helper)

I think it would be good to update the kerneldoc of _init() and _fini()
here to mention each another like we usually do with these pairs. Same
with prepare/unprepare() although the latter rerfences _prepare() already.

>   }
>   mutex_unlock(_fb_helper_lock);
>  
> - drm_fb_helper_unprepare(fb_helper);
> -
>   if (!fb_helper->client.funcs)
>   drm_client_release(_helper->client);
>  }
> diff --git a/drivers/gpu/drm/drm_fbdev_generic.c 
> b/drivers/gpu/drm/drm_fbdev_generic.c
> index 365f80717fa1..4d6325e91565 100644
> --- a/drivers/gpu/drm/drm_fbdev_generic.c
> +++ b/drivers/gpu/drm/drm_fbdev_generic.c
> @@ -65,6 +65,8 @@ static void drm_fbdev_fb_destroy(struct fb_info *info)
>  
>   drm_client_framebuffer_delete(fb_helper->buffer);
>   drm_client_release(_helper->client);
> +
> + drm_fb_helper_unprepare(fb_helper);
>   kfree(fb_helper);
>  }
>  
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c 
> b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
> index b89e33af8da8..4929ffe5a09a 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
> @@ -183,8 +183,8 @@ int exynos_drm_fbdev_init(struct drm_device *dev)
>  
>  err_setup:
>   drm_fb_helper_fini(helper);
> -
>  err_init:
> + drm_fb_helper_unprepare(helper);
>   private->fb_helper = NULL;
>   kfree(fbdev);
>  
> @@ -219,6 +

Re: [PATCH 1/6] drm/amdgpu: Generalize KFD dmabuf import

2023-02-16 Thread Daniel Vetter
On Tue, Jan 17, 2023 at 04:06:05AM +0300, Dmitry Osipenko wrote:
> 16.01.2023 18:11, Christian König пишет:
> > 
> >>>>>
> >>>>>> mmapping the memory with that new offset should still work. The
> >>>>>> imported BO is created with ttm_bo_type_sg, and AFAICT ttm_bo_vm.c
> >>>>>> supports mapping of SG BOs.
> >>>>>
> >>>>> Actually it shouldn't. This can go boom really easily.
> >>>>
> >>>> OK. I don't think we're doing this, but after Xiaogang raised the
> >>>> question I went looking through the code whether it's theoretically
> >>>> possible. I didn't find anything in the code that says that mmapping
> >>>> imported dmabufs would be prohibited or even dangerous. On the
> >>>> contrary, I found that ttm_bo_vm explicitly supports mmapping SG BOs.
> >>>>
> >>>>
> >>>>>
> >>>>> When you have imported a BO the only correct way of to mmap() it is
> >>>>> to do so on the original exporter.
> >>>>
> >>>> That seems sensible, and this is what we do today. That said, if
> >>>> mmapping an imported BO is dangerous, I'm missing a mechanism to
> >>>> protect against this. It could be as simple as setting
> >>>> AMDGPU_GEM_CREATE_NO_CPU_ACCESS in amdgpu_dma_buf_create_obj.
> >>>
> >>> At least for the GEM mmap() handler this is double checked very early
> >>> by looking at obj->import_attach and then either rejecting it or
> >>> redirecting the request to the DMA-buf file instead.
> >>
> >> Can you point me at where this check is? I see a check for
> >> obj->import_attach in drm_gem_dumb_map_offset. But I can't see how
> >> this function is called in amdgpu. I don't think it is used at all.
> > 
> > Uff, good question! @Thomas and @Dmitry: I clearly remember that one of
> > you guys was involved in the DRM/GEM mmap cleanup and DMA-buf with
> > workarounds for the KFD and DMA-buf.
> > 
> > What was the final solution to this? I can't find it of hand any more.
> 
> I was looking at it. The AMDGPU indeed allows to map imported GEMs, but
> then touching the mapped area by CPU results in a bus fault. You,
> Christian, suggested that this an AMDGPU bug that should be fixed by
> prohibiting the mapping in the first place and I was going to fix it,
> but then the plan changed from prohibiting the mapping into fixing it.
> 
> The first proposal was to make DRM core to handle the dma-buf mappings
> for all drivers universally [1]. Then we decided that will be better to
> prohibit mapping of imported GEMs [2]. In the end, Rob Clark argued that
> better to implement the [1], otherwise current userspace (Android) will
> be broken if mapping will be prohibited.
> 
> The last question was about the cache syncing of imported dma-bufs, how
> to ensure that drivers will do the cache maintenance/syncing properly.
> Rob suggested that it should be a problem for drivers and not for DRM core.
> 
> I was going to re-send the [1], but other things were getting priority.
> It's good that you reminded me about it :) I may re-send it sometime
> soon if there are no new objections.
> 
> [1] https://patchwork.freedesktop.org/patch/487481/
> 
> [2]
> https://lore.kernel.org/all/20220701090240.1896131-1-dmitry.osipe...@collabora.com/

Hm I still don't like allowing this in general, because in general it just
doesn't work.

I think more like a per-driver opt-in or something might be needed, so
that drivers which "know" that it's ok to just mmap without coherency can
allow that. Allowing this in general essentially gives up on the entire
idea of dma-buf cache flushing completely.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 5.10 1/1] drm/amdkfd: Check for null pointer after calling kmemdup

2023-01-19 Thread Daniel Vetter
On Thu, Jan 12, 2023 at 05:45:42PM +0100, Greg KH wrote:
> On Thu, Jan 12, 2023 at 04:26:45PM +0100, Daniel Vetter wrote:
> > On Thu, 12 Jan 2023 at 13:47, Greg KH  wrote:
> > > On Wed, Jan 04, 2023 at 07:56:33PM +0200, Dragos-Marian Panait wrote:
> > > > From: Jiasheng Jiang 
> > > >
> > > > [ Upstream commit abfaf0eee97925905e742aa3b0b72e04a918fa9e ]
> > > >
> > > > As the possible failure of the allocation, kmemdup() may return NULL
> > > > pointer.
> > > > Therefore, it should be better to check the 'props2' in order to prevent
> > > > the dereference of NULL pointer.
> > > >
> > > > Fixes: 3a87177eb141 ("drm/amdkfd: Add topology support for dGPUs")
> > > > Signed-off-by: Jiasheng Jiang 
> > > > Reviewed-by: Felix Kuehling 
> > > > Signed-off-by: Felix Kuehling 
> > > > Signed-off-by: Alex Deucher 
> > > > Signed-off-by: Dragos-Marian Panait 
> > > > ---
> > > >  drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c 
> > > > b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> > > > index 86b4dadf772e..02e3c650ed1c 100644
> > > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> > > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> > > > @@ -408,6 +408,9 @@ static int kfd_parse_subtype_iolink(struct 
> > > > crat_subtype_iolink *iolink,
> > > >   return -ENODEV;
> > > >   /* same everything but the other direction */
> > > >   props2 = kmemdup(props, sizeof(*props2), GFP_KERNEL);
> > > > + if (!props2)
> > > > + return -ENOMEM;
> > >
> > > Not going to queue this up as this is a bogus CVE.
> > 
> > Are we at the point where CVE presence actually contraindicates
> > backporting?
> 
> Some would say that that point passed a long time ago :)
> 
> > At least I'm getting a bit the feeling there's a surge of
> > automated (security) fixes that just don't hold up to any scrutiny.
> 
> That has been happening a lot more in the past 6-8 months than in years
> past with the introduction of more automated tools being present.

Ok, gut feeling confirmed, I'll try and keep more a lookout for these.

I guess next step is that people will use chatgpt to write the patches for
these bugs.

> > Last week I had to toss out an fbdev locking patch due to static
> > checker that has no clue at all how refcounting works, and so
> > complained that things need more locking ... (that was -fixes, but
> > would probably have gone to stable too if I didn't catch it).
> > 
> > Simple bugfixes from random people was nice when it was checkpatch
> > stuff and I was fairly happy to take these aggressively in drm. But my
> > gut feeling says things seem to be shifting towards more advanced
> > tooling, but without more advanced understanding by submitters. Does
> > that holder in other areas too?
> 
> Again, yes, I have seen that a lot recently, especially with regards to
> patches that purport to fix bugs yet obviously were never tested.
> 
> That being said, there are a few developers who are doing great things
> with fault-injection testing and providing good patches for that.  So we
> can't just say that everyone using these tools has no clue.

Oh yes there's definitely awesome stuff happening, which is why I do not
want to throw them all out. And waiting until the name is recognizeable
for individual maintainers like me that don't see the entire fixes flood
is also not really an approach.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH 00/17] DRM_USE_DYNAMIC_DEBUG regression

2023-01-13 Thread Daniel Vetter
On Fri, Jan 13, 2023 at 11:29:57AM -0700, jim.cro...@gmail.com wrote:
> On Wed, Jan 11, 2023 at 4:09 PM Daniel Vetter  wrote:
> >
> > On Mon, Dec 05, 2022 at 05:34:07PM -0700, Jim Cromie wrote:
> > > Hi everyone,
> > >
> > > DRM_USE_DYNAMIC_DEBUG=y has a regression on rc-*
> > >
> > > Regression is due to a chicken-egg problem loading modules; on
> > > `modprobe i915`, drm is loaded 1st, and drm.debug is set.  When
> > > drm_debug_enabled() tested __drm_debug at runtime, that just worked.
> > >
> > > But with DRM_USE_DYNAMIC_DEBUG=y, the runtime test is replaced with a
> > > post-load enablement of drm_dbg/dyndbg callsites (static-keys), via
> > > dyndbg's callback on __drm_debug.  Since all drm-drivers need drm.ko,
> > > it is loaded 1st, then drm.debug=X is applied, then drivers load, but
> > > too late for drm_dbgs to be enabled.
> > >
> > > STATUS
> > >
> > > For all-loadable drm,i915,amdgpu configs, it almost works, but
> > > propagating drm.debug to dependent modules doesnt actually apply,
> > > though the motions are there.  This is not the problem I want to chase
> > > here.
> > >
> > > The more basic trouble is:
> > >
> > > For builtin drm + helpers, things are broken pretty early; at the
> > > beginning of dynamic_debug_init().  As the ddebug_sanity() commit-msg
> > > describes in some detail, the records added by _USE fail to reference
> > > the struct ddebug_class_map created and exported by _DEFINE, but get
> > > separate addresses to "other" data that segv's when used as the
> > > expected pointer. FWIW, the pointer val starts with "revi".
> >
> > So I honestly have no idea here, linker stuff is way beyond where I have
> > clue. So what's the way forward here?
> >
> 
> Ive fixed this aspect.
> Unsurprisingly, it wasnt the linker :-}

Awesome!

> > The DEFINE/USE split does like the right thing to do at least from the
> > "how it's used in drivers" pov. But if we're just running circles not
> > quite getting there I dunno :-/
> > -Daniel
> >
> 
> Sending new rev next.
> I think its getting close.

Thanks a lot for keeping on pushing this.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] Revert "drm/display/dp_mst: Move all payload info into the atomic state"

2023-01-13 Thread Daniel Vetter
On Fri, Jan 13, 2023 at 12:16:57PM +0200, Jani Nikula wrote:
> 
> Cc: intel-gfx, drm maintainers
> 
> Please have the courtesy of Cc'ing us for changes impacting us, and
> maybe try to involve us earlier instead of surprising us like
> this. Looks like this has been debugged for at least three months, and
> the huge revert at this point is going to set us back with what's been
> developed on top of it for i915 DP MST and DSC.

tbf I assumed this wont land when I've seen it fly by. It feels a bit much
like living under a rock for half a year and then creating a mess for
everyone else who's been building on top of this is not great.

Like yes it's a regression, but apparently not a blantantly obvious one,
and I think if we just ram this in there's considerable community goodwill
down the drain. Someone needs to get that goodwill up the drain again.

> It's a regression, I get that, but this is also going to be really nasty
> to deal with. It's a 2500-line commit, plus the dependencies, which I
> don't think are accounted for here. (What's the baseline for the revert
> anyway?) I don't expect all the dependent commits to be easy to revert
> or backport to v6.1 or v6.2.
> 
> *sad trombone*

Yeah that's the other thing. 2500 patch revert is not cc stable material.
So this isn't even really helping users all that much.

Unless it also comes with full amounts of backports of the reverts on all
affected drivers for all curent stable trees, fully validated.

This is bad. I do think we need to have some understanding first of what
"fix this in amdgpu" would look like as plan B. Because plan A does not
look like a happy one at all.
-Daniel

> BR,
> Jani.
> 
> 
> On Thu, 12 Jan 2023, Wayne Lin  wrote:
> > This reverts commit 4d07b0bc403403438d9cf88450506240c5faf92f.
> >
> > [Why]
> > Changes cause regression on amdgpu mst.
> > E.g.
> > In fill_dc_mst_payload_table_from_drm(), amdgpu expects to add/remove 
> > payload
> > one by one and call fill_dc_mst_payload_table_from_drm() to update the HW
> > maintained payload table. But previous change tries to go through all the
> > payloads in mst_state and update amdpug hw maintained table in once 
> > everytime
> > driver only tries to add/remove a specific payload stream only. The newly
> > design idea conflicts with the implementation in amdgpu nowadays.
> >
> > [How]
> > Revert this patch first. After addressing all regression problems caused by
> > this previous patch, will add it back and adjust it.
> >
> > Signed-off-by: Wayne Lin 
> > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2171
> > Cc: sta...@vger.kernel.org # 6.1
> > Cc: Lyude Paul 
> > Cc: Harry Wentland 
> > Cc: Mario Limonciello 
> > Cc: Ville Syrjälä 
> > Cc: Ben Skeggs 
> > Cc: Stanislav Lisovskiy 
> > Cc: Fangzhi Zuo 
> > ---
> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  53 +-
> >  .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 106 ++-
> >  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   |  87 ++-
> >  .../amd/display/include/link_service_types.h  |   3 -
> >  drivers/gpu/drm/display/drm_dp_mst_topology.c | 724 --
> >  drivers/gpu/drm/i915/display/intel_dp_mst.c   |  67 +-
> >  drivers/gpu/drm/i915/display/intel_hdcp.c |  24 +-
> >  drivers/gpu/drm/nouveau/dispnv50/disp.c   | 167 ++--
> >  include/drm/display/drm_dp_mst_helper.h   | 177 +++--
> >  9 files changed, 878 insertions(+), 530 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 77277d90b6e2..674f5dc1102b 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -6548,7 +6548,6 @@ static int dm_encoder_helper_atomic_check(struct 
> > drm_encoder *encoder,
> > const struct drm_display_mode *adjusted_mode = 
> > _state->adjusted_mode;
> > struct drm_dp_mst_topology_mgr *mst_mgr;
> > struct drm_dp_mst_port *mst_port;
> > -   struct drm_dp_mst_topology_state *mst_state;
> > enum dc_color_depth color_depth;
> > int clock, bpp = 0;
> > bool is_y420 = false;
> > @@ -6562,13 +6561,6 @@ static int dm_encoder_helper_atomic_check(struct 
> > drm_encoder *encoder,
> > if (!crtc_state->connectors_changed && !crtc_state->mode_changed)
> > return 0;
> >  
> > -   mst_state = drm_atomic_get_mst_topology_state(state, mst_mgr);
> > -   if (IS_ERR(mst_state))
> > -   return PTR_ERR(mst_state);
> > -
> > -   if (!mst_state->pbn_div)
> > -   mst_state->pbn_div = 
> > dm_mst_get_pbn_divider(aconnector->mst_port->dc_link);
> > -
> > if (!state->duplicated) {
> > int max_bpc = conn_state->max_requested_bpc;
> > is_y420 = drm_mode_is_420_also(>display_info, 
> > adjusted_mode) &&
> > @@ -6580,10 +6572,11 @@ static int dm_encoder_helper_atomic_check(struct 
> > drm_encoder *encoder,
> > clock = adjusted_mode->clock;
> > dm_new_connector_state->pbn = 

Re: [PATCH 5.10 1/1] drm/amdkfd: Check for null pointer after calling kmemdup

2023-01-12 Thread Daniel Vetter
On Thu, 12 Jan 2023 at 13:47, Greg KH  wrote:
> On Wed, Jan 04, 2023 at 07:56:33PM +0200, Dragos-Marian Panait wrote:
> > From: Jiasheng Jiang 
> >
> > [ Upstream commit abfaf0eee97925905e742aa3b0b72e04a918fa9e ]
> >
> > As the possible failure of the allocation, kmemdup() may return NULL
> > pointer.
> > Therefore, it should be better to check the 'props2' in order to prevent
> > the dereference of NULL pointer.
> >
> > Fixes: 3a87177eb141 ("drm/amdkfd: Add topology support for dGPUs")
> > Signed-off-by: Jiasheng Jiang 
> > Reviewed-by: Felix Kuehling 
> > Signed-off-by: Felix Kuehling 
> > Signed-off-by: Alex Deucher 
> > Signed-off-by: Dragos-Marian Panait 
> > ---
> >  drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c 
> > b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> > index 86b4dadf772e..02e3c650ed1c 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> > @@ -408,6 +408,9 @@ static int kfd_parse_subtype_iolink(struct 
> > crat_subtype_iolink *iolink,
> >   return -ENODEV;
> >   /* same everything but the other direction */
> >   props2 = kmemdup(props, sizeof(*props2), GFP_KERNEL);
> > + if (!props2)
> > + return -ENOMEM;
>
> Not going to queue this up as this is a bogus CVE.

Are we at the point where CVE presence actually contraindicates
backporting? At least I'm getting a bit the feeling there's a surge of
automated (security) fixes that just don't hold up to any scrutiny.
Last week I had to toss out an fbdev locking patch due to static
checker that has no clue at all how refcounting works, and so
complained that things need more locking ... (that was -fixes, but
would probably have gone to stable too if I didn't catch it).

Simple bugfixes from random people was nice when it was checkpatch
stuff and I was fairly happy to take these aggressively in drm. But my
gut feeling says things seem to be shifting towards more advanced
tooling, but without more advanced understanding by submitters. Does
that holder in other areas too?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH 00/17] DRM_USE_DYNAMIC_DEBUG regression

2023-01-11 Thread Daniel Vetter
On Mon, Dec 05, 2022 at 05:34:07PM -0700, Jim Cromie wrote:
> Hi everyone,
> 
> DRM_USE_DYNAMIC_DEBUG=y has a regression on rc-*
> 
> Regression is due to a chicken-egg problem loading modules; on
> `modprobe i915`, drm is loaded 1st, and drm.debug is set.  When
> drm_debug_enabled() tested __drm_debug at runtime, that just worked.
> 
> But with DRM_USE_DYNAMIC_DEBUG=y, the runtime test is replaced with a
> post-load enablement of drm_dbg/dyndbg callsites (static-keys), via
> dyndbg's callback on __drm_debug.  Since all drm-drivers need drm.ko,
> it is loaded 1st, then drm.debug=X is applied, then drivers load, but
> too late for drm_dbgs to be enabled.
> 
> STATUS
> 
> For all-loadable drm,i915,amdgpu configs, it almost works, but
> propagating drm.debug to dependent modules doesnt actually apply,
> though the motions are there.  This is not the problem I want to chase
> here.
> 
> The more basic trouble is:
> 
> For builtin drm + helpers, things are broken pretty early; at the
> beginning of dynamic_debug_init().  As the ddebug_sanity() commit-msg
> describes in some detail, the records added by _USE fail to reference
> the struct ddebug_class_map created and exported by _DEFINE, but get
> separate addresses to "other" data that segv's when used as the
> expected pointer. FWIW, the pointer val starts with "revi".

So I honestly have no idea here, linker stuff is way beyond where I have
clue. So what's the way forward here?

The DEFINE/USE split does like the right thing to do at least from the
"how it's used in drivers" pov. But if we're just running circles not
quite getting there I dunno :-/
-Daniel

> 
> OVERVIEW
> 
> DECLARE_DYNDBG_CLASSMAP is broken: it is one-size-fits-all-poorly.
> It muddles the distinction between a (single) definition, and multiple
> references.  Something exported should suffice.
> 
> The core of this patchset splits it into:
> 
> DYNDBG_CLASSMAP_DEFINEused once per subsystem to define each classmap
> DYNDBG_CLASSMAP_USE   declare dependence on a DEFINEd classmap
> 
> This makes the weird coordinated-changes-by-identical-classmaps
> "feature" unnecessary; the DEFINE can export the var, and USE refers
> to the exported var.
> 
> So this patchset adds another section: __dyndbg_class_refs.
> 
> It is like __dyndbg_classes; it is scanned under ddebug_add_module(),
> and attached to each module's ddebug_table.  Once attached, it can be
> used like classes to validate and apply class FOO >control queries.
> 
> It also maps the class user -> definer explicitly, so that when the
> module is loaded, the section scan can find the kernel-param that is
> wired to dyndbg's kparam-callback, and apply its state-var, forex:
> __drm_debug to the just loaded helper/driver module.
> 
> Theres plenty to address Im sure.
> 
> Jim Cromie (17):
>   test-dyndbg: fixup CLASSMAP usage error
>   test-dyndbg: show that DEBUG enables prdbgs at compiletime
>   dyndbg: fix readback value on LEVEL_NAMES interfaces
>   dyndbg: replace classmap list with a vector
>   dyndbg: make ddebug_apply_class_bitmap more selective
>   dyndbg: dynamic_debug_init - use pointer inequality, not strcmp
>   dyndbg: drop NUM_TYPE_ARRAY
>   dyndbg: reduce verbose/debug clutter
>   dyndbg-API: replace DECLARE_DYNDBG_CLASSMAP with
> DYNDBG_CLASSMAP(_DEFINE|_USE)
>   dyndbg-API: specialize DYNDBG_CLASSMAP_(DEFINE|USE)
>   dyndbg-API: DYNDBG_CLASSMAP_USE drop extra args
>   dyndbg-API: DYNDBG_CLASSMAP_DEFINE() improvements
>   drm_print: fix stale macro-name in comment
>   dyndbg: unwrap __ddebug_add_module inner function NOTYET
>   dyndbg: ddebug_sanity()
>   dyndbg: mess-w-dep-class
>   dyndbg: miss-on HACK
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  14 +-
>  drivers/gpu/drm/display/drm_dp_helper.c |  14 +-
>  drivers/gpu/drm/drm_crtc_helper.c   |  14 +-
>  drivers/gpu/drm/drm_print.c |  22 +--
>  drivers/gpu/drm/i915/i915_params.c  |  14 +-
>  drivers/gpu/drm/nouveau/nouveau_drm.c   |  14 +-
>  include/asm-generic/vmlinux.lds.h   |   3 +
>  include/drm/drm_print.h |   6 +-
>  include/linux/dynamic_debug.h   |  57 --
>  include/linux/map.h |  54 ++
>  kernel/module/main.c|   2 +
>  lib/dynamic_debug.c | 240 +++-
>  lib/test_dynamic_debug.c|  47 ++---
>  13 files changed, 344 insertions(+), 157 deletions(-)
>  create mode 100644 include/linux/map.h
> 
> -- 
> 2.38.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH 13/17] drm_print: fix stale macro-name in comment

2023-01-11 Thread Daniel Vetter
On Mon, Dec 05, 2022 at 05:34:20PM -0700, Jim Cromie wrote:
> Cited commit uses stale macro name, fix this, and explain better.
> 
> When DRM_USE_DYNAMIC_DEBUG=y, DYNDBG_CLASSMAP_DEFINE() maps DRM_UT_*
> onto BITs in drm.debug.  This still uses enum drm_debug_category, but
> it is somewhat indirect, with the ordered set of DRM_UT_* enum-vals.
> This requires that the macro args: DRM_UT_* list must be kept in sync
> and in order.
> 
> Fixes: f158936b60a7 ("drm: POC drm on dyndbg - use in core, 2 helpers, 3 
> drivers.")
> Signed-off-by: Jim Cromie 

Should I land this already?
-Daniel

> ---
> . emphasize ABI non-change despite enum val change - Jani Nikula
> . reorder to back of patchset to follow API name changes.
> ---
>  include/drm/drm_print.h | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
> index 6a27e8f26770..7695ba31b3a4 100644
> --- a/include/drm/drm_print.h
> +++ b/include/drm/drm_print.h
> @@ -276,7 +276,10 @@ static inline struct drm_printer drm_err_printer(const 
> char *prefix)
>   *
>   */
>  enum drm_debug_category {
> - /* These names must match those in DYNAMIC_DEBUG_CLASSBITS */
> + /*
> +  * Keep DYNDBG_CLASSMAP_DEFINE args in sync with changes here,
> +  * the enum-values define BIT()s in drm.debug, so are ABI.
> +  */
>   /**
>* @DRM_UT_CORE: Used in the generic drm code: drm_ioctl.c, drm_mm.c,
>* drm_memory.c, ...
> -- 
> 2.38.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/fb-helper: Set framebuffer for vga-switcheroo clients

2023-01-11 Thread Daniel Vetter
On Wed, Jan 11, 2023 at 04:38:13PM +0100, Thomas Zimmermann wrote:
> Set the framebuffer info for drivers that support VGA switcheroo. Only
> affects the amdgpu driver, which uses VGA switcheroo and generic fbdev
> emulation. For other drivers, this does nothing.
> 
> Amdgpu's lastclose helper called vga_switcheroo_process_delayed_switch().
> But as amdgpu uses generic fbdev emulation, it's better to call the helper
> from drm_lastclose(), after the kernel client's screen has been restored.
> So all drivers and clients can benefit. Radeon and nouveau with modernized
> fbdev code are possible candidates.
> 
> There was an earlier patchset to do something similar. [1]
> 
> Suggested-by: Alexander Deucher 
> Signed-off-by: Thomas Zimmermann 
> Link: 
> https://lore.kernel.org/amd-gfx/20221020143603.563929-1-alexander.deuc...@amd.com/
>  # 1

Indeed, vga_switcheroo_client_fb_set is a no-op if no client is registered
on that pdev.

Reviewed-by: Daniel Vetter 

t-b/ack from amd would be still good I think (or maybe they'll pick this
one up so it goes through their CI).
-Daniel

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h |  1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 12 
>  drivers/gpu/drm/drm_fb_helper.c |  8 
>  drivers/gpu/drm/drm_file.c  |  3 +++
>  5 files changed, 11 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 63c921c55fb9..7120b9b6e580 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1330,7 +1330,6 @@ extern const int amdgpu_max_kms_ioctl;
>  
>  int amdgpu_driver_load_kms(struct amdgpu_device *adev, unsigned long flags);
>  void amdgpu_driver_unload_kms(struct drm_device *dev);
> -void amdgpu_driver_lastclose_kms(struct drm_device *dev);
>  int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file 
> *file_priv);
>  void amdgpu_driver_postclose_kms(struct drm_device *dev,
>struct drm_file *file_priv);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index ebc6e6cbe2ab..02d636f781a2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -2784,7 +2784,6 @@ static const struct drm_driver amdgpu_kms_driver = {
>   DRIVER_SYNCOBJ_TIMELINE,
>   .open = amdgpu_driver_open_kms,
>   .postclose = amdgpu_driver_postclose_kms,
> - .lastclose = amdgpu_driver_lastclose_kms,
>   .ioctls = amdgpu_ioctls_kms,
>   .num_ioctls = ARRAY_SIZE(amdgpu_ioctls_kms),
>   .dumb_create = amdgpu_mode_dumb_create,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index 7aa7e52ca784..886739576d3d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -1104,18 +1104,6 @@ int amdgpu_info_ioctl(struct drm_device *dev, void 
> *data, struct drm_file *filp)
>  /*
>   * Outdated mess for old drm with Xorg being in charge (void function now).
>   */
> -/**
> - * amdgpu_driver_lastclose_kms - drm callback for last close
> - *
> - * @dev: drm dev pointer
> - *
> - * Switch vga_switcheroo state after last close (all asics).
> - */
> -void amdgpu_driver_lastclose_kms(struct drm_device *dev)
> -{
> - drm_fb_helper_lastclose(dev);
> - vga_switcheroo_process_delayed_switch();
> -}
>  
>  /**
>   * amdgpu_driver_open_kms - drm callback for open
> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> index 427631706128..5e445c61252d 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -30,7 +30,9 @@
>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>  
>  #include 
> +#include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -1940,6 +1942,7 @@ static int drm_fb_helper_single_fb_probe(struct 
> drm_fb_helper *fb_helper,
>int preferred_bpp)
>  {
>   struct drm_client_dev *client = _helper->client;
> + struct drm_device *dev = fb_helper->dev;
>   struct drm_fb_helper_surface_size sizes;
>   int ret;
>  
> @@ -1961,6 +1964,11 @@ static int drm_fb_helper_single_fb_probe(struct 
> drm_fb_helper *fb_helper,
>   return ret;
>  
>   strcpy(fb_helper->fb->comm, "[fbcon]");
> +
> + /* Set the fb info for vgaswitcheroo clients. Does nothing otherwise. */
> + if (dev_is_pci(dev->dev))
> + vga_switcheroo_client_fb

Re: [PATCH] drm/radeon: free iio for atombios when driver shutdown

2023-01-06 Thread Daniel Vetter
Just a quick drive-by. For these simple cases where we just need to make
sure that memory is freed using drmm_kmalloc and friends should help
simplify things. Probably not worth it for radeon, but figured I'll throw
it out there.

For more functional code switching to drmm is harder because you need the
right order. But for these all that matters is that stuff gets freed so
there's no leak, and drmm can take care of that without ordering
constraints.
-Daniel

On Fri, Jan 06, 2023 at 10:36:53AM -0500, Alex Deucher wrote:
> Applied.  Thanks!
> 
> Alex
> 
> On Fri, Jan 6, 2023 at 5:00 AM Liwei Song  wrote:
> >
> > Fix below kmemleak when unload radeon driver:
> >
> > unreferenced object 0x9f8608ede200 (size 512):
> >   comm "systemd-udevd", pid 326, jiffies 4294682822 (age 716.338s)
> >   hex dump (first 32 bytes):
> > 00 00 00 00 c4 aa ec aa 14 ab 00 00 00 00 00 00  
> > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> >   backtrace:
> > [<62fadebe>] kmem_cache_alloc_trace+0x2f1/0x500
> > [<b6883cea>] atom_parse+0x117/0x230 [radeon]
> > [<158c23fd>] radeon_atombios_init+0xab/0x170 [radeon]
> > [<683f672e>] si_init+0x57/0x750 [radeon]
> > [<566cc31f>] radeon_device_init+0x559/0x9c0 [radeon]
> > [<46efabb3>] radeon_driver_load_kms+0xc1/0x1a0 [radeon]
> > [<b5155064>] drm_dev_register+0xdd/0x1d0
> > [<45fec835>] radeon_pci_probe+0xbd/0x100 [radeon]
> > [<e69ecca3>] pci_device_probe+0xe1/0x160
> > [<19484b76>] really_probe.part.0+0xc1/0x2c0
> > [<3f2649da>] __driver_probe_device+0x96/0x130
> > [<231c5bb1>] driver_probe_device+0x24/0xf0
> > [<00a42377>] __driver_attach+0x77/0x190
> > [<d7574da6>] bus_for_each_dev+0x7f/0xd0
> > [<633166d2>] driver_attach+0x1e/0x30
> > [<313b05b8>] bus_add_driver+0x12c/0x1e0
> >
> > iio was allocated in atom_index_iio() called by atom_parse(),
> > but it doesn't got released when the dirver is shutdown.
> > Fix this kmemleak by free it in radeon_atombios_fini().
> >
> > Signed-off-by: Liwei Song 
> > ---
> >  drivers/gpu/drm/radeon/radeon_device.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
> > b/drivers/gpu/drm/radeon/radeon_device.c
> > index 92905ebb7b45..1c005e0ddd38 100644
> > --- a/drivers/gpu/drm/radeon/radeon_device.c
> > +++ b/drivers/gpu/drm/radeon/radeon_device.c
> > @@ -1022,6 +1022,7 @@ void radeon_atombios_fini(struct radeon_device *rdev)
> >  {
> > if (rdev->mode_info.atom_context) {
> > kfree(rdev->mode_info.atom_context->scratch);
> > +   kfree(rdev->mode_info.atom_context->iio);
> > }
> > kfree(rdev->mode_info.atom_context);
> > rdev->mode_info.atom_context = NULL;
> > --
> > 2.33.1
> >

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 2/2] drm_print: fix stale macro-name in comment

2023-01-05 Thread Daniel Vetter
On Mon, Dec 05, 2022 at 09:10:05AM -0700, Jim Cromie wrote:
> Cited commit uses stale macro name, fix this, and explain better.
> 
> When DRM_USE_DYNAMIC_DEBUG=y, DYNDBG_CLASSMAP_DEFINE() maps DRM_UT_*
> onto BITs in drm.debug.  This still uses enum drm_debug_category, but
> it is somewhat indirect, with the ordered set of DRM_UT_* enum-vals.
> This requires that the macro args: DRM_UT_* list must be kept in sync
> and in order.
> 
> Fixes: f158936b60a7 ("drm: POC drm on dyndbg - use in core, 2 helpers, 3 
> drivers.")
> Signed-off-by: Jim Cromie 

What's the status of this series?

Greg, you landed the original entire pile that wasn't quite ready yet? Or
should I apply these two?
-Daniel

> ---
> . emphasize ABI non-change despite enum val change - Jani Nikula
> . reorder to back of patchset to follow API name changes.
> ---
>  include/drm/drm_print.h | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
> index a44fb7ef257f..e4c0c7e6d49d 100644
> --- a/include/drm/drm_print.h
> +++ b/include/drm/drm_print.h
> @@ -276,7 +276,10 @@ static inline struct drm_printer drm_err_printer(const 
> char *prefix)
>   *
>   */
>  enum drm_debug_category {
> - /* These names must match those in DYNAMIC_DEBUG_CLASSBITS */
> + /*
> +  * Keep DYNDBG_CLASSMAP_DEFINE args in sync with changes here,
> +  * the enum-values define BIT()s in drm.debug, so are ABI.
> +  */
>   /**
>* @DRM_UT_CORE: Used in the generic drm code: drm_ioctl.c, drm_mm.c,
>* drm_memory.c, ...
> -- 
> 2.38.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v3 0/2] drm: Add GPU reset sysfs

2023-01-05 Thread Daniel Vetter
On Thu, 8 Dec 2022 at 05:54, Alex Deucher  wrote:
>
> On Wed, Nov 30, 2022 at 6:11 AM Daniel Vetter  wrote:
> >
> > On Fri, Nov 25, 2022 at 02:52:01PM -0300, André Almeida wrote:
> > > This patchset adds a udev event for DRM device's resets.
> > >
> > > Userspace apps can trigger GPU resets by misuse of graphical APIs or 
> > > driver
> > > bugs. Either way, the GPU reset might lead the system to a broken 
> > > state[1], that
> > > might be recovered if user has access to a tty or a remote shell. 
> > > Arguably, this
> > > recovery could happen automatically by the system itself, thus this is 
> > > the goal
> > > of this patchset.
> > >
> > > For debugging and report purposes, device coredump support was already 
> > > added
> > > for amdgpu[2], but it's not suitable for programmatic usage like this one 
> > > given
> > > the uAPI not being stable and the need for parsing.
> > >
> > > GL/VK is out of scope for this use, giving that we are dealing with device
> > > resets regardless of API.
> > >
> > > A basic userspace daemon is provided at [3] showing how the interface is 
> > > used
> > > to recovery from resets.
> > >
> > > [1] A search for "reset" in DRM/AMD issue tracker shows reports of resets
> > > making the system unusable:
> > > https://gitlab.freedesktop.org/drm/amd/-/issues/?search=reset
> > >
> > > [2] 
> > > https://lore.kernel.org/amd-gfx/20220602081538.1652842-2-amaranath.somalapu...@amd.com/
> > >
> > > [3] https://gitlab.freedesktop.org/andrealmeid/gpu-resetd
> > >
> > > v2: 
> > > https://lore.kernel.org/dri-devel/20220308180403.75566-1-contactshashanksha...@gmail.com/
> > >
> > > André Almeida (1):
> > >   drm/amdgpu: Add work function for GPU reset event
> > >
> > > Shashank Sharma (1):
> > >   drm: Add GPU reset sysfs event
> >
> > This seems a bit much amd specific, and a bit much like an ad-hoc stopgap.
> >
> > On the amd specific piece:
> >
> > - amd's gpus suck the most for gpu hangs, because aside from the shader
> >   unblock, there's only device reset, which thrashes vram and display and
> >   absolutely everything. Which is terrible. Everyone else has engine only
> >   reset since years (i.e. doesn't thrash display or vram), and very often
> >   even just context reset (i.e. unless the driver is busted somehow or hw
> >   bug, malicious userspace will _only_ ever impact itself).
> >
> > - robustness extensions for gl/vk already have very clear specifications
> >   of all cases of reset, and this work here just ignores that. Yes on amd
> >   you only have device reset, but this is drm infra, so you need to be
> >   able to cope with ctx reset or reset which only affected a limited set
> >   of context. If this is for compute and compute apis lack robustness
> >   extensions, then those apis need to be fixed to fill that gap.
> >
> > - the entire deamon thing feels a bit like overkill and I'm not sure why
> >   it exists. I think for a start it would be much simpler if we just have
> >   a (per-device maybe) sysfs knob to enable automatic killing of process
> >   that die and which don't have arb robustness enabled (for gl case, for
> >   vk case the assumption is that _every_ app supports VK_DEVICE_LOST and
> >   can recover).
>
> Thinking about this a bit more, I think there are useful cases for the
> GPU reset event and a daemon.  When I refer to a daemon here, it could
> be a standalone thing or integrated into the desktop manager like
> logind or whatever.
> 1. For APIs that don't have robustness support (e.g., video
> encode/decode APIs).  This one I could kind of go either way on since,
> it probably makes sense to just kill the app if it there is no
> robustness mechanism in the API.

I think transcode might also be a case where the userspace driver can
recover, at least on the decode side. But that would most likely
require some extension to make it clear to the app what's going on.

Or people just use vk video and be done, reset support comes built-in there :-)

> 2. Telemetry collection.  It would be useful to have a central place
> to collect telemetry information about what apps seem to be
> problematic, etc.

Yeah I think standardizing reset messages and maybe device state dumps
makes sense. But that's telemetry, not making decisions about what to
kill.

> 3. A policy manager in userspace.  If you want to make some decision
> about what to do about repeat offenders or apps

Re: [pull] amdgpu, amdkfd drm-fixes-6.2

2023-01-05 Thread Daniel Vetter
On Wed, Jan 04, 2023 at 10:38:39PM -0500, Alex Deucher wrote:
> Hi Dave, Daniel,
> 
> Fixes for 6.2.
> 
> The following changes since commit c8de526215fdab9f2dd0d9675582cf9f1391a919:
> 
>   Merge tag 'drm-misc-next-fixes-2023-01-03' of 
> git://anongit.freedesktop.org/drm/drm-misc into drm-fixes (2023-01-03 
> 21:02:57 +0100)
> 
> are available in the Git repository at:
> 
>   https://gitlab.freedesktop.org/agd5f/linux.git 
> tags/amd-drm-fixes-6.2-2023-01-04
> 
> for you to fetch changes up to 6fe6ece398f7431784847e922a2c8c385dc58a35:
> 
>   Revert "drm/amd/display: Enable Freesync Video Mode by default" (2023-01-04 
> 22:29:32 -0500)

Pulled, thanks a lot.
-Daniel
> 
> 
> amd-drm-fixes-6.2-2023-01-04:
> 
> amdgpu:
> - DCN 3.2 fix
> - Display fix
> 
> amdkfd:
> - Fix kernel warning
> 
> 
> Michel Dänzer (1):
>   Revert "drm/amd/display: Enable Freesync Video Mode by default"
> 
> Mukul Joshi (1):
>   drm/amdkfd: Fix kernel warning during topology setup
> 
> Samson Tam (1):
>   drm/amd/display: Uninitialized variables causing 4k60 UCLK to stay at 
> DPM1 and not DPM0
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 27 
> ++
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c  |  2 +-
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 12 ++
>  .../dc/dml/dcn32/display_mode_vba_util_32.c|  6 ++---
>  5 files changed, 39 insertions(+), 9 deletions(-)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 7/9] drm/i915: stop using ttm_bo_wait

2022-11-30 Thread Daniel Vetter
On Wed, 30 Nov 2022 at 14:03, Tvrtko Ursulin
 wrote:
> On 29/11/2022 18:05, Matthew Auld wrote:
> > On Fri, 25 Nov 2022 at 11:14, Tvrtko Ursulin
> >  wrote:
> >>
> >>
> >> + Matt
> >>
> >> On 25/11/2022 10:21, Christian König wrote:
> >>> TTM is just wrapping core DMA functionality here, remove the mid-layer.
> >>> No functional change.
> >>>
> >>> Signed-off-by: Christian König 
> >>> ---
> >>>drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 9 ++---
> >>>1 file changed, 6 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
> >>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> >>> index 5247d88b3c13..d409a77449a3 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> >>> @@ -599,13 +599,16 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object 
> >>> *obj,
> >>>static int i915_ttm_truncate(struct drm_i915_gem_object *obj)
> >>>{
> >>>struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> >>> - int err;
> >>> + long err;
> >>>
> >>>WARN_ON_ONCE(obj->mm.madv == I915_MADV_WILLNEED);
> >>>
> >>> - err = ttm_bo_wait(bo, true, false);
> >>> - if (err)
> >>> + err = dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_BOOKKEEP,
> >>> + true, 15 * HZ);
> >>
> >> This 15 second stuck out a bit for me and then on a slightly deeper look
> >> it seems this timeout will "leak" into a few of i915 code paths. If we
> >> look at the difference between the legacy shmem and ttm backend I am not
> >> sure if the legacy one is blocking or not - but if it can block I don't
> >> think it would have an arbitrary timeout like this. Matt your thoughts?
> >
> > Not sure what is meant by leak here, but the legacy shmem must also
> > wait/block when unbinding each VMA, before calling truncate. It's the
>
> By "leak" I meant if 15s timeout propagates into some code paths visible
> from userspace which with a legacy backend instead have an indefinite
> wait. If we have that it's probably not very good to have this
> inconsistency, or to apply an arbitrary timeout to those path to start with.
>
> > same story for the ttm backend, except slightly more complicated in
> > that there might be no currently bound VMA, and yet the GPU could
> > still be accessing the pages due to async unbinds, kernel moves etc,
> > which the wait here (and in i915_ttm_shrink) is meant to protect
> > against. If the wait times out it should just fail gracefully. I guess
> > we could just use MAX_SCHEDULE_TIMEOUT here? Not sure if it really
> > matters though.
>
> Right, depends if it can leak or not to userspace and diverge between
> backends.

Generally lock_timeout() is a design bug. It's either
lock_interruptible (or maybe lock_killable) or try_lock, but
lock_timeout is just duct-tape. I haven't dug in to figure out what
should be here, but it smells fishy.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v3 0/2] drm: Add GPU reset sysfs

2022-11-30 Thread Daniel Vetter
nt reporting
  requirements to make sure robustness extensions work correctly.

- pid isn't enough once you have engine/context reset, you need pid (well
  drm_file really, but I guess we can bind those to pid somehow) and gpu
  ctx id. Both gl and vk allow you to allocate limitless gpu context on
  the same device, and so this matters.

- igt for this stuff. Probably needs some work to generalize the i915
  infra for endless batchbuffers so that you can make very controlled gpu
  hangs.

Cheers, Daniel

>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  4 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 30 ++
>  drivers/gpu/drm/drm_sysfs.c| 26 +++
>  include/drm/drm_sysfs.h| 13 ++
>  4 files changed, 73 insertions(+)
> 
> -- 
> 2.38.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v7 18/21] dma-buf: Move dma_buf_mmap() to dynamic locking specification

2022-11-07 Thread Daniel Vetter
On Mon, 17 Oct 2022 at 19:25, Dmitry Osipenko
 wrote:
>
> Move dma_buf_mmap() function to the dynamic locking specification by
> taking the reservation lock. Neither of the today's drivers take the
> reservation lock within the mmap() callback, hence it's safe to enforce
> the locking.
>
> Acked-by: Sumit Semwal 
> Acked-by: Christian König 
> Signed-off-by: Dmitry Osipenko 

Just noticed this while reading code ... this patch seems to have
missed dma_buf_mmap_internal()?

Might be good if at least some drivers gain a dma_resv_assert_held in
that path to make sure we're not quite this bad, together with fixing
this issue.
-Daniel

> ---
>  drivers/dma-buf/dma-buf.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index f54c649f922a..f149b384f4dd 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -1390,6 +1390,8 @@ EXPORT_SYMBOL_NS_GPL(dma_buf_end_cpu_access, DMA_BUF);
>  int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
>  unsigned long pgoff)
>  {
> +   int ret;
> +
> if (WARN_ON(!dmabuf || !vma))
> return -EINVAL;
>
> @@ -1410,7 +1412,11 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct 
> vm_area_struct *vma,
> vma_set_file(vma, dmabuf->file);
> vma->vm_pgoff = pgoff;
>
> -   return dmabuf->ops->mmap(dmabuf, vma);
> +   dma_resv_lock(dmabuf->resv, NULL);
> +   ret = dmabuf->ops->mmap(dmabuf, vma);
> +   dma_resv_unlock(dmabuf->resv);
> +
> +   return ret;
>  }
>  EXPORT_SYMBOL_NS_GPL(dma_buf_mmap, DMA_BUF);
>
> --
> 2.37.3
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [pull] amdgpu drm-fixes-6.0

2022-09-30 Thread Daniel Vetter
On Fri, Sep 30, 2022 at 05:04:54PM -0400, Alex Deucher wrote:
> Hi Dave, Daniel,
> 
> Sorry, some last minute changes to deal with updated firmwares/bioses and
> board revisions containing new IPs added in this cycle.  It required
> pulling in some cleanup patches for the RLC firmware handing, but they
> are only applied to GC 11 in this case.  I figured that would be cleaner
> then a bunch of local fixes that would cause merge conflicts for -next,
> and time was getting short for 6.0. They are only applied to GC 11, so no
> chance of regression on existing asics.
> 
> V2: fixed S-O-Bs.
> 
> The following changes since commit 83ca5fb40e758e0a0257bf4e3a1148dd52c6d0f2:
> 
>   drm/amd/display: Prevent OTG shutdown during PSR SU (2022-09-29 10:07:42 
> -0400)
> 
> are available in the Git repository at:
> 
>   https://gitlab.freedesktop.org/agd5f/linux.git 
> tags/amd-drm-fixes-6.0-2022-09-30-1

Pullled and forwarded.

Cheers, Daniel

> 
> for you to fetch changes up to 0fd85e89b5bf18447e56099a010ee5be5dc9f2b0:
> 
>   drm/amdgpu/gfx11: switch to amdgpu_gfx_rlc_init_microcode (2022-09-30 
> 16:59:06 -0400)
> 
> 
> amd-drm-fixes-6.0-2022-09-30-1:
> 
> amdgpu:
> - VCN 4.x fixes
> - RLC fixes for GC 11.x
> 
> 
> Hawking Zhang (8):
>   drm/amdgpu: save rlcv/rlcp ucode version in amdgpu_gfx
>   drm/amdgpu: add helper to init rlc fw in header v2_0
>   drm/amdgpu: add helper to init rlc fw in header v2_1
>   drm/amdgpu: add helper to init rlc fw in header v2_2
>   drm/amdgpu: add helper to init rlc fw in header v2_3
>   drm/amdgpu: add helper to init rlc fw in header v2_4
>   drm/amdgpu: add helper to init rlc firmware
>   drm/amdgpu/gfx11: switch to amdgpu_gfx_rlc_init_microcode
> 
> Sonny Jiang (2):
>   drm/amdgpu: Enable VCN DPG for GC11_0_1
>   drm/amdgpu: Enable sram on vcn_4_0_2
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |   4 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c   | 264 
> ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h   |   4 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h |   4 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c   |   2 +-
>  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c| 151 +
>  drivers/gpu/drm/amd/amdgpu/soc21.c|   1 +
>  7 files changed, 281 insertions(+), 149 deletions(-)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [pull] amdgpu drm-fixes-6.0

2022-09-30 Thread Daniel Vetter
On Fri, Sep 30, 2022 at 09:58:10AM -0400, Alex Deucher wrote:
> Hi Dave, Daniel,
> 
> Sorry, some last minute changes to deal with updated firmwares/bioses and
> board revisions containing new IPs added in this cycle.  It required
> pulling in some cleanup patches for the RLC firmware handing, but they
> are only applied to GC 11 in this case.  I figured that would be cleaner
> then a bunch of local fixes that would cause merge conflicts for -next,
> and time was getting short for 6.0. They are only applied to GC 11, so no
> chance of regression on existing asics.
> 
> The following changes since commit 83ca5fb40e758e0a0257bf4e3a1148dd52c6d0f2:
> 
>   drm/amd/display: Prevent OTG shutdown during PSR SU (2022-09-29 10:07:42 
> -0400)
> 
> are available in the Git repository at:
> 
>   https://gitlab.freedesktop.org/agd5f/linux.git 
> tags/amd-drm-fixes-6.0-2022-09-30
> 
> for you to fetch changes up to 5369e662f99087b4ad38b151aaefecb690117f10:
> 
>   drm/amdgpu/gfx11: switch to amdgpu_gfx_rlc_init_microcode (2022-09-29 
> 15:22:31 -0400)
> 
> 
> amd-drm-fixes-6.0-2022-09-30:

dim isn't entirely happy:

dim: 5369e662f990 ("drm/amdgpu/gfx11: switch to 
amdgpu_gfx_rlc_init_microcode"): committer Signed-off-by missing.
dim: dd00f3eeba5b ("drm/amdgpu: save rlcv/rlcp ucode version in amdgpu_gfx"): 
committer Signed-off-by missing.

Can you pls respin?
-Daniel

> 
> amdgpu:
> - VCN 4.x fixes
> - RLC fixes for GC 11.x
> 
> 
> Hawking Zhang (8):
>   drm/amdgpu: save rlcv/rlcp ucode version in amdgpu_gfx
>   drm/amdgpu: add helper to init rlc fw in header v2_0
>   drm/amdgpu: add helper to init rlc fw in header v2_1
>   drm/amdgpu: add helper to init rlc fw in header v2_2
>   drm/amdgpu: add helper to init rlc fw in header v2_3
>   drm/amdgpu: add helper to init rlc fw in header v2_4
>   drm/amdgpu: add helper to init rlc firmware
>   drm/amdgpu/gfx11: switch to amdgpu_gfx_rlc_init_microcode
> 
> Sonny Jiang (2):
>   drm/amdgpu: Enable VCN DPG for GC11_0_1
>   drm/amdgpu: Enable sram on vcn_4_0_2
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |   4 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c   | 264 
> ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h   |   4 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h |   4 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c   |   2 +-
>  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c| 151 +
>  drivers/gpu/drm/amd/amdgpu/soc21.c|   1 +
>  7 files changed, 281 insertions(+), 149 deletions(-)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 1/6] drm/ttm: Add new callbacks to ttm res mgr

2022-09-07 Thread Daniel Vetter
On Wed, Sep 07, 2022 at 08:45:22AM +0200, Christian König wrote:
> Am 06.09.22 um 21:58 schrieb Daniel Vetter:
> > On Tue, Aug 16, 2022 at 10:33:16AM +0530, Arunpravin Paneer Selvam wrote:
> > > 
> > > On 8/15/2022 4:35 PM, Christian König wrote:
> > > > Am 12.08.22 um 15:30 schrieb Arunpravin Paneer Selvam:
> > > > > We are adding two new callbacks to ttm resource manager
> > > > > function to handle intersection and compatibility of
> > > > > placement and resources.
> > > > > 
> > > > > v2: move the amdgpu and ttm_range_manager changes to
> > > > >   separate patches (Christian)
> > > > > v3: rename "intersect" to "intersects" (Matthew)
> > > > > v4: move !place check to the !res if and return false
> > > > >   in ttm_resource_compatible() function (Christian)
> > > > > v5: move bits of code from patch number 6 to avoid
> > > > >   temporary driver breakup (Christian)
> > > > > 
> > > > > Signed-off-by: Christian König 
> > > > > Signed-off-by: Arunpravin Paneer Selvam
> > > > > 
> > > > Patch #6 could still be cleaned up more now that we have the workaround
> > > > code in patch #1, but that not really a must have.
> > > > 
> > > > Reviewed-by: Christian König  for the entire
> > > > series.
> > > > 
> > > > Do you already have commit rights?
> > > Hi Christian,
> > > I applied for drm-misc commit rights, waiting for the project maintainers 
> > > to
> > > approve my request.
> > Why do all drivers have to implement the current behaviour? Can we have a
> > default implementation, which gets called if nothing is set instead?
> 
> We do have a default implementation in the range manager which is used by
> radeon, GEM VRAM helpers, VMWGFX and amdgpu (but there only for some
> domains).
> 
> > I'm a bit confused why the bloat here ...
> 
> Drivers do have specialized implementations of the backend, e.g. VMWGFX have
> his handle backend, amdgpu the VRAM backend with special placements, i915 is
> completely special as well.
> 
> Here we only move the decision if resources intersect or are compatible into
> those specialized backends. Previously we had all this in a centralized
> callback for all backends of a driver.
> 
> See the switch in amdgpu_ttm_bo_eviction_valuable() for an example. Final
> goal is to move all this stuff into the specialized backends and remove this
> callback.
> 
> The only driver where I couldn't figure out why we have duplicated all this
> from the standard implementation is Nouveau.

Yeah I didn't read this too carefully, apologies.

> > Also please document new callbacks precisely with inline kerneldoc. I know
> > ttm docs aren't great yet, but they don't get better if we don't start
> > somewhere. I think the in-depth comments for modeset vfuncs (e.g. in
> > drm_modeset_helper_vtables.h) are a good standard here.
> 
> I thought we already did that. Please be a bit more specific.

Yeah rushed this too, but the kerneldoc isn't too great yet. It's
definitely not formatted correctly (you can't do a full function
definition in a struct unfortunately, see the examples I linked). And it
would be good to specificy what the default implementation is, that there
is one (i.e. the hook is optional) and when exactly a driver would want to
overwrite this. Atm it's a one-liner that explains exactly as much as you
can guess from the function interface anyway, that's not super userful.
-Daniel



> 
> Thanks,
> Christian.
> 
> > -Daniel
> > 
> > > Thanks,
> > > Arun
> > > > Regards,
> > > > Christian.
> > > > 
> > > > > ---
> > > > >    drivers/gpu/drm/ttm/ttm_bo.c   |  9 ++--
> > > > >    drivers/gpu/drm/ttm/ttm_resource.c | 77 
> > > > > +-
> > > > >    include/drm/ttm/ttm_resource.h | 40 
> > > > >    3 files changed, 119 insertions(+), 7 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c 
> > > > > b/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > index c1bd006a5525..f066e8124c50 100644
> > > > > --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > @@ -518,6 +518,9 @@ static int ttm_bo_evict(struct ttm_buffer_object
> > > > > *bo,
> > > > >    bool ttm_bo_eviction

Re: [PATCH v6 39/57] dyndbg/drm: POC add tracebits sysfs-knob

2022-09-07 Thread Daniel Vetter
On Sun, Sep 04, 2022 at 03:41:16PM -0600, Jim Cromie wrote:
> clone DRM.debug interface to DRM.tracebits: ie map bits to
> drm-debug-categories, except this interface enables messages to
> tracefs, not to syslog.
> 
> 1- we reuse the class-map added previously.
>this reflects the single source of both syslog/trace events
> 
> 2- add a 2nd struct ddebug_classes_bitmap_param
>refs 1, reusing it.
>flags = "T", to enable trace-events on this callsite.
> 
> 3- module_param_cb([2]) - does the sysfs part
> 
> Signed-off-by: Jim Cromie 

All the drm patches (excluding nouveau) I haven't commented on:

Reviewed-by: Daniel Vetter 

I think nouveau I'll leave up to nouveau folks.
-Daniel



> ---
>  drivers/gpu/drm/drm_print.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
> index c50edbf443d3..75d0cecd7e86 100644
> --- a/drivers/gpu/drm/drm_print.c
> +++ b/drivers/gpu/drm/drm_print.c
> @@ -45,6 +45,9 @@
>  unsigned long __drm_debug;
>  EXPORT_SYMBOL(__drm_debug);
>  
> +unsigned long __drm_trace;
> +EXPORT_SYMBOL(__drm_trace);
> +
>  MODULE_PARM_DESC(debug, "Enable debug output, where each bit enables a debug 
> category.\n"
>  "\t\tBit 0 (0x01)  will enable CORE messages (drm core code)\n"
>  "\t\tBit 1 (0x02)  will enable DRIVER messages (drm controller code)\n"
> @@ -77,6 +80,13 @@ static struct ddebug_class_param drm_debug_bitmap = {
>   .map = _debug_classes,
>  };
>  module_param_cb(debug, _ops_dyndbg_classes, _debug_bitmap, 0600);
> +
> +static struct ddebug_class_param drm_trace_bitmap = {
> + .bits = &__drm_trace,
> + .flags = "T",
> + .map = _debug_classes,
> +};
> +module_param_cb(tracecats, _ops_dyndbg_classes, _trace_bitmap, 
> 0600);
>  #endif
>  
>  void __drm_puts_coredump(struct drm_printer *p, const char *str)
> -- 
> 2.37.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 28/57] drm_print: refine drm_debug_enabled for jump-label

2022-09-07 Thread Daniel Vetter
On Sun, Sep 04, 2022 at 03:41:05PM -0600, Jim Cromie wrote:
> In order to use dynamic-debug's jump-label optimization in drm-debug,
> its clarifying to refine drm_debug_enabled into 3 uses:
> 
> 1.   drm_debug_enabled - legacy, public
> 2. __drm_debug_enabled - optimized for dyndbg jump-label enablement.
> 3.  _drm_debug_enabled - pr_debug instrumented, observable
> 
> 1. The legacy version always checks the bits.
> 
> 2. is privileged, for use by __drm_dbg(), __drm_dev_dbg(), which do an
> early return unless the category is enabled.  For dyndbg builds, debug
> callsites are selectively "pre-enabled", so __drm_debug_enabled()
> short-circuits to true there.  Remaining callers of 1 may be able to
> use 2, case by case.
> 
> 3. is 1st wrapped in a macro, with a pr_debug, which reports each
> usage in /proc/dynamic_debug/control, making it observable in the
> logs.  The macro lets the pr_debug see the real caller, not an inline
> function.
> 
> When plugged into 1, 3 identified ~10 remaining callers of the
> function, leading to the follow-on cleanup patch, and would allow
> activating the pr_debugs, estimating the callrate, and the potential
> savings by using the wrapper macro.  It is unused ATM, but it fills
> out the picture.
> 
> Signed-off-by: Jim Cromie 

So instead of having 3 here as a "you need to hack it in to see what
should be converted" I have a bit a different idea: Could we make the
public version also a dyndbg callsite (like the printing wrappers), but
instead of a dynamic call we'd have a dynamically fixed value we get out?
I think that would take care of everything you have here as an open.

Otherwise I'd just drop 3 for the series we're going to merge.
-Daniel

> ---
>  drivers/gpu/drm/drm_print.c |  4 ++--
>  include/drm/drm_print.h | 28 
>  2 files changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
> index 29a29949ad0b..cb203d63b286 100644
> --- a/drivers/gpu/drm/drm_print.c
> +++ b/drivers/gpu/drm/drm_print.c
> @@ -285,7 +285,7 @@ void __drm_dev_dbg(const struct device *dev, enum 
> drm_debug_category category,
>   struct va_format vaf;
>   va_list args;
>  
> - if (!drm_debug_enabled(category))
> + if (!__drm_debug_enabled(category))
>   return;
>  
>   va_start(args, format);
> @@ -308,7 +308,7 @@ void ___drm_dbg(enum drm_debug_category category, const 
> char *format, ...)
>   struct va_format vaf;
>   va_list args;
>  
> - if (!drm_debug_enabled(category))
> + if (!__drm_debug_enabled(category))
>   return;
>  
>   va_start(args, format);
> diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
> index dfdd81c3287c..7631b5fb669e 100644
> --- a/include/drm/drm_print.h
> +++ b/include/drm/drm_print.h
> @@ -321,11 +321,39 @@ enum drm_debug_category {
>   DRM_UT_DRMRES
>  };
>  
> +/*
> + * 3 name flavors of drm_debug_enabled:
> + *   drm_debug_enabled - public/legacy, always checks bits
> + *  _drm_debug_enabled - instrumented to observe call-rates, est overheads.
> + * __drm_debug_enabled - privileged - knows jump-label state, can 
> short-circuit
> + */
>  static inline bool drm_debug_enabled(enum drm_debug_category category)
>  {
>   return unlikely(__drm_debug & BIT(category));
>  }
>  
> +/*
> + * Wrap fn in macro, so that the pr_debug sees the actual caller, not
> + * the inline fn.  Using this name creates a callsite entry / control
> + * point in /proc/dynamic_debug/control.
> + */
> +#define _drm_debug_enabled(category) \
> + ({  \
> + pr_debug("todo: maybe avoid via dyndbg\n"); \
> + drm_debug_enabled(category);\
> + })
> +
> +#if defined(CONFIG_DRM_USE_DYNAMIC_DEBUG)
> +/*
> + * dyndbg is wrapping the drm.debug API, so as to avoid the runtime
> + * bit-test overheads of drm_debug_enabled() in those api calls.
> + * In this case, executed callsites are known enabled, so true.
> + */
> +#define __drm_debug_enabled(category)true
> +#else
> +#define __drm_debug_enabled(category)drm_debug_enabled(category)
> +#endif
> +
>  /*
>   * struct device based logging
>   *
> -- 
> 2.37.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 23/57] drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers.

2022-09-07 Thread Daniel Vetter
  "DRM_UT_CORE",
> + "DRM_UT_DRIVER",
> +     "DRM_UT_KMS",
> + "DRM_UT_PRIME",
> + "DRM_UT_ATOMIC",
> + "DRM_UT_VBL",
> + "DRM_UT_STATE",
> + "DRM_UT_LEASE",
> + "DRM_UT_DP",
> + "DRM_UT_DRMRES");
> +
>  MODULE_PARM_DESC(config, "option string to pass to driver core");
>  static char *nouveau_config;
>  module_param_named(config, nouveau_config, charp, 0400);
> diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
> index b3b470440e46..668273e36c2c 100644
> --- a/include/drm/drm_print.h
> +++ b/include/drm/drm_print.h
> @@ -35,7 +35,7 @@
>  #include 
>  
>  /* Do *not* use outside of drm_print.[ch]! */
> -extern unsigned int __drm_debug;
> +extern unsigned long __drm_debug;
>  
>  /**
>   * DOC: print
> @@ -275,6 +275,7 @@ static inline struct drm_printer drm_err_printer(const 
> char *prefix)
>   *
>   */
>  enum drm_debug_category {
> + /* These names must match those in DYNAMIC_DEBUG_CLASSBITS */

I'd just put this into the kerneldoc, then you can also link to the
DRM_PRINT_DECLARE_DEBUG_CLASSBITS macro or whatever you'll call the thing
so drivers don't have to copypaste it all.
-Daniel

>   /**
>* @DRM_UT_CORE: Used in the generic drm code: drm_ioctl.c, drm_mm.c,
>* drm_memory.c, ...
> -- 
> 2.37.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 22/57] drm_print: condense enum drm_debug_category

2022-09-07 Thread Daniel Vetter
On Sun, Sep 04, 2022 at 03:40:59PM -0600, Jim Cromie wrote:
> enum drm_debug_category has 10 categories, but is initialized with
> bitmasks which require 10 bits of underlying storage.  By using
> natural enumeration, and moving the BIT(cat) into drm_debug_enabled(),
> the enum fits in 4 bits, allowing the category to be represented
> directly in pr_debug callsites, via the ddebug.class_id field.
> 
> While this slightly pessimizes the bit-test in drm_debug_enabled(),
> using dyndbg with JUMP_LABEL will avoid the function entirely.
> 
> NOTE: this change forecloses the possibility of doing:
> 
>   drm_dbg(DRM_UT_CORE|DRM_UT_KMS, "weird 2-cat experiment")
> 
> but thats already strongly implied by the use of the enum itself; its
> not a normal enum if it can be 2 values simultaneously.
> 
> Signed-off-by: Jim Cromie 

Reviewed-by: Daniel Vetter 

I guess this would also be a good patch to apply already, so we reduce the
patch set size somewhat?
-Daniel

> ---
>  include/drm/drm_print.h | 22 +++---
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
> index 22fabdeed297..b3b470440e46 100644
> --- a/include/drm/drm_print.h
> +++ b/include/drm/drm_print.h
> @@ -279,49 +279,49 @@ enum drm_debug_category {
>* @DRM_UT_CORE: Used in the generic drm code: drm_ioctl.c, drm_mm.c,
>* drm_memory.c, ...
>*/
> - DRM_UT_CORE = 0x01,
> + DRM_UT_CORE,
>   /**
>* @DRM_UT_DRIVER: Used in the vendor specific part of the driver: i915,
>* radeon, ... macro.
>*/
> - DRM_UT_DRIVER   = 0x02,
> + DRM_UT_DRIVER,
>   /**
>* @DRM_UT_KMS: Used in the modesetting code.
>*/
> - DRM_UT_KMS  = 0x04,
> + DRM_UT_KMS,
>   /**
>* @DRM_UT_PRIME: Used in the prime code.
>*/
> - DRM_UT_PRIME= 0x08,
> + DRM_UT_PRIME,
>   /**
>* @DRM_UT_ATOMIC: Used in the atomic code.
>*/
> - DRM_UT_ATOMIC   = 0x10,
> + DRM_UT_ATOMIC,
>   /**
>* @DRM_UT_VBL: Used for verbose debug message in the vblank code.
>*/
> - DRM_UT_VBL  = 0x20,
> + DRM_UT_VBL,
>   /**
>* @DRM_UT_STATE: Used for verbose atomic state debugging.
>*/
> - DRM_UT_STATE= 0x40,
> + DRM_UT_STATE,
>   /**
>* @DRM_UT_LEASE: Used in the lease code.
>*/
> - DRM_UT_LEASE= 0x80,
> + DRM_UT_LEASE,
>   /**
>* @DRM_UT_DP: Used in the DP code.
>*/
> - DRM_UT_DP   = 0x100,
> + DRM_UT_DP,
>   /**
>* @DRM_UT_DRMRES: Used in the drm managed resources code.
>*/
> - DRM_UT_DRMRES   = 0x200,
> + DRM_UT_DRMRES
>  };
>  
>  static inline bool drm_debug_enabled(enum drm_debug_category category)
>  {
> - return unlikely(__drm_debug & category);
> + return unlikely(__drm_debug & BIT(category));
>  }
>  
>  /*
> -- 
> 2.37.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v4 00/41] DYNDBG: opt-in class'd debug for modules, use in drm.

2022-09-07 Thread Daniel Vetter
On Wed, Sep 07, 2022 at 07:47:10AM +0200, Greg KH wrote:
> On Tue, Sep 06, 2022 at 09:18:09PM +0200, Daniel Vetter wrote:
> > On Fri, Aug 12, 2022 at 08:03:47AM +0200, Greg KH wrote:
> > > On Thu, Aug 11, 2022 at 06:52:40PM +0200, Daniel Vetter wrote:
> > > > On Wed, Aug 03, 2022 at 04:13:05PM -0400, Jason Baron wrote:
> > > > > 
> > > > > 
> > > > > On 8/3/22 15:56, jim.cro...@gmail.com wrote:
> > > > > > On Wed, Jul 20, 2022 at 9:32 AM Jim Cromie  
> > > > > > wrote:
> > > > > >>
> > > > > > 
> > > > > >> Hi Jason, Greg, DRM-folk,
> > > > > >>
> > > > > >> This adds 'typed' "class FOO" support to dynamic-debug, where 
> > > > > >> 'typed'
> > > > > >> means either DISJOINT (like drm debug categories), or VERBOSE (like
> > > > > >> nouveau debug-levels).  Use it in DRM modules: core, helpers, and 
> > > > > >> in
> > > > > >> drivers i915, amdgpu, nouveau.
> > > > > >>
> > > > > > 
> > > > > > This revision fell over, on a conflict with something in drm-MUMBLE
> > > > > > 
> > > > > > Error: patch 
> > > > > > https://urldefense.com/v3/__https://patchwork.freedesktop.org/api/1.0/series/106427/revisions/2/mbox/__;!!GjvTz_vk!UCPl5Uf32cDVwwysMTfaLwoGLWomargFXuR8HjBA3xsUOjxXHXC5hneAkP4iWK91yc-LjjJxWW89-51Z$
> > > > > >  
> > > > > > not applied
> > > > > > Applying: dyndbg: fix static_branch manipulation
> > > > > > Applying: dyndbg: fix module.dyndbg handling
> > > > > > Applying: dyndbg: show both old and new in change-info
> > > > > > Applying: dyndbg: reverse module walk in cat control
> > > > > > Applying: dyndbg: reverse module.callsite walk in cat control
> > > > > > Applying: dyndbg: use ESCAPE_SPACE for cat control
> > > > > > Applying: dyndbg: let query-modname override actual module name
> > > > > > Applying: dyndbg: add test_dynamic_debug module
> > > > > > Applying: dyndbg: drop EXPORTed dynamic_debug_exec_queries
> > > > > > 
> > > > > > Jason,
> > > > > > those above are decent maintenance patches, particularly the drop 
> > > > > > export.
> > > > > > It would be nice to trim this unused api this cycle.
> > > > > 
> > > > > Hi Jim,
> > > > > 
> > > > > Agreed - I was thinking the same thing. Feel free to add
> > > > > Acked-by: Jason Baron  to those first 9.
> > > > 
> > > > Does Greg KH usually pick up dyndbg patches or someone else or do I need
> > > > to do something? Would be great to get some movement here since -rc1 
> > > > goes
> > > > out and merging will restart next week.
> > > 
> > > Yes, I can take these into my tree after -rc1 is out.
> > 
> > [uncovering from an absolutely impressive cascade of holes :-(]
> > 
> > Did this happen and I can stop worrying here? I'd like to make sure these
> > drm debug infra improvements keep moving.
> 
> I didn't take these, and I think I saw a 6th series sent:
>   https://lore.kernel.org/r/20220904214134.408619-1-jim.cro...@gmail.com
> 
> If you ack them, I will pick them up.

Hm here we only talked about the first 9 or so patches from the series, do
you still want my ack on those?

Acked-by: Daniel Vetter 

Since yeah I do like the direction of this :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 1/6] drm/ttm: Add new callbacks to ttm res mgr

2022-09-06 Thread Daniel Vetter
s(struct ttm_device *bdev,
> > > + struct ttm_resource *res,
> > > + const struct ttm_place *place,
> > > + size_t size)
> > > +{
> > > +    struct ttm_resource_manager *man;
> > > +
> > > +    if (!res)
> > > +    return false;
> > > +
> > > +    if (!place)
> > > +    return true;
> > > +
> > > +    man = ttm_manager_type(bdev, res->mem_type);
> > > +    if (!man->func->intersects) {
> > > +    if (place->fpfn >= (res->start + res->num_pages) ||
> > > +    (place->lpfn && place->lpfn <= res->start))
> > > +    return false;
> > > +
> > > +    return true;
> > > +    }
> > > +
> > > +    return man->func->intersects(man, res, place, size);
> > > +}
> > > +
> > > +/**
> > > + * ttm_resource_compatible - test for compatibility
> > > + *
> > > + * @bdev: TTM device structure
> > > + * @res: The resource to test
> > > + * @place: The placement to test
> > > + * @size: How many bytes the new allocation needs.
> > > + *
> > > + * Test if @res compatible with @place and @size.
> > > + *
> > > + * Returns true if the res placement compatible with @place and @size.
> > > + */
> > > +bool ttm_resource_compatible(struct ttm_device *bdev,
> > > + struct ttm_resource *res,
> > > + const struct ttm_place *place,
> > > + size_t size)
> > > +{
> > > +    struct ttm_resource_manager *man;
> > > +
> > > +    if (!res || !place)
> > > +    return false;
> > > +
> > > +    man = ttm_manager_type(bdev, res->mem_type);
> > > +    if (!man->func->compatible) {
> > > +    if (res->start < place->fpfn ||
> > > +    (place->lpfn && (res->start + res->num_pages) >
> > > place->lpfn))
> > > +    return false;
> > > +
> > > +    return true;
> > > +    }
> > > +
> > > +    return man->func->compatible(man, res, place, size);
> > > +}
> > > +
> > >   static bool ttm_resource_places_compat(struct ttm_resource *res,
> > >  const struct ttm_place *places,
> > >  unsigned num_placement)
> > >   {
> > > +    struct ttm_buffer_object *bo = res->bo;
> > > +    struct ttm_device *bdev = bo->bdev;
> > >   unsigned i;
> > >     if (res->placement & TTM_PL_FLAG_TEMPORARY)
> > > @@ -265,8 +339,7 @@ static bool ttm_resource_places_compat(struct
> > > ttm_resource *res,
> > >   for (i = 0; i < num_placement; i++) {
> > >   const struct ttm_place *heap = [i];
> > >   -    if (res->start < heap->fpfn || (heap->lpfn &&
> > > -    (res->start + res->num_pages) > heap->lpfn))
> > > +    if (!ttm_resource_compatible(bdev, res, heap, bo->base.size))
> > >   continue;
> > >     if ((res->mem_type == heap->mem_type) &&
> > > diff --git a/include/drm/ttm/ttm_resource.h
> > > b/include/drm/ttm/ttm_resource.h
> > > index ca89a48c2460..5afc6d664fde 100644
> > > --- a/include/drm/ttm/ttm_resource.h
> > > +++ b/include/drm/ttm/ttm_resource.h
> > > @@ -88,6 +88,38 @@ struct ttm_resource_manager_func {
> > >   void (*free)(struct ttm_resource_manager *man,
> > >    struct ttm_resource *res);
> > >   +    /**
> > > + * struct ttm_resource_manager_func member intersects
> > > + *
> > > + * @man: Pointer to a memory type manager.
> > > + * @res: Pointer to a struct ttm_resource to be checked.
> > > + * @place: Placement to check against.
> > > + * @size: Size of the check.
> > > + *
> > > + * Test if @res intersects with @place + @size. Used to judge if
> > > + * evictions are valueable or not.
> > > + */
> > > +    bool (*intersects)(struct ttm_resource_manager *man,
> > > +   struct ttm_resource *res,
> > > +   const struct ttm_place *place,
> > > +   size_t size);
> > > +
> > > +    /**
> > > + * struct ttm_resource_manager_func member compatible
> > > + *
> > > + * @man: Pointer to a memory type manager.
> > > + * @res: Pointer to a struct ttm_resource to be checked.
> > > + * @place: Placement to check against.
> > > + * @size: Size of the check.
> > > + *
> > > + * Test if @res compatible with @place + @size. Used to check of
> > > + * the need to move the backing store or not.
> > > + */
> > > +    bool (*compatible)(struct ttm_resource_manager *man,
> > > +   struct ttm_resource *res,
> > > +   const struct ttm_place *place,
> > > +   size_t size);
> > > +
> > >   /**
> > >    * struct ttm_resource_manager_func member debug
> > >    *
> > > @@ -329,6 +361,14 @@ int ttm_resource_alloc(struct ttm_buffer_object
> > > *bo,
> > >  const struct ttm_place *place,
> > >  struct ttm_resource **res);
> > >   void ttm_resource_free(struct ttm_buffer_object *bo, struct
> > > ttm_resource **res);
> > > +bool ttm_resource_intersects(struct ttm_device *bdev,
> > > + struct ttm_resource *res,
> > > + const struct ttm_place *place,
> > > + size_t size);
> > > +bool ttm_resource_compatible(struct ttm_device *bdev,
> > > + struct ttm_resource *res,
> > > + const struct ttm_place *place,
> > > + size_t size);
> > >   bool ttm_resource_compat(struct ttm_resource *res,
> > >    struct ttm_placement *placement);
> > >   void ttm_resource_set_bo(struct ttm_resource *res,
> > > 
> > > base-commit: 730c2bf4ad395acf0aa0820535fdb8ea6abe5df1
> > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v4 00/41] DYNDBG: opt-in class'd debug for modules, use in drm.

2022-09-06 Thread Daniel Vetter
On Fri, Aug 12, 2022 at 08:03:47AM +0200, Greg KH wrote:
> On Thu, Aug 11, 2022 at 06:52:40PM +0200, Daniel Vetter wrote:
> > On Wed, Aug 03, 2022 at 04:13:05PM -0400, Jason Baron wrote:
> > > 
> > > 
> > > On 8/3/22 15:56, jim.cro...@gmail.com wrote:
> > > > On Wed, Jul 20, 2022 at 9:32 AM Jim Cromie  wrote:
> > > >>
> > > > 
> > > >> Hi Jason, Greg, DRM-folk,
> > > >>
> > > >> This adds 'typed' "class FOO" support to dynamic-debug, where 'typed'
> > > >> means either DISJOINT (like drm debug categories), or VERBOSE (like
> > > >> nouveau debug-levels).  Use it in DRM modules: core, helpers, and in
> > > >> drivers i915, amdgpu, nouveau.
> > > >>
> > > > 
> > > > This revision fell over, on a conflict with something in drm-MUMBLE
> > > > 
> > > > Error: patch 
> > > > https://urldefense.com/v3/__https://patchwork.freedesktop.org/api/1.0/series/106427/revisions/2/mbox/__;!!GjvTz_vk!UCPl5Uf32cDVwwysMTfaLwoGLWomargFXuR8HjBA3xsUOjxXHXC5hneAkP4iWK91yc-LjjJxWW89-51Z$
> > > >  
> > > > not applied
> > > > Applying: dyndbg: fix static_branch manipulation
> > > > Applying: dyndbg: fix module.dyndbg handling
> > > > Applying: dyndbg: show both old and new in change-info
> > > > Applying: dyndbg: reverse module walk in cat control
> > > > Applying: dyndbg: reverse module.callsite walk in cat control
> > > > Applying: dyndbg: use ESCAPE_SPACE for cat control
> > > > Applying: dyndbg: let query-modname override actual module name
> > > > Applying: dyndbg: add test_dynamic_debug module
> > > > Applying: dyndbg: drop EXPORTed dynamic_debug_exec_queries
> > > > 
> > > > Jason,
> > > > those above are decent maintenance patches, particularly the drop 
> > > > export.
> > > > It would be nice to trim this unused api this cycle.
> > > 
> > > Hi Jim,
> > > 
> > > Agreed - I was thinking the same thing. Feel free to add
> > > Acked-by: Jason Baron  to those first 9.
> > 
> > Does Greg KH usually pick up dyndbg patches or someone else or do I need
> > to do something? Would be great to get some movement here since -rc1 goes
> > out and merging will restart next week.
> 
> Yes, I can take these into my tree after -rc1 is out.

[uncovering from an absolutely impressive cascade of holes :-(]

Did this happen and I can stop worrying here? I'd like to make sure these
drm debug infra improvements keep moving.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v4 00/41] DYNDBG: opt-in class'd debug for modules, use in drm.

2022-08-11 Thread Daniel Vetter
On Wed, Aug 03, 2022 at 04:13:05PM -0400, Jason Baron wrote:
> 
> 
> On 8/3/22 15:56, jim.cro...@gmail.com wrote:
> > On Wed, Jul 20, 2022 at 9:32 AM Jim Cromie  wrote:
> >>
> > 
> >> Hi Jason, Greg, DRM-folk,
> >>
> >> This adds 'typed' "class FOO" support to dynamic-debug, where 'typed'
> >> means either DISJOINT (like drm debug categories), or VERBOSE (like
> >> nouveau debug-levels).  Use it in DRM modules: core, helpers, and in
> >> drivers i915, amdgpu, nouveau.
> >>
> > 
> > This revision fell over, on a conflict with something in drm-MUMBLE
> > 
> > Error: patch 
> > https://urldefense.com/v3/__https://patchwork.freedesktop.org/api/1.0/series/106427/revisions/2/mbox/__;!!GjvTz_vk!UCPl5Uf32cDVwwysMTfaLwoGLWomargFXuR8HjBA3xsUOjxXHXC5hneAkP4iWK91yc-LjjJxWW89-51Z$
> >  
> > not applied
> > Applying: dyndbg: fix static_branch manipulation
> > Applying: dyndbg: fix module.dyndbg handling
> > Applying: dyndbg: show both old and new in change-info
> > Applying: dyndbg: reverse module walk in cat control
> > Applying: dyndbg: reverse module.callsite walk in cat control
> > Applying: dyndbg: use ESCAPE_SPACE for cat control
> > Applying: dyndbg: let query-modname override actual module name
> > Applying: dyndbg: add test_dynamic_debug module
> > Applying: dyndbg: drop EXPORTed dynamic_debug_exec_queries
> > 
> > Jason,
> > those above are decent maintenance patches, particularly the drop export.
> > It would be nice to trim this unused api this cycle.
> 
> Hi Jim,
> 
> Agreed - I was thinking the same thing. Feel free to add
> Acked-by: Jason Baron  to those first 9.

Does Greg KH usually pick up dyndbg patches or someone else or do I need
to do something? Would be great to get some movement here since -rc1 goes
out and merging will restart next week.
-Daniel

> 
> 
> 
> > 
> > Applying: dyndbg: add class_id to pr_debug callsites
> > Applying: dyndbg: add __pr_debug_cls for testing
> > Applying: dyndbg: add DECLARE_DYNDBG_CLASSMAP
> > Applying: kernel/module: add __dyndbg_classes section
> > Applying: dyndbg: add ddebug_attach_module_classes
> > Applying: dyndbg: validate class FOO by checking with module
> > Applying: dyndbg: add drm.debug style bitmap support
> > Applying: dyndbg: test DECLARE_DYNDBG_CLASSMAP, sysfs nodes
> > Applying: doc-dyndbg: describe "class CLASS_NAME" query support
> > Applying: doc-dyndbg: edit dynamic-debug-howto for brevity, audience
> > Applying: drm_print: condense enum drm_debug_category
> > Applying: drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers.
> > Applying: drm_print: interpose drm_*dbg with forwarding macros
> > Applying: drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro
> > Using index info to reconstruct a base tree...
> > M drivers/gpu/drm/Kconfig
> > M drivers/gpu/drm/Makefile
> > Falling back to patching base and 3-way merge...
> > Auto-merging drivers/gpu/drm/Makefile
> > Auto-merging drivers/gpu/drm/Kconfig
> > CONFLICT (content): Merge conflict in drivers/gpu/drm/Kconfig
> > error: Failed to merge in the changes.
> > 
> > 
> > Before I resend, I should sort out that possible conflict
> > which tree is patchwork applied upon ?
> > 
> > or was it just transient ? after 5.19 I rebased a copy onto 
> > drm-next/drm-next,
> > and there was nothing to fix - I will revisit presently..
> 
> 
> Not sure, if it's a minor conflict maybe Greg KH can sort it when
> he pulls it in? If not yeah might be important to rebase first...Greg?
> 
> Thanks,
> 
> -Jason

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v5 00/33] DYNDBG: opt-in class'd debug for modules, use in drm.

2022-08-09 Thread Daniel Vetter
EXPORTed dynamic_debug_exec_queries
>   dyndbg: cleanup local vars in ddebug_init
>   dyndbg: create and use struct _ddebug_info
> 
> class FOO support  
> 
>   dyndbg: add class_id to pr_debug callsites
>   dyndbg: add __pr_debug_cls for testing
>   dyndbg: add DECLARE_DYNDBG_CLASSMAP macro
>   kernel/module: add __dyndbg_classes section
>   dyndbg: add ddebug_attach_module_classes
>   dyndbg: validate class FOO by checking with module
>   doc-dyndbg: describe "class CLASS_NAME" query support
>   doc-dyndbg: edit dynamic-debug-howto for brevity, audience
> 
> add dyndbg-class-param support
> 
>   dyndbg: add drm.debug style (drm/parameters/debug) bitmap support
>   dyndbg: test DECLARE_DYNDBG_CLASSMAP, sysfs nodes
> 
> drm.debug adaptation
> 
>   drm_print: condense enum drm_debug_category
>   drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers.
>   drm_print: interpose drm_*dbg with forwarding macros
>   drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro
>   drm-print.h: include dyndbg header
>   drm-print: add drm_dbg_driver to improve namespace symmetry
>   drm_print: refine drm_debug_enabled for jump-label
>   drm_print: prefer bare printk KERN_DEBUG on generic fn
>   drm_print: add _ddebug descriptor to drm_*dbg prototypes
> 
> nouveau-LEVEL_NUM integration: WIP/exploratory.
> 
>   nouveau: change nvkm_debug/trace to use dev_dbg POC
>   nouveau: adapt NV_DEBUG, NV_ATOMIC to use DRM.debug
>   nouveau: WIP add 2 LEVEL_NUM classmaps for CLI, SUBDEV
> 
>  .../admin-guide/dynamic-debug-howto.rst   | 246 +-
>  MAINTAINERS   |   2 +
>  drivers/gpu/drm/Kconfig   |  12 +
>  drivers/gpu/drm/Makefile  |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  14 +
>  drivers/gpu/drm/display/drm_dp_helper.c   |  13 +
>  drivers/gpu/drm/drm_crtc_helper.c |  13 +
>  drivers/gpu/drm/drm_print.c   |  48 +-
>  drivers/gpu/drm/i915/i915_params.c|  12 +
>  .../gpu/drm/nouveau/include/nvkm/core/debug.h |  16 +
>  .../drm/nouveau/include/nvkm/core/subdev.h|  17 +-
>  drivers/gpu/drm/nouveau/nouveau_drm.c |  20 +
>  drivers/gpu/drm/nouveau/nouveau_drv.h |  16 +-
>  drivers/gpu/drm/nouveau/nvkm/core/subdev.c|  23 +
>  include/asm-generic/vmlinux.lds.h |   3 +
>  include/drm/drm_print.h   |  85 +++-
>  include/linux/dynamic_debug.h | 176 +--
>  kernel/module/internal.h  |   4 +-
>  kernel/module/main.c  |  20 +-
>  lib/Kconfig.debug |  10 +
>  lib/Makefile  |   1 +
>  lib/dynamic_debug.c   | 450 +++---
>  lib/test_dynamic_debug.c  | 165 +++
>  23 files changed, 1099 insertions(+), 269 deletions(-)
>  create mode 100644 lib/test_dynamic_debug.c
> 
> -- 
> 2.37.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH 4/5] drm/drm_color_mgmt: add 3D LUT to color mgmt properties

2022-06-24 Thread Daniel Vetter
);
> diff --git a/include/drm/drm_color_mgmt.h b/include/drm/drm_color_mgmt.h
> index 81c298488b0c..a4f054e0108f 100644
> --- a/include/drm/drm_color_mgmt.h
> +++ b/include/drm/drm_color_mgmt.h
> @@ -59,6 +59,10 @@ void drm_crtc_enable_color_mgmt(struct drm_crtc *crtc,
>   bool has_ctm,
>   uint gamma_lut_size);
>  
> +void drm_crtc_enable_lut3d(struct drm_crtc *crtc,
> +uint shaper_lut_size,
> +uint lut3d_size);
> +
>  int drm_mode_crtc_set_gamma_size(struct drm_crtc *crtc,
>int gamma_size);
>  
> diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
> index a318d5feb44b..c22ffcc4d7aa 100644
> --- a/include/drm/drm_crtc.h
> +++ b/include/drm/drm_crtc.h
> @@ -165,7 +165,7 @@ struct drm_crtc_state {
>   bool zpos_changed : 1;
>   /**
>* @color_mgmt_changed: Color management properties have changed
> -  * (@shaper_lut, @gamma_lut, @degamma_lut or @ctm). Used by
> +  * (@shaper_lut, @lut3d, @gamma_lut, @degamma_lut or @ctm). Used by
>* the atomic helpers and drivers to steer the atomic commit control
>* flow.
>*/
> @@ -298,6 +298,16 @@ struct drm_crtc_state {
>*/
>   struct drm_property_blob *shaper_lut;
>  
> + /**
> +  * @lut3d:
> +  *
> +  * 3D Lookup table for converting pixel data. Position where it takes
> +  * place depends on hw design, after @ctm or @gamma_lut. See
> +  * drm_crtc_enable_color_mgmt(). The blob (if not NULL) is an array of
> +  *  drm_color_lut.
> +  */
> + struct drm_property_blob *lut3d;
> +
>   /**
>* @target_vblank:
>*
> diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
> index 2df7e171add9..87280694e70d 100644
> --- a/include/drm/drm_mode_config.h
> +++ b/include/drm/drm_mode_config.h
> @@ -812,6 +812,19 @@ struct drm_mode_config {
>*/
>   struct drm_property *shaper_lut_size_property;
>  
> + /**
> +  * @lut3d_property: Optional CRTC property to set the 3D LUT used to
> +  * convert colors; before or after gamma conversion depends on hw
> +  * design. A shaper LUT can be used to delinearize content before apply
> +  * 3D LUT correction.
> +  */
> + struct drm_property *lut3d_property;
> + /**
> +  * @lut3d_size_property: Optional CRTC property for the size of the
> +  * 3D LUT as supported by the driver (read-only).
> +  */
> + struct drm_property *lut3d_size_property;
> +
>   /**
>* @gamma_lut_property: Optional CRTC property to set the LUT used to
>* convert the colors, after the CTM matrix, to the gamma space of the
> -- 
> 2.35.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

2022-06-24 Thread Daniel Vetter
On Mon, Jun 20, 2022 at 08:18:04AM -0700, Rob Clark wrote:
> On Mon, Jun 20, 2022 at 7:09 AM Dmitry Osipenko
>  wrote:
> >
> > On 6/19/22 20:53, Rob Clark wrote:
> > ...
> > >> +static unsigned long
> > >> +drm_gem_shmem_shrinker_count_objects(struct shrinker *shrinker,
> > >> +struct shrink_control *sc)
> > >> +{
> > >> +   struct drm_gem_shmem_shrinker *gem_shrinker = 
> > >> to_drm_shrinker(shrinker);
> > >> +   struct drm_gem_shmem_object *shmem;
> > >> +   unsigned long count = 0;
> > >> +
> > >> +   if (!mutex_trylock(_shrinker->lock))
> > >> +   return 0;
> > >> +
> > >> +   list_for_each_entry(shmem, _shrinker->lru_evictable, 
> > >> madv_list) {
> > >> +   count += shmem->base.size;
> > >> +
> > >> +   if (count >= SHRINK_EMPTY)
> > >> +   break;
> > >> +   }
> > >> +
> > >> +   mutex_unlock(_shrinker->lock);
> > >
> > > As I mentioned on other thread, count_objects, being approximate but
> > > lockless and fast is the important thing.  Otherwise when you start
> > > hitting the shrinker on many threads, you end up serializing them all,
> > > even if you have no pages to return to the system at that point.
> >
> > Daniel's point for dropping the lockless variant was that we're already
> > in trouble if we're hitting shrinker too often and extra optimizations
> > won't bring much benefits to us.
> 
> At least with zram swap (which I highly recommend using even if you
> are not using a physical swap file/partition), swapin/out is actually
> quite fast.  And if you are leaning on zram swap to fit 8GB of chrome
> browser on a 4GB device, the shrinker gets hit quite a lot.  Lower
> spec (4GB RAM) chromebooks can be under constant memory pressure and
> can quite easily get into a situation where you are hitting the
> shrinker on many threads simultaneously.  So it is pretty important
> for all shrinkers in the system (not just drm driver) to be as
> concurrent as possible.  As long as you avoid serializing reclaim on
> all the threads, performance can still be quite good, but if you don't
> performance will fall off a cliff.
> 
> jfwiw, we are seeing pretty good results (iirc 40-70% increase in open
> tab counts) with the combination of eviction + multigen LRU[1] +
> sizing zram swap to be 2x physical RAM
> 
> [1] https://lwn.net/Articles/856931/
> 
> > Alright, I'll add back the lockless variant (or will use yours
> > drm_gem_lru) in the next revision. The code difference is very small
> > after all.
> >
> > ...
> > >> +   /* prevent racing with the dma-buf importing/exporting */
> > >> +   if 
> > >> (!mutex_trylock(_shrinker->dev->object_name_lock)) {
> > >> +   *lock_contention |= true;
> > >> +   goto resv_unlock;
> > >> +   }
> > >
> > > I'm not sure this is a good idea to serialize on object_name_lock.
> > > Purgeable buffers should never be shared (imported or exported).  So
> > > at best you are avoiding evicting and immediately swapping back in, in
> > > a rare case, at the cost of serializing multiple threads trying to
> > > reclaim pages in parallel.
> >
> > The object_name_lock shouldn't cause contention in practice. But objects
> > are also pinned on attachment, hence maybe this lock is indeed
> > unnecessary.. I'll re-check it.
> 
> I'm not worried about contention with export/import/etc, but
> contention between multiple threads hitting the shrinker in parallel.
> I guess since you are using trylock, it won't *block* the other
> threads hitting shrinker, but they'll just end up looping in
> do_shrink_slab() because they are hitting contention.
> 
> I'd have to do some experiments to see how it works out in practice,
> but my gut feel is that it isn't a good idea

Yeah trylock on anything else than the object lock is No Good in the
shrinker. And it really shouldn't be needed, since import/export should
pin stuff as needed. Which should be protected by the dma_resv object
lock. If not, we need to fix that.

Picking a random drm-internal lock like this is definitely no good design.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

2022-06-24 Thread Daniel Vetter
On Sun, Jun 19, 2022 at 10:53:03AM -0700, Rob Clark wrote:
> On Thu, May 26, 2022 at 4:55 PM Dmitry Osipenko
>  wrote:
> > +   mutex_unlock(_shrinker->lock);
> 
> As I mentioned on other thread, count_objects, being approximate but
> lockless and fast is the important thing.  Otherwise when you start
> hitting the shrinker on many threads, you end up serializing them all,
> even if you have no pages to return to the system at that point.

Yeah agreed, seems like I was wrong here :-) Atomic counter or something
would also be in link the the lru_list stuff.

It would be to record this in the kerneldoc for the shrinker structure
though, to make sure this is all understood.

> > +   /* prevent racing with the dma-buf importing/exporting */
> > +   if (!mutex_trylock(_shrinker->dev->object_name_lock)) {
> > +   *lock_contention |= true;
> > +   goto resv_unlock;
> > +   }
> 
> I'm not sure this is a good idea to serialize on object_name_lock.
> Purgeable buffers should never be shared (imported or exported).  So
> at best you are avoiding evicting and immediately swapping back in, in
> a rare case, at the cost of serializing multiple threads trying to
> reclaim pages in parallel.

Yeah this sounds really bad. Plus this is a per-device lock, and doing
those with trylock means the shrinker will fail to find shrinkable memory
way too often. We need to engineer this out somehow.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH v2 00/27] DRM.debug on DYNAMIC_DEBUG, add trace events

2022-06-08 Thread Daniel Vetter
On Mon, Jun 06, 2022 at 08:59:36AM -0600, jim.cro...@gmail.com wrote:
> On Wed, May 25, 2022 at 9:02 AM Daniel Vetter  wrote:
> 
> > On Mon, May 16, 2022 at 04:56:13PM -0600, Jim Cromie wrote:
> > > DRM.debug API is 23 macros, issuing 10 exclusive categories of debug
> > > messages.  By rough count, they are used 5140 times in the kernel.
> > > These all call drm_dbg or drm_devdbg, which call drm_debug_enabled(),
> > > which checks bits in global __drm_debug.  Some of these are page-flips
> > > and vblanks, and get called often.
> > >
> > > DYNAMIC_DEBUG (with CONFIG_JUMP_LABEL) is built to avoid this kind of
> > > work, with NOOPd jump/callsites.
> > >
> > > This patchset is RFC because:
> > > - it touches 2.5 subsystems: dyndbg, drm, tracefs (new events)
> > > - dyndbg class support is built for drm, needs it for validation
> > > - new api, used by drm
> > > - big memory impact, with 5100 new pr-debug callsites.
> > > - drm class bikeshedding opportunities
> > > - others, names etc.
> >
> > Thanks a lot for keeping on pushing this!
> 
> 
> > >
> > > DYNAMIC_DEBUG:
> > >
> 
> 
> 
> > > RFC:
> > >
> > > dynamic_debug_register_classes() cannot act early enough to be in
> > > effect at module-load.  So this will not work as you'd reasonably
> > > expect:
> > >
> > >   modprobe test_dynamic_debug dyndbg='+pfm; class FOO +pfmlt'
> > >
> > > The 1st query:+pfm will be enabled during load, but in the 2nd query,
> > > "class FOO" will be unknown at load time.  Early class enablement
> > > would be nice.  DYNAMIC_DEBUG_CLASSES is a static initializer, which
> > > is certainly early enough, but Im missing a trick, suggestions?
> >
> > So maybe I'm just totally overloading this work here so feel free to
> > ignore or postpone, but: Could we do the dynamic_debug_register_classes()
> > automatically at module load as a new special section? And then throw in a
> > bit of kbuild so that in a given subsystem every driver gets the same
> > class names by default and everything would just work, without having to
> > sprinkle calls to dynamic_debug_register_classes() all over the place?
> >
> 
> This is now done; Ive added __dyndbg_classes section.
> load_module() now grabs it from the .ko
> and ddebug_add_module() attaches it to the module's ddebug_table record.
> for builtins, dynamic_debug_init feeds the builtin class-maps to
> ddebug_add_module
> 
> bash-5.1# modprobe test_dynamic_debug dyndbg="class FOO +p"
> [   88.374722] dyndbg: class[0]: nm:test_dynamic_debug base:20 len:7 ty:1
> [   88.375158] dyndbg:  0: EMERG
> [   88.375345] dyndbg:  1: DANGER
> [   88.375540] dyndbg:  2: ERROR
> [   88.375726] dyndbg:  3: WARNING
> [   88.375930] dyndbg:  4: NOTICE
> [   88.376130] dyndbg:  5: INFO
> [   88.376310] dyndbg:  6: DEBUG
> [   88.376499] dyndbg: class[1]: nm:test_dynamic_debug base:12 len:3 ty:1
> [   88.376903] dyndbg:  0: ONE
> [   88.377079] dyndbg:  1: TWO
> [   88.377253] dyndbg:  2: THREE
> [   88.377441] dyndbg: class[2]: nm:test_dynamic_debug base:8 len:3 ty:0
> [   88.377837] dyndbg:  0: bing
> [   88.378022] dyndbg:  1: bong
> [   88.378203] dyndbg:  2: boom
> [   88.378387] dyndbg: class[3]: nm:test_dynamic_debug base:4 len:3 ty:0
> [   88.378800] dyndbg:  0: Foo
> [   88.378986] dyndbg:  1: Bar
> [   88.379167] dyndbg:  2: Buzz
> [   88.379348] dyndbg: class[4]: nm:test_dynamic_debug base:0 len:3 ty:0
> [   88.379757] dyndbg:  0: FOO
> [   88.379938] dyndbg:  1: BAR
> [   88.380136] dyndbg:  2: BUZZ
> [   88.380410] dyndbg: module:test_dynamic_debug attached 5 classes
> [   88.380881] dyndbg:  24 debug prints in module test_dynamic_debug
> [   88.381315] dyndbg: module: test_dynamic_debug dyndbg="class FOO +p"
> [   88.381714] dyndbg: query 0: "class FOO +p" mod:test_dynamic_debug
> [   88.382109] dyndbg: split into words: "class" "FOO" "+p"
> [   88.382445] dyndbg: op='+'
> [   88.382616] dyndbg: flags=0x1
> [   88.382802] dyndbg: *flagsp=0x1 *maskp=0x
> [   88.383101] dyndbg: parsed: func="" file="" module="test_dynamic_debug"
> format="" lineno=0-0 class=FOO
> [   88.383740] dyndbg: applied: func="" file="" module="test_dynamic_debug"
> format="" lineno=0-0 class=FOO
> [   88.384324] dyndbg: processed 1 queries, with 2 matches, 0 errs
> bash-5.1#
> 
> so its working at module-load time.

Awesome!

> > For the entire class approach, di

Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

2022-06-05 Thread Daniel Vetter
On Sun, 5 Jun 2022 at 20:32, Rob Clark  wrote:
>
> On Sun, Jun 5, 2022 at 9:47 AM Daniel Vetter  wrote:
> >
> > On Fri, 27 May 2022 at 01:55, Dmitry Osipenko
> >  wrote:
> > >
> > > Introduce a common DRM SHMEM shrinker framework that allows to reduce
> > > code duplication among DRM drivers by replacing theirs custom shrinker
> > > implementations with the generic shrinker.
> > >
> > > In order to start using DRM SHMEM shrinker drivers should:
> > >
> > > 1. Implement new evict() shmem object callback.
> > > 2. Register shrinker using drm_gem_shmem_shrinker_register(drm_device).
> > > 3. Use drm_gem_shmem_set_purgeable(shmem) and alike API functions to
> > >activate shrinking of shmem GEMs.
> > >
> > > This patch is based on a ideas borrowed from Rob's Clark MSM shrinker,
> > > Thomas' Zimmermann variant of SHMEM shrinker and Intel's i915 shrinker.
> > >
> > > Signed-off-by: Daniel Almeida 
> > > Signed-off-by: Dmitry Osipenko 
> >
> > So I guess I get a price for being blind since forever, because this
> > thing existed since at least 2013. I just stumbled over
> > llist_lru.[hc], a purpose built list helper for shrinkers. I think we
> > should try to adopt that so that our gpu shrinkers look more like
> > shrinkers for everything else.
>
> followup from a bit of irc discussion w/ danvet about list_lru:
>
> * It seems to be missing a way to bail out of iteration before
>   nr_to_scan is hit.. which is going to be inconvenient if you
>   want to allow active bos on the LRU but bail scanning once
>   you encounter the first one.
>
> * Not sure if the numa node awareness is super useful for GEM
>   bos
>
> First issue is perhaps not too hard to fix.  But maybe a better
> idea is a drm_gem_lru helper type thing which is more tailored
> to GEM buffers?

Yeah I guess reusing list_lru isn't that good idea. So just
open-coding it for now, and then drm_gem_bo_lru or so if we need to
share it separately from shmem helpers with other drivers. Maybe will
be needed for ttm or so.
-Daniel

>
> BR,
> -R
>
> > Apologies for this, since I fear this might cause a bit of churn.
> > Hopefully it's all contained to the list manipulation code in shmem
> > helpers, I don't think this should leak any further.
> > -Daniel
> >
> > > ---
> > >  drivers/gpu/drm/drm_gem_shmem_helper.c| 540 --
> > >  .../gpu/drm/panfrost/panfrost_gem_shrinker.c  |   9 +-
> > >  drivers/gpu/drm/virtio/virtgpu_drv.h  |   3 +
> > >  include/drm/drm_device.h  |   4 +
> > >  include/drm/drm_gem_shmem_helper.h|  87 ++-
> > >  5 files changed, 594 insertions(+), 49 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> > > b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > > index 555fe212bd98..4cd0b5913492 100644
> > > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> > > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > > @@ -126,6 +126,42 @@ struct drm_gem_shmem_object 
> > > *drm_gem_shmem_create(struct drm_device *dev, size_t
> > >  }
> > >  EXPORT_SYMBOL_GPL(drm_gem_shmem_create);
> > >
> > > +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object 
> > > *shmem)
> > > +{
> > > +   return (shmem->madv >= 0) && shmem->evict &&
> > > +   shmem->eviction_enabled && shmem->pages_use_count &&
> > > +   !shmem->pages_pin_count && !shmem->base.dma_buf &&
> > > +   !shmem->base.import_attach && shmem->sgt && 
> > > !shmem->evicted;
> > > +}
> > > +
> > > +static void
> > > +drm_gem_shmem_update_pages_state(struct drm_gem_shmem_object *shmem)
> > > +{
> > > +   struct drm_gem_object *obj = >base;
> > > +   struct drm_gem_shmem_shrinker *gem_shrinker = 
> > > obj->dev->shmem_shrinker;
> > > +
> > > +   dma_resv_assert_held(shmem->base.resv);
> > > +
> > > +   if (!gem_shrinker || obj->import_attach)
> > > +   return;
> > > +
> > > +   mutex_lock(_shrinker->lock);
> > > +
> > > +   if (drm_gem_shmem_is_evictable(shmem) ||
> > > +   drm_gem_shmem_is_purgeable(shmem))
> > > +   list_move_tail(>madv_list, 
> > > _shrinker->lru_evictable);
> > > +

Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

2022-06-05 Thread Daniel Vetter
nker.c
> index a4bedfeb2ec4..7cc32556f908 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> @@ -15,6 +15,13 @@
>  #include "panfrost_gem.h"
>  #include "panfrost_mmu.h"
>
> +static bool panfrost_gem_shmem_is_purgeable(struct drm_gem_shmem_object 
> *shmem)
> +{
> +   return (shmem->madv > 0) &&
> +   !shmem->pages_pin_count && shmem->sgt &&
> +   !shmem->base.dma_buf && !shmem->base.import_attach;
> +}
> +
>  static unsigned long
>  panfrost_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control 
> *sc)
>  {
> @@ -27,7 +34,7 @@ panfrost_gem_shrinker_count(struct shrinker *shrinker, 
> struct shrink_control *sc
> return 0;
>
> list_for_each_entry(shmem, >shrinker_list, madv_list) {
> -   if (drm_gem_shmem_is_purgeable(shmem))
> +   if (panfrost_gem_shmem_is_purgeable(shmem))
> count += shmem->base.size >> PAGE_SHIFT;
> }
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
> b/drivers/gpu/drm/virtio/virtgpu_drv.h
> index b2d93cb12ebf..81bacc7e1873 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_drv.h
> +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
> @@ -89,6 +89,7 @@ struct virtio_gpu_object {
> uint32_t hw_res_handle;
> bool dumb;
> bool created;
> +   bool detached;
> bool host3d_blob, guest_blob;
> uint32_t blob_mem, blob_flags;
>
> @@ -453,6 +454,8 @@ int virtio_gpu_object_create(struct virtio_gpu_device 
> *vgdev,
>
>  bool virtio_gpu_is_shmem(struct virtio_gpu_object *bo);
>
> +int virtio_gpu_reattach_shmem_object(struct virtio_gpu_object *bo);
> +
>  int virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev,
>uint32_t *resid);
>  /* virtgpu_prime.c */
> diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
> index 9923c7a6885e..929546cad894 100644
> --- a/include/drm/drm_device.h
> +++ b/include/drm/drm_device.h
> @@ -16,6 +16,7 @@ struct drm_vblank_crtc;
>  struct drm_vma_offset_manager;
>  struct drm_vram_mm;
>  struct drm_fb_helper;
> +struct drm_gem_shmem_shrinker;
>
>  struct inode;
>
> @@ -277,6 +278,9 @@ struct drm_device {
> /** @vram_mm: VRAM MM memory manager */
> struct drm_vram_mm *vram_mm;
>
> +   /** @shmem_shrinker: SHMEM GEM memory shrinker */
> +   struct drm_gem_shmem_shrinker *shmem_shrinker;
> +
> /**
>  * @switch_power_state:
>  *
> diff --git a/include/drm/drm_gem_shmem_helper.h 
> b/include/drm/drm_gem_shmem_helper.h
> index 9a8983ee8abe..62c640678a91 100644
> --- a/include/drm/drm_gem_shmem_helper.h
> +++ b/include/drm/drm_gem_shmem_helper.h
> @@ -6,6 +6,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -15,6 +16,7 @@
>  struct dma_buf_attachment;
>  struct drm_mode_create_dumb;
>  struct drm_printer;
> +struct drm_device;
>  struct sg_table;
>
>  /**
> @@ -39,12 +41,21 @@ struct drm_gem_shmem_object {
>  */
> unsigned int pages_use_count;
>
> +   /**
> +* @pages_pin_count:
> +*
> +* Reference count on the pinned pages table.
> +* The pages can be evicted by memory shrinker
> +* when the count reaches zero.
> +*/
> +   unsigned int pages_pin_count;
> +
> /**
>  * @madv: State for madvise
>  *
>  * 0 is active/inuse.
> +* 1 is not-needed/can-be-purged
>  * A negative value is the object is purged.
> -* Positive values are driver specific and not used by the helpers.
>  */
> int madv;
>
> @@ -91,6 +102,39 @@ struct drm_gem_shmem_object {
>  * @map_wc: map object write-combined (instead of using shmem 
> defaults).
>  */
> bool map_wc;
> +
> +   /**
> +* @eviction_enabled:
> +*
> +* The shmem pages can be evicted only if @eviction_enabled is set to 
> true.
> +* Used internally by memory shrinker.
> +*/
> +   bool eviction_enabled;
> +
> +   /**
> +* @purge_enabled:
> +*
> +* The shmem pages can be purged only if @purge_enabled is set to 
> true.
> +* Used internally by memory shrinker.
> +*/
> +   bool purge_enabled;
> +
> +   /**
> +* @evicted: True if shmem pages are evicted by the memory shrinker.
> +* Used internally by memory shrinker.
> +*/
> +   bool evicted;
> +
> +   /**
> +* @evict:
> +*
> +* Invoked by shmem shrinker before evicting shmem GEM from memory.
> +* GEM's DMA reservation is kept locked by the shrinker. This is
> +* optional callback that should be specified by drivers.
> +*
> +* Returns 0 on success, or -errno on error.
> +*/
> +   int (*evict)(struct drm_gem_shmem_object *shmem);
>  };
>
>  #define to_drm_gem_shmem_obj(obj) \
> @@ -110,14 +154,21 @@ int drm_gem_shmem_mmap(struct drm_gem_shmem_object 
> *shmem, struct vm_area_struct
>
>  int drm_gem_shmem_madvise(struct drm_gem_shmem_object *shmem, int madv);
>
> +int drm_gem_shmem_set_purgeable(struct drm_gem_shmem_object *shmem);
> +int drm_gem_shmem_set_evictable(struct drm_gem_shmem_object *shmem);
> +
>  static inline bool drm_gem_shmem_is_purgeable(struct drm_gem_shmem_object 
> *shmem)
>  {
> -   return (shmem->madv > 0) &&
> -   !shmem->vmap_use_count && shmem->sgt &&
> -   !shmem->base.dma_buf && !shmem->base.import_attach;
> +   return (shmem->madv > 0) && shmem->evict &&
> +   shmem->purge_enabled && shmem->pages_use_count &&
> +   !shmem->pages_pin_count && !shmem->base.dma_buf &&
> +   !shmem->base.import_attach && (shmem->sgt || shmem->evicted);
>  }
>
> -void drm_gem_shmem_purge(struct drm_gem_shmem_object *shmem);
> +int drm_gem_shmem_swap_in(struct drm_gem_shmem_object *shmem);
> +
> +int drm_gem_shmem_evict(struct drm_gem_shmem_object *shmem);
> +int drm_gem_shmem_purge(struct drm_gem_shmem_object *shmem);
>
>  struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_shmem_object 
> *shmem);
>  struct sg_table *drm_gem_shmem_get_pages_sgt(struct drm_gem_shmem_object 
> *shmem);
> @@ -260,6 +311,32 @@ static inline int drm_gem_shmem_object_mmap(struct 
> drm_gem_object *obj, struct v
> return drm_gem_shmem_mmap(shmem, vma);
>  }
>
> +/**
> + * struct drm_gem_shmem_shrinker - Generic memory shrinker for shmem GEMs
> + */
> +struct drm_gem_shmem_shrinker {
> +   /** @base: Shrinker for purging shmem GEM objects */
> +   struct shrinker base;
> +
> +   /** @lock: Protects @lru_* */
> +   struct mutex lock;
> +
> +   /** @lru_pinned: List of pinned shmem GEM objects */
> +   struct list_head lru_pinned;
> +
> +   /** @lru_evictable: List of shmem GEM objects to be evicted */
> +   struct list_head lru_evictable;
> +
> +   /** @lru_evicted: List of evicted shmem GEM objects */
> +   struct list_head lru_evicted;
> +
> +   /** @dev: DRM device that uses this shrinker */
> +   struct drm_device *dev;
> +};
> +
> +int drm_gem_shmem_shrinker_register(struct drm_device *dev);
> +void drm_gem_shmem_shrinker_unregister(struct drm_device *dev);
> +
>  /*
>   * Driver ops
>   */
> --
> 2.35.3
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 00/14] drm/kms: Stop registering multiple /sys/class/backlight devs for a single display

2022-05-25 Thread Daniel Vetter
gets the right one and not merely the one
> > which happened to get registered first.
> >
> > And I believe that having the panel brightness be a drm_connector
> > property is the way to make it possible for userspace to deal
> > with the multiple panels which each have a separate brightness
> > control case.
> 
> Agreed.
> 
> Thanks for the explanations and recording them here.

Can we stuff a summary of all the things we've discussed here into
Documentation/gpu/todo.rst so that we have this recorded somewhere more
permanently? Also good place to make sure everyone who was part of these
discussions has a chance to ack the overall plan as part of merging such a
patch.

Just feels like this is big enough to justify a todo entry with
what's going on and what's still to be done.
-Daniel


> 
> BR,
> Jani.
> 
> >
> > Regards,
> >
> > Hans
> >
> >
> >
> >
> >
> >> 
> >> BR,
> >> Jani.
> >> 
> >> 
> >>>
> >>> This series implements my RFC describing my plan for these cleanups:
> >>> https://lore.kernel.org/dri-devel/98519ba0-7f18-201a-ea34-652f50343...@redhat.com/
> >>>
> >>> Specifically patches 1-6 implement the "Fixing kms driver unconditionally
> >>> register their "native" backlight dev" part.
> >>>
> >>> And patches 7-14 implement the "Fixing acpi_video0 getting registered for
> >>> a brief time" time.
> >>>
> >>> Note this series does not deal yet with the "Other issues" part, I plan
> >>> to do a follow up series for that.
> >>>
> >>> The changes in this series are good to have regardless of the further
> >>> "drm/kms: control display brightness through drm_connector properties"
> >>> plans. So I plan to push these upstream once they are ready (once
> >>> reviewed). Since this crosses various subsystems / touches multiple
> >>> kms drivers my plan is to provide an immutable branch based on say
> >>> 5.19-rc1 and then have that get merged into all the relevant trees.
> >>>
> >>> Please review.
> >>>
> >>> Regards,
> >>>
> >>> Hans
> >>>
> >>>
> >>> Hans de Goede (14):
> >>>   ACPI: video: Add a native function parameter to
> >>> acpi_video_get_backlight_type()
> >>>   drm/i915: Don't register backlight when another backlight should be
> >>> used
> >>>   drm/amdgpu: Don't register backlight when another backlight should be
> >>> used
> >>>   drm/radeon: Don't register backlight when another backlight should be
> >>> used
> >>>   drm/nouveau: Don't register backlight when another backlight should be
> >>> used
> >>>   ACPI: video: Drop backlight_device_get_by_type() call from
> >>> acpi_video_get_backlight_type()
> >>>   ACPI: video: Remove acpi_video_bus from list before tearing it down
> >>>   ACPI: video: Simplify acpi_video_unregister_backlight()
> >>>   ACPI: video: Make backlight class device registration a separate step
> >>>   ACPI: video: Remove code to unregister acpi_video backlight when a
> >>> native backlight registers
> >>>   drm/i915: Call acpi_video_register_backlight()
> >>>   drm/nouveau: Register ACPI video backlight when nv_backlight
> >>> registration fails
> >>>   drm/amdgpu: Register ACPI video backlight when skipping amdgpu
> >>> backlight registration
> >>>   drm/radeon: Register ACPI video backlight when skipping radeon
> >>> backlight registration
> >>>
> >>>  drivers/acpi/acpi_video.c | 69 ++-
> >>>  drivers/acpi/video_detect.c   | 53 +++---
> >>>  drivers/gpu/drm/Kconfig   |  2 +
> >>>  .../gpu/drm/amd/amdgpu/atombios_encoders.c| 14 +++-
> >>>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  9 +++
> >>>  .../gpu/drm/i915/display/intel_backlight.c|  7 ++
> >>>  drivers/gpu/drm/i915/display/intel_display.c  |  1 +
> >>>  drivers/gpu/drm/i915/display/intel_opregion.c |  2 +-
> >>>  drivers/gpu/drm/nouveau/nouveau_backlight.c   | 14 
> >>>  drivers/gpu/drm/radeon/atombios_encoders.c|  7 ++
> >>>  drivers/gpu/drm/radeon/radeon_encoders.c  | 11 ++-
> >>>  .../gpu/drm/radeon/radeon_legacy_encoders.c   |  7 ++
> >>>  drivers/platform/x86/acer-wmi.c   |  2 +-
> >>>  drivers/platform/x86/asus-laptop.c|  2 +-
> >>>  drivers/platform/x86/asus-wmi.c   |  4 +-
> >>>  drivers/platform/x86/compal-laptop.c  |  2 +-
> >>>  drivers/platform/x86/dell/dell-laptop.c   |  2 +-
> >>>  drivers/platform/x86/eeepc-laptop.c   |  2 +-
> >>>  drivers/platform/x86/fujitsu-laptop.c |  4 +-
> >>>  drivers/platform/x86/ideapad-laptop.c |  2 +-
> >>>  drivers/platform/x86/intel/oaktrail.c |  2 +-
> >>>  drivers/platform/x86/msi-laptop.c |  2 +-
> >>>  drivers/platform/x86/msi-wmi.c|  2 +-
> >>>  drivers/platform/x86/samsung-laptop.c |  2 +-
> >>>  drivers/platform/x86/sony-laptop.c|  2 +-
> >>>  drivers/platform/x86/thinkpad_acpi.c  |  4 +-
> >>>  drivers/platform/x86/toshiba_acpi.c   |  2 +-
> >>>  include/acpi/video.h  |  8 ++-
> >>>  28 files changed, 156 insertions(+), 84 deletions(-)
> >> 
> >
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH v2 00/27] DRM.debug on DYNAMIC_DEBUG, add trace events

2022-05-25 Thread Daniel Vetter
> 15:TRACE_EVENT(drm_vblank_event,
> 35:TRACE_EVENT(drm_vblank_event_queued,
> 52:TRACE_EVENT(drm_vblank_event_delivered,
> 
> STATUS
> 
> kernel-test-robot tested this patchset (on 5.18-rc6).
> github:[jimc:blead] BUILD SUCCESS 6c59e52ac81dd81ac7da4522a5e15b7ac488d5b5
> May 15, 2022, 8:39 AM (1 day ago)
> 
> 
> Ive been testing, mostly on virtme, mostly with this:
> #!/bin/bash
> 
> # test class FOO handling of dynamic-debug
> 
> alias lmt='modprobe test_dynamic_debug dyndbg=+pmf'
> alias rmt='rmmod test_dynamic_debug'
> alias look='grep test_dynamic_debug /proc/dynamic_debug/control'
> 
> lookfor() {
> grep $1 /proc/dynamic_debug/control
> }
> 
> vx() {
> echo $* > /sys/module/dynamic_debug/parameters/verbose
> }
> 
> # amdgpu has ~2200 pr-debugs (before drm-debug-on-dyndbg),
> # use them to prove modprobe  dyndbg=+p works
> 
> test_param_dyndbg() {
> 
> modprobe amdgpu dyndbg=+pfm
> let count=$(grep =pmf /proc/dynamic_debug/control | grep amdgpu | wc -l)
> 
> if [ $count -gt 200 ] ; then
>   echo amdgpu has $count pr-dbgs
>   return 0
> else
>   echo failed $count
>   return -1
> fi
> }
> out=$(test_param_dyndbg)
> echo out:$out $?
> [ $? -eq 0 ] || exit $?
> 
> qry_cmd() {
> echo "QCMD: $*   >control" >&2
> echo $* > /proc/dynamic_debug/control
> }
> 
> # enable dyndbg tracing
> dynable() {
> grep -P \\d $SKT/events/dyndbg/{.,*}/enable
> echo 1 > $SKT/events/dyndbg/enable
> grep -P \\d $SKT/events/dyndbg/{.,*}/enable
> }
> 
> # enable drm tracing
> drmable() {
> grep -P \\d $SKT/events/drm/{.,*}/enable
> echo 1 > $SKT/events/drm/enable
> grep -P \\d $SKT/events/drm/{.,*}/enable
> }
> 
> function doit() {
> cat /sys/module/test_dynamic_debug/parameters/do_prints
> }
> 
> # test class FOO behavior of test_dynamic_debug module
> ttest_module__() {
> flg=$1
> dmesg -C
> modprobe test_dynamic_debug dyndbg=+pfm
> doit
> 
> for cls in FOO BAR BUZZ; do
>   qry_cmd module test_dynamic_debug class $cls $flg
>   doit
> done
> doit
> 
> for cls in Foo Bar Buzz; do
>   qry_cmd module test_dynamic_debug class $cls $flg
>   doit
> done
> doit
> 
> for cls in bing bong boom; do
>   qry_cmd module test_dynamic_debug class $cls $flg
>   doit
> done
> doit
> 
> dmesg | grep class
> }
> 
> ttest_module() {
> 
> ttest_module__ +p
> ttest_module__ -p
> 
> #ttest_module__ +T
> #ttest_module__ -T
> }
> 
> #dynable
> #drmable
> 
> ttest_module
> grep test_dyn /proc/dynamic_debug/control
> 
> 
> # use/test bitmaps
> 
> set_tddm_() {
> val=$1;
> knob=$2;
> echo "TDDM: $val >$knob" >&2
> echo $val > /sys/module/test_dynamic_debug/parameters/$knob
> cat /sys/module/test_dynamic_debug/parameters/$knob
> }
> 
> CLS_1="FOO -FOO +FOO -FOO BAR -BAR +BAR -BAR BUZZ -BUZZ +BUZZ -BUZZ"
> CLS_2=" Foo  Bar  Buzz -Foo -Bar -Buzz +Foo +Bar +Buzz -Foo -Bar -Buzz"
> CLS_3=" bing bong boom -bing -bong -boom +bing +bong +boom -bing -bong -boom"
> 
> tddm_sysnode_classes__() {
> targ=$1
> shift
> cls=$*
> for bitsym in $cls;
> do
>   set_tddm_ $bitsym $targ
> done
> }
> 
> # work all 3 sysfs bitmaps
> 
> for sysnode in c1_syslog_bits c2_syslog_bits c3_syslog_bits;
> do
> for val in 0 1 2 4 8 0;
> do
>   tddm_sysnode_classes__ $sysnode $val
> done
> done
> 
> tddm_sysnode_classes__ c1_syslog_bits $CLS_1
> tddm_sysnode_classes__ c2_syslog_bits $CLS_2
> tddm_sysnode_classes__ c3_syslog_bits $CLS_3
> 
> echo "show unknown: c3-names on c1-knob" >&2
> tddm_sysnode_classes__ c1_trace_bits $CLS_3
> 
> echo "flags look inverted" >&2
> tddm_sysnode_classes__ c1_syslog_bits $CLS_1
> 
> CLS_1_=FOO,-FOO,+FOO,-FOO,BAR,-BAR,+BAR,-BAR,BUZZ,-BUZZ,+BUZZ,-BUZZ
> CLS_2_=Foo,Bar,Buzz,-Foo,-Bar,-Buzz,+Foo,+Bar,+Buzz,-Foo,-Bar,-Buzz
> # leading err doesnt wreck the rest
> CLS_3_=,bing,bong,boom,-bing,-bong,-boom,+bing,+bong,+boom,-bing,-bong,-boom
> 
> tddm_sysnode_classes__ c1_syslog_bits $CLS_1_
> tddm_sysnode_classes__ c2_syslog_bits $CLS_2_
> tddm_sysnode_classes__ c3_syslog_bits $CLS_3_
> 
> 
> Cc: Sean Paul 
> Cc: Daniel Vetter 
> Cc: David Airlie 
> Cc: Jani Nikula 
> Cc: Joonas Lahtinen 
> Cc: Pekka Paalanen 
> Cc: Rob Clark 
> Cc: Steven Rostedt 
> Cc: Thomas Zimmermann 
> Cc: Vil

Re: [PATCH v2 00/19] DC/DM changes needed for amdgpu PSR-SU

2022-05-16 Thread Daniel Vetter
On Mon, 16 May 2022 at 18:23, Leo Li  wrote:
>
>
>
> On 2022-05-12 13:39, Daniel Vetter wrote:
> > On Thu, 12 May 2022 at 19:22, Zhang, Dingchen (David)
> >  wrote:
> >>
> >> [AMD Official Use Only - General]
> >>
> >> Hi Daniel
> >>
> >> Thanks for your comments and explanations. I replied in-line and look 
> >> forward to more discussion.
> >>
> >> regards
> >> David
> >>
> >>
> >> From: Daniel Vetter 
> >> Sent: Thursday, May 12, 2022 7:22 AM
> >> To: Alex Deucher 
> >> Cc: Zhang, Dingchen (David) ; amd-gfx list 
> >> ; Wang, Chao-kai (Stylon) 
> >> ; Li, Sun peng (Leo) ; Wentland, 
> >> Harry ; Zhuo, Qingqing (Lillian) 
> >> ; Siqueira, Rodrigo ; Li, 
> >> Roman ; Chiu, Solomon ; Zuo, Jerry 
> >> ; Pillai, Aurabindo ; Lin, 
> >> Wayne ; Lakha, Bhawanpreet ; 
> >> Gutierrez, Agustin ; Kotarac, Pavle 
> >> 
> >> Subject: Re: [PATCH v2 00/19] DC/DM changes needed for amdgpu PSR-SU
> >>
> >> On Wed, 11 May 2022 at 17:35, Alex Deucher  wrote:
> >>>
> >>> On Tue, May 10, 2022 at 4:45 PM David Zhang  
> >>> wrote:
> >>>>
> >>>> changes in v2:
> >>>> ---
> >>>> - set vsc_packet_rev2 for PSR1 which is safer
> >>>> - add exposure of AMD specific DPCD regs for PSR-SU-RC (rate-control)
> >>>> - add DC/DM change related to amdgpu PSR-SU-RC
> >>>>
> >>>>
> >>>> David Zhang (18):
> >>>>drm/amd/display: align dmub cmd header to latest dmub FW to support
> >>>>  PSR-SU
> >>>>drm/amd/display: feed PSR-SU as psr version to dmub FW
> >>>>drm/amd/display: combine dirty rectangles in DMUB FW
> >>>>drm/amd/display: update GSP1 generic info packet for PSRSU
> >>>>drm/amd/display: revise Start/End SDP data
> >>>>drm/amd/display: program PSR2 DPCD Configuration
> >>>>drm/amd/display: Passing Y-granularity to dmub fw
> >>>>drm/amd/display: Set default value of line_capture_indication
> >>>>drm/amd/display: add vline time in micro sec to PSR context
> >>>>drm/amd/display: fix system hang when PSR exits
> >>>>drm/amd/display: Set PSR level to enable ALPM by default
> >>>>drm/amd/display: use HW lock mgr for PSR-SU
> >>>>drm/amd/display: PSRSU+DSC WA for specific TCON
> >>>>drm/amd/display: add shared helpers to update psr config fields to
> >>>>  power module
> >>>>drm/amd/display: calculate psr config settings in runtime in DM
> >>>>drm/amd/display: update cursor position to DMUB FW
> >>>>drm/amd/display: expose AMD source specific DPCD for FreeSync PSR
> >>>>  support
> >>>>drm/amd/display: PSR-SU rate control support in DC
> >>>>
> >>>> Leo Li (1):
> >>>>drm/amd/display: Implement MPO PSR SU
> >>>
> >>> A couple of suggestions from Daniel on IRC:
> >>> 1.  Might be good to extract the "calculate total crtc damage" code
> >>> from i915 in intel_psr2_sel_fetch_update, stuff that into damage
> >>> helpers and reuse for i915 and amdgpu
> >>
> >> To expand a bit on this. There is currently a helper for total damage,
> >> but it's at the fb/plane level for drivers which need to upload
> >> buffers (usb/spi or virtual) drm_atomic_helper_damage_merged(). That
> >> one probably needs to be renamed to signify it's about the plane, and
> >> then we need a new drm_atomic_helper_crtc_damage_merged() which
> >> (extract from i915 code ideally) which computes total crtc damage for
> >> stuff like psr2/su or the command mode dsi panels (unfortunately none
> >> of the drivers for android for these panels have been upstreamed yet).
> >>
> >> <<<
> >> Checked the DRM comment for the helper `drm_atomic_helper_damage_merged()` 
> >> and
> >> quoted below:
> >> *
> >> Drivers might want to use the helper functions 
> >> drm_atomic_helper_damage_iter_init()
> >> and drm_atomic_helper_damage_iter_next() or 
> >> drm_atomic_helper_damage_merged()
> >> if the driver can only handle a single damage region at most.
> >> *
> >> Currently for amdgpu, the multiple damage clips com

Re: [PATCH v2 00/19] DC/DM changes needed for amdgpu PSR-SU

2022-05-12 Thread Daniel Vetter
On Thu, 12 May 2022 at 19:22, Zhang, Dingchen (David)
 wrote:
>
> [AMD Official Use Only - General]
>
> Hi Daniel
>
> Thanks for your comments and explanations. I replied in-line and look forward 
> to more discussion.
>
> regards
> David
>
>
> From: Daniel Vetter 
> Sent: Thursday, May 12, 2022 7:22 AM
> To: Alex Deucher 
> Cc: Zhang, Dingchen (David) ; amd-gfx list 
> ; Wang, Chao-kai (Stylon) 
> ; Li, Sun peng (Leo) ; Wentland, 
> Harry ; Zhuo, Qingqing (Lillian) 
> ; Siqueira, Rodrigo ; Li, 
> Roman ; Chiu, Solomon ; Zuo, Jerry 
> ; Pillai, Aurabindo ; Lin, Wayne 
> ; Lakha, Bhawanpreet ; 
> Gutierrez, Agustin ; Kotarac, Pavle 
> 
> Subject: Re: [PATCH v2 00/19] DC/DM changes needed for amdgpu PSR-SU
>
> On Wed, 11 May 2022 at 17:35, Alex Deucher  wrote:
> >
> > On Tue, May 10, 2022 at 4:45 PM David Zhang  wrote:
> > >
> > > changes in v2:
> > > ---
> > > - set vsc_packet_rev2 for PSR1 which is safer
> > > - add exposure of AMD specific DPCD regs for PSR-SU-RC (rate-control)
> > > - add DC/DM change related to amdgpu PSR-SU-RC
> > >
> > >
> > > David Zhang (18):
> > >   drm/amd/display: align dmub cmd header to latest dmub FW to support
> > > PSR-SU
> > >   drm/amd/display: feed PSR-SU as psr version to dmub FW
> > >   drm/amd/display: combine dirty rectangles in DMUB FW
> > >   drm/amd/display: update GSP1 generic info packet for PSRSU
> > >   drm/amd/display: revise Start/End SDP data
> > >   drm/amd/display: program PSR2 DPCD Configuration
> > >   drm/amd/display: Passing Y-granularity to dmub fw
> > >   drm/amd/display: Set default value of line_capture_indication
> > >   drm/amd/display: add vline time in micro sec to PSR context
> > >   drm/amd/display: fix system hang when PSR exits
> > >   drm/amd/display: Set PSR level to enable ALPM by default
> > >   drm/amd/display: use HW lock mgr for PSR-SU
> > >   drm/amd/display: PSRSU+DSC WA for specific TCON
> > >   drm/amd/display: add shared helpers to update psr config fields to
> > > power module
> > >   drm/amd/display: calculate psr config settings in runtime in DM
> > >   drm/amd/display: update cursor position to DMUB FW
> > >   drm/amd/display: expose AMD source specific DPCD for FreeSync PSR
> > > support
> > >   drm/amd/display: PSR-SU rate control support in DC
> > >
> > > Leo Li (1):
> > >   drm/amd/display: Implement MPO PSR SU
> >
> > A couple of suggestions from Daniel on IRC:
> > 1.  Might be good to extract the "calculate total crtc damage" code
> > from i915 in intel_psr2_sel_fetch_update, stuff that into damage
> > helpers and reuse for i915 and amdgpu
>
> To expand a bit on this. There is currently a helper for total damage,
> but it's at the fb/plane level for drivers which need to upload
> buffers (usb/spi or virtual) drm_atomic_helper_damage_merged(). That
> one probably needs to be renamed to signify it's about the plane, and
> then we need a new drm_atomic_helper_crtc_damage_merged() which
> (extract from i915 code ideally) which computes total crtc damage for
> stuff like psr2/su or the command mode dsi panels (unfortunately none
> of the drivers for android for these panels have been upstreamed yet).
>
> <<<
> Checked the DRM comment for the helper `drm_atomic_helper_damage_merged()` and
> quoted below:
> *
> Drivers might want to use the helper functions 
> drm_atomic_helper_damage_iter_init()
> and drm_atomic_helper_damage_iter_next() or drm_atomic_helper_damage_merged()
> if the driver can only handle a single damage region at most.
> *
> Currently for amdgpu, the multiple damage clips combination (merging) is 
> handled in
> DMUB firmware. And the DRM comment shows that the usage of "damage_merged()"
> helper is for the driver which can only handle single damage region at most.
>
> Since AMDGPU is capable of handling multiple damaged clip (in DMUB FW), can I
> understand that the group of helpers of `damage_merged()` in DRM is not 
> mandatory
> but optional?

Ah I didn't see from a quick read that this was possible. How does
this work when the plane is enabled/disabled or resized or moved?
-Daniel

> I also think that the split between dc and kms is a bit funny, I'd put
> only the resulting damage rect into dc_pipe and do the computation of
> that in the drm/kms linux code outside of dc functions (or in the glue
> code for dc), since I'm assuming on windows it's completely different
> approach in how you compute damage. Especially 

Re: [PATCH v2 00/19] DC/DM changes needed for amdgpu PSR-SU

2022-05-12 Thread Daniel Vetter
On Wed, 11 May 2022 at 17:35, Alex Deucher  wrote:
>
> On Tue, May 10, 2022 at 4:45 PM David Zhang  wrote:
> >
> > changes in v2:
> > ---
> > - set vsc_packet_rev2 for PSR1 which is safer
> > - add exposure of AMD specific DPCD regs for PSR-SU-RC (rate-control)
> > - add DC/DM change related to amdgpu PSR-SU-RC
> >
> >
> > David Zhang (18):
> >   drm/amd/display: align dmub cmd header to latest dmub FW to support
> > PSR-SU
> >   drm/amd/display: feed PSR-SU as psr version to dmub FW
> >   drm/amd/display: combine dirty rectangles in DMUB FW
> >   drm/amd/display: update GSP1 generic info packet for PSRSU
> >   drm/amd/display: revise Start/End SDP data
> >   drm/amd/display: program PSR2 DPCD Configuration
> >   drm/amd/display: Passing Y-granularity to dmub fw
> >   drm/amd/display: Set default value of line_capture_indication
> >   drm/amd/display: add vline time in micro sec to PSR context
> >   drm/amd/display: fix system hang when PSR exits
> >   drm/amd/display: Set PSR level to enable ALPM by default
> >   drm/amd/display: use HW lock mgr for PSR-SU
> >   drm/amd/display: PSRSU+DSC WA for specific TCON
> >   drm/amd/display: add shared helpers to update psr config fields to
> > power module
> >   drm/amd/display: calculate psr config settings in runtime in DM
> >   drm/amd/display: update cursor position to DMUB FW
> >   drm/amd/display: expose AMD source specific DPCD for FreeSync PSR
> > support
> >   drm/amd/display: PSR-SU rate control support in DC
> >
> > Leo Li (1):
> >   drm/amd/display: Implement MPO PSR SU
>
> A couple of suggestions from Daniel on IRC:
> 1.  Might be good to extract the "calculate total crtc damage" code
> from i915 in intel_psr2_sel_fetch_update, stuff that into damage
> helpers and reuse for i915 and amdgpu

To expand a bit on this. There is currently a helper for total damage,
but it's at the fb/plane level for drivers which need to upload
buffers (usb/spi or virtual) drm_atomic_helper_damage_merged(). That
one probably needs to be renamed to signify it's about the plane, and
then we need a new drm_atomic_helper_crtc_damage_merged() which
(extract from i915 code ideally) which computes total crtc damage for
stuff like psr2/su or the command mode dsi panels (unfortunately none
of the drivers for android for these panels have been upstreamed yet).

I also think that the split between dc and kms is a bit funny, I'd put
only the resulting damage rect into dc_pipe and do the computation of
that in the drm/kms linux code outside of dc functions (or in the glue
code for dc), since I'm assuming on windows it's completely different
approach in how you compute damage. Especially once we have the crtc
damage helper on linux.

> 2.  The commit message on "drm/amd/display: Implement MPO PSR SU" is a
> bit funny, since if you use the helpers right you always get damage
> information, just when it's from userspace that doesn't set explicit
> damage it's just always the entire plane.

Yeah so that one was just another reason to use the helpers more in
amdgpu for this.
-Daniel

>
> Alex
>
> >
> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 142 +-
> >  .../drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c |  21 +-
> >  drivers/gpu/drm/amd/display/dc/core/dc.c  |  54 
> >  drivers/gpu/drm/amd/display/dc/core/dc_link.c |  47 +++-
> >  drivers/gpu/drm/amd/display/dc/dc_link.h  |   4 +
> >  drivers/gpu/drm/amd/display/dc/dc_stream.h|   5 +
> >  drivers/gpu/drm/amd/display/dc/dc_types.h |  23 +-
> >  .../drm/amd/display/dc/dce/dmub_hw_lock_mgr.c |   2 +
> >  drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c |  64 +
> >  drivers/gpu/drm/amd/display/dc/dce/dmub_psr.h |   2 +
> >  .../gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c |   2 +
> >  .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 131 +
> >  .../gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c |   2 +
> >  .../dc/dcn30/dcn30_dio_stream_encoder.c   |  15 ++
> >  drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h  |   1 +
> >  .../drm/amd/display/dc/inc/hw/link_encoder.h  |  21 +-
> >  .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   | 250 +-
> >  .../amd/display/include/ddc_service_types.h   |   1 +
> >  .../display/modules/info_packet/info_packet.c |  29 +-
> >  .../amd/display/modules/power/power_helpers.c |  84 ++
> >  .../amd/display/modules/power/power_helpers.h |   6 +
> >  21 files changed, 887 insertions(+), 19 deletions(-)
> >
> > --
> > 2.25.1
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/8] drm: execution context for GEM buffers v2

2022-05-11 Thread Daniel Vetter
On Mon, May 09, 2022 at 05:01:33PM +0200, Christian König wrote:
> Am 09.05.22 um 16:31 schrieb Daniel Vetter:
> > On Wed, May 04, 2022 at 09:47:32AM +0200, Christian König wrote:
> > > [SNIP]
> > > +/* Make sure we have enough room and add an object the container */
> > > +static int drm_exec_objects_add(struct drm_exec_objects *container,
> > > + struct drm_gem_object *obj)
> > > +{
> > > + if (unlikely(container->num_objects == container->max_objects)) {
> > > + size_t size = container->max_objects * sizeof(void *);
> > > + void *tmp;
> > > +
> > > + tmp = kvrealloc(container->objects, size, size + PAGE_SIZE,
> > > + GFP_KERNEL);
> > > + if (!tmp)
> > > + return -ENOMEM;
> > > +
> > > + container->objects = tmp;
> > > + container->max_objects += PAGE_SIZE / sizeof(void *);
> > Might be worth it to inquire the actual allocation size here, since if
> > it's kmalloc the generic buckets only cover doubling of sizes, so once
> > it's big it goes up a lot quicker than PAGE_SIZE.
> > 
> > But also krealloc checks this internally already so maybe better to not
> > break the abstraction.
> 
> How can I actually do this? ksize() only works with kmalloc().
> 
> Or do we had a function to figure out if vmalloc or kmalloc was used by
> kvrealloc()?

kvfree has a is_vmalloc_addr so it would boil down to open-code that a
bit.

Probably not worth the trouble really, otoh looking at kvrealloc it
doesn't use krealloc underneath, so it's not doing that check. Maybe we
should just push that check into kvrealloc for the !vmalloc_addr case.

> > > [SNIP]
> > > +/**
> > > + * drm_exec_cleanup - cleanup when contention is detected
> > > + * @exec: the drm_exec object to cleanup
> > > + *
> > > + * Cleanup the current state and return true if we should stay inside 
> > > the retry
> > > + * loop, false if there wasn't any contention detected and we can keep 
> > > the
> > > + * objects locked.
> > > + */
> > > +bool drm_exec_cleanup(struct drm_exec *exec)
> > > +{
> > > + if (likely(!exec->contended)) {
> > > + ww_acquire_done(>ticket);
> > > + return false;
> > > + }
> > > +
> > > + if (likely(exec->contended == DRM_EXEC_DUMMY)) {
> > > + exec->contended = NULL;
> > > + ww_acquire_init(>ticket, _ww_class);
> > Not sure why this is here instead of in _init()? I thought you're playing
> > some really dangerous tricks with re-initting the acquire ctx, which would
> > at least be questionable, but does not look like that.
> 
> That was my initial design, but the problem with this approach is that all
> locks taken between drm_exec_init() and the loop suddenly have a lockdep
> dependency on reservation_ww_class. And that in turn goes boom immediately.
> 
> Took me a moment to realize what's wrong with that as well.

Uh crap, indeed. I think minimally this needs to be document, but
personally I'm leaning towards drm_exec_prepare_init(), which does this
explicitly.

I do agree we need this split, especially so we can eventually add helpers
for bo lookup, or maybe userptr/hmm prep and things like that, which all
has to be outside of the acquire_ctx.

> > [SNIP]
> > +/**
> > + * drm_exec_has_duplicates - check for duplicated GEM object
> > + * @exec: drm_exec object
> > + *
> > + * Return true if the drm_exec object has encountered some already locked 
> > GEM
> > + * objects while trying to lock them. This can happen if multiple GEM 
> > objects
> > + * share the same underlying resv object.
> > + */
> > +static inline bool drm_exec_has_duplicates(struct drm_exec *exec)
> > +{
> > +   return exec->duplicates.num_objects > 0;
> > Definitely an aside, but in our i915 efforts to get rid of temporary pins
> > we run into some fun where the eviction code couldn't differentiate from
> > memory we need reserved for the CS and memory we just keep locked because
> > we evicted it and fun stuff like that. So maybe we need a bit more
> > tracking here eventually, but that's only when we have this somehow glued
> > into ttm eviction code.
> 
> Hehe, yeah that's what I was thinking about as well. But then I though one
> step at a time.
> 
> > Also the even more massive step would be to glue this into dma-buf so you
> > can do dynamic dma-buf eviction and still keep track of all the buffers. I
>

Re: [PATCH 8/8] drm: move ttm_execbuf_util into vmwgfx

2022-05-09 Thread Daniel Vetter
On Wed, May 04, 2022 at 09:47:39AM +0200, Christian König wrote:
> VMWGFX is the only remaining user of this and should probably moved over
> to drm_exec when it starts using GEM as well.
> 
> Signed-off-by: Christian König 

I guess this is a bit annoying since it means we can't require drm_exec in
ttm eviction, but we can make it an optional pointer in the ttm ctx. Needs
to be optional anyway since we won't roll this out to all drivers, and
then we can optionally use it to handle the locking in eviction instead of
the current lock dropping tricks.

I'm assuming at least that's your goal here, or is there a different one?
-Daniel

> ---
>  drivers/gpu/drm/ttm/Makefile  | 4 ++--
>  drivers/gpu/drm/vmwgfx/Makefile   | 2 +-
>  drivers/gpu/drm/{ttm => vmwgfx}/ttm_execbuf_util.c| 3 ++-
>  .../drm/ttm => drivers/gpu/drm/vmwgfx}/ttm_execbuf_util.h | 2 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.h   | 2 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_validation.h| 2 +-
>  6 files changed, 8 insertions(+), 7 deletions(-)
>  rename drivers/gpu/drm/{ttm => vmwgfx}/ttm_execbuf_util.c (99%)
>  rename {include/drm/ttm => drivers/gpu/drm/vmwgfx}/ttm_execbuf_util.h (99%)
> 
> diff --git a/drivers/gpu/drm/ttm/Makefile b/drivers/gpu/drm/ttm/Makefile
> index f906b22959cf..b05a8477d0d0 100644
> --- a/drivers/gpu/drm/ttm/Makefile
> +++ b/drivers/gpu/drm/ttm/Makefile
> @@ -3,8 +3,8 @@
>  # Makefile for the drm device driver.  This driver provides support for the
>  
>  ttm-y := ttm_tt.o ttm_bo.o ttm_bo_util.o ttm_bo_vm.o ttm_module.o \
> - ttm_execbuf_util.o ttm_range_manager.o ttm_resource.o ttm_pool.o \
> - ttm_device.o ttm_sys_manager.o
> + ttm_range_manager.o ttm_resource.o ttm_pool.o ttm_device.o \
> + ttm_sys_manager.o
>  ttm-$(CONFIG_AGP) += ttm_agp_backend.o
>  
>  obj-$(CONFIG_DRM_TTM) += ttm.o
> diff --git a/drivers/gpu/drm/vmwgfx/Makefile b/drivers/gpu/drm/vmwgfx/Makefile
> index eee73b9aa404..c2c836103b23 100644
> --- a/drivers/gpu/drm/vmwgfx/Makefile
> +++ b/drivers/gpu/drm/vmwgfx/Makefile
> @@ -1,6 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0
>  vmwgfx-y := vmwgfx_execbuf.o vmwgfx_gmr.o vmwgfx_hashtab.o vmwgfx_kms.o 
> vmwgfx_drv.o \
> - vmwgfx_ioctl.o vmwgfx_resource.o vmwgfx_ttm_buffer.o \
> + vmwgfx_ioctl.o vmwgfx_resource.o vmwgfx_ttm_buffer.o 
> ttm_execbuf_util.o \
>   vmwgfx_cmd.o vmwgfx_irq.o vmwgfx_ldu.o vmwgfx_ttm_glue.o \
>   vmwgfx_overlay.o vmwgfx_gmrid_manager.o vmwgfx_fence.o \
>   vmwgfx_bo.o vmwgfx_scrn.o vmwgfx_context.o \
> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c 
> b/drivers/gpu/drm/vmwgfx/ttm_execbuf_util.c
> similarity index 99%
> rename from drivers/gpu/drm/ttm/ttm_execbuf_util.c
> rename to drivers/gpu/drm/vmwgfx/ttm_execbuf_util.c
> index dbee34a058df..1030f263ba07 100644
> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> +++ b/drivers/gpu/drm/vmwgfx/ttm_execbuf_util.c
> @@ -26,13 +26,14 @@
>   *
>   **/
>  
> -#include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  
> +#include "ttm_execbuf_util.h"
> +
>  static void ttm_eu_backoff_reservation_reverse(struct list_head *list,
> struct ttm_validate_buffer *entry)
>  {
> diff --git a/include/drm/ttm/ttm_execbuf_util.h 
> b/drivers/gpu/drm/vmwgfx/ttm_execbuf_util.h
> similarity index 99%
> rename from include/drm/ttm/ttm_execbuf_util.h
> rename to drivers/gpu/drm/vmwgfx/ttm_execbuf_util.h
> index a99d7fdf2964..47553bf31c73 100644
> --- a/include/drm/ttm/ttm_execbuf_util.h
> +++ b/drivers/gpu/drm/vmwgfx/ttm_execbuf_util.h
> @@ -33,7 +33,7 @@
>  
>  #include 
>  
> -#include "ttm_bo_api.h"
> +#include 
>  
>  /**
>   * struct ttm_validate_buffer
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h 
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> index be19aa6e1f13..cae306c60af9 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> @@ -37,8 +37,8 @@
>  #include 
>  
>  #include 
> -#include 
>  
> +#include "ttm_execbuf_util.h"
>  #include "ttm_object.h"
>  
>  #include "vmwgfx_fence.h"
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h 
> b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h
> index f21df053882b..3613a3d52528 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h
> @@ -31,7 +31,7 @@
>  #include 
>  #include 
>  
> -#include 
> +#include "ttm_execbuf_util.h"
>  
>  #include "vmwgfx_hashtab.h"
>  
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/8] drm: execution context for GEM buffers v2

2022-05-09 Thread Daniel Vetter
gt; container
> + */
> +#define drm_exec_objects_for_each(array, index, obj) \
> + for (index = 0, obj = (array)->objects[0];  \
> +  index < (array)->num_objects;  \
> +  ++index, obj = (array)->objects[index])
> +
> +/**
> + * struct drm_exec - Execution context
> + */
> +struct drm_exec {
> + /**
> +  * @interruptible: If locks should be taken interruptible
> +  */
> + boolinterruptible;
> +
> + /**
> +  * @ticket: WW ticket used for acquiring locks
> +  */
> + struct ww_acquire_ctx   ticket;
> +
> + /**
> +  * @locked: container for the locked GEM objects
> +  */
> + struct drm_exec_objects locked;
> +
> + /**
> +  * @duplicates: container for the duplicated GEM objects
> +  */
> + struct drm_exec_objects duplicates;
> +
> + /**
> +  * @contended: contended GEM object we backet of for.
> +  */
> + struct drm_gem_object   *contended;
> +};
> +
> +/**
> + * drm_exec_for_each_locked_object - iterate over all the locked objects
> + * @exec: drm_exec object
> + * @index: unsigned long index for the iteration
> + * @obj: the current GEM object
> + *
> + * Iterate over all the locked GEM objects inside the drm_exec object.
> + */
> +#define drm_exec_for_each_locked_object(exec, index, obj)\
> + drm_exec_objects_for_each(&(exec)->locked, index, obj)
> +
> +/**
> + * drm_exec_for_each_duplicate_object - iterate over all the duplicate 
> objects
> + * @exec: drm_exec object
> + * @index: unsigned long index for the iteration
> + * @obj: the current GEM object
> + *
> + * Iterate over all the duplicate GEM objects inside the drm_exec object.
> + */
> +#define drm_exec_for_each_duplicate_object(exec, index, obj) \
> + drm_exec_objects_for_each(&(exec)->duplicates, index, obj)
> +
> +/**
> + * drm_exec_while_not_all_locked - loop until all GEM objects are prepared
> + * @exec: drm_exec object
> + *
> + * Core functionality of the drm_exec object. Loops until all GEM objects are
> + * prepared and no more contention exists.
> + *
> + * At the beginning of the loop it is guaranteed that no GEM object is 
> locked.
> + */
> +#define drm_exec_while_not_all_locked(exec)  \
> + while (drm_exec_cleanup(exec))
> +
> +/**
> + * drm_exec_continue_on_contention - continue the loop when we need to 
> cleanup
> + * @exec: drm_exec object
> + *
> + * Control flow helper to continue when a contention was detected and we 
> need to
> + * clean up and re-start the loop to prepare all GEM objects.
> + */
> +#define drm_exec_continue_on_contention(exec)\
> + if (unlikely(drm_exec_is_contended(exec)))  \
> + continue
> +
> +/**
> + * drm_exec_break_on_contention - break a subordinal loop on contention
> + * @exec: drm_exec object
> + *
> + * Control flow helper to break a subordinal loop when a contention was 
> detected
> + * and we need to clean up and re-start the loop to prepare all GEM objects.
> + */
> +#define drm_exec_break_on_contention(exec)   \
> + if (unlikely(drm_exec_is_contended(exec)))  \
> + break
> +
> +/**
> + * drm_exec_is_contended - check for contention
> + * @exec: drm_exec object
> + *
> + * Returns true if the drm_exec object has run into some contention while
> + * locking a GEM object and needs to clean up.
> + */
> +static inline bool drm_exec_is_contended(struct drm_exec *exec)
> +{
> + return !!exec->contended;
> +}
> +
> +/**
> + * drm_exec_has_duplicates - check for duplicated GEM object
> + * @exec: drm_exec object
> + *
> + * Return true if the drm_exec object has encountered some already locked GEM
> + * objects while trying to lock them. This can happen if multiple GEM objects
> + * share the same underlying resv object.
> + */
> +static inline bool drm_exec_has_duplicates(struct drm_exec *exec)
> +{
> + return exec->duplicates.num_objects > 0;

Definitely an aside, but in our i915 efforts to get rid of temporary pins
we run into some fun where the eviction code couldn't differentiate from
memory we need reserved for the CS and memory we just keep locked because
we evicted it and fun stuff like that. So maybe we need a bit more
tracking here eventually, but that's only when we have this somehow glued
into ttm eviction code.

Also the even more massive step would be to glue this into dma-buf so you
can do dynamic dma-buf eviction and still keep track of all the buffers. I
think with some clever pointer tagging and a bit more indirection we could
nest drm_exec structures (so that a driver could insert it's entire
drm_exec structure with a drm_exec-level callback for handling refcounting
and stuff like that).

So anyway I think this all looks good, just one more thing before I think
we should land this:

gem helpers in drm_gem_lock_reservations() has something which is
practically compatible already, except that you bulk-add the entire set of
objects. I think if you add a bulk-prepare function then we could also
replace all those. Maybe even nicer if the bulk-prepare takes the array of
handles and does the handle lookup too, but at least something which can
subsititue drm_gem_lock_reservations with drm_exec would be nice to
validate the helpers a bit more and really make sure we only have one of
them left.

Thoughts?
-Daniel

> +}
> +
> +void drm_exec_init(struct drm_exec *exec, bool interruptible);
> +void drm_exec_fini(struct drm_exec *exec);
> +bool drm_exec_cleanup(struct drm_exec *exec);
> +int drm_exec_prepare_obj(struct drm_exec *exec, struct drm_gem_object *obj,
> +  unsigned int num_fences);
> +
> +#endif
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/2] Revert "drm/amdgpu: disable runpm if we are the primary adapter"

2022-05-04 Thread Daniel Vetter
On Wed, May 04, 2022 at 09:48:32AM -0400, Alex Deucher wrote:
> This reverts commit b95dc06af3e683d6b7ddbbae178b2b2a21ee8b2b.
> 
> This workaround is no longer necessary.  We have a better workaround
> in commit f95af4a9236695 ("drm/amdgpu: don't runtime suspend if there are 
> displays attached (v3)").

I looked at this patch here quickly, and you still have a bit a design
issue. The trouble is that this is a pretty nasty locking inversion
compared to any other drivers, because you check modeset locks within
runtime pm callbacks.

The way this is meant to work with atomic is that in your atomic commit
you grab/drop runtime pm references as needed (simple for pci devices, but
the arm-soc have a rpm domain pretty much per plane/crtc/encoder
sometimes), in conjunction with drm_atomic_helper_commit_tail_rpm - if
you're using the default commit functions at least, so that ordering is
correct. Which doesn't apply to amdgpu.

I think in general it's a antipattern to check whether you're in use in
your suspend callback - it's gone boom wrt locking in a few places and
also once you reject I think there's nothing really that tries again. The
autosuspend (if enabled) only kicks in when the refcount drops to zero.

Anyway nothing terrible, just more work to do here I guess, it's good to
drop the earlier approaches still.

On the series:

Acked-by: Daniel Vetter 
> 
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h |  1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 28 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c |  6 --
>  3 files changed, 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index d557f4db2565..682ec660f2c4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -981,7 +981,6 @@ struct amdgpu_device {
>   boolrunpm;
>   boolin_runpm;
>   boolhas_pr3;
> - boolis_fw_fb;
>  
>   boolpm_sysfs_en;
>   boolucode_sysfs_en;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index ebd37fb19cdb..3c198b2a86db 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -38,7 +38,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  
>  #include "amdgpu.h"
>  #include "amdgpu_irq.h"
> @@ -1950,26 +1949,6 @@ MODULE_DEVICE_TABLE(pci, pciidlist);
>  
>  static const struct drm_driver amdgpu_kms_driver;
>  
> -static bool amdgpu_is_fw_framebuffer(resource_size_t base,
> -  resource_size_t size)
> -{
> - bool found = false;
> -#if IS_REACHABLE(CONFIG_FB)
> - struct apertures_struct *a;
> -
> - a = alloc_apertures(1);
> - if (!a)
> - return false;
> -
> - a->ranges[0].base = base;
> - a->ranges[0].size = size;
> -
> - found = is_firmware_framebuffer(a);
> - kfree(a);
> -#endif
> - return found;
> -}
> -
>  static void amdgpu_get_secondary_funcs(struct amdgpu_device *adev)
>  {
>   struct pci_dev *p = NULL;
> @@ -2000,8 +1979,6 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
>   unsigned long flags = ent->driver_data;
>   int ret, retry = 0, i;
>   bool supports_atomic = false;
> - bool is_fw_fb;
> - resource_size_t base, size;
>  
>   /* skip devices which are owned by radeon */
>   for (i = 0; i < ARRAY_SIZE(amdgpu_unsupported_pciidlist); i++) {
> @@ -2068,10 +2045,6 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
>   }
>  #endif
>  
> - base = pci_resource_start(pdev, 0);
> - size = pci_resource_len(pdev, 0);
> - is_fw_fb = amdgpu_is_fw_framebuffer(base, size);
> -
>   /* Get rid of things like offb */
>   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
> _kms_driver);
>   if (ret)
> @@ -2084,7 +2057,6 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
>   adev->dev  = >dev;
>   adev->pdev = pdev;
>   ddev = adev_to_drm(adev);
> - adev->is_fw_fb = is_fw_fb;
>  
>   if (!supports_atomic)
>   ddev->driver_features &= ~DRIVER_ATOMIC;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index 51bb977154eb..497478f8a5d3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -185,12 +185,6 @@ int amdgpu_d

Re: [PATCH] drm/amdgpu: Fix one use-after-free of VM

2022-04-13 Thread Daniel Vetter
fcounting would
probably be needed) or making it look like there's a guarantee that
really doesn't hold when you try to use it. wait/wake_up functions
pair really should not provide more ordering than just the barrier
(and that barrier is even conditional on "an actual wake-up has
happened").

I'm not exactly sure how to best fix this here, but I guess you either
want your own spinlock to protect the link between the fence and the
vm, or some other refcounting scheme changes (like you have the gpu
ctx that run on top of a vm hold the refence on the fence, and the
fence itself holds a full reference on the vm) to make sure there's
not use after free here.

I don't think the spinlock fence you propose below is enough, I think
you also need to protect any vm dereference from under that spinlock
(i.e. set some vm pointer to NULL while holding that spinlock, or
whatever you need to do to unlink the fence from the vm).
-Daniel

>
> Thanks,
> Christian.
>
> > + (void)dma_fence_wait(vm->last_delayed_tlb_flush, false);
> > + (void)dma_fence_get_status(vm->last_delayed_tlb_flush);
> > + dma_fence_put(vm->last_delayed_tlb_flush);
> > + }
> > +
> >   list_for_each_entry_safe(mapping, tmp, >freed, list) {
> >   if (mapping->flags & AMDGPU_PTE_PRT && prt_fini_needed) {
> >   amdgpu_vm_prt_fini(adev, vm);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > index 1a814fb8..c1a48f5c1019 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > @@ -286,6 +286,7 @@ struct amdgpu_vm {
> >
> >   /* Last finished delayed update */
> >   atomic64_t  tlb_seq;
> > + struct dma_fence*last_delayed_tlb_flush;
> >
> >   /* Last unlocked submission to the scheduler entities */
> >   struct dma_fence*last_unlocked;
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 04/16] dma-buf & drm/amdgpu: remove dma_resv workaround

2022-04-06 Thread Daniel Vetter
On Wed, Apr 06, 2022 at 09:51:20AM +0200, Christian König wrote:
> Rework the internals of the dma_resv object to allow adding more than one
> write fence and remember for each fence what purpose it had.
> 
> This allows removing the workaround from amdgpu which used a container for
> this instead.
> 
> Signed-off-by: Christian König 
> Cc: amd-gfx@lists.freedesktop.org

It is honestly all getting rather blurry, I think when it's all landed I
need to audit the entire tree and see what we missed. Anyway:

Reviewed-by: Daniel Vetter 

> ---
>  drivers/dma-buf/dma-resv.c  | 353 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h |   1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  53 +--
>  include/linux/dma-resv.h|  47 +--
>  4 files changed, 157 insertions(+), 297 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index 543dae6566d2..378d47e1cfea 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -44,12 +44,12 @@
>  /**
>   * DOC: Reservation Object Overview
>   *
> - * The reservation object provides a mechanism to manage shared and
> - * exclusive fences associated with a buffer.  A reservation object
> - * can have attached one exclusive fence (normally associated with
> - * write operations) or N shared fences (read operations).  The RCU
> - * mechanism is used to protect read access to fences from locked
> - * write-side updates.
> + * The reservation object provides a mechanism to manage a container of
> + * dma_fence object associated with a resource. A reservation object
> + * can have any number of fences attaches to it. Each fence carries an usage
> + * parameter determining how the operation represented by the fence is using 
> the
> + * resource. The RCU mechanism is used to protect read access to fences from
> + * locked write-side updates.
>   *
>   * See struct dma_resv for more details.
>   */
> @@ -57,39 +57,59 @@
>  DEFINE_WD_CLASS(reservation_ww_class);
>  EXPORT_SYMBOL(reservation_ww_class);
>  
> +/* Mask for the lower fence pointer bits */
> +#define DMA_RESV_LIST_MASK   0x3
> +
>  struct dma_resv_list {
>   struct rcu_head rcu;
> - u32 shared_count, shared_max;
> - struct dma_fence __rcu *shared[];
> + u32 num_fences, max_fences;
> + struct dma_fence __rcu *table[];
>  };
>  
> -/**
> - * dma_resv_list_alloc - allocate fence list
> - * @shared_max: number of fences we need space for
> - *
> +/* Extract the fence and usage flags from an RCU protected entry in the 
> list. */
> +static void dma_resv_list_entry(struct dma_resv_list *list, unsigned int 
> index,
> + struct dma_resv *resv, struct dma_fence **fence,
> + enum dma_resv_usage *usage)
> +{
> + long tmp;
> +
> + tmp = (long)rcu_dereference_check(list->table[index],
> +   resv ? dma_resv_held(resv) : true);
> + *fence = (struct dma_fence *)(tmp & ~DMA_RESV_LIST_MASK);
> + if (usage)
> + *usage = tmp & DMA_RESV_LIST_MASK;
> +}
> +
> +/* Set the fence and usage flags at the specific index in the list. */
> +static void dma_resv_list_set(struct dma_resv_list *list,
> +   unsigned int index,
> +   struct dma_fence *fence,
> +   enum dma_resv_usage usage)
> +{
> + long tmp = ((long)fence) | usage;
> +
> + RCU_INIT_POINTER(list->table[index], (struct dma_fence *)tmp);
> +}
> +
> +/*
>   * Allocate a new dma_resv_list and make sure to correctly initialize
> - * shared_max.
> + * max_fences.
>   */
> -static struct dma_resv_list *dma_resv_list_alloc(unsigned int shared_max)
> +static struct dma_resv_list *dma_resv_list_alloc(unsigned int max_fences)
>  {
>   struct dma_resv_list *list;
>  
> - list = kmalloc(struct_size(list, shared, shared_max), GFP_KERNEL);
> + list = kmalloc(struct_size(list, table, max_fences), GFP_KERNEL);
>   if (!list)
>   return NULL;
>  
> - list->shared_max = (ksize(list) - offsetof(typeof(*list), shared)) /
> - sizeof(*list->shared);
> + list->max_fences = (ksize(list) - offsetof(typeof(*list), table)) /
> + sizeof(*list->table);
>  
>   return list;
>  }
>  
> -/**
> - * dma_resv_list_free - free fence list
> - * @list: list to free
> - *
> - * Free a dma_resv_list and make sure to drop all references.
> - */
> +/* Free a dma_resv_list and make sure to drop all references. */
>  static void dma_resv_list_free(struct dma_resv_list *list)
&

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-30 Thread Daniel Vetter
On Tue, Mar 29, 2022 at 12:25:55PM -0400, Marek Olšák wrote:
> I don't know what iris does, but I would guess that the same problems as
> with AMD GPUs apply, making GPUs resets very fragile.

iris_batch_check_for_reset -> replace_kernel_ctx -> iris_lost_context_state

is I think the main call chain of how this is handled/detected. There's
also a side-chain which handles -EIO from execbuf.

Also this is using non-recoverable contexts, i.e. any time they suffer
from a gpu reset (either because guilty themselves, or collateral damage
of a reset that shot more than just the guilty context) the context stops
entirely and refuses any further execbuf with -EIO.

Cheers, Daniel

> 
> Marek
> 
> On Tue., Mar. 29, 2022, 08:14 Christian König, 
> wrote:
> 
> > My main question is what does the iris driver better than radeonsi when
> > the client doesn't support the robustness extension?
> >
> > From Daniels description it sounds like they have at least a partial
> > recovery mechanism in place.
> >
> > Apart from that I completely agree to what you said below.
> >
> > Christian.
> >
> > Am 26.03.22 um 01:53 schrieb Olsak, Marek:
> >
> > [AMD Official Use Only]
> >
> > amdgpu has 2 resets: soft reset and hard reset.
> >
> > The soft reset is able to recover from an infinite loop and even some GPU
> > hangs due to bad shaders or bad states. The soft reset uses a signal that
> > kills all currently-running shaders of a certain process (VM context),
> > which unblocks the graphics pipeline, so draws and command buffers finish
> > but are not correctly. This can then cause a hard hang if the shader was
> > supposed to signal work completion through a shader store instruction and a
> > non-shader consumer is waiting for it (skipping the store instruction by
> > killing the shader won't signal the work, and thus the consumer will be
> > stuck, requiring a hard reset).
> >
> > The hard reset can recover from other hangs, which is great, but it may
> > use a PCI reset, which erases VRAM on dGPUs. APUs don't lose memory
> > contents, but we should assume that any process that had running jobs on
> > the GPU during a GPU reset has its memory resources in an inconsistent
> > state, and thus following command buffers can cause another GPU hang. The
> > shader store example above is enough to cause another hard hang due to
> > incorrect content in memory resources, which can contain synchronization
> > primitives that are used internally by the hardware.
> >
> > Asking the driver to replay a command buffer that caused a hang is a sure
> > way to hang it again. Unrelated processes can be affected due to lost VRAM
> > or the misfortune of using the GPU while the GPU hang occurred. The window
> > system should recreate GPU resources and redraw everything without
> > affecting applications. If apps use GL, they should do the same. Processes
> > that can't recover by redrawing content can be terminated or left alone,
> > but they shouldn't be allowed to submit work to the GPU anymore.
> >
> > dEQP only exercises the soft reset. I think WebGL is only able to trigger
> > a soft reset at this point, but Vulkan can also trigger a hard reset.
> >
> > Marek
> > --
> > *From:* Koenig, Christian 
> > 
> > *Sent:* March 23, 2022 11:25
> > *To:* Daniel Vetter  ; Daniel Stone
> >  ; Olsak, Marek
> >  ; Grodzovsky, Andrey
> >  
> > *Cc:* Rob Clark  ; Rob Clark
> >  ; Sharma, Shashank
> >  ; Christian König
> >  ;
> > Somalapuram, Amaranath 
> > ; Abhinav Kumar 
> > ; dri-devel 
> > ; amd-gfx list
> >  ; Deucher,
> > Alexander  ;
> > Shashank Sharma 
> > 
> > *Subject:* Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event
> >
> > [Adding Marek and Andrey as well]
> >
> > Am 23.03.22 um 16:14 schrieb Daniel Vetter:
> > > On Wed, 23 Mar 2022 at 15:07, Daniel Stone 
> >  wrote:
> > >> Hi,
> > >>
> > >> On Mon, 21 Mar 2022 at 16:02, Rob Clark 
> >  wrote:
> > >>> On Mon, Mar 21, 2022 at 2:30 AM Christian König
> > >>>   wrote:
> > >>>> Well you can, it just means that their contexts are lost as well.
> > >>> Which is rather inconvenient when deqp-egl reset tests, for example,
> > >>> take down your compositor ;-)
> > >> Yeah. Or anything WebGL.
> > >>
> > >> System-wide collateral damage is definitely a non-starter. If that
> > >> means that the userspace driver has to do what iris does and en

Re: [PATCH 18/23] drm/amdgpu: remove dma_resv workaround

2022-03-29 Thread Daniel Vetter
On Mon, Mar 21, 2022 at 02:58:51PM +0100, Christian König wrote:
> We can now add multiple writers to the dma_resv object.
> 
> Also enable the check for not adding containers in dma_resv.c again.
> 
> Signed-off-by: Christian König 
> Cc: amd-gfx@lists.freedesktop.org

It's a bit much magic, but that's the entire point of your huge prep
series to be able to have all the fences on a dma-resv :-)

Reviewed-by: Daniel Vetter 

> ---
>  drivers/dma-buf/dma-resv.c  |  6 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h |  1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 51 ++---
>  3 files changed, 8 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index 26257ba1527e..10d70812373c 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -308,10 +308,10 @@ void dma_resv_add_fence(struct dma_resv *obj, struct 
> dma_fence *fence,
>  
>   dma_resv_assert_held(obj);
>  
> - /* TODO: Drivers should not add containers here, instead add each fence
> -  * individually. Disabled for now until we cleaned up amdgpu/ttm.
> + /* Drivers should not add containers here, instead add each fence
> +  * individually.
>*/
> - /* WARN_ON(dma_fence_is_container(fence)); */
> + WARN_ON(dma_fence_is_container(fence));
>  
>   fobj = dma_resv_fences_list(obj);
>   count = fobj->num_fences;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
> index 044b41f0bfd9..529d52a204cf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
> @@ -34,7 +34,6 @@ struct amdgpu_fpriv;
>  struct amdgpu_bo_list_entry {
>   struct ttm_validate_buffer  tv;
>   struct amdgpu_bo_va *bo_va;
> - struct dma_fence_chain  *chain;
>   uint32_tpriority;
>   struct page **user_pages;
>   booluser_invalidated;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 1c039db976a9..88009833f523 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -575,14 +575,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser 
> *p,
>   struct amdgpu_bo *bo = ttm_to_amdgpu_bo(e->tv.bo);
>  
>   e->bo_va = amdgpu_vm_bo_find(vm, bo);
> -
> - if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> - e->chain = dma_fence_chain_alloc();
> - if (!e->chain) {
> - r = -ENOMEM;
> - goto error_validate;
> - }
> - }
>   }
>  
>   amdgpu_cs_get_threshold_for_moves(p->adev, >bytes_moved_threshold,
> @@ -633,13 +625,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser 
> *p,
>   }
>  
>  error_validate:
> - if (r) {
> - amdgpu_bo_list_for_each_entry(e, p->bo_list) {
> - dma_fence_chain_free(e->chain);
> - e->chain = NULL;
> - }
> + if (r)
>   ttm_eu_backoff_reservation(>ticket, >validated);
> - }
>  out:
>   return r;
>  }
> @@ -679,17 +666,9 @@ static void amdgpu_cs_parser_fini(struct 
> amdgpu_cs_parser *parser, int error,
>  {
>   unsigned i;
>  
> - if (error && backoff) {
> - struct amdgpu_bo_list_entry *e;
> -
> - amdgpu_bo_list_for_each_entry(e, parser->bo_list) {
> - dma_fence_chain_free(e->chain);
> - e->chain = NULL;
> - }
> -
> + if (error && backoff)
>   ttm_eu_backoff_reservation(>ticket,
>  >validated);
> - }
>  
>   for (i = 0; i < parser->num_post_deps; i++) {
>   drm_syncobj_put(parser->post_deps[i].syncobj);
> @@ -1264,29 +1243,9 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>  
>   amdgpu_vm_move_to_lru_tail(p->adev, >vm);
>  
> - amdgpu_bo_list_for_each_entry(e, p->bo_list) {
> - struct dma_resv *resv = e->tv.bo->base.resv;
> - struct dma_fence_chain *chain = e->chain;
> - struct dma_resv_iter cursor;
> - struct dma_fence *fence;
> -
> - if (!chain)
> - continue;
> -
> - /*
> -

Re: [PATCH next, v2] kernel: Add 1 ms delay to init handler to fix s3 resume hang

2022-03-29 Thread Daniel Vetter
On Tue, Mar 29, 2022 at 08:20:24AM +0200, Christian König wrote:
> Am 29.03.22 um 05:05 schrieb Zhenneng Li:
> > This is a workaround for s3 resume hang for r7 340(amdgpu).
> > When we test s3 with r7 340 on arm64 platform, graphics card will hang up,
> > the error message are as follows:
> > Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.599374][ 7] [  T291] 
> > amdgpu :02:00.0: fb0: amdgpudrmfb frame buffer device
> > Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.612869][ 7] [  T291] 
> > [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block 
> >  failed -22
> > Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.623392][ 7] [  T291] 
> > amdgpu :02:00.0: amdgpu_device_ip_late_init failed
> > Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.630696][ 7] [  T291] 
> > amdgpu :02:00.0: Fatal error during GPU init
> > Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.637477][ 7] [  T291] 
> > [drm] amdgpu: finishing device.
> > 
> > On the following hardware:
> > lspci -nn -s 05:00.0
> > 05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. 
> > [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 / Radeon 520 OEM] [1002:6611] 
> > (rev 87)
> 
> Well that's rather funny and certainly a NAK. To recap you are adding a
> delay to a delayed work handler. In other words you could delay the work
> handler in the first place :)
> 
> But this is not the reason why that here is a NAK. The more obvious problem
> is that we seem to have a race between the DPM code kicking in to save power
> after driver load and the asynchronous testing if userspace command
> submission works.
> 
> Adding the delay here works around that for the IB submission, but there can
> be other things going on in parallel which can fail as well.

Yeah standard pattern for this is to refcount your dpm code (using power
domains or runtime pm ideally or hand-rolled if you have to). And then
grabbing a dpm reference before you launch that work, and dropping that
when the work has finished.

That gives you a nice clean way to handle all these problems around "right
now I'm really not ready to allow low power states" in a very clean
fashion. arm-soc drivers go totally overboard on this with runtime pm on
all the chip components, that's maybe a bit much but afaiui we could do it
on big pci drivers with power domains too :-)

Also with power domains you get autosuspend delay timers for free and
tunable in sysfs ...

Cheers, Daniel

> 
> Please rather open up a bug report instead.
> 
> Regards,
> Christian.
> 
> > 
> > Signed-off-by: Zhenneng Li 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 3987ecb24ef4..1eced991b5b2 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -2903,6 +2903,8 @@ static void 
> > amdgpu_device_delayed_init_work_handler(struct work_struct *work)
> > container_of(work, struct amdgpu_device, 
> > delayed_init_work.work);
> > int r;
> > +   mdelay(1);
> > +
> > r = amdgpu_ib_ring_tests(adev);
> > if (r)
> > DRM_ERROR("ib ring test failed (%d).\n", r);
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 3/25] drm/amdgpu: Disable ABM when AC mode

2022-03-25 Thread Daniel Vetter
stem_supplied() > 0)
> > >   DRM_DEBUG_DRIVER("pm: AC\n");
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > index abfcc1304ba0c..3a0afe7602727 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > @@ -3454,6 +3454,7 @@ int amdgpu_device_init(struct amdgpu_device
> > > *adev,
> > >   adev->gfx.gfx_off_req_count = 1;
> > >   adev->pm.ac_power = power_supply_is_system_supplied() > 0;
> > > + adev->pm.old_ac_power = true;
> > >   atomic_set(>throttling_logging_enabled, 1);
> > >   /*
> > > diff --git a/drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c
> > > b/drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c
> > > index 54a1408c8015c..478a734b66926 100644
> > > --- a/drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c
> > > +++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_abm.c
> > > @@ -23,6 +23,8 @@
> > > *
> > > */
> > > +#include 
> > > +#include "amdgpu.h"
> > >#include "dmub_abm.h"
> > >#include "dce_abm.h"
> > >#include "dc.h"
> > > @@ -51,6 +53,7 @@
> > >#define DISABLE_ABM_IMMEDIATELY 255
> > > +extern uint amdgpu_dm_abm_level;
> > >static void dmub_abm_enable_fractional_pwm(struct dc_context *dc)
> > >{
> > > @@ -117,28 +120,6 @@ static void dmub_abm_init(struct abm *abm, uint32_t 
> > > backlight)
> > >   dmub_abm_enable_fractional_pwm(abm->ctx);
> > >}
> > > -static unsigned int dmub_abm_get_current_backlight(struct abm *abm)
> > > -{
> > > - struct dce_abm *dce_abm = TO_DMUB_ABM(abm);
> > > - unsigned int backlight = REG_READ(BL1_PWM_CURRENT_ABM_LEVEL);
> > > -
> > > - /* return backlight in hardware format which is unsigned 17 bits, with
> > > -  * 1 bit integer and 16 bit fractional
> > > -  */
> > > - return backlight;
> > > -}
> > > -
> > > -static unsigned int dmub_abm_get_target_backlight(struct abm *abm) -{
> > > - struct dce_abm *dce_abm = TO_DMUB_ABM(abm);
> > > - unsigned int backlight = REG_READ(BL1_PWM_TARGET_ABM_LEVEL);
> > > -
> > > - /* return backlight in hardware format which is unsigned 17 bits, with
> > > -  * 1 bit integer and 16 bit fractional
> > > -  */
> > > - return backlight;
> > > -}
> > > -
> > >static bool dmub_abm_set_level(struct abm *abm, uint32_t level)
> > >{
> > >   union dmub_rb_cmd cmd;
> > > @@ -148,6 +129,9 @@ static bool dmub_abm_set_level(struct abm *abm, 
> > > uint32_t level)
> > >   int edp_num;
> > >   uint8_t panel_mask = 0;
> > > + if (power_supply_is_system_supplied() > 0)
> > > + level = 0;
> > > +
> > >   get_edp_links(dc->dc, edp_links, _num);
> > >   for (i = 0; i < edp_num; i++) {
> > > @@ -170,6 +154,36 @@ static bool dmub_abm_set_level(struct abm *abm, 
> > > uint32_t level)
> > >   return true;
> > >}
> > > +static unsigned int dmub_abm_get_current_backlight(struct abm *abm) {
> > > + struct dce_abm *dce_abm = TO_DMUB_ABM(abm);
> > > + unsigned int backlight = REG_READ(BL1_PWM_CURRENT_ABM_LEVEL);
> > > + struct dc_context *dc = abm->ctx;
> > > + struct amdgpu_device *adev = dc->driver_context;
> > > +
> > > + if (adev->pm.ac_power != adev->pm.old_ac_power) {
> > > + dmub_abm_set_level(abm, amdgpu_dm_abm_level);
> > > + adev->pm.ac_power = power_supply_is_system_supplied() > 0;
> > > + adev->pm.old_ac_power = adev->pm.ac_power;
> > > + }
> > > +
> > > + /* return backlight in hardware format which is unsigned 17 bits, with
> > > +  * 1 bit integer and 16 bit fractional
> > > +  */
> > > + return backlight;
> > > +}
> > > +
> > > +static unsigned int dmub_abm_get_target_backlight(struct abm *abm) {
> > > + struct dce_abm *dce_abm = TO_DMUB_ABM(abm);
> > > + unsigned int backlight = REG_READ(BL1_PWM_TARGET_ABM_LEVEL);
> > > +
> > > + /* return backlight in hardware format which is unsigned 17 bits, with
> > > +  * 1 bit integer and 16 bit fractional
> > > +  */
> > > + return backlight;
> > > +}
> > > +
> > >static bool dmub_abm_init_config(struct abm *abm,
> > >   const char *src,
> > >   unsigned int bytes,
> > > diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
> > > b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
> > > index f6e0e7d8a0077..de459411a0e83 100644
> > > --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
> > > +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
> > > @@ -445,6 +445,7 @@ struct amdgpu_pm {
> > >   uint32_tsmu_prv_buffer_size;
> > >   struct amdgpu_bo*smu_prv_buffer;
> > >   bool ac_power;
> > > + bool old_ac_power;
> > >   /* powerplay feature */
> > >   uint32_t pp_feature;
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] Commit messages (was: [PATCH v11] drm/amdgpu: add drm buddy support to amdgpu)

2022-03-24 Thread Daniel Vetter
On Wed, 23 Mar 2022 at 16:32, Christian König  wrote:
>
> Am 23.03.22 um 16:24 schrieb Daniel Stone:
> > On Wed, 23 Mar 2022 at 15:14, Alex Deucher  wrote:
> >> On Wed, Mar 23, 2022 at 11:04 AM Daniel Stone  wrote:
> >>> That's not what anyone's saying here ...
> >>>
> >>> No-one's demanding AMD publish RTL, or internal design docs, or
> >>> hardware specs, or URLs to JIRA tickets no-one can access.
> >>>
> >>> This is a large and invasive commit with pretty big ramifications;
> >>> containing exactly two lines of commit message, one of which just
> >>> duplicates the subject.
> >>>
> >>> It cannot be the case that it's completely impossible to provide any
> >>> justification, background, or details, about this commit being made.
> >>> Unless, of course, it's to fix a non-public security issue, that is
> >>> reasonable justification for eliding some of the details. But then
> >>> again, 'huge change which is very deliberately opaque' is a really
> >>> good way to draw a lot of attention to the commit, and it would be
> >>> better to provide more detail about the change to help it slip under
> >>> the radar.
> >>>
> >>> If dri-devel@ isn't allowed to inquire about patches which are posted,
> >>> then CCing the list is just a façade; might as well just do it all
> >>> internally and periodically dump out pull requests.
> >> I think we are in agreement. I think the withheld information
> >> Christian was referring to was on another thread with Christian and
> >> Paul discussing a workaround for a hardware bug:
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg75908.htmldata=04%7C01%7Cchristian.koenig%40amd.com%7C6a3f2815d83b4872577008da0ce1347a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637836458652370599%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=QtNB0XHMhTgH%2FNHMwF23Qn%2BgSdYyHJSenbpP%2FHG%2BkxE%3Dreserved=0
> > Right, that definitely seems like some crossed wires. I don't see
> > anything wrong with that commit at all: the commit message and a
> > comment notes that there is a hardware issue preventing Raven from
> > being able to do TMZ+GTT, and the code does the very straightforward
> > and obvious thing to ensure that on VCN 1.0, any TMZ buffer must be
> > VRAM-placed.
> >
> > This one, on the other hand, is much less clear ...
>
> Yes, completely agree. I mean a good bunch of comments on commit
> messages are certainly valid and we could improve them.
>
> But this patch here was worked on by both AMD and Intel developers.
> Where both sides and I think even people from other companies perfectly
> understands why, what, how etc...
>
> When now somebody comes along and asks for a whole explanation of the
> context why we do it then that sounds really strange to me.

Yeah gpus are using pages a lot more like the cpu (with bigger pages
of benefit, but not required, hence the buddy allocator to coalesce
them), and extremely funny contig allocations with bonkers
requirements aren't needed anymore (which was the speciality of
drm_mm.c). Hence why both i915 and amdgpu move over to this new buddy
allocator for managing vram.

I guess that could be added to the commit message, but also it's kinda
well known - the i915 patches also didn't explain why we want to
manage our vram with a buddy allocator (I think some of the earlier
versions explained it a bit, but the version with ttm integration that
landed didnt).

But yeah the confusing comments about hiding stuff that somehow
spilled over from other discussions into this didn't help :-/
-Daniel

> Thanks for jumping in here,
> Christian.
>
> >
> > Cheers,
> > Daniel
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-23 Thread Daniel Vetter
On Wed, 23 Mar 2022 at 15:07, Daniel Stone  wrote:
>
> Hi,
>
> On Mon, 21 Mar 2022 at 16:02, Rob Clark  wrote:
> > On Mon, Mar 21, 2022 at 2:30 AM Christian König
> >  wrote:
> > > Well you can, it just means that their contexts are lost as well.
> >
> > Which is rather inconvenient when deqp-egl reset tests, for example,
> > take down your compositor ;-)
>
> Yeah. Or anything WebGL.
>
> System-wide collateral damage is definitely a non-starter. If that
> means that the userspace driver has to do what iris does and ensure
> everything's recreated and resubmitted, that works too, just as long
> as the response to 'my adblocker didn't detect a crypto miner ad'  is
> something better than 'shoot the entire user session'.

Not sure where that idea came from, I thought at least I made it clear
that legacy gl _has_ to recover. It's only vk and arb_robustness gl
which should die without recovery attempt.

The entire discussion here is who should be responsible for replay and
at least if you can decide the uapi, then punting that entirely to
userspace is a good approach.

Ofc it'd be nice if the collateral damage is limited, i.e. requests
not currently on the gpu, or on different engines and all that
shouldn't be nuked, if possible.

Also ofc since msm uapi is that the kernel tries to recover there's
not much we can do there, contexts cannot be shot. But still trying to
replay them as much as possible feels a bit like overkill.
-Daniel

> Cheers,
> Daniel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-21 Thread Daniel Vetter
On Fri, Mar 18, 2022 at 08:12:54AM -0700, Rob Clark wrote:
> On Fri, Mar 18, 2022 at 12:42 AM Christian König
>  wrote:
> >
> > Am 17.03.22 um 18:31 schrieb Rob Clark:
> > > On Thu, Mar 17, 2022 at 10:27 AM Daniel Vetter  wrote:
> > >> [SNIP]
> > >>> (At some point, I'd like to use scheduler for the replay, and actually
> > >>> use drm_sched_stop()/etc.. but last time I looked there were still
> > >>> some sched bugs in that area which prevented me from deleting a bunch
> > >>> of code ;-))
> > >> Not sure about your hw, but at least on intel replaying tends to just
> > >> result in follow-on fun. And that holds even more so the more complex a
> > >> workload is. This is why vk just dies immediately and does not try to
> > >> replay anything, offloading it to the app. Same with arb robusteness.
> > >> Afaik it's really only media and classic gl which insist that the driver
> > >> stack somehow recover.
> > > At least for us, each submit must be self-contained (ie. not rely on
> > > previous GPU hw state), so in practice replay works out pretty well.
> > > The worst case is subsequent submits from same process fail as well
> > > (if they depended on something that crashing submit failed to write
> > > back to memory.. but in that case they just crash as well and we move
> > > on to the next one.. the recent gens (a5xx+ at least) are pretty good
> > > about quickly detecting problems and giving us an error irq.
> >
> > Well I absolutely agree with Daniel.
> >
> > The whole replay thing AMD did in the scheduler is an absolutely mess
> > and should probably be killed with fire.
> >
> > I strongly recommend not to do the same mistake in other drivers.
> >
> > If you want to have some replay feature then please make it driver
> > specific and don't use anything from the infrastructure in the DRM
> > scheduler.
> 
> hmm, perhaps I was not clear, but I'm only talking about re-emitting
> jobs *following* the faulting one (which could be from other contexts,
> etc).. not trying to restart the faulting job.

You absolutely can drop jobs on the floor, this is what both anv and iris
expect. They use what we call non-recoverable context, meaning when any
gpu hang happens and the context is affect (whether as the guilty on, or
because it was a multi-engine reset and it was victimized) we kill it
entirely. No replaying, and any further execbuf ioctl fails with -EIO.

Userspace then gets to sort out the mess, which for vk is
VK_ERROR_DEVICE_LOST, for robust gl it's the same, and for non-robust gl
iris re-creates a pile of things.

Anything in-between _is_ dropped on the floor completely.

Also note that this is obviously uapi, if you have an userspace which
expect contexts to survive, then replaying makes some sense.

> You *absolutely* need to replay jobs following the faulting one, they
> could be from unrelated contexts/processes.  You can't just drop them
> on the floor.
> 
> Currently it is all driver specific, but I wanted to delete a lot of
> code and move to using scheduler to handle faults/timeouts (but
> blocked on that until [1] is resolved)

Yeah for the drivers where the uapi is "you can safely replay after a
hang, and you're supposed to", then sharing the code is ofc a good idea.

Just wanted to make it clear that this is only one of many uapi flavours
you can pick from, dropping it all on the floor is a perfectly legit
approach :-) And imo it's the more robust one, and also better fits with
latest apis like gl_arb_robustness or vk.

Cheers, Daniel


> 
> [1] 
> https://patchwork.kernel.org/project/dri-devel/patch/1630457207-13107-2-git-send-email-monk....@amd.com/
> 
> BR,
> -R
> 
> > Thanks,
> > Christian.
> >
> > >
> > > BR,
> > > -R
> > >
> > >> And recovering from a mess in userspace is a lot simpler than trying to
> > >> pull of the same magic in the kernel. Plus it also helps with a few of 
> > >> the
> > >> dma_fence rules, which is a nice bonus.
> > >> -Daniel
> > >>
> >

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Daniel Vetter
On Thu, Mar 17, 2022 at 08:40:51AM -0700, Rob Clark wrote:
> On Thu, Mar 17, 2022 at 2:29 AM Daniel Vetter  wrote:
> >
> > On Thu, Mar 17, 2022 at 08:03:27AM +0100, Christian König wrote:
> > > Am 16.03.22 um 16:36 schrieb Rob Clark:
> > > > [SNIP]
> > > > just one point of clarification.. in the msm and i915 case it is
> > > > purely for debugging and telemetry (ie. sending crash logs back to
> > > > distro for analysis if user has crash reporting enabled).. it isn't
> > > > used for triggering any action like killing app or compositor.
> > >
> > > By the way, how does msm it's memory management for the devcoredumps?
> >
> > GFP_NORECLAIM all the way. It's purely best effort.
> >
> > Note that the fancy new plan for i915 discrete gpu is to only support gpu
> > crash dumps on non-recoverable gpu contexts, i.e. those that do not
> > continue to the next batch when something bad happens. This is what vk
> > wants and also what iris now uses (we do context recovery in userspace in
> > all cases), and non-recoverable contexts greatly simplify the crash dump
> > gather: Only thing you need to gather is the register state from hw
> > (before you reset it), all the batchbuffer bo and indirect state bo (in
> > i915 you can mark which bo to capture in the CS ioctl) can be captured in
> > a worker later on. Which for non-recoverable context is no issue, since
> > subsequent batchbuffers won't trample over any of these things.
> 
> fwiw, we snapshot everything (cmdstream and bo's marked with dump
> flag, in addition to hw state) before resuming the GPU, so there is no
> danger of things being trampled.  After state is captured and GPU
> reset, we "replay" the submits that were written into the ringbuffer
> after the faulting submit.  GPU crashes should be a thing you don't
> need to try to optimize.

Not sure why you think we optimize anything here?

> (At some point, I'd like to use scheduler for the replay, and actually
> use drm_sched_stop()/etc.. but last time I looked there were still
> some sched bugs in that area which prevented me from deleting a bunch
> of code ;-))

Not sure about your hw, but at least on intel replaying tends to just
result in follow-on fun. And that holds even more so the more complex a
workload is. This is why vk just dies immediately and does not try to
replay anything, offloading it to the app. Same with arb robusteness.
Afaik it's really only media and classic gl which insist that the driver
stack somehow recover.

And recovering from a mess in userspace is a lot simpler than trying to
pull of the same magic in the kernel. Plus it also helps with a few of the
dma_fence rules, which is a nice bonus.
-Daniel

> 
> BR,
> -R
> 
> >
> > And that way you can record the crashdump (or at least the big pieces like
> > all the indirect state stuff) with GFP_KERNEL.
> >
> > msm probably gets it wrong since embedded drivers have much less shrinker
> > and generally no mmu notifiers going on :-)
> >
> > > I mean it is strictly forbidden to allocate any memory in the GPU reset
> > > path.
> > >
> > > > I would however *strongly* recommend devcoredump support in other GPU
> > > > drivers (i915's thing pre-dates devcoredump by a lot).. I've used it
> > > > to debug and fix a couple obscure issues that I was not able to
> > > > reproduce by myself.
> > >
> > > Yes, completely agree as well.
> >
> > +1
> >
> > Cheers, Daniel
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Daniel Vetter
On Thu, Mar 17, 2022 at 08:34:21AM -0700, Rob Clark wrote:
> On Thu, Mar 17, 2022 at 2:29 AM Daniel Vetter  wrote:
> >
> > On Thu, Mar 17, 2022 at 08:03:27AM +0100, Christian König wrote:
> > > Am 16.03.22 um 16:36 schrieb Rob Clark:
> > > > [SNIP]
> > > > just one point of clarification.. in the msm and i915 case it is
> > > > purely for debugging and telemetry (ie. sending crash logs back to
> > > > distro for analysis if user has crash reporting enabled).. it isn't
> > > > used for triggering any action like killing app or compositor.
> > >
> > > By the way, how does msm it's memory management for the devcoredumps?
> >
> > GFP_NORECLAIM all the way. It's purely best effort.
> 
> We do one GEM obj allocation in the snapshot path (the hw has a
> mechanism to snapshot it's own state into a gpu buffer.. not sure if
> nice debugging functionality like that is a commentary on the blob
> driver quality, but I'm not complaining)
> 
> I suppose we could pre-allocate this buffer up-front.. but it doesn't
> seem like a problem, ie. if allocation fails we just skip snapshotting
> stuff that needs the hw crashdumper.  I guess since vram is not
> involved, perhaps that makes the situation a bit more straightforward.

The problem is that you need to allocate with GFP_ATOMIC, instead of
GFP_KERNEL, or things go very bad.

The scheduler dma-fence annotations I've had (well still have them here)
would catch this stuff, but thus far they got nowhere.

> > Note that the fancy new plan for i915 discrete gpu is to only support gpu
> > crash dumps on non-recoverable gpu contexts, i.e. those that do not
> > continue to the next batch when something bad happens. This is what vk
> > wants and also what iris now uses (we do context recovery in userspace in
> > all cases), and non-recoverable contexts greatly simplify the crash dump
> > gather: Only thing you need to gather is the register state from hw
> > (before you reset it), all the batchbuffer bo and indirect state bo (in
> > i915 you can mark which bo to capture in the CS ioctl) can be captured in
> > a worker later on. Which for non-recoverable context is no issue, since
> > subsequent batchbuffers won't trample over any of these things.
> >
> > And that way you can record the crashdump (or at least the big pieces like
> > all the indirect state stuff) with GFP_KERNEL.
> >
> > msm probably gets it wrong since embedded drivers have much less shrinker
> > and generally no mmu notifiers going on :-)
> 
> Note that the bo's associated with the batch are still pinned at this
> point, from the bo lifecycle the batch is still active.  So from the
> point of view of shrinker, there should be no interaction.  We aren't
> doing anything with mmu notifiers (yet), so not entirely sure offhand
> the concern there.
> 
> Currently we just use GFP_KERNEL and bail if allocation fails.

Yeah you have a simple enough shrinker for this not to be a problem. The
issue is that sooner or later things tend to not stay like that, and we're
trying to have common rules for dma_fence to make sure everyone follows
the same rules.
-Daniel

> 
> BR,
> -R
> 
> > > I mean it is strictly forbidden to allocate any memory in the GPU reset
> > > path.
> > >
> > > > I would however *strongly* recommend devcoredump support in other GPU
> > > > drivers (i915's thing pre-dates devcoredump by a lot).. I've used it
> > > > to debug and fix a couple obscure issues that I was not able to
> > > > reproduce by myself.
> > >
> > > Yes, completely agree as well.
> >
> > +1
> >
> > Cheers, Daniel
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


  1   2   3   4   5   6   7   8   9   10   >