Re: [PATCH v2] drm/amdgpu: Clear the hotplug interrupt ack bit before hpd initialization

2024-03-26 Thread Qiang Ma
On Tue, 26 Mar 2024 23:51:45 -0400
Alex Deucher  wrote:

> On Tue, Mar 26, 2024 at 11:41 PM Qiang Ma 
> wrote:
> >
> > On Thu, 14 Mar 2024 14:40:40 +
> > "Deucher, Alexander"  wrote:
> >  
> > > [Public]
> > >  
> > > > -Original Message-
> > > > From: Qiang Ma 
> > > > Sent: Wednesday, March 13, 2024 2:18 AM
> > > > To: Deucher, Alexander ; Koenig,
> > > > Christian ; Pan, Xinhui
> > > > ; airl...@gmail.com; dan...@ffwll.ch;
> > > > SHANMUGAM, SRINIVASAN ;
> > > > sunran...@208suo.com Cc: amd-...@lists.freedesktop.org;
> > > > dri-devel@lists.freedesktop.org; linux- ker...@vger.kernel.org
> > > > Subject: Re: [PATCH v2] drm/amdgpu: Clear the hotplug interrupt
> > > > ack bit before hpd initialization
> > > >
> > > > On Wed, 31 Jan 2024 15:57:03 +0800
> > > > Qiang Ma  wrote:
> > > >
> > > > Hello everyone, please help review this patch.  
> > >
> > > This was applied back in January, sorry if I forget to reply.
> > >
> > > Alex  
> >
> > Hi, Alex, it doesn't matter, please take some time to help review
> > this patch.
> >
> > This patch mainly solves the problem that after unplugging the HDMI
> > display during bios initialization, the display does not light up
> > after the system starts.
> >  
> 
> I already reviewed and applied the patch.  It's in mainline:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=aeaf3e6cf84282500b6fa03621b0c225ce1af18a
> 
> Alex

Thank you.

Qiang Ma

> 
> > Qiang Ma  
> > >  
> > > >
> > > >   Qiang Ma
> > > >  
> > > > > Problem:
> > > > > The computer in the bios initialization process, unplug the
> > > > > HDMI display, wait until the system up, plug in the HDMI
> > > > > display, did not enter the hotplug interrupt function, the
> > > > > display is not bright.
> > > > >
> > > > > Fix:
> > > > > After the above problem occurs, and the hpd ack interrupt bit
> > > > > is 1, the interrupt should be cleared during hpd_init
> > > > > initialization so that when the driver is ready, it can
> > > > > respond to the hpd interrupt normally.
> > > > >
> > > > > Signed-off-by: Qiang Ma 
> > > > > ---
> > > > > v2:
> > > > >  - Remove unused variable 'tmp'
> > > > >  - Fixed function spelling errors
> > > > >
> > > > > drivers/gpu/drm/amd/amdgpu/dce_v10_0.c |  2 ++
> > > > > drivers/gpu/drm/amd/amdgpu/dce_v11_0.c |  2 ++
> > > > > drivers/gpu/drm/amd/amdgpu/dce_v6_0.c  | 22
> > > > > ++---  
> > > > -  
> > > > > drivers/gpu/drm/amd/amdgpu/dce_v8_0.c  | 22
> > > > > ++---  
> > > > -  
> > > > >  4 files changed, 40 insertions(+), 8 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
> > > > > b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c index
> > > > > bb666cb7522e..12a8ba929a72 100644 ---
> > > > > a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c +++
> > > > > b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c @@ -51,6 +51,7 @@
> > > > >
> > > > >  static void dce_v10_0_set_display_funcs(struct amdgpu_device
> > > > > *adev); static void dce_v10_0_set_irq_funcs(struct
> > > > > amdgpu_device *adev); +static void
> > > > > dce_v10_0_hpd_int_ack(struct amdgpu_device *adev, int hpd);
> > > > >  static const u32 crtc_offsets[] = {
> > > > > CRTC0_REGISTER_OFFSET,
> > > > > @@ -363,6 +364,7 @@ static void dce_v10_0_hpd_init(struct
> > > > > amdgpu_device *adev) AMDGPU_HPD_DISCONNECT_INT_DELAY_IN_MS);
> > > > > WREG32(mmDC_HPD_TOGGLE_FILT_CNTL +
> > > > > hpd_offsets[amdgpu_connector->hpd.hpd], tmp);
> > > > > +   dce_v10_0_hpd_int_ack(adev,
> > > > > amdgpu_connector->hpd.hpd); dce_v10_0_hpd_set_polarity(adev,
> > > > > amdgpu_connector->hpd.hpd); amdgpu_irq_get(adev,
> > > > > >hpd_irq, amdgpu_connector->hpd.hpd); diff --git
> > > > > a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
> > > > > b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c index
> > > > > 7af277f61cca..745e4fdffade 100644 ---
> > > > > a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c +++
> > > > > b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c @@ -51,6 +51,7 @@
> > > > >
> > > > >  static void dce_v11_0_set_display_funcs(struct amdgpu_device
> > > > > *adev); static void dce_v11_0_set_irq_funcs(struct
> > > > > amdgpu_device *adev); +static void
> > > > > dce_v11_0_hpd_int_ack(struct amdgpu_device *adev, int hpd);
> > > > >  static const u32 crtc_offsets[] =
> > > > >  {
> > > > > @@ -387,6 +388,7 @@ static void dce_v11_0_hpd_init(struct
> > > > > amdgpu_device *adev) AMDGPU_HPD_DISCONNECT_INT_DELAY_IN_MS);
> > > > > WREG32(mmDC_HPD_TOGGLE_FILT_CNTL +
> > > > > hpd_offsets[amdgpu_connector->hpd.hpd], tmp);
> > > > > +   dce_v11_0_hpd_int_ack(adev,
> > > > > amdgpu_connector->hpd.hpd); dce_v11_0_hpd_set_polarity(adev,
> > > > > amdgpu_connector->hpd.hpd); amdgpu_irq_get(adev,
> > > > > >hpd_irq, amdgpu_connector->hpd.hpd); } diff --git
> > > > > a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
> > > > > b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c index
> > > > > 143efc37a17f..28c4a735716b 100644 ---
> > > > > 

Re: [PATCH v2] drm/amdgpu: Clear the hotplug interrupt ack bit before hpd initialization

2024-03-26 Thread Alex Deucher
On Tue, Mar 26, 2024 at 11:41 PM Qiang Ma  wrote:
>
> On Thu, 14 Mar 2024 14:40:40 +
> "Deucher, Alexander"  wrote:
>
> > [Public]
> >
> > > -Original Message-
> > > From: Qiang Ma 
> > > Sent: Wednesday, March 13, 2024 2:18 AM
> > > To: Deucher, Alexander ; Koenig,
> > > Christian ; Pan, Xinhui
> > > ; airl...@gmail.com; dan...@ffwll.ch;
> > > SHANMUGAM, SRINIVASAN ;
> > > sunran...@208suo.com Cc: amd-...@lists.freedesktop.org;
> > > dri-devel@lists.freedesktop.org; linux- ker...@vger.kernel.org
> > > Subject: Re: [PATCH v2] drm/amdgpu: Clear the hotplug interrupt ack
> > > bit before hpd initialization
> > >
> > > On Wed, 31 Jan 2024 15:57:03 +0800
> > > Qiang Ma  wrote:
> > >
> > > Hello everyone, please help review this patch.
> >
> > This was applied back in January, sorry if I forget to reply.
> >
> > Alex
>
> Hi, Alex, it doesn't matter, please take some time to help review this
> patch.
>
> This patch mainly solves the problem that after unplugging the HDMI
> display during bios initialization, the display does not light up after
> the system starts.
>

I already reviewed and applied the patch.  It's in mainline:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=aeaf3e6cf84282500b6fa03621b0c225ce1af18a

Alex

> Qiang Ma
> >
> > >
> > >   Qiang Ma
> > >
> > > > Problem:
> > > > The computer in the bios initialization process, unplug the HDMI
> > > > display, wait until the system up, plug in the HDMI display, did
> > > > not enter the hotplug interrupt function, the display is not
> > > > bright.
> > > >
> > > > Fix:
> > > > After the above problem occurs, and the hpd ack interrupt bit is
> > > > 1, the interrupt should be cleared during hpd_init initialization
> > > > so that when the driver is ready, it can respond to the hpd
> > > > interrupt normally.
> > > >
> > > > Signed-off-by: Qiang Ma 
> > > > ---
> > > > v2:
> > > >  - Remove unused variable 'tmp'
> > > >  - Fixed function spelling errors
> > > >
> > > > drivers/gpu/drm/amd/amdgpu/dce_v10_0.c |  2 ++
> > > > drivers/gpu/drm/amd/amdgpu/dce_v11_0.c |  2 ++
> > > > drivers/gpu/drm/amd/amdgpu/dce_v6_0.c  | 22
> > > > ++---
> > > -
> > > > drivers/gpu/drm/amd/amdgpu/dce_v8_0.c  | 22
> > > > ++---
> > > -
> > > >  4 files changed, 40 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
> > > > b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c index
> > > > bb666cb7522e..12a8ba929a72 100644 ---
> > > > a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c +++
> > > > b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c @@ -51,6 +51,7 @@
> > > >
> > > >  static void dce_v10_0_set_display_funcs(struct amdgpu_device
> > > > *adev); static void dce_v10_0_set_irq_funcs(struct amdgpu_device
> > > > *adev); +static void dce_v10_0_hpd_int_ack(struct amdgpu_device
> > > > *adev, int hpd);
> > > >  static const u32 crtc_offsets[] = {
> > > > CRTC0_REGISTER_OFFSET,
> > > > @@ -363,6 +364,7 @@ static void dce_v10_0_hpd_init(struct
> > > > amdgpu_device *adev) AMDGPU_HPD_DISCONNECT_INT_DELAY_IN_MS);
> > > > WREG32(mmDC_HPD_TOGGLE_FILT_CNTL +
> > > > hpd_offsets[amdgpu_connector->hpd.hpd], tmp);
> > > > +   dce_v10_0_hpd_int_ack(adev,
> > > > amdgpu_connector->hpd.hpd); dce_v10_0_hpd_set_polarity(adev,
> > > > amdgpu_connector->hpd.hpd); amdgpu_irq_get(adev, >hpd_irq,
> > > >amdgpu_connector->hpd.hpd); diff --git
> > > > a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
> > > > b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c index
> > > > 7af277f61cca..745e4fdffade 100644 ---
> > > > a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c +++
> > > > b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c @@ -51,6 +51,7 @@
> > > >
> > > >  static void dce_v11_0_set_display_funcs(struct amdgpu_device
> > > > *adev); static void dce_v11_0_set_irq_funcs(struct amdgpu_device
> > > > *adev); +static void dce_v11_0_hpd_int_ack(struct amdgpu_device
> > > > *adev, int hpd);
> > > >  static const u32 crtc_offsets[] =
> > > >  {
> > > > @@ -387,6 +388,7 @@ static void dce_v11_0_hpd_init(struct
> > > > amdgpu_device *adev) AMDGPU_HPD_DISCONNECT_INT_DELAY_IN_MS);
> > > > WREG32(mmDC_HPD_TOGGLE_FILT_CNTL +
> > > > hpd_offsets[amdgpu_connector->hpd.hpd], tmp);
> > > > +   dce_v11_0_hpd_int_ack(adev,
> > > > amdgpu_connector->hpd.hpd); dce_v11_0_hpd_set_polarity(adev,
> > > > amdgpu_connector->hpd.hpd); amdgpu_irq_get(adev, >hpd_irq,
> > > > amdgpu_connector->hpd.hpd); } diff --git
> > > > a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
> > > > b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c index
> > > > 143efc37a17f..28c4a735716b 100644 ---
> > > > a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c +++
> > > > b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c @@ -272,6 +272,21 @@
> > > static
> > > > void dce_v6_0_hpd_set_polarity(struct amdgpu_device *adev,
> > > > WREG32(mmDC_HPD1_INT_CONTROL + hpd_offsets[hpd], tmp); }
> > > >
> > > > +static void dce_v6_0_hpd_int_ack(struct amdgpu_device *adev,
> 

Re: [PATCH v2] drm/amdgpu: Clear the hotplug interrupt ack bit before hpd initialization

2024-03-26 Thread Qiang Ma
On Thu, 14 Mar 2024 14:40:40 +
"Deucher, Alexander"  wrote:

> [Public]
> 
> > -Original Message-
> > From: Qiang Ma 
> > Sent: Wednesday, March 13, 2024 2:18 AM
> > To: Deucher, Alexander ; Koenig,
> > Christian ; Pan, Xinhui
> > ; airl...@gmail.com; dan...@ffwll.ch;
> > SHANMUGAM, SRINIVASAN ;
> > sunran...@208suo.com Cc: amd-...@lists.freedesktop.org;
> > dri-devel@lists.freedesktop.org; linux- ker...@vger.kernel.org
> > Subject: Re: [PATCH v2] drm/amdgpu: Clear the hotplug interrupt ack
> > bit before hpd initialization
> >
> > On Wed, 31 Jan 2024 15:57:03 +0800
> > Qiang Ma  wrote:
> >
> > Hello everyone, please help review this patch.  
> 
> This was applied back in January, sorry if I forget to reply.
> 
> Alex

Hi, Alex, it doesn't matter, please take some time to help review this
patch.

This patch mainly solves the problem that after unplugging the HDMI
display during bios initialization, the display does not light up after
the system starts.

Qiang Ma
> 
> >
> >   Qiang Ma
> >  
> > > Problem:
> > > The computer in the bios initialization process, unplug the HDMI
> > > display, wait until the system up, plug in the HDMI display, did
> > > not enter the hotplug interrupt function, the display is not
> > > bright.
> > >
> > > Fix:
> > > After the above problem occurs, and the hpd ack interrupt bit is
> > > 1, the interrupt should be cleared during hpd_init initialization
> > > so that when the driver is ready, it can respond to the hpd
> > > interrupt normally.
> > >
> > > Signed-off-by: Qiang Ma 
> > > ---
> > > v2:
> > >  - Remove unused variable 'tmp'
> > >  - Fixed function spelling errors
> > >
> > > drivers/gpu/drm/amd/amdgpu/dce_v10_0.c |  2 ++
> > > drivers/gpu/drm/amd/amdgpu/dce_v11_0.c |  2 ++
> > > drivers/gpu/drm/amd/amdgpu/dce_v6_0.c  | 22
> > > ++---  
> > -  
> > > drivers/gpu/drm/amd/amdgpu/dce_v8_0.c  | 22
> > > ++---  
> > -  
> > >  4 files changed, 40 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
> > > b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c index
> > > bb666cb7522e..12a8ba929a72 100644 ---
> > > a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c +++
> > > b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c @@ -51,6 +51,7 @@
> > >
> > >  static void dce_v10_0_set_display_funcs(struct amdgpu_device
> > > *adev); static void dce_v10_0_set_irq_funcs(struct amdgpu_device
> > > *adev); +static void dce_v10_0_hpd_int_ack(struct amdgpu_device
> > > *adev, int hpd);
> > >  static const u32 crtc_offsets[] = {
> > > CRTC0_REGISTER_OFFSET,
> > > @@ -363,6 +364,7 @@ static void dce_v10_0_hpd_init(struct
> > > amdgpu_device *adev) AMDGPU_HPD_DISCONNECT_INT_DELAY_IN_MS);
> > > WREG32(mmDC_HPD_TOGGLE_FILT_CNTL +
> > > hpd_offsets[amdgpu_connector->hpd.hpd], tmp);
> > > +   dce_v10_0_hpd_int_ack(adev,
> > > amdgpu_connector->hpd.hpd); dce_v10_0_hpd_set_polarity(adev,
> > > amdgpu_connector->hpd.hpd); amdgpu_irq_get(adev, >hpd_irq,
> > >amdgpu_connector->hpd.hpd); diff --git
> > > a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
> > > b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c index
> > > 7af277f61cca..745e4fdffade 100644 ---
> > > a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c +++
> > > b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c @@ -51,6 +51,7 @@
> > >
> > >  static void dce_v11_0_set_display_funcs(struct amdgpu_device
> > > *adev); static void dce_v11_0_set_irq_funcs(struct amdgpu_device
> > > *adev); +static void dce_v11_0_hpd_int_ack(struct amdgpu_device
> > > *adev, int hpd);
> > >  static const u32 crtc_offsets[] =
> > >  {
> > > @@ -387,6 +388,7 @@ static void dce_v11_0_hpd_init(struct
> > > amdgpu_device *adev) AMDGPU_HPD_DISCONNECT_INT_DELAY_IN_MS);
> > > WREG32(mmDC_HPD_TOGGLE_FILT_CNTL +
> > > hpd_offsets[amdgpu_connector->hpd.hpd], tmp);
> > > +   dce_v11_0_hpd_int_ack(adev,
> > > amdgpu_connector->hpd.hpd); dce_v11_0_hpd_set_polarity(adev,
> > > amdgpu_connector->hpd.hpd); amdgpu_irq_get(adev, >hpd_irq,
> > > amdgpu_connector->hpd.hpd); } diff --git
> > > a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
> > > b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c index
> > > 143efc37a17f..28c4a735716b 100644 ---
> > > a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c +++
> > > b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c @@ -272,6 +272,21 @@  
> > static  
> > > void dce_v6_0_hpd_set_polarity(struct amdgpu_device *adev,
> > > WREG32(mmDC_HPD1_INT_CONTROL + hpd_offsets[hpd], tmp); }
> > >
> > > +static void dce_v6_0_hpd_int_ack(struct amdgpu_device *adev,
> > > +int hpd)
> > > +{
> > > +   u32 tmp;
> > > +
> > > +   if (hpd >= adev->mode_info.num_hpd) {
> > > +   DRM_DEBUG("invalid hdp %d\n", hpd);
> > > +   return;
> > > +   }
> > > +
> > > +   tmp = RREG32(mmDC_HPD1_INT_CONTROL + hpd_offsets[hpd]);
> > > +   tmp |= DC_HPD1_INT_CONTROL__DC_HPD1_INT_ACK_MASK;
> > > +   WREG32(mmDC_HPD1_INT_CONTROL + hpd_offsets[hpd], tmp); }
> > > +
> > >  /**
> > >   * 

Re: [PATCH 07/11] drm/dp: Add drm_dp_uhbr_channel_coding_supported()

2024-03-26 Thread Manasi Navare
Reviewed-by: Manasi Navare 

Manasi

On Tue, Mar 26, 2024 at 5:54 AM Nautiyal, Ankit K
 wrote:
>
>
> On 3/21/2024 1:41 AM, Imre Deak wrote:
> > Factor out a function to check for UHBR channel coding support used by a
> > follow-up patch in the patchset.
> >
> > Cc: dri-devel@lists.freedesktop.org
> > Signed-off-by: Imre Deak 
>
> LGTM.
>
> Reviewed-by: Ankit Nautiyal 
>
> > ---
> >   drivers/gpu/drm/i915/display/intel_dp.c | 2 +-
> >   include/drm/display/drm_dp_helper.h | 6 ++
> >   2 files changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
> > b/drivers/gpu/drm/i915/display/intel_dp.c
> > index dbe65651bf277..1d13a1ba2b97d 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > @@ -217,7 +217,7 @@ static void intel_dp_set_dpcd_sink_rates(struct 
> > intel_dp *intel_dp)
> >* Sink rates for 128b/132b. If set, sink should support all 8b/10b
> >* rates and 10 Gbps.
> >*/
> > - if (intel_dp->dpcd[DP_MAIN_LINK_CHANNEL_CODING] & 
> > DP_CAP_ANSI_128B132B) {
> > + if (drm_dp_uhbr_channel_coding_supported(intel_dp->dpcd)) {
> >   u8 uhbr_rates = 0;
> >
> >   BUILD_BUG_ON(ARRAY_SIZE(intel_dp->sink_rates) < 
> > ARRAY_SIZE(dp_rates) + 3);
> > diff --git a/include/drm/display/drm_dp_helper.h 
> > b/include/drm/display/drm_dp_helper.h
> > index a62fcd051d4d4..150c37a99a16f 100644
> > --- a/include/drm/display/drm_dp_helper.h
> > +++ b/include/drm/display/drm_dp_helper.h
> > @@ -221,6 +221,12 @@ drm_dp_channel_coding_supported(const u8 
> > dpcd[DP_RECEIVER_CAP_SIZE])
> >   return dpcd[DP_MAIN_LINK_CHANNEL_CODING] & DP_CAP_ANSI_8B10B;
> >   }
> >
> > +static inline bool
> > +drm_dp_uhbr_channel_coding_supported(const u8 dpcd[DP_RECEIVER_CAP_SIZE])
> > +{
> > + return dpcd[DP_MAIN_LINK_CHANNEL_CODING] & DP_CAP_ANSI_128B132B;
> > +}
> > +
> >   static inline bool
> >   drm_dp_alternate_scrambler_reset_cap(const u8 dpcd[DP_RECEIVER_CAP_SIZE])
> >   {


Re: [PATCH v2] Fix duplicate C declaration warnings

2024-03-26 Thread Donald Hunter
Akira Yokosawa  writes:
>
> That message of mine just pointed out that the Sphinx bug of false
> duplicate C declaration warning first reported by Mauro (+CC'd) at:
> https://github.com/sphinx-doc/sphinx/issues/8241 --
> "C domain issues when building the Linux Kernel documentation".
> It had not been resolved despite Mauro's recognition of the issue at the
> time.
>
> It was closed without fixing the bug but delegate the issue to an earlier
> one of the same nature at: https://github.com/sphinx-doc/sphinx/issues/7819 --
> "C, distinguish between ordinary identifiers and tag names", which was
> opened on Jun 12, 2020 and has not been resolved.  (almost 4 years ago!)
>
> There is two pull requests attempting to resolve the issue at:
> https://github.com/sphinx-doc/sphinx/pull/8313 --
> "C, distinguish between tag names and ordinary names" and
> https://github.com/sphinx-doc/sphinx/pull/8929 --
> "Intersphinx delegation to domains".
> PR #8313 needs #8929 as its prerequisite.
>
> Unfortunately, both PRs are still open as well as the issue #7819.
> Honestly speaking, I don't have any idea what prevents those pulls,
> give or take the need of rebasing with conflict resolution.
>
>>  So by changing the
>> function name to something like "query_drm_format_info(u32 format)" is
>> a possible fix. Question is what should I rename this function to, that
>> aligns with the coding standards? Also suggest a new function name for
>> "drm_modeset_lock" that causes the second warning.
>
> So, I would rather not rename valid identifiers for the sake of working
> around a bug of Sphinx.  Rather, I'd appreciate if you'd send a message
> encouraging Sphinx devs to resolve the issue sooner rather than later.
>
> Thanks, Akira

Agreed, we should try and get the bug resolved in Sphinx. This same
issue came up in relation to this PR that I am working on so hopefully
we can work together to get fixes merged upstream:

https://github.com/sphinx-doc/sphinx/pull/12162

Thanks,
Donald.


Re: [PATCH 01/11] drm/i915/dp: Fix DSC line buffer depth programming

2024-03-26 Thread Manasi Navare
Hi Imre,

Thanks for the DSC fixes.
Would the line buf depth calculation that was getting set to 0 impact
DSC on all platforms
or was this issue only specific to MTL and was getting set correctly
with older platforms?
We didnt notice any DSC issues/corruptions with ADL based systems.

The actual change makes sense, just want to confirm if this applies to
all platforms or any particular?
With that clarification:

Reviewed-by: Manasi Navare 

Regards
Manasi

On Tue, Mar 26, 2024 at 3:01 AM Nautiyal, Ankit K
 wrote:
>
>
> On 3/21/2024 1:41 AM, Imre Deak wrote:
> > Fix the calculation of the DSC line buffer depth. This is limited both
> > by the source's and sink's maximum line buffer depth, but the former one
> > was not taken into account. On all Intel platform's the source's maximum
> > buffer depth is 13, so the overall limit is simply the minimum of the
> > source/sink's limit, regardless of the DSC version.
> >
> > This leaves the DSI DSC line buffer depth calculation as-is, trusting
> > VBT.
> >
> > On DSC version 1.2 for sinks reporting a maximum line buffer depth of 16
> > the line buffer depth was incorrectly programmed as 0, leading to a
> > corruption in color gradients / lines on the decompressed screen image.
> >
> > Cc: dri-devel@lists.freedesktop.org
> > Signed-off-by: Imre Deak 
>
> LGTM.
>
> Reviewed-by: Ankit Nautiyal 
>
> > ---
> >   drivers/gpu/drm/i915/display/intel_dp.c | 16 ++--
> >   include/drm/display/drm_dsc.h   |  3 ---
> >   2 files changed, 6 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
> > b/drivers/gpu/drm/i915/display/intel_dp.c
> > index af7ca00e9bc0a..dbe65651bf277 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > @@ -89,6 +89,9 @@
> >   #define DP_DSC_MAX_ENC_THROUGHPUT_0 34
> >   #define DP_DSC_MAX_ENC_THROUGHPUT_1 40
> >
> > +/* Max DSC line buffer depth supported by HW. */
> > +#define INTEL_DP_DSC_MAX_LINE_BUF_DEPTH  13
> > +
> >   /* DP DSC FEC Overhead factor in ppm = 1/(0.972261) = 1.028530 */
> >   #define DP_DSC_FEC_OVERHEAD_FACTOR  1028530
> >
> > @@ -1703,7 +1706,6 @@ static int intel_dp_dsc_compute_params(const struct 
> > intel_connector *connector,
> >   {
> >   struct drm_i915_private *i915 = to_i915(connector->base.dev);
> >   struct drm_dsc_config *vdsc_cfg = _state->dsc.config;
> > - u8 line_buf_depth;
> >   int ret;
> >
> >   /*
> > @@ -1732,20 +1734,14 @@ static int intel_dp_dsc_compute_params(const struct 
> > intel_connector *connector,
> >   connector->dp.dsc_dpcd[DP_DSC_DEC_COLOR_FORMAT_CAP - 
> > DP_DSC_SUPPORT] &
> >   DP_DSC_RGB;
> >
> > - line_buf_depth = 
> > drm_dp_dsc_sink_line_buf_depth(connector->dp.dsc_dpcd);
> > - if (!line_buf_depth) {
> > + vdsc_cfg->line_buf_depth = min(INTEL_DP_DSC_MAX_LINE_BUF_DEPTH,
> > +
> > drm_dp_dsc_sink_line_buf_depth(connector->dp.dsc_dpcd));
> > + if (!vdsc_cfg->line_buf_depth) {
> >   drm_dbg_kms(>drm,
> >   "DSC Sink Line Buffer Depth invalid\n");
> >   return -EINVAL;
> >   }
> >
> > - if (vdsc_cfg->dsc_version_minor == 2)
> > - vdsc_cfg->line_buf_depth = (line_buf_depth == 
> > DSC_1_2_MAX_LINEBUF_DEPTH_BITS) ?
> > - DSC_1_2_MAX_LINEBUF_DEPTH_VAL : line_buf_depth;
> > - else
> > - vdsc_cfg->line_buf_depth = (line_buf_depth > 
> > DSC_1_1_MAX_LINEBUF_DEPTH_BITS) ?
> > - DSC_1_1_MAX_LINEBUF_DEPTH_BITS : line_buf_depth;
> > -
> >   vdsc_cfg->block_pred_enable =
> >   connector->dp.dsc_dpcd[DP_DSC_BLK_PREDICTION_SUPPORT - 
> > DP_DSC_SUPPORT] &
> >   DP_DSC_BLK_PREDICTION_IS_SUPPORTED;
> > diff --git a/include/drm/display/drm_dsc.h b/include/drm/display/drm_dsc.h
> > index bc90273d06a62..bbbe7438473d3 100644
> > --- a/include/drm/display/drm_dsc.h
> > +++ b/include/drm/display/drm_dsc.h
> > @@ -40,9 +40,6 @@
> >   #define DSC_PPS_RC_RANGE_MINQP_SHIFT11
> >   #define DSC_PPS_RC_RANGE_MAXQP_SHIFT6
> >   #define DSC_PPS_NATIVE_420_SHIFT1
> > -#define DSC_1_2_MAX_LINEBUF_DEPTH_BITS   16
> > -#define DSC_1_2_MAX_LINEBUF_DEPTH_VAL0
> > -#define DSC_1_1_MAX_LINEBUF_DEPTH_BITS   13
> >
> >   /**
> >* struct drm_dsc_rc_range_parameters - DSC Rate Control range parameters


Re: [PATCH 0/9] enabled -Wformat-truncation for clang

2024-03-26 Thread Jakub Kicinski
On Tue, 26 Mar 2024 23:37:59 +0100 Arnd Bergmann wrote:
> I hope that the patches can get picked up by platform maintainers
> directly, so the final patch can go in later on.

platform == subsystem? :)


Re: [RFC PATCH net-next v7 13/14] net: add devmem TCP documentation

2024-03-26 Thread Randy Dunlap
Hi,

On 3/26/24 15:50, Mina Almasry wrote:
> Add documentation outlining the usage and details of devmem TCP.
> 
> Signed-off-by: Mina Almasry 
> 
> ---
> 
> v7:
> - Applied docs suggestions (Jakub).
> 
> v2:
> 
> - Missing spdx (simon)
> - add to index.rst (simon)
> 
> ---
>  Documentation/networking/devmem.rst | 256 
>  Documentation/networking/index.rst  |   1 +
>  2 files changed, 257 insertions(+)
>  create mode 100644 Documentation/networking/devmem.rst
> 
> diff --git a/Documentation/networking/devmem.rst 
> b/Documentation/networking/devmem.rst
> new file mode 100644
> index ..b0899e8e9e83
> --- /dev/null
> +++ b/Documentation/networking/devmem.rst
> @@ -0,0 +1,256 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=
> +Device Memory TCP
> +=
> +
> +
> +Intro
> +=
> +
> +Device memory TCP (devmem TCP) enables receiving data directly into device
> +memory (dmabuf). The feature is currently implemented for TCP sockets.
> +
> +
> +Opportunity
> +---
> +
> +A large number of data transfers have device memory as the source and/or
> +destination. Accelerators drastically increased the prevalence of such
> +transfers.  Some examples include:
> +
> +- Distributed training, where ML accelerators, such as GPUs on different 
> hosts,
> +  exchange data.
> +
> +- Distributed raw block storage applications transfer large amounts of data 
> with
> +  remote SSDs, much of this data does not require host processing.

SSDs. Much

> +
> +Typically the Device-to-Device data transfers the network are implemented as 
> the

 in the network
?

> +following low level operations: Device-to-Host copy, Host-to-Host network

 low-level

> +transfer, and Host-to-Device copy.
> +
> +The flow involving host copies is suboptimal, especially for bulk data 
> transfers,
> +and can put significant strains on system resources such as host memory
> +bandwidth and PCIe bandwidth.
> +
> +Devmem TCP optimizes this use case by implementing socket APIs that enable
> +the user to receive incoming network packets directly into device memory.
> +
> +Packet payloads go directly from the NIC to device memory.
> +
> +Packet headers go to host memory and are processed by the TCP/IP stack
> +normally. The NIC must support header split to achieve this.
> +
> +Advantages:
> +
> +- Alleviate host memory bandwidth pressure, compared to existing
> +  network-transfer + device-copy semantics.
> +
> +- Alleviate PCIe bandwidth pressure, by limiting data transfer to the lowest
> +  level of the PCIe tree, compared to traditional path which sends data 
> through

  to the

> +  the root complex.
> +
> +
> +More Info
> +-
> +
> +  slides, video
> +https://netdevconf.org/0x17/sessions/talk/device-memory-tcp.html
> +
> +  patchset
> +[RFC PATCH v6 00/12] Device Memory TCP
> +
> https://lore.kernel.org/netdev/20240305020153.2787423-1-almasrym...@google.com/
> +
> +
> +Interface
> +=
> +
> +Example
> +---
> +
> +tools/testing/selftests/net/ncdevmem.c:do_server shows an example of setting 
> up
> +the RX path of this API.
> +
> +NIC Setup
> +-
> +
> +Header split, flow steering, & RSS are required features for devmem TCP.
> +
> +Header split is used to split incoming packets into a header buffer in host
> +memory, and a payload buffer in device memory.
> +
> +Flow steering & RSS are used to ensure that only flows targeting devmem land 
> on> +RX queue bound to devmem.

   an RX queue
?

> +
> +Enable header split & flow steering::
> +
> + # enable header split
> + ethtool -G eth1 tcp-data-split on
> +
> +
> + # enable flow steering
> + ethtool -K eth1 ntuple on
> +
> +Configure RSS to steer all traffic away from the target RX queue (queue 15 in
> +this example)::
> +
> + ethtool --set-rxfh-indir eth1 equal 15
> +
> +
> +The user must bind a dmabuf to any number of RX queues on a given NIC using
> +netlink API::

   the netlink API::

> +
> + /* Bind dmabuf to NIC RX queue 15 */
> + struct netdev_queue *queues;
> + queues = malloc(sizeof(*queues) * 1);
> +
> + queues[0]._present.type = 1;
> + queues[0]._present.idx = 1;
> + queues[0].type = NETDEV_RX_QUEUE_TYPE_RX;
> + queues[0].idx = 15;
> +
> + *ys = ynl_sock_create(_netdev_family, );
> +
> + req = netdev_bind_rx_req_alloc();
> + netdev_bind_rx_req_set_ifindex(req, 1 /* ifindex */);
> + netdev_bind_rx_req_set_dmabuf_fd(req, dmabuf_fd);
> + __netdev_bind_rx_req_set_queues(req, queues, n_queue_index);
> +
> + rsp = netdev_bind_rx(*ys, req);
> +
> + dmabuf_id = rsp->dmabuf_id;
> +
> +
> +The netlink API returns a dmabuf_id: a unique ID that refers to this dmabuf
> +that has been bound.
> +
> +Socket Setup
> +
> +
> +The socket must be flow steering to the dmabuf bound RX queue::

 

Re: [PATCH v4 10/16] drm/msm: generate headers on the fly

2024-03-26 Thread Dmitry Baryshkov
On Wed, 27 Mar 2024 at 01:49, Abhinav Kumar  wrote:
>
>
>
> On 3/22/2024 3:57 PM, Dmitry Baryshkov wrote:
> > Generate DRM/MSM headers on the fly during kernel build. This removes a
> > need to push register changes to Mesa with the following manual
> > synchronization step. Existing headers will be removed in the following
> > commits (split away to ease reviews).
> >
>
> This change does two things:
>
> 1) move adreno folder compilation under "adreno-y", move display related
> files compilation undere "msm-display-y", move common files under "msm-y"
>
> 2) changes to generate the header using gen_header.py
>
> Why not split it into two changes?

Basically because there is no difference between object files before
we start moving headers.

>
> > Signed-off-by: Dmitry Baryshkov 
> > ---
> >   drivers/gpu/drm/msm/.gitignore |  1 +
> >   drivers/gpu/drm/msm/Makefile   | 97 
> > +-
> >   drivers/gpu/drm/msm/msm_drv.c  |  3 +-
> >   drivers/gpu/drm/msm/msm_gpu.c  |  2 +-
> >   4 files changed, 80 insertions(+), 23 deletions(-)
> >
>
> 
>
> Are below two changes related to this patch?

Ack, I'll move it to a separate patch.

>
> > +targets += $(ADRENO_HEADERS) $(DISPLAY_HEADERS)
> > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > index 97790faffd23..9c33f4e3f822 100644
> > --- a/drivers/gpu/drm/msm/msm_drv.c
> > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > @@ -17,8 +17,9 @@
> >
> >   #include "msm_drv.h"
> >   #include "msm_debugfs.h"
> > +#include "msm_gem.h"
> > +#include "msm_gpu.h"
> >   #include "msm_kms.h"
> > -#include "adreno/adreno_gpu.h"
> >
> >   /*
> >* MSM driver version:
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> > index 655002b21b0d..cd185b9636d2 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.c
> > +++ b/drivers/gpu/drm/msm/msm_gpu.c
> > @@ -11,7 +11,7 @@
> >   #include "msm_mmu.h"
> >   #include "msm_fence.h"
> >   #include "msm_gpu_trace.h"
> > -#include "adreno/adreno_gpu.h"
> > +//#include "adreno/adreno_gpu.h"
>
> you can just drop this line

Ack

>
> >
> >   #include 
> >   #include 
> >



-- 
With best wishes
Dmitry


Re: [PATCH v4 09/16] drm/msm: import gen_header.py script from Mesa

2024-03-26 Thread Dmitry Baryshkov
On Wed, 27 Mar 2024 at 00:34, Abhinav Kumar  wrote:
>
>
>
> On 3/26/2024 3:25 PM, Dmitry Baryshkov wrote:
> > On Wed, 27 Mar 2024 at 00:19, Abhinav Kumar  
> > wrote:
> >>
> >>
> >>
> >> On 3/22/2024 3:57 PM, Dmitry Baryshkov wrote:
> >>> Import the gen_headers.py script from Mesa, commit FIXME. This script
> >>> will be used to generate MSM register files on the fly during
> >>> compilation.
> >>>
> >>> Signed-off-by: Dmitry Baryshkov 
> >>> ---
> >>>drivers/gpu/drm/msm/registers/gen_header.py | 957 
> >>> 
> >>>1 file changed, 957 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/msm/registers/gen_header.py 
> >>> b/drivers/gpu/drm/msm/registers/gen_header.py
> >>> new file mode 100644
> >>> index ..ae39b7e6cde8
> >>> --- /dev/null
> >>> +++ b/drivers/gpu/drm/msm/registers/gen_header.py
> >>> @@ -0,0 +1,957 @@
> >>> +#!/usr/bin/python3
> >>> +
> >>
> >> We need a licence and copyright here.
> >
> > Yes, this is going to be fixed in the next revision. Mesa already got
> > the proper SPDX header here.
> >
> >>
> >> Also is something like a "based on" applicable here?
> >>
> >> 
> >>
> >>> +import xml.parsers.expat
> >>> +import sys
> >>> +import os
> >>> +import collections
> >>> +import argparse
> >>> +import time
> >>> +import datetime
> >>> +
> >>> +class Error(Exception):
> >>> +This file was generated by the rules-ng-ng gen_header.py tool in this 
> >>> git repository:
> >>> +http://gitlab.freedesktop.org/mesa/mesa/
> >>> +git clone https://gitlab.freedesktop.org/mesa/mesa.git
> >>> +
> >>> +The rules-ng-ng source files this header was generated from are:
> >>
> >> Is this still applicable ?
> >>
> >> Now gen_header.py is moved to kernel.
> >>
> >
> > Copied, not moved. So Mesa remains the primary source for Adreno
> > headers and gen_header.py
> >
>
> But all future development and code review on gen_header.py will be done
> in kernel itself OR periodically we will sync it up with mesa?

We'd sync from kernel.


-- 
With best wishes
Dmitry


Re: [PATCH v4 10/16] drm/msm: generate headers on the fly

2024-03-26 Thread Abhinav Kumar




On 3/22/2024 3:57 PM, Dmitry Baryshkov wrote:

Generate DRM/MSM headers on the fly during kernel build. This removes a
need to push register changes to Mesa with the following manual
synchronization step. Existing headers will be removed in the following
commits (split away to ease reviews).



This change does two things:

1) move adreno folder compilation under "adreno-y", move display related 
files compilation undere "msm-display-y", move common files under "msm-y"


2) changes to generate the header using gen_header.py

Why not split it into two changes?


Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/.gitignore |  1 +
  drivers/gpu/drm/msm/Makefile   | 97 +-
  drivers/gpu/drm/msm/msm_drv.c  |  3 +-
  drivers/gpu/drm/msm/msm_gpu.c  |  2 +-
  4 files changed, 80 insertions(+), 23 deletions(-)





Are below two changes related to this patch?


+targets += $(ADRENO_HEADERS) $(DISPLAY_HEADERS)
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 97790faffd23..9c33f4e3f822 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -17,8 +17,9 @@
  
  #include "msm_drv.h"

  #include "msm_debugfs.h"
+#include "msm_gem.h"
+#include "msm_gpu.h"
  #include "msm_kms.h"
-#include "adreno/adreno_gpu.h"
  
  /*

   * MSM driver version:
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 655002b21b0d..cd185b9636d2 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -11,7 +11,7 @@
  #include "msm_mmu.h"
  #include "msm_fence.h"
  #include "msm_gpu_trace.h"
-#include "adreno/adreno_gpu.h"
+//#include "adreno/adreno_gpu.h" 


you can just drop this line

  
  #include 

  #include 



Re: [PATCH v6 2/3] drm/i915/gt: Do not generate the command streamer for all the CCS

2024-03-26 Thread Andi Shyti
Hi Matt,

On Tue, Mar 26, 2024 at 02:30:33PM -0700, Matt Roper wrote:
> On Tue, Mar 26, 2024 at 07:42:34PM +0100, Andi Shyti wrote:
> > On Tue, Mar 26, 2024 at 09:03:10AM -0700, Matt Roper wrote:
> > > On Wed, Mar 13, 2024 at 09:19:50PM +0100, Andi Shyti wrote:
> > > > +   /*
> > > > +* Do not create the command streamer for CCS 
> > > > slices
> > > > +* beyond the first. All the workload submitted 
> > > > to the
> > > > +* first engine will be shared among all the 
> > > > slices.
> > > > +*
> > > > +* Once the user will be allowed to customize 
> > > > the CCS
> > > > +* mode, then this check needs to be removed.
> > > > +*/
> > > > +   if (IS_DG2(i915) &&
> > > > +   class == COMPUTE_CLASS &&
> > > > +   ccs_instance++)
> > > > +   continue;
> > > 
> > > Wouldn't it be more intuitive to drop the non-lowest CCS engines in
> > > init_engine_mask() since that's the function that's dedicated to
> > > building the list of engines we'll use?  Then we don't need to kill the
> > > assertion farther down either.
> > 
> > Because we don't check the result of init_engine_mask() while
> > creating the engine's structure. We check it only after and
> > indeed I removed the drm_WARN_ON() check.
> > 
> > I think the whole process of creating the engine's structure in
> > the intel_engines_init_mmio() can be simplified, but this goes
> > beyong the scope of the series.
> > 
> > Or am I missing something?
> 
> The important part of init_engine_mask isn't the return value, but
> rather that it's what sets up gt->info.engine_mask.  The HAS_ENGINE()
> check that intel_engines_init_mmio() uses is based on the value stored
> there, so updating that function will also ensure that we skip the
> engines we don't want in the loop.

Yes, can do like this, as well. After all this is done I'm going
to do some cleanup here, as well.

Thanks,
Andi


Re: [PATCH 01/12] kbuild: make -Woverride-init warnings more consistent

2024-03-26 Thread Andrew Jeffery
On Tue, 2024-03-26 at 15:47 +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> The -Woverride-init warn about code that may be intentional or not,
> but the inintentional ones tend to be real bugs, so there is a bit of
> disagreement on whether this warning option should be enabled by default
> and we have multiple settings in scripts/Makefile.extrawarn as well as
> individual subsystems.
> 
> Older versions of clang only supported -Wno-initializer-overrides with
> the same meaning as gcc's -Woverride-init, though all supported versions
> now work with both. Because of this difference, an earlier cleanup of
> mine accidentally turned the clang warning off for W=1 builds and only
> left it on for W=2, while it's still enabled for gcc with W=1.
> 
> There is also one driver that only turns the warning off for newer
> versions of gcc but not other compilers, and some but not all the
> Makefiles still use a cc-disable-warning conditional that is no
> longer needed with supported compilers here.
> 
> Address all of the above by removing the special cases for clang
> and always turning the warning off unconditionally where it got
> in the way, using the syntax that is supported by both compilers.
> 
> Fixes: 2cd3271b7a31 ("kbuild: avoid duplicate warning options")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/gpu/drm/amd/display/dc/dce110/Makefile |  2 +-
>  drivers/gpu/drm/amd/display/dc/dce112/Makefile |  2 +-
>  drivers/gpu/drm/amd/display/dc/dce120/Makefile |  2 +-
>  drivers/gpu/drm/amd/display/dc/dce60/Makefile  |  2 +-
>  drivers/gpu/drm/amd/display/dc/dce80/Makefile  |  2 +-
>  drivers/gpu/drm/i915/Makefile  |  6 +++---
>  drivers/gpu/drm/xe/Makefile|  4 ++--
>  drivers/net/ethernet/renesas/sh_eth.c  |  2 +-
>  drivers/pinctrl/aspeed/Makefile|  2 +-

For the Aspeed change:

Acked-by: Andrew Jeffery 

Thanks!


Re: [RESEND v3 2/2] drm: Add CONFIG_DRM_WERROR

2024-03-26 Thread Nathan Chancellor
On Tue, Mar 05, 2024 at 11:07:36AM +0200, Jani Nikula wrote:
> Add kconfig to enable -Werror subsystem wide. This is useful for
> development and CI to keep the subsystem warning free, while avoiding
> issues outside of the subsystem that kernel wide CONFIG_WERROR=y might
> hit.
> 
> v2: Don't depend on COMPILE_TEST
> 
> Reviewed-by: Hamza Mahfooz  # v1
> Signed-off-by: Jani Nikula 
> ---
>  drivers/gpu/drm/Kconfig  | 13 +
>  drivers/gpu/drm/Makefile |  3 +++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 6e853acf15da..c08e18108c2a 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -416,3 +416,16 @@ config DRM_LIB_RANDOM
>  config DRM_PRIVACY_SCREEN
>   bool
>   default n
> +
> +config DRM_WERROR
> + bool "Compile the drm subsystem with warnings as errors"
> + depends on EXPERT
> + default n
> + help
> +   A kernel build should not cause any compiler warnings, and this
> +   enables the '-Werror' flag to enforce that rule in the drm subsystem.
> +
> +   The drm subsystem enables more warnings than the kernel default, so
> +   this config option is disabled by default.
> +
> +   If in doubt, say N.

While I understand the desire for an easy switch that maintainers and
developers can use to ensure that their changes are warning free for the
drm subsystem specifically, I think subsystem specific configuration
options like this are actively detrimental to developers and continuous
integration systems that build test the entire kernel. For example, we
turned off CONFIG_WERROR for our Hexagon builds because of warnings that
appear with -Wextra that are legitimate but require treewide changes to
resolve in a manner sufficient for Linus:

https://github.com/ClangBuiltLinux/linux/issues/1285
https://lore.kernel.org/all/CAHk-=wg80je=k7madf4e7wrrnp37e3qh6y10svhdc7o8sz_...@mail.gmail.com/
https://lore.kernel.org/all/20230522105049.1467313-1-schne...@linux.ibm.com/

But now, due to CONFIG_DRM_WERROR getting enabled by all{mod,yes}config
and -Wextra being unconditionally enabled for DRM, those warnings hard
break the build despite CONFIG_WERROR=n...

https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2eEBDGEqfmMZjGg3ZvDx2af2pde/build.log

Same thing with PowerPC allmodconfig because we see -Wframe-larger-than
that appears because allmodconfig enables CONFIG_KASAN or CONFIG_KCSAN
usually:

https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2eE2HDsODudQGqkMKAPQnId7pRd/build.log

I don't know what the solution for this conflict is through. I guess it
is just the nature of the kernel being a federation of independent
subsystems that want to have their own policies. I suppose we can just
set CONFIG_DRM_WERROR=n and be done with it but I would like to avoid
this issue from spreading to other subsystems because it does not scale
for folks like us who do many builds across many trees.

It would be nice if there was something like CONFIG_WERROR_DIRS or
something that could take a set of directories that should have -Werror
enabled so that you could do something like

  CONFIG_WERROR_DIRS="drivers/gpu/drm"

and have -Werror automatically added to all commands within that
directory like subdir-ccflags-y but it is explicitly opt in on the part
of the developer/tester, rather than just happening to get enabled due
to all{mod,yes}config. No idea if that is feasible or not though.

> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index ea456f057e8a..a73c04d2d7a3 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -30,6 +30,9 @@ subdir-ccflags-y += -Wno-sign-compare
>  endif
>  # --- end copy-paste
>  
> +# Enable -Werror in CI and development
> +subdir-ccflags-$(CONFIG_DRM_WERROR) += -Werror
> +
>  drm-y := \
>   drm_aperture.o \
>   drm_atomic.o \
> -- 
> 2.39.2
> 


[RFC PATCH net-next v7 04/14] netdev: support binding dma-buf to netdevice

2024-03-26 Thread Mina Almasry
Add a netdev_dmabuf_binding struct which represents the
dma-buf-to-netdevice binding. The netlink API will bind the dma-buf to
rx queues on the netdevice. On the binding, the dma_buf_attach
& dma_buf_map_attachment will occur. The entries in the sg_table from
mapping will be inserted into a genpool to make it ready
for allocation.

The chunks in the genpool are owned by a dmabuf_chunk_owner struct which
holds the dma-buf offset of the base of the chunk and the dma_addr of
the chunk. Both are needed to use allocations that come from this chunk.

We create a new type that represents an allocation from the genpool:
net_iov. We setup the net_iov allocation size in the
genpool to PAGE_SIZE for simplicity: to match the PAGE_SIZE normally
allocated by the page pool and given to the drivers.

The user can unbind the dmabuf from the netdevice by closing the netlink
socket that established the binding. We do this so that the binding is
automatically unbound even if the userspace process crashes.

The binding and unbinding leaves an indicator in struct netdev_rx_queue
that the given queue is bound, but the binding doesn't take effect until
the driver actually reconfigures its queues, and re-initializes its page
pool.

The netdev_dmabuf_binding struct is refcounted, and releases its
resources only when all the refs are released.

Signed-off-by: Willem de Bruijn 
Signed-off-by: Kaiyuan Zhang 
Signed-off-by: Mina Almasry 

---

v7:
- Use IS_ERR() instead of IS_ERR_OR_NULL() for the dma_buf_get() return
  value.
- Changes netdev_* naming in devmem.c to net_devmem_* (Yunsheng).
- DMA_BIDIRECTIONAL -> DMA_FROM_DEVICE (Yunsheng).
- Added a comment around recovering of the old rx queue in
  net_devmem_restart_rx_queue(), and added freeing of old_mem if the
  restart of the old queue fails. (Yunsheng).
- Use kernel-family sock-priv (Jakub).
- Put pp_memory_provider_params in netdev_rx_queue instead of the
  dma-buf specific binding (Pavel & David).
- Move queue management ops to queue_mgmt_ops instead of netdev_ops
  (Jakub).
- Remove excess whitespaces (Jakub).
- Use genlmsg_iput (Jakub).

v6:
- Validate rx queue index
- Refactor new functions into devmem.c (Pavel)

v5:
- Renamed page_pool_iov to net_iov, and moved that support to devmem.h
  or netmem.h.

v1:

- Introduce devmem.h instead of bloating netdevice.h (Jakub)
- ENOTSUPP -> EOPNOTSUPP (checkpatch.pl I think)
- Remove unneeded rcu protection for binding->list (rtnl protected)
- Removed extraneous err_binding_put: label.
- Removed dma_addr += len (Paolo).
- Don't override err on netdev_bind_dmabuf_to_queue failure.
- Rename devmem -> dmabuf (David).
- Add id to dmabuf binding (David/Stan).
- Fix missing xa_destroy bound_rq_list.
- Use queue api to reset bound RX queues (Jakub).
- Update netlink API for rx-queue type (tx/re) (Jakub).

RFC v3:
- Support multi rx-queue binding

---
 Documentation/netlink/specs/netdev.yaml |   4 +
 include/net/devmem.h| 111 +
 include/net/netdev_rx_queue.h   |   2 +
 include/net/netmem.h|  10 +
 include/net/page_pool/types.h   |   5 +
 net/core/Makefile   |   2 +-
 net/core/dev.c  |   3 +
 net/core/devmem.c   | 304 
 net/core/netdev-genl-gen.c  |   4 +
 net/core/netdev-genl-gen.h  |   4 +
 net/core/netdev-genl.c  | 105 +++-
 11 files changed, 551 insertions(+), 3 deletions(-)
 create mode 100644 include/net/devmem.h
 create mode 100644 net/core/devmem.c

diff --git a/Documentation/netlink/specs/netdev.yaml 
b/Documentation/netlink/specs/netdev.yaml
index 275d1faa87a6..bf4e58dfe9dd 100644
--- a/Documentation/netlink/specs/netdev.yaml
+++ b/Documentation/netlink/specs/netdev.yaml
@@ -550,6 +550,10 @@ operations:
 - tx-packets
 - tx-bytes
 
+kernel-family:
+  headers: [ "linux/list.h"]
+  sock-priv: struct list_head
+
 mcast-groups:
   list:
 -
diff --git a/include/net/devmem.h b/include/net/devmem.h
new file mode 100644
index ..fa03bdabdffd
--- /dev/null
+++ b/include/net/devmem.h
@@ -0,0 +1,111 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Device memory TCP support
+ *
+ * Authors:Mina Almasry 
+ * Willem de Bruijn 
+ * Kaiyuan Zhang 
+ *
+ */
+#ifndef _NET_DEVMEM_H
+#define _NET_DEVMEM_H
+
+struct net_devmem_dmabuf_binding {
+   struct dma_buf *dmabuf;
+   struct dma_buf_attachment *attachment;
+   struct sg_table *sgt;
+   struct net_device *dev;
+   struct gen_pool *chunk_pool;
+
+   /* The user holds a ref (via the netlink API) for as long as they want
+* the binding to remain alive. Each page pool using this binding holds
+* a ref to keep the binding alive. Each allocated net_iov holds a
+* ref.
+*
+* The binding undos itself and unmaps the underlying dmabuf once all
+* those refs are 

[RFC PATCH net-next v7 11/14] tcp: RX path for devmem TCP

2024-03-26 Thread Mina Almasry
In tcp_recvmsg_locked(), detect if the skb being received by the user
is a devmem skb. In this case - if the user provided the MSG_SOCK_DEVMEM
flag - pass it to tcp_recvmsg_devmem() for custom handling.

tcp_recvmsg_devmem() copies any data in the skb header to the linear
buffer, and returns a cmsg to the user indicating the number of bytes
returned in the linear buffer.

tcp_recvmsg_devmem() then loops over the unaccessible devmem skb frags,
and returns to the user a cmsg_devmem indicating the location of the
data in the dmabuf device memory. cmsg_devmem contains this information:

1. the offset into the dmabuf where the payload starts. 'frag_offset'.
2. the size of the frag. 'frag_size'.
3. an opaque token 'frag_token' to return to the kernel when the buffer
is to be released.

The pages awaiting freeing are stored in the newly added
sk->sk_user_frags, and each page passed to userspace is get_page()'d.
This reference is dropped once the userspace indicates that it is
done reading this page.  All pages are released when the socket is
destroyed.

Signed-off-by: Willem de Bruijn 
Signed-off-by: Kaiyuan Zhang 
Signed-off-by: Mina Almasry 

---

v7:
- Updated the SO_DEVMEM_* uapi to use the next available entries (Arnd).
- Updated dmabuf_cmsg struct to be __u64 padded (Arnd).
- Squashed fix from Eric to initialize sk_user_frags for passive
  sockets (Eric).

v6
- skb->dmabuf -> skb->readable (Pavel)
- Fixed asm definitions of SO_DEVMEM_LINEAR/SO_DEVMEM_DMABUF not found
  on some archs.
- Squashed in locking optimizations from eduma...@google.com. With this
  change we lock the xarray once per per tcp_recvmsg_dmabuf() rather
  than once per frag in xa_alloc().

Changes in v1:
- Added dmabuf_id to dmabuf_cmsg (David/Stan).
- Devmem -> dmabuf (David).
- Change tcp_recvmsg_dmabuf() check to skb->dmabuf (Paolo).
- Use __skb_frag_ref() & napi_pp_put_page() for refcounting (Yunsheng).

RFC v3:
- Fixed issue with put_cmsg() failing silently.

---
 arch/alpha/include/uapi/asm/socket.h  |   5 +
 arch/mips/include/uapi/asm/socket.h   |   5 +
 arch/parisc/include/uapi/asm/socket.h |   5 +
 arch/sparc/include/uapi/asm/socket.h  |   5 +
 include/linux/socket.h|   1 +
 include/net/netmem.h  |  13 ++
 include/net/sock.h|   2 +
 include/uapi/asm-generic/socket.h |   5 +
 include/uapi/linux/uio.h  |  13 ++
 net/ipv4/tcp.c| 248 +-
 net/ipv4/tcp_ipv4.c   |   9 +
 net/ipv4/tcp_minisocks.c  |   2 +
 12 files changed, 308 insertions(+), 5 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/socket.h 
b/arch/alpha/include/uapi/asm/socket.h
index e94f621903fe..ef4656a41058 100644
--- a/arch/alpha/include/uapi/asm/socket.h
+++ b/arch/alpha/include/uapi/asm/socket.h
@@ -140,6 +140,11 @@
 #define SO_PASSPIDFD   76
 #define SO_PEERPIDFD   77
 
+#define SO_DEVMEM_LINEAR   78
+#define SCM_DEVMEM_LINEAR  SO_DEVMEM_LINEAR
+#define SO_DEVMEM_DMABUF   79
+#define SCM_DEVMEM_DMABUF  SO_DEVMEM_DMABUF
+
 #if !defined(__KERNEL__)
 
 #if __BITS_PER_LONG == 64
diff --git a/arch/mips/include/uapi/asm/socket.h 
b/arch/mips/include/uapi/asm/socket.h
index 60ebaed28a4c..414807d55e33 100644
--- a/arch/mips/include/uapi/asm/socket.h
+++ b/arch/mips/include/uapi/asm/socket.h
@@ -151,6 +151,11 @@
 #define SO_PASSPIDFD   76
 #define SO_PEERPIDFD   77
 
+#define SO_DEVMEM_LINEAR   78
+#define SCM_DEVMEM_LINEAR  SO_DEVMEM_LINEAR
+#define SO_DEVMEM_DMABUF   79
+#define SCM_DEVMEM_DMABUF  SO_DEVMEM_DMABUF
+
 #if !defined(__KERNEL__)
 
 #if __BITS_PER_LONG == 64
diff --git a/arch/parisc/include/uapi/asm/socket.h 
b/arch/parisc/include/uapi/asm/socket.h
index be264c2b1a11..2b817efd4544 100644
--- a/arch/parisc/include/uapi/asm/socket.h
+++ b/arch/parisc/include/uapi/asm/socket.h
@@ -132,6 +132,11 @@
 #define SO_PASSPIDFD   0x404A
 #define SO_PEERPIDFD   0x404B
 
+#define SO_DEVMEM_LINEAR   78
+#define SCM_DEVMEM_LINEAR  SO_DEVMEM_LINEAR
+#define SO_DEVMEM_DMABUF   79
+#define SCM_DEVMEM_DMABUF  SO_DEVMEM_DMABUF
+
 #if !defined(__KERNEL__)
 
 #if __BITS_PER_LONG == 64
diff --git a/arch/sparc/include/uapi/asm/socket.h 
b/arch/sparc/include/uapi/asm/socket.h
index 682da3714686..00248fc68977 100644
--- a/arch/sparc/include/uapi/asm/socket.h
+++ b/arch/sparc/include/uapi/asm/socket.h
@@ -133,6 +133,11 @@
 #define SO_PASSPIDFD 0x0055
 #define SO_PEERPIDFD 0x0056
 
+#define SO_DEVMEM_LINEAR 0x0057
+#define SCM_DEVMEM_LINEARSO_DEVMEM_LINEAR
+#define SO_DEVMEM_DMABUF 0x0058
+#define SCM_DEVMEM_DMABUFSO_DEVMEM_DMABUF
+
 #if !defined(__KERNEL__)
 
 
diff --git a/include/linux/socket.h b/include/linux/socket.h
index 139c330ccf2c..f11ab541439e 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -326,6 +326,7 @@ struct ucred {
  

[RFC PATCH net-next v7 12/14] net: add SO_DEVMEM_DONTNEED setsockopt to release RX frags

2024-03-26 Thread Mina Almasry
Add an interface for the user to notify the kernel that it is done
reading the devmem dmabuf frags returned as cmsg. The kernel will
drop the reference on the frags to make them available for reuse.

Signed-off-by: Willem de Bruijn 
Signed-off-by: Kaiyuan Zhang 
Signed-off-by: Mina Almasry 

---

v7:
- Updated SO_DEVMEM_* uapi to use the next available entry (Arnd).

v6:
- Squash in locking optimizations from eduma...@google.com. With his
  changes we lock the xarray once per sock_devmem_dontneed operation
  rather than once per frag.

Changes in v1:
- devmemtoken -> dmabuf_token (David).
- Use napi_pp_put_page() for refcounting (Yunsheng).
- Fix build error with missing socket options on other asms.

---
 arch/alpha/include/uapi/asm/socket.h  |  1 +
 arch/mips/include/uapi/asm/socket.h   |  1 +
 arch/parisc/include/uapi/asm/socket.h |  1 +
 arch/sparc/include/uapi/asm/socket.h  |  1 +
 include/uapi/asm-generic/socket.h |  1 +
 include/uapi/linux/uio.h  |  4 ++
 net/core/sock.c   | 61 +++
 7 files changed, 70 insertions(+)

diff --git a/arch/alpha/include/uapi/asm/socket.h 
b/arch/alpha/include/uapi/asm/socket.h
index ef4656a41058..251b73c5481e 100644
--- a/arch/alpha/include/uapi/asm/socket.h
+++ b/arch/alpha/include/uapi/asm/socket.h
@@ -144,6 +144,7 @@
 #define SCM_DEVMEM_LINEAR  SO_DEVMEM_LINEAR
 #define SO_DEVMEM_DMABUF   79
 #define SCM_DEVMEM_DMABUF  SO_DEVMEM_DMABUF
+#define SO_DEVMEM_DONTNEED 80
 
 #if !defined(__KERNEL__)
 
diff --git a/arch/mips/include/uapi/asm/socket.h 
b/arch/mips/include/uapi/asm/socket.h
index 414807d55e33..8ab7582291ab 100644
--- a/arch/mips/include/uapi/asm/socket.h
+++ b/arch/mips/include/uapi/asm/socket.h
@@ -155,6 +155,7 @@
 #define SCM_DEVMEM_LINEAR  SO_DEVMEM_LINEAR
 #define SO_DEVMEM_DMABUF   79
 #define SCM_DEVMEM_DMABUF  SO_DEVMEM_DMABUF
+#define SO_DEVMEM_DONTNEED 80
 
 #if !defined(__KERNEL__)
 
diff --git a/arch/parisc/include/uapi/asm/socket.h 
b/arch/parisc/include/uapi/asm/socket.h
index 2b817efd4544..38fc0b188e08 100644
--- a/arch/parisc/include/uapi/asm/socket.h
+++ b/arch/parisc/include/uapi/asm/socket.h
@@ -136,6 +136,7 @@
 #define SCM_DEVMEM_LINEAR  SO_DEVMEM_LINEAR
 #define SO_DEVMEM_DMABUF   79
 #define SCM_DEVMEM_DMABUF  SO_DEVMEM_DMABUF
+#define SO_DEVMEM_DONTNEED 80
 
 #if !defined(__KERNEL__)
 
diff --git a/arch/sparc/include/uapi/asm/socket.h 
b/arch/sparc/include/uapi/asm/socket.h
index 00248fc68977..57084ed2f3c4 100644
--- a/arch/sparc/include/uapi/asm/socket.h
+++ b/arch/sparc/include/uapi/asm/socket.h
@@ -137,6 +137,7 @@
 #define SCM_DEVMEM_LINEARSO_DEVMEM_LINEAR
 #define SO_DEVMEM_DMABUF 0x0058
 #define SCM_DEVMEM_DMABUFSO_DEVMEM_DMABUF
+#define SO_DEVMEM_DONTNEED   0x0059
 
 #if !defined(__KERNEL__)
 
diff --git a/include/uapi/asm-generic/socket.h 
b/include/uapi/asm-generic/socket.h
index 25a2f5255f52..1acb77780f10 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -135,6 +135,7 @@
 #define SO_PASSPIDFD   76
 #define SO_PEERPIDFD   77
 
+#define SO_DEVMEM_DONTNEED 97
 #define SO_DEVMEM_LINEAR   98
 #define SCM_DEVMEM_LINEAR  SO_DEVMEM_LINEAR
 #define SO_DEVMEM_DMABUF   99
diff --git a/include/uapi/linux/uio.h b/include/uapi/linux/uio.h
index 3a22ddae376a..d17f8fcd93ec 100644
--- a/include/uapi/linux/uio.h
+++ b/include/uapi/linux/uio.h
@@ -33,6 +33,10 @@ struct dmabuf_cmsg {
 */
 };
 
+struct dmabuf_token {
+   __u32 token_start;
+   __u32 token_count;
+};
 /*
  * UIO_MAXIOV shall be at least 16 1003.1g (5.4.1.1)
  */
diff --git a/net/core/sock.c b/net/core/sock.c
index 43bf3818c19e..b589610cbe4a 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1049,6 +1049,63 @@ static int sock_reserve_memory(struct sock *sk, int 
bytes)
return 0;
 }
 
+#ifdef CONFIG_PAGE_POOL
+static noinline_for_stack int
+sock_devmem_dontneed(struct sock *sk, sockptr_t optval, unsigned int optlen)
+{
+   unsigned int num_tokens, i, j, k, netmem_num = 0;
+   struct dmabuf_token *tokens;
+   netmem_ref netmems[16];
+   int ret;
+
+   if (sk->sk_type != SOCK_STREAM || sk->sk_protocol != IPPROTO_TCP)
+   return -EBADF;
+
+   if (optlen % sizeof(struct dmabuf_token) ||
+   optlen > sizeof(*tokens) * 128)
+   return -EINVAL;
+
+   tokens = kvmalloc_array(128, sizeof(*tokens), GFP_KERNEL);
+   if (!tokens)
+   return -ENOMEM;
+
+   num_tokens = optlen / sizeof(struct dmabuf_token);
+   if (copy_from_sockptr(tokens, optval, optlen))
+   return -EFAULT;
+
+   ret = 0;
+
+   xa_lock_bh(>sk_user_frags);
+   for (i = 0; i < num_tokens; i++) {
+   for (j = 0; j < tokens[i].token_count; j++) {
+   netmem_ref netmem = (__force netmem_ref)__xa_erase(
+   

[RFC PATCH net-next v7 09/14] net: support non paged skb frags

2024-03-26 Thread Mina Almasry
Make skb_frag_page() fail in the case where the frag is not backed
by a page, and fix its relevant callers to handle this case.

Signed-off-by: Mina Almasry 


---

v6:
- Rebased on top of the merged netmem changes.

Changes in v1:
- Fix illegal_highdma() (Yunsheng).
- Rework napi_pp_put_page() slightly to reduce code churn (Willem).

---
 include/linux/skbuff.h | 53 +++---
 net/core/dev.c |  3 ++-
 net/core/gro.c |  3 ++-
 net/core/skbuff.c  | 11 +
 net/ipv4/esp4.c|  2 +-
 net/ipv4/tcp.c |  3 +++
 net/ipv6/esp6.c|  2 +-
 7 files changed, 65 insertions(+), 12 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 78659c8efa4e..8143aee8d911 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3483,17 +3483,53 @@ static inline void skb_frag_off_copy(skb_frag_t *fragto,
fragto->offset = fragfrom->offset;
 }
 
+/* Returns true if the skb_frag contains a net_iov. */
+static inline bool skb_frag_is_net_iov(const skb_frag_t *frag)
+{
+   return netmem_is_net_iov(frag->netmem);
+}
+
+/**
+ * skb_frag_net_iov - retrieve the net_iov referred to by fragment
+ * @frag: the fragment
+ *
+ * Returns the  net_iov associated with @frag. Returns NULL if this
+ * frag has no associated net_iov.
+ */
+static inline struct net_iov *skb_frag_net_iov(const skb_frag_t *frag)
+{
+   if (!skb_frag_is_net_iov(frag))
+   return NULL;
+
+   return netmem_to_net_iov(frag->netmem);
+}
+
 /**
  * skb_frag_page - retrieve the page referred to by a paged fragment
  * @frag: the paged fragment
  *
- * Returns the  page associated with @frag.
+ * Returns the  page associated with @frag. Returns NULL if this frag
+ * has no associated page.
  */
 static inline struct page *skb_frag_page(const skb_frag_t *frag)
 {
+   if (skb_frag_is_net_iov(frag))
+   return NULL;
+
return netmem_to_page(frag->netmem);
 }
 
+/**
+ * skb_frag_netmem - retrieve the netmem referred to by a fragment
+ * @frag: the fragment
+ *
+ * Returns the _ref associated with @frag.
+ */
+static inline netmem_ref skb_frag_netmem(const skb_frag_t *frag)
+{
+   return frag->netmem;
+}
+
 /**
  * __skb_frag_ref - take an addition reference on a paged fragment.
  * @frag: the paged fragment
@@ -3524,25 +3560,23 @@ int skb_cow_data_for_xdp(struct page_pool *pool, struct 
sk_buff **pskb,
 bool napi_pp_put_page(netmem_ref netmem, bool napi_safe);
 
 static inline void
-skb_page_unref(const struct sk_buff *skb, struct page *page, bool napi_safe)
+skb_page_unref(const struct sk_buff *skb, netmem_ref netmem, bool napi_safe)
 {
 #ifdef CONFIG_PAGE_POOL
-   if (skb->pp_recycle && napi_pp_put_page(page, napi_safe))
+   if (skb->pp_recycle && napi_pp_put_page(netmem, napi_safe))
return;
 #endif
-   put_page(page);
+   put_page(netmem_to_page(netmem));
 }
 
 static inline void
 napi_frag_unref(skb_frag_t *frag, bool recycle, bool napi_safe)
 {
-   struct page *page = skb_frag_page(frag);
-
 #ifdef CONFIG_PAGE_POOL
-   if (recycle && napi_pp_put_page(page_to_netmem(page), napi_safe))
+   if (recycle && napi_pp_put_page(skb_frag_netmem(frag), napi_safe))
return;
 #endif
-   put_page(page);
+   put_page(skb_frag_page(frag));
 }
 
 /**
@@ -3582,6 +3616,9 @@ static inline void skb_frag_unref(struct sk_buff *skb, 
int f)
  */
 static inline void *skb_frag_address(const skb_frag_t *frag)
 {
+   if (!skb_frag_page(frag))
+   return NULL;
+
return page_address(skb_frag_page(frag)) + skb_frag_off(frag);
 }
 
diff --git a/net/core/dev.c b/net/core/dev.c
index e10610698a0a..8228432cb600 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3411,8 +3411,9 @@ static int illegal_highdma(struct net_device *dev, struct 
sk_buff *skb)
if (!(dev->features & NETIF_F_HIGHDMA)) {
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
skb_frag_t *frag = _shinfo(skb)->frags[i];
+   struct page *page = skb_frag_page(frag);
 
-   if (PageHighMem(skb_frag_page(frag)))
+   if (page && PageHighMem(page))
return 1;
}
}
diff --git a/net/core/gro.c b/net/core/gro.c
index ee30d4f0c038..eef20c82c5c3 100644
--- a/net/core/gro.c
+++ b/net/core/gro.c
@@ -380,7 +380,8 @@ static inline void skb_gro_reset_offset(struct sk_buff 
*skb, u32 nhoff)
pinfo = skb_shinfo(skb);
frag0 = >frags[0];
 
-   if (pinfo->nr_frags && !PageHighMem(skb_frag_page(frag0)) &&
+   if (pinfo->nr_frags && skb_frag_page(frag0) &&
+   !PageHighMem(skb_frag_page(frag0)) &&
(!NET_IP_ALIGN || !((skb_frag_off(frag0) + nhoff) & 3))) {
NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
NAPI_GRO_CB(skb)->frag0_len = min_t(unsigned int,
diff --git 

[RFC PATCH net-next v7 07/14] page_pool: devmem support

2024-03-26 Thread Mina Almasry
Convert netmem to be a union of struct page and struct netmem. Overload
the LSB of struct netmem* to indicate that it's a net_iov, otherwise
it's a page.

Currently these entries in struct page are rented by the page_pool and
used exclusively by the net stack:

struct {
unsigned long pp_magic;
struct page_pool *pp;
unsigned long _pp_mapping_pad;
unsigned long dma_addr;
atomic_long_t pp_ref_count;
};

Mirror these (and only these) entries into struct net_iov and implement
netmem helpers that can access these common fields regardless of
whether the underlying type is page or net_iov.

Implement checks for net_iov in netmem helpers which delegate to mm
APIs, to ensure net_iov are never passed to the mm stack.

Signed-off-by: Mina Almasry 

---

v7:
- Remove static_branch_unlikely from netmem_to_net_iov(). We're getting
  better results from the fast path in bench_page_pool_simple tests
  without the static_branch_unlikely, and the addition of
  static_branch_unlikely doesn't improve performance of devmem TCP.

  Additionally only check netmem_to_net_iov() if
  CONFIG_DMA_SHARED_BUFFER is enabled, otherwise dmabuf net_iovs cannot
  exist anyway.

  net-next base: 8 cycle fast path.
  with static_branch_unlikely: 10 cycle fast path.
  without static_branch_unlikely: 9 cycle fast path.
  CONFIG_DMA_SHARED_BUFFER disabled: 8 cycle fast path as baseline.

  Performance of devmem TCP is at 95% line rate is regardless of
  static_branch_unlikely or not.

v6:
- Rebased on top of the merged netmem_ref type.
- Rebased on top of the merged skb_pp_frag_ref() changes.

v5:
- Use netmem instead of page* with LSB set.
- Use pp_ref_count for refcounting net_iov.
- Removed many of the custom checks for netmem.

v1:
- Disable fragmentation support for iov properly.
- fix napi_pp_put_page() path (Yunsheng).
- Use pp_frag_count for devmem refcounting.

To: linux...@kvack.org
Cc: Matthew Wilcox 

---
 include/net/netmem.h| 143 ++--
 include/net/page_pool/helpers.h |  25 +++---
 include/net/page_pool/types.h   |   1 +
 net/core/page_pool.c|  26 +++---
 net/core/skbuff.c   |  23 +++--
 5 files changed, 171 insertions(+), 47 deletions(-)

diff --git a/include/net/netmem.h b/include/net/netmem.h
index 21f53b29e5fe..74eeaa34883e 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -9,14 +9,51 @@
 #define _NET_NETMEM_H
 
 #include 
+#include 
 
 /* net_iov */
 
+DECLARE_STATIC_KEY_FALSE(page_pool_mem_providers);
+
+/*  We overload the LSB of the struct page pointer to indicate whether it's
+ *  a page or net_iov.
+ */
+#define NET_IOV 0x01UL
+
 struct net_iov {
+   unsigned long __unused_padding;
+   unsigned long pp_magic;
+   struct page_pool *pp;
struct dmabuf_genpool_chunk_owner *owner;
unsigned long dma_addr;
+   atomic_long_t pp_ref_count;
 };
 
+/* These fields in struct page are used by the page_pool and net stack:
+ *
+ * struct {
+ * unsigned long pp_magic;
+ * struct page_pool *pp;
+ * unsigned long _pp_mapping_pad;
+ * unsigned long dma_addr;
+ * atomic_long_t pp_ref_count;
+ * };
+ *
+ * We mirror the page_pool fields here so the page_pool can access these fields
+ * without worrying whether the underlying fields belong to a page or net_iov.
+ *
+ * The non-net stack fields of struct page are private to the mm stack and must
+ * never be mirrored to net_iov.
+ */
+#define NET_IOV_ASSERT_OFFSET(pg, iov) \
+   static_assert(offsetof(struct page, pg) == \
+ offsetof(struct net_iov, iov))
+NET_IOV_ASSERT_OFFSET(pp_magic, pp_magic);
+NET_IOV_ASSERT_OFFSET(pp, pp);
+NET_IOV_ASSERT_OFFSET(dma_addr, dma_addr);
+NET_IOV_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
+#undef NET_IOV_ASSERT_OFFSET
+
 static inline struct dmabuf_genpool_chunk_owner *
 net_iov_owner(const struct net_iov *niov)
 {
@@ -50,7 +87,7 @@ static inline dma_addr_t net_iov_dma_addr(const struct 
net_iov *niov)
   ((dma_addr_t)net_iov_idx(niov) << PAGE_SHIFT);
 }
 
-static inline struct netdev_dmabuf_binding *
+static inline struct net_devmem_dmabuf_binding *
 net_iov_binding(const struct net_iov *niov)
 {
return net_iov_owner(niov)->binding;
@@ -69,20 +106,26 @@ net_iov_binding(const struct net_iov *niov)
  */
 typedef unsigned long __bitwise netmem_ref;
 
+static inline bool netmem_is_net_iov(const netmem_ref netmem)
+{
+#if defined(CONFIG_PAGE_POOL) && defined(CONFIG_DMA_SHARED_BUFFER)
+   return (__force unsigned long)netmem & NET_IOV;
+#else
+   return false;
+#endif
+}
+
 /* This conversion fails (returns NULL) if the netmem_ref is not struct page
  * backed.
- *
- * Currently struct page is the only possible netmem, and this helper never
- * fails.
  */
 static inline struct page *netmem_to_page(netmem_ref netmem)
 {
+   if (WARN_ON_ONCE(netmem_is_net_iov(netmem)))
+   

[RFC PATCH net-next v7 10/14] net: add support for skbs with unreadable frags

2024-03-26 Thread Mina Almasry
For device memory TCP, we expect the skb headers to be available in host
memory for access, and we expect the skb frags to be in device memory
and unaccessible to the host. We expect there to be no mixing and
matching of device memory frags (unaccessible) with host memory frags
(accessible) in the same skb.

Add a skb->devmem flag which indicates whether the frags in this skb
are device memory frags or not.

__skb_fill_netmem_desc() now checks frags added to skbs for net_iov,
and marks the skb as skb->devmem accordingly.

Add checks through the network stack to avoid accessing the frags of
devmem skbs and avoid coalescing devmem skbs with non devmem skbs.

Signed-off-by: Willem de Bruijn 
Signed-off-by: Kaiyuan Zhang 
Signed-off-by: Mina Almasry 


---

v6
- skb->dmabuf -> skb->readable (Pavel). Pavel's original suggestion was
  to remove the skb->dmabuf flag entirely, but when I looked into it
  closely, I found the issue that if we remove the flag we have to
  dereference the shinfo(skb) pointer to obtain the first frag, which
  can cause a performance regression if it dirties the cache line when
  the shinfo(skb) was not really needed. Instead, I converted the
  skb->dmabuf flag into a generic skb->readable flag which can be
  re-used by io_uring.

Changes in v1:
- Rename devmem -> dmabuf (David).
- Flip skb_frags_not_readable (Jakub).

---
 include/linux/skbuff.h | 18 --
 include/net/tcp.h  |  5 +--
 net/core/datagram.c|  6 
 net/core/gro.c |  5 ++-
 net/core/skbuff.c  | 75 +++---
 net/ipv4/tcp.c |  3 ++
 net/ipv4/tcp_input.c   | 13 ++--
 net/ipv4/tcp_output.c  |  5 ++-
 net/packet/af_packet.c |  4 +--
 9 files changed, 112 insertions(+), 22 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8143aee8d911..d7245540e67a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -818,6 +818,7 @@ typedef unsigned char *sk_buff_data_t;
  * @csum_level: indicates the number of consecutive checksums found in
  * the packet minus one that have been verified as
  * CHECKSUM_UNNECESSARY (max 3)
+ * @readable: indicates that all the fragments in this skb are readable.
  * @dst_pending_confirm: need to confirm neighbour
  * @decrypted: Decrypted SKB
  * @slow_gro: state present at GRO time, slower prepare step required
@@ -1004,7 +1005,7 @@ struct sk_buff {
 #if IS_ENABLED(CONFIG_IP_SCTP)
__u8csum_not_inet:1;
 #endif
-
+   __u8readable:1;
 #if defined(CONFIG_NET_SCHED) || defined(CONFIG_NET_XGRESS)
__u16   tc_index;   /* traffic control index */
 #endif
@@ -1796,6 +1797,12 @@ static inline void skb_zcopy_downgrade_managed(struct 
sk_buff *skb)
__skb_zcopy_downgrade_managed(skb);
 }
 
+/* Return true if frags in this skb are readable by the host. */
+static inline bool skb_frags_readable(const struct sk_buff *skb)
+{
+   return skb->readable;
+}
+
 static inline void skb_mark_not_on_list(struct sk_buff *skb)
 {
skb->next = NULL;
@@ -2512,10 +2519,17 @@ static inline void skb_len_add(struct sk_buff *skb, int 
delta)
 static inline void __skb_fill_netmem_desc(struct sk_buff *skb, int i,
  netmem_ref netmem, int off, int size)
 {
-   struct page *page = netmem_to_page(netmem);
+   struct page *page;
 
__skb_fill_netmem_desc_noacc(skb_shinfo(skb), i, netmem, off, size);
 
+   if (netmem_is_net_iov(netmem)) {
+   skb->readable = false;
+   return;
+   }
+
+   page = netmem_to_page(netmem);
+
/* Propagate page pfmemalloc to the skb if we can. The problem is
 * that not all callers have unique ownership of the page but rely
 * on page_is_pfmemalloc doing the right thing(tm).
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 6ae35199d3b3..8f086e14b21d 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1062,7 +1062,7 @@ static inline int tcp_skb_mss(const struct sk_buff *skb)
 
 static inline bool tcp_skb_can_collapse_to(const struct sk_buff *skb)
 {
-   return likely(!TCP_SKB_CB(skb)->eor);
+   return likely(!TCP_SKB_CB(skb)->eor && skb_frags_readable(skb));
 }
 
 static inline bool tcp_skb_can_collapse(const struct sk_buff *to,
@@ -1070,7 +1070,8 @@ static inline bool tcp_skb_can_collapse(const struct 
sk_buff *to,
 {
return likely(tcp_skb_can_collapse_to(to) &&
  mptcp_skb_can_collapse(to, from) &&
- skb_pure_zcopy_same(to, from));
+ skb_pure_zcopy_same(to, from) &&
+ skb_frags_readable(to) == skb_frags_readable(from));
 }
 
 /* Events passed to congestion control interface */
diff --git a/net/core/datagram.c b/net/core/datagram.c
index e614cfd8e14a..b29f881df0e8 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -407,6 

[RFC PATCH net-next v7 14/14] selftests: add ncdevmem, netcat for devmem TCP

2024-03-26 Thread Mina Almasry
ncdevmem is a devmem TCP netcat. It works similarly to netcat, but it
sends and receives data using the devmem TCP APIs. It uses udmabuf as
the dmabuf provider. It is compatible with a regular netcat running on
a peer, or a ncdevmem running on a peer.

In addition to normal netcat support, ncdevmem has a validation mode,
where it sends a specific pattern and validates this pattern on the
receiver side to ensure data integrity.

Suggested-by: Stanislav Fomichev 
Signed-off-by: Mina Almasry 

---

v6:
- Updated to bind 8 queues.
- Added RSS configuration.
- Added some more tests for the netlink API.

Changes in v1:
- Many more general cleanups (Willem).
- Removed driver reset (Jakub).
- Removed hardcoded if index (Paolo).

RFC v2:
- General cleanups (Willem).

---
 tools/testing/selftests/net/.gitignore |   1 +
 tools/testing/selftests/net/Makefile   |   5 +
 tools/testing/selftests/net/ncdevmem.c | 546 +
 3 files changed, 552 insertions(+)
 create mode 100644 tools/testing/selftests/net/ncdevmem.c

diff --git a/tools/testing/selftests/net/.gitignore 
b/tools/testing/selftests/net/.gitignore
index 2f9d378edec3..b644dbae58b7 100644
--- a/tools/testing/selftests/net/.gitignore
+++ b/tools/testing/selftests/net/.gitignore
@@ -17,6 +17,7 @@ ipv6_flowlabel
 ipv6_flowlabel_mgr
 log.txt
 msg_zerocopy
+ncdevmem
 nettest
 psock_fanout
 psock_snd
diff --git a/tools/testing/selftests/net/Makefile 
b/tools/testing/selftests/net/Makefile
index 7b6918d5f4af..c9853573e60c 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -5,6 +5,10 @@ CFLAGS =  -Wall -Wl,--no-as-needed -O2 -g
 CFLAGS += -I../../../../usr/include/ $(KHDR_INCLUDES)
 # Additional include paths needed by kselftest.h
 CFLAGS += -I../
+CFLAGS += -I../../../net/ynl/generated/
+CFLAGS += -I../../../net/ynl/lib/
+
+LDLIBS += ../../../net/ynl/lib/ynl.a ../../../net/ynl/generated/protos.a
 
 TEST_PROGS := run_netsocktests run_afpackettests test_bpf.sh netdevice.sh \
  rtnetlink.sh xfrm_policy.sh test_blackhole_dev.sh
@@ -93,6 +97,7 @@ TEST_PROGS += test_bridge_backup_port.sh
 TEST_PROGS += fdb_flush.sh
 TEST_PROGS += fq_band_pktlimit.sh
 TEST_PROGS += vlan_hw_filter.sh
+TEST_GEN_FILES += ncdevmem
 
 TEST_FILES := settings
 TEST_FILES += in_netns.sh lib.sh net_helper.sh setup_loopback.sh setup_veth.sh
diff --git a/tools/testing/selftests/net/ncdevmem.c 
b/tools/testing/selftests/net/ncdevmem.c
new file mode 100644
index ..11bfe3e1125b
--- /dev/null
+++ b/tools/testing/selftests/net/ncdevmem.c
@@ -0,0 +1,546 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#define __EXPORTED_HEADERS__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#define __iovec_defined
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "netdev-user.h"
+#include 
+
+#define PAGE_SHIFT 12
+#define TEST_PREFIX "ncdevmem"
+#define NUM_PAGES 16000
+
+#ifndef MSG_SOCK_DEVMEM
+#define MSG_SOCK_DEVMEM 0x200
+#endif
+
+/*
+ * tcpdevmem netcat. Works similarly to netcat but does device memory TCP
+ * instead of regular TCP. Uses udmabuf to mock a dmabuf provider.
+ *
+ * Usage:
+ *
+ * On server:
+ * ncdevmem -s  -c  -f eth1 -d 3 -n :06:00.0 -l \
+ * -p 5201 -v 7
+ *
+ * On client:
+ * yes $(echo -e \\x01\\x02\\x03\\x04\\x05\\x06) | \
+ * tr \\n \\0 | \
+ * head -c 5G | \
+ * nc  5201 -p 5201
+ *
+ * Note this is compatible with regular netcat. i.e. the sender or receiver can
+ * be replaced with regular netcat to test the RX or TX path in isolation.
+ */
+
+static char *server_ip = "192.168.1.4";
+static char *client_ip = "192.168.1.2";
+static char *port = "5201";
+static size_t do_validation;
+static int start_queue = 8;
+static int num_queues = 8;
+static char *ifname = "eth1";
+static unsigned int ifindex = 3;
+static char *nic_pci_addr = ":06:00.0";
+static unsigned int iterations;
+static unsigned int dmabuf_id;
+
+void print_bytes(void *ptr, size_t size)
+{
+   unsigned char *p = ptr;
+   int i;
+
+   for (i = 0; i < size; i++)
+   printf("%02hhX ", p[i]);
+   printf("\n");
+}
+
+void print_nonzero_bytes(void *ptr, size_t size)
+{
+   unsigned char *p = ptr;
+   unsigned int i;
+
+   for (i = 0; i < size; i++)
+   putchar(p[i]);
+   printf("\n");
+}
+
+void validate_buffer(void *line, size_t size)
+{
+   static unsigned char seed = 1;
+   unsigned char *ptr = line;
+   int errors = 0;
+   size_t i;
+
+   for (i = 0; i < size; i++) {
+   if (ptr[i] != seed) {
+   fprintf(stderr,
+   "Failed validation: expected=%u, actual=%u, 
index=%lu\n",
+   seed, ptr[i], i);
+   

[RFC PATCH net-next v7 13/14] net: add devmem TCP documentation

2024-03-26 Thread Mina Almasry
Add documentation outlining the usage and details of devmem TCP.

Signed-off-by: Mina Almasry 

---

v7:
- Applied docs suggestions (Jakub).

v2:

- Missing spdx (simon)
- add to index.rst (simon)

---
 Documentation/networking/devmem.rst | 256 
 Documentation/networking/index.rst  |   1 +
 2 files changed, 257 insertions(+)
 create mode 100644 Documentation/networking/devmem.rst

diff --git a/Documentation/networking/devmem.rst 
b/Documentation/networking/devmem.rst
new file mode 100644
index ..b0899e8e9e83
--- /dev/null
+++ b/Documentation/networking/devmem.rst
@@ -0,0 +1,256 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=
+Device Memory TCP
+=
+
+
+Intro
+=
+
+Device memory TCP (devmem TCP) enables receiving data directly into device
+memory (dmabuf). The feature is currently implemented for TCP sockets.
+
+
+Opportunity
+---
+
+A large number of data transfers have device memory as the source and/or
+destination. Accelerators drastically increased the prevalence of such
+transfers.  Some examples include:
+
+- Distributed training, where ML accelerators, such as GPUs on different hosts,
+  exchange data.
+
+- Distributed raw block storage applications transfer large amounts of data 
with
+  remote SSDs, much of this data does not require host processing.
+
+Typically the Device-to-Device data transfers the network are implemented as 
the
+following low level operations: Device-to-Host copy, Host-to-Host network
+transfer, and Host-to-Device copy.
+
+The flow involving host copies is suboptimal, especially for bulk data 
transfers,
+and can put significant strains on system resources such as host memory
+bandwidth and PCIe bandwidth.
+
+Devmem TCP optimizes this use case by implementing socket APIs that enable
+the user to receive incoming network packets directly into device memory.
+
+Packet payloads go directly from the NIC to device memory.
+
+Packet headers go to host memory and are processed by the TCP/IP stack
+normally. The NIC must support header split to achieve this.
+
+Advantages:
+
+- Alleviate host memory bandwidth pressure, compared to existing
+  network-transfer + device-copy semantics.
+
+- Alleviate PCIe bandwidth pressure, by limiting data transfer to the lowest
+  level of the PCIe tree, compared to traditional path which sends data through
+  the root complex.
+
+
+More Info
+-
+
+  slides, video
+https://netdevconf.org/0x17/sessions/talk/device-memory-tcp.html
+
+  patchset
+[RFC PATCH v6 00/12] Device Memory TCP
+
https://lore.kernel.org/netdev/20240305020153.2787423-1-almasrym...@google.com/
+
+
+Interface
+=
+
+Example
+---
+
+tools/testing/selftests/net/ncdevmem.c:do_server shows an example of setting up
+the RX path of this API.
+
+NIC Setup
+-
+
+Header split, flow steering, & RSS are required features for devmem TCP.
+
+Header split is used to split incoming packets into a header buffer in host
+memory, and a payload buffer in device memory.
+
+Flow steering & RSS are used to ensure that only flows targeting devmem land on
+RX queue bound to devmem.
+
+Enable header split & flow steering::
+
+   # enable header split
+   ethtool -G eth1 tcp-data-split on
+
+
+   # enable flow steering
+   ethtool -K eth1 ntuple on
+
+Configure RSS to steer all traffic away from the target RX queue (queue 15 in
+this example)::
+
+   ethtool --set-rxfh-indir eth1 equal 15
+
+
+The user must bind a dmabuf to any number of RX queues on a given NIC using
+netlink API::
+
+   /* Bind dmabuf to NIC RX queue 15 */
+   struct netdev_queue *queues;
+   queues = malloc(sizeof(*queues) * 1);
+
+   queues[0]._present.type = 1;
+   queues[0]._present.idx = 1;
+   queues[0].type = NETDEV_RX_QUEUE_TYPE_RX;
+   queues[0].idx = 15;
+
+   *ys = ynl_sock_create(_netdev_family, );
+
+   req = netdev_bind_rx_req_alloc();
+   netdev_bind_rx_req_set_ifindex(req, 1 /* ifindex */);
+   netdev_bind_rx_req_set_dmabuf_fd(req, dmabuf_fd);
+   __netdev_bind_rx_req_set_queues(req, queues, n_queue_index);
+
+   rsp = netdev_bind_rx(*ys, req);
+
+   dmabuf_id = rsp->dmabuf_id;
+
+
+The netlink API returns a dmabuf_id: a unique ID that refers to this dmabuf
+that has been bound.
+
+Socket Setup
+
+
+The socket must be flow steering to the dmabuf bound RX queue::
+
+   ethtool -N eth1 flow-type tcp4 ... queue 15,
+
+
+Receiving data
+--
+
+The user application must signal to the kernel that it is capable of receiving
+devmem data by passing the MSG_SOCK_DEVMEM flag to recvmsg::
+
+   ret = recvmsg(fd, , MSG_SOCK_DEVMEM);
+
+Applications that do not specify the MSG_SOCK_DEVMEM flag will receive an 
EFAULT
+on devmem data.
+
+Devmem data is received directly into the dmabuf bound to the NIC in 'NIC
+Setup', and the kernel signals such to the user via the SCM_DEVMEM_* cmsgs::
+
+   for (cm 

[RFC PATCH net-next v7 03/14] net: netdev netlink api to bind dma-buf to a net device

2024-03-26 Thread Mina Almasry
API takes the dma-buf fd as input, and binds it to the netdevice. The
user can specify the rx queues to bind the dma-buf to.

Suggested-by: Stanislav Fomichev 
Signed-off-by: Mina Almasry 

---

v7:
- Use flags: [ admin-perm ] instead of a CAP_NET_ADMIN check.

Changes in v1:
- Add rx-queue-type to distingish rx from tx (Jakub)
- Return dma-buf ID from netlink API (David, Stan)

Changes in RFC-v3:
- Support binding multiple rx rx-queues

---
 Documentation/netlink/specs/netdev.yaml | 53 +
 include/uapi/linux/netdev.h | 19 +
 net/core/netdev-genl-gen.c  | 19 +
 net/core/netdev-genl-gen.h  |  2 +
 net/core/netdev-genl.c  |  6 +++
 tools/include/uapi/linux/netdev.h   | 19 +
 6 files changed, 118 insertions(+)

diff --git a/Documentation/netlink/specs/netdev.yaml 
b/Documentation/netlink/specs/netdev.yaml
index 76352dbd2be4..275d1faa87a6 100644
--- a/Documentation/netlink/specs/netdev.yaml
+++ b/Documentation/netlink/specs/netdev.yaml
@@ -268,6 +268,45 @@ attribute-sets:
 name: napi-id
 doc: ID of the NAPI instance which services this queue.
 type: u32
+  -
+name: queue-dmabuf
+attributes:
+  -
+name: type
+doc: rx or tx queue
+type: u8
+enum: queue-type
+  -
+name: idx
+doc: queue index
+type: u32
+
+  -
+name: bind-dmabuf
+attributes:
+  -
+name: ifindex
+doc: netdev ifindex to bind the dma-buf to.
+type: u32
+checks:
+  min: 1
+  -
+name: queues
+doc: receive queues to bind the dma-buf to.
+type: nest
+nested-attributes: queue-dmabuf
+multi-attr: true
+  -
+name: dmabuf-fd
+doc: dmabuf file descriptor to bind.
+type: u32
+  -
+name: dmabuf-id
+doc: id of the dmabuf binding
+type: u32
+checks:
+  min: 1
+
 
   -
 name: qstats
@@ -457,6 +496,20 @@ operations:
   attributes:
 - ifindex
 reply: *queue-get-op
+-
+  name: bind-rx
+  doc: Bind dmabuf to netdev
+  attribute-set: bind-dmabuf
+  flags: [ admin-perm ]
+  do:
+request:
+  attributes:
+- ifindex
+- dmabuf-fd
+- queues
+reply:
+  attributes:
+- dmabuf-id
 -
   name: napi-get
   doc: Get information about NAPI instances configured on the system.
diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
index bb65ee840cda..c5b959a0ed6c 100644
--- a/include/uapi/linux/netdev.h
+++ b/include/uapi/linux/netdev.h
@@ -136,6 +136,24 @@ enum {
NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)
 };
 
+enum {
+   NETDEV_A_QUEUE_DMABUF_TYPE = 1,
+   NETDEV_A_QUEUE_DMABUF_IDX,
+
+   __NETDEV_A_QUEUE_DMABUF_MAX,
+   NETDEV_A_QUEUE_DMABUF_MAX = (__NETDEV_A_QUEUE_DMABUF_MAX - 1)
+};
+
+enum {
+   NETDEV_A_BIND_DMABUF_IFINDEX = 1,
+   NETDEV_A_BIND_DMABUF_QUEUES,
+   NETDEV_A_BIND_DMABUF_DMABUF_FD,
+   NETDEV_A_BIND_DMABUF_DMABUF_ID,
+
+   __NETDEV_A_BIND_DMABUF_MAX,
+   NETDEV_A_BIND_DMABUF_MAX = (__NETDEV_A_BIND_DMABUF_MAX - 1)
+};
+
 enum {
NETDEV_A_QSTATS_IFINDEX = 1,
NETDEV_A_QSTATS_QUEUE_TYPE,
@@ -162,6 +180,7 @@ enum {
NETDEV_CMD_PAGE_POOL_CHANGE_NTF,
NETDEV_CMD_PAGE_POOL_STATS_GET,
NETDEV_CMD_QUEUE_GET,
+   NETDEV_CMD_BIND_RX,
NETDEV_CMD_NAPI_GET,
NETDEV_CMD_QSTATS_GET,
 
diff --git a/net/core/netdev-genl-gen.c b/net/core/netdev-genl-gen.c
index 8d8ace9ef87f..bbaaa1b36b5b 100644
--- a/net/core/netdev-genl-gen.c
+++ b/net/core/netdev-genl-gen.c
@@ -27,6 +27,11 @@ const struct nla_policy 
netdev_page_pool_info_nl_policy[NETDEV_A_PAGE_POOL_IFIND
[NETDEV_A_PAGE_POOL_IFINDEX] = NLA_POLICY_FULL_RANGE(NLA_U32, 
_a_page_pool_ifindex_range),
 };
 
+const struct nla_policy 
netdev_queue_dmabuf_nl_policy[NETDEV_A_QUEUE_DMABUF_IDX + 1] = {
+   [NETDEV_A_QUEUE_DMABUF_TYPE] = NLA_POLICY_MAX(NLA_U8, 1),
+   [NETDEV_A_QUEUE_DMABUF_IDX] = { .type = NLA_U32, },
+};
+
 /* NETDEV_CMD_DEV_GET - do */
 static const struct nla_policy netdev_dev_get_nl_policy[NETDEV_A_DEV_IFINDEX + 
1] = {
[NETDEV_A_DEV_IFINDEX] = NLA_POLICY_MIN(NLA_U32, 1),
@@ -58,6 +63,13 @@ static const struct nla_policy 
netdev_queue_get_dump_nl_policy[NETDEV_A_QUEUE_IF
[NETDEV_A_QUEUE_IFINDEX] = NLA_POLICY_MIN(NLA_U32, 1),
 };
 
+/* NETDEV_CMD_BIND_RX - do */
+static const struct nla_policy 
netdev_bind_rx_nl_policy[NETDEV_A_BIND_DMABUF_DMABUF_FD + 1] = {
+   [NETDEV_A_BIND_DMABUF_IFINDEX] = NLA_POLICY_MIN(NLA_U32, 1),
+   [NETDEV_A_BIND_DMABUF_DMABUF_FD] = { .type = NLA_U32, },
+   [NETDEV_A_BIND_DMABUF_QUEUES] = 
NLA_POLICY_NESTED(netdev_queue_dmabuf_nl_policy),
+};
+
 /* NETDEV_CMD_NAPI_GET - do */
 static const struct nla_policy 

[RFC PATCH net-next v7 08/14] memory-provider: dmabuf devmem memory provider

2024-03-26 Thread Mina Almasry
Implement a memory provider that allocates dmabuf devmem in the form of
net_iov.

The provider receives a reference to the struct netdev_dmabuf_binding
via the pool->mp_priv pointer. The driver needs to set this pointer for
the provider in the net_iov.

The provider obtains a reference on the netdev_dmabuf_binding which
guarantees the binding and the underlying mapping remains alive until
the provider is destroyed.

Usage of PP_FLAG_DMA_MAP is required for this memory provide such that
the page_pool can provide the driver with the dma-addrs of the devmem.

Support for PP_FLAG_DMA_SYNC_DEV is omitted for simplicity & p.order !=
0.

Signed-off-by: Willem de Bruijn 
Signed-off-by: Kaiyuan Zhang 
Signed-off-by: Mina Almasry 

---

v6:
- refactor new memory provider functions into net/core/devmem.c (Pavel)

v2:
- Disable devmem for p.order != 0

v1:
- static_branch check in page_is_page_pool_iov() (Willem & Paolo).
- PP_DEVMEM -> PP_IOV (David).
- Require PP_FLAG_DMA_MAP (Jakub).

---
 include/net/netmem.h| 15 ++
 include/net/page_pool/helpers.h | 22 +
 include/net/page_pool/types.h   |  2 +
 net/core/devmem.c   | 82 +
 net/core/page_pool.c| 38 +++
 5 files changed, 137 insertions(+), 22 deletions(-)

diff --git a/include/net/netmem.h b/include/net/netmem.h
index 74eeaa34883e..34aa1c80c1ca 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -126,6 +126,21 @@ static inline struct page *netmem_to_page(netmem_ref 
netmem)
return (__force struct page *)netmem;
 }
 
+static inline struct net_iov *netmem_to_net_iov(netmem_ref netmem)
+{
+   if (netmem_is_net_iov(netmem))
+   return (struct net_iov *)((__force unsigned long)netmem &
+ ~NET_IOV);
+
+   DEBUG_NET_WARN_ON_ONCE(true);
+   return NULL;
+}
+
+static inline netmem_ref net_iov_to_netmem(struct net_iov *niov)
+{
+   return (__force netmem_ref)((unsigned long)niov | NET_IOV);
+}
+
 static inline netmem_ref page_to_netmem(struct page *page)
 {
return (__force netmem_ref)page;
diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index c6a55eddefae..eb736506c3ce 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -453,4 +453,26 @@ static inline void page_pool_nid_changed(struct page_pool 
*pool, int new_nid)
page_pool_update_nid(pool, new_nid);
 }
 
+static inline void page_pool_set_pp_info(struct page_pool *pool,
+netmem_ref netmem)
+{
+   netmem_set_pp(netmem, pool);
+   netmem_or_pp_magic(netmem, PP_SIGNATURE);
+
+   /* Ensuring all pages have been split into one fragment initially:
+* page_pool_set_pp_info() is only called once for every page when it
+* is allocated from the page allocator and page_pool_fragment_page()
+* is dirtying the same cache line as the page->pp_magic above, so
+* the overhead is negligible.
+*/
+   page_pool_fragment_netmem(netmem, 1);
+   if (pool->has_init_callback)
+   pool->slow.init_callback(netmem, pool->slow.init_arg);
+}
+
+static inline void page_pool_clear_pp_info(netmem_ref netmem)
+{
+   netmem_clear_pp_magic(netmem);
+   netmem_set_pp(netmem, NULL);
+}
 #endif /* _NET_PAGE_POOL_HELPERS_H */
diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
index f04af1613f59..5b58c9e185a4 100644
--- a/include/net/page_pool/types.h
+++ b/include/net/page_pool/types.h
@@ -141,6 +141,8 @@ struct pp_memory_provider_params {
void *mp_priv;
 };
 
+extern const struct memory_provider_ops dmabuf_devmem_ops;
+
 struct page_pool {
struct page_pool_params_fast p;
 
diff --git a/net/core/devmem.c b/net/core/devmem.c
index 84e88955ff2d..01337de7d6a4 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -341,3 +341,85 @@ int net_devmem_bind_dmabuf(struct net_device *dev, 
unsigned int dmabuf_fd,
return err;
 }
 #endif
+
+/*** "Dmabuf devmem memory provider" ***/
+
+static int mp_dmabuf_devmem_init(struct page_pool *pool)
+{
+   struct net_devmem_dmabuf_binding *binding = pool->mp_priv;
+
+   if (!binding)
+   return -EINVAL;
+
+   if (!(pool->p.flags & PP_FLAG_DMA_MAP))
+   return -EOPNOTSUPP;
+
+   if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV)
+   return -EOPNOTSUPP;
+
+   if (pool->p.order != 0)
+   return -E2BIG;
+
+   net_devmem_dmabuf_binding_get(binding);
+   return 0;
+}
+
+static netmem_ref mp_dmabuf_devmem_alloc_pages(struct page_pool *pool,
+  gfp_t gfp)
+{
+   struct net_devmem_dmabuf_binding *binding = pool->mp_priv;
+   netmem_ref netmem;
+   struct net_iov *niov;
+   dma_addr_t dma_addr;
+
+   niov = net_devmem_alloc_dmabuf(binding);
+   if (!niov)
+   return 

[RFC PATCH net-next v7 05/14] netdev: netdevice devmem allocator

2024-03-26 Thread Mina Almasry
Implement netdev devmem allocator. The allocator takes a given struct
netdev_dmabuf_binding as input and allocates net_iov from that
binding.

The allocation simply delegates to the binding's genpool for the
allocation logic and wraps the returned memory region in a net_iov
struct.

Signed-off-by: Willem de Bruijn 
Signed-off-by: Kaiyuan Zhang 
Signed-off-by: Mina Almasry 

---
v7:
- netdev_ -> net_devmem_* naming (Yunsheng).

v6:
- Add comment on net_iov_dma_addr to explain why we don't use
  niov->dma_addr (Pavel)
- Refactor new functions into net/core/devmem.c (Pavel)

v1:
- Rename devmem -> dmabuf (David).

---
 include/net/devmem.h | 13 +
 include/net/netmem.h | 40 
 net/core/devmem.c| 39 +++
 3 files changed, 92 insertions(+)

diff --git a/include/net/devmem.h b/include/net/devmem.h
index fa03bdabdffd..cd3186f5d1fb 100644
--- a/include/net/devmem.h
+++ b/include/net/devmem.h
@@ -68,7 +68,20 @@ int net_devmem_bind_dmabuf(struct net_device *dev, unsigned 
int dmabuf_fd,
 void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding);
 int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
struct net_devmem_dmabuf_binding *binding);
+struct net_iov *
+net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding);
+void net_devmem_free_dmabuf(struct net_iov *ppiov);
 #else
+static inline struct net_iov *
+net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding)
+{
+   return NULL;
+}
+
+static inline void net_devmem_free_dmabuf(struct net_iov *ppiov)
+{
+}
+
 static inline void
 __net_devmem_dmabuf_binding_free(struct net_devmem_dmabuf_binding *binding)
 {
diff --git a/include/net/netmem.h b/include/net/netmem.h
index 72e932a1a948..ca17ea1d33f8 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -14,8 +14,48 @@
 
 struct net_iov {
struct dmabuf_genpool_chunk_owner *owner;
+   unsigned long dma_addr;
 };
 
+static inline struct dmabuf_genpool_chunk_owner *
+net_iov_owner(const struct net_iov *niov)
+{
+   return niov->owner;
+}
+
+static inline unsigned int net_iov_idx(const struct net_iov *niov)
+{
+   return niov - net_iov_owner(niov)->niovs;
+}
+
+/* This returns the absolute dma_addr_t calculated from
+ * net_iov_owner(niov)->owner->base_dma_addr, not the page_pool-owned
+ * niov->dma_addr.
+ *
+ * The absolute dma_addr_t is a dma_addr_t that is always uncompressed.
+ *
+ * The page_pool-owner niov->dma_addr is the absolute dma_addr compressed into
+ * an unsigned long. Special handling is done when the unsigned long is 32-bit
+ * but the dma_addr_t is 64-bit.
+ *
+ * In general code looking for the dma_addr_t should use net_iov_dma_addr(),
+ * while page_pool code looking for the unsigned long dma_addr which mirrors
+ * the field in struct page should use niov->dma_addr.
+ */
+static inline dma_addr_t net_iov_dma_addr(const struct net_iov *niov)
+{
+   struct dmabuf_genpool_chunk_owner *owner = net_iov_owner(niov);
+
+   return owner->base_dma_addr +
+  ((dma_addr_t)net_iov_idx(niov) << PAGE_SHIFT);
+}
+
+static inline struct netdev_dmabuf_binding *
+net_iov_binding(const struct net_iov *niov)
+{
+   return net_iov_owner(niov)->binding;
+}
+
 /* netmem */
 
 /**
diff --git a/net/core/devmem.c b/net/core/devmem.c
index e49f9ca74f67..84e88955ff2d 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -103,6 +103,45 @@ static int net_devmem_restart_rx_queue(struct net_device 
*dev, int rxq_idx)
return err;
 }
 
+struct net_iov *
+net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding)
+{
+   struct dmabuf_genpool_chunk_owner *owner;
+   unsigned long dma_addr;
+   struct net_iov *niov;
+   ssize_t offset;
+   ssize_t index;
+
+   dma_addr = gen_pool_alloc_owner(binding->chunk_pool, PAGE_SIZE,
+   (void **));
+   if (!dma_addr)
+   return NULL;
+
+   offset = dma_addr - owner->base_dma_addr;
+   index = offset / PAGE_SIZE;
+   niov = >niovs[index];
+
+   niov->pp_magic = 0;
+   niov->pp = NULL;
+   niov->dma_addr = 0;
+   atomic_long_set(>pp_ref_count, 0);
+
+   net_devmem_dmabuf_binding_get(binding);
+
+   return niov;
+}
+
+void net_devmem_free_dmabuf(struct net_iov *niov)
+{
+   struct net_devmem_dmabuf_binding *binding = net_iov_binding(niov);
+   unsigned long dma_addr = net_iov_dma_addr(niov);
+
+   if (gen_pool_has_addr(binding->chunk_pool, dma_addr, PAGE_SIZE))
+   gen_pool_free(binding->chunk_pool, dma_addr, PAGE_SIZE);
+
+   net_devmem_dmabuf_binding_put(binding);
+}
+
 /* Protected by rtnl_lock() */
 static DEFINE_XARRAY_FLAGS(net_devmem_dmabuf_bindings, XA_FLAGS_ALLOC1);
 
-- 
2.44.0.396.g6e790dbe36-goog



[RFC PATCH net-next v7 06/14] page_pool: convert to use netmem

2024-03-26 Thread Mina Almasry
Abstrace the memory type from the page_pool so we can later add support
for new memory types. Convert the page_pool to use the new netmem type
abstraction, rather than use struct page directly.

As of this patch the netmem type is a no-op abstraction: it's always a
struct page underneath. All the page pool internals are converted to
use struct netmem instead of struct page, and the page pool now exports
2 APIs:

1. The existing struct page API.
2. The new struct netmem API.

Keeping the existing API is transitional; we do not want to refactor all
the current drivers using the page pool at once.

The netmem abstraction is currently a no-op. The page_pool uses
page_to_netmem() to convert allocated pages to netmem, and uses
netmem_to_page() to convert the netmem back to pages to pass to mm APIs,

Follow up patches to this series add non-paged netmem support to the
page_pool. This change is factored out on its own to limit the code
churn to this 1 patch, for ease of code review.

Signed-off-by: Mina Almasry 

---

v6:

- Rebased on top of the merged netmem_ref type.

To: linux...@kvack.org
Cc: Matthew Wilcox 

---
 include/linux/skbuff.h   |   4 +-
 include/net/netmem.h |  15 ++
 include/net/page_pool/helpers.h  | 122 +
 include/net/page_pool/types.h|  17 +-
 include/trace/events/page_pool.h |  29 +--
 net/bpf/test_run.c   |   5 +-
 net/core/page_pool.c | 303 +--
 net/core/skbuff.c|   7 +-
 8 files changed, 302 insertions(+), 200 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index b945af8a6208..78659c8efa4e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3521,7 +3521,7 @@ int skb_pp_cow_data(struct page_pool *pool, struct 
sk_buff **pskb,
unsigned int headroom);
 int skb_cow_data_for_xdp(struct page_pool *pool, struct sk_buff **pskb,
 struct bpf_prog *prog);
-bool napi_pp_put_page(struct page *page, bool napi_safe);
+bool napi_pp_put_page(netmem_ref netmem, bool napi_safe);
 
 static inline void
 skb_page_unref(const struct sk_buff *skb, struct page *page, bool napi_safe)
@@ -3539,7 +3539,7 @@ napi_frag_unref(skb_frag_t *frag, bool recycle, bool 
napi_safe)
struct page *page = skb_frag_page(frag);
 
 #ifdef CONFIG_PAGE_POOL
-   if (recycle && napi_pp_put_page(page, napi_safe))
+   if (recycle && napi_pp_put_page(page_to_netmem(page), napi_safe))
return;
 #endif
put_page(page);
diff --git a/include/net/netmem.h b/include/net/netmem.h
index ca17ea1d33f8..21f53b29e5fe 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -88,4 +88,19 @@ static inline netmem_ref page_to_netmem(struct page *page)
return (__force netmem_ref)page;
 }
 
+static inline int netmem_ref_count(netmem_ref netmem)
+{
+   return page_ref_count(netmem_to_page(netmem));
+}
+
+static inline unsigned long netmem_to_pfn(netmem_ref netmem)
+{
+   return page_to_pfn(netmem_to_page(netmem));
+}
+
+static inline netmem_ref netmem_compound_head(netmem_ref netmem)
+{
+   return page_to_netmem(compound_head(netmem_to_page(netmem)));
+}
+
 #endif /* _NET_NETMEM_H */
diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index 1d397c1a0043..61814f91a458 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -53,6 +53,8 @@
 #define _NET_PAGE_POOL_HELPERS_H
 
 #include 
+#include 
+#include 
 
 #ifdef CONFIG_PAGE_POOL_STATS
 /* Deprecated driver-facing API, use netlink instead */
@@ -101,7 +103,7 @@ static inline struct page *page_pool_dev_alloc_pages(struct 
page_pool *pool)
  * Get a page fragment from the page allocator or page_pool caches.
  *
  * Return:
- * Return allocated page fragment, otherwise return NULL.
+ * Return allocated page fragment, otherwise return 0.
  */
 static inline struct page *page_pool_dev_alloc_frag(struct page_pool *pool,
unsigned int *offset,
@@ -112,22 +114,22 @@ static inline struct page 
*page_pool_dev_alloc_frag(struct page_pool *pool,
return page_pool_alloc_frag(pool, offset, size, gfp);
 }
 
-static inline struct page *page_pool_alloc(struct page_pool *pool,
-  unsigned int *offset,
-  unsigned int *size, gfp_t gfp)
+static inline netmem_ref page_pool_alloc(struct page_pool *pool,
+unsigned int *offset,
+unsigned int *size, gfp_t gfp)
 {
unsigned int max_size = PAGE_SIZE << pool->p.order;
-   struct page *page;
+   netmem_ref netmem;
 
if ((*size << 1) > max_size) {
*size = max_size;
*offset = 0;
-   return page_pool_alloc_pages(pool, gfp);
+   return page_pool_alloc_netmem(pool, gfp);
}
 
-   

[RFC PATCH net-next v7 02/14] net: page_pool: create hooks for custom page providers

2024-03-26 Thread Mina Almasry
From: Jakub Kicinski 

The page providers which try to reuse the same pages will
need to hold onto the ref, even if page gets released from
the pool - as in releasing the page from the pp just transfers
the "ownership" reference from pp to the provider, and provider
will wait for other references to be gone before feeding this
page back into the pool.

Signed-off-by: Jakub Kicinski 
Signed-off-by: Mina Almasry 

---

This is implemented by Jakub in his RFC:
https://lore.kernel.org/netdev/f8270765-a27b-6ccf-33ea-cda097168...@redhat.com/T/

I take no credit for the idea or implementation; I only added minor
edits to make this workable with device memory TCP, and removed some
hacky test code. This is a critical dependency of device memory TCP
and thus I'm pulling it into this series to make it revewable and
mergeable.

RFC v3 -> v1
- Removed unusued mem_provider. (Yunsheng).
- Replaced memory_provider & mp_priv with netdev_rx_queue (Jakub).

---
 include/net/page_pool/types.h | 12 ++
 net/core/page_pool.c  | 43 +++
 2 files changed, 50 insertions(+), 5 deletions(-)

diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
index 5e43a08d3231..ffe5f31fb0da 100644
--- a/include/net/page_pool/types.h
+++ b/include/net/page_pool/types.h
@@ -52,6 +52,7 @@ struct pp_alloc_cache {
  * @dev:   device, for DMA pre-mapping purposes
  * @netdev:netdev this pool will serve (leave as NULL if none or multiple)
  * @napi:  NAPI which is the sole consumer of pages, otherwise NULL
+ * @queue: struct netdev_rx_queue this page_pool is being created for.
  * @dma_dir:   DMA mapping direction
  * @max_len:   max DMA sync memory size for PP_FLAG_DMA_SYNC_DEV
  * @offset:DMA sync address offset for PP_FLAG_DMA_SYNC_DEV
@@ -64,6 +65,7 @@ struct page_pool_params {
int nid;
struct device   *dev;
struct napi_struct *napi;
+   struct netdev_rx_queue *queue;
enum dma_data_direction dma_dir;
unsigned intmax_len;
unsigned intoffset;
@@ -126,6 +128,13 @@ struct page_pool_stats {
 };
 #endif
 
+struct memory_provider_ops {
+   int (*init)(struct page_pool *pool);
+   void (*destroy)(struct page_pool *pool);
+   struct page *(*alloc_pages)(struct page_pool *pool, gfp_t gfp);
+   bool (*release_page)(struct page_pool *pool, struct page *page);
+};
+
 struct page_pool {
struct page_pool_params_fast p;
 
@@ -176,6 +185,9 @@ struct page_pool {
 */
struct ptr_ring ring;
 
+   void *mp_priv;
+   const struct memory_provider_ops *mp_ops;
+
 #ifdef CONFIG_PAGE_POOL_STATS
/* recycle stats are per-cpu to avoid locking */
struct page_pool_recycle_stats __percpu *recycle_stats;
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index dd364d738c00..795b7ff1c01f 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -25,6 +25,8 @@
 
 #include "page_pool_priv.h"
 
+static DEFINE_STATIC_KEY_FALSE(page_pool_mem_providers);
+
 #define DEFER_TIME (msecs_to_jiffies(1000))
 #define DEFER_WARN_INTERVAL (60 * HZ)
 
@@ -177,6 +179,7 @@ static int page_pool_init(struct page_pool *pool,
  int cpuid)
 {
unsigned int ring_qsize = 1024; /* Default */
+   int err;
 
memcpy(>p, >fast, sizeof(pool->p));
memcpy(>slow, >slow, sizeof(pool->slow));
@@ -248,10 +251,25 @@ static int page_pool_init(struct page_pool *pool,
/* Driver calling page_pool_create() also call page_pool_destroy() */
refcount_set(>user_cnt, 1);
 
+   if (pool->mp_ops) {
+   err = pool->mp_ops->init(pool);
+   if (err) {
+   pr_warn("%s() mem-provider init failed %d\n", __func__,
+   err);
+   goto free_ptr_ring;
+   }
+
+   static_branch_inc(_pool_mem_providers);
+   }
+
if (pool->p.flags & PP_FLAG_DMA_MAP)
get_device(pool->p.dev);
 
return 0;
+
+free_ptr_ring:
+   ptr_ring_cleanup(>ring, NULL);
+   return err;
 }
 
 static void page_pool_uninit(struct page_pool *pool)
@@ -546,7 +564,10 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, 
gfp_t gfp)
return page;
 
/* Slow-path: cache empty, do real allocation */
-   page = __page_pool_alloc_pages_slow(pool, gfp);
+   if (static_branch_unlikely(_pool_mem_providers) && pool->mp_ops)
+   page = pool->mp_ops->alloc_pages(pool, gfp);
+   else
+   page = __page_pool_alloc_pages_slow(pool, gfp);
return page;
 }
 EXPORT_SYMBOL(page_pool_alloc_pages);
@@ -603,10 +624,13 @@ void __page_pool_release_page_dma(struct page_pool *pool, 
struct page *page)
 void page_pool_return_page(struct page_pool *pool, struct page *page)
 {
int count;
+   bool put;
 
-   

[RFC PATCH net-next v7 01/14] queue_api: define queue api

2024-03-26 Thread Mina Almasry
This API enables the net stack to reset the queues used for devmem TCP.

Signed-off-by: Mina Almasry 

---
 include/linux/netdevice.h   |  3 +++
 include/net/netdev_queues.h | 27 +++
 2 files changed, 30 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e41d30ebaca6..3d3af8f7f9c9 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1956,6 +1956,7 @@ enum netdev_reg_state {
  * @sysfs_rx_queue_group:  Space for optional per-rx queue attributes
  * @rtnl_link_ops: Rtnl_link_ops
  * @stat_ops:  Optional ops for queue-aware statistics
+ * @queue_mgmt_ops:Optional ops for queue management
  *
  * @gso_max_size:  Maximum size of generic segmentation offload
  * @tso_max_size:  Device (as in HW) limit on the max TSO request size
@@ -2338,6 +2339,8 @@ struct net_device {
 
const struct netdev_stat_ops *stat_ops;
 
+   const struct netdev_queue_mgmt_ops *queue_mgmt_ops;
+
/* for setting kernel sock attribute on TCP connection setup */
 #define GSO_MAX_SEGS   65535u
 #define GSO_LEGACY_MAX_SIZE65536u
diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h
index 1ec408585373..337df0860ae6 100644
--- a/include/net/netdev_queues.h
+++ b/include/net/netdev_queues.h
@@ -60,6 +60,33 @@ struct netdev_stat_ops {
   struct netdev_queue_stats_tx *tx);
 };
 
+/**
+ * struct netdev_queue_mgmt_ops - netdev ops for queue management
+ *
+ * @ndo_queue_mem_alloc: Allocate memory for an RX queue. The memory returned
+ *  in the form of a void* can be passed to
+ *  ndo_queue_mem_free() for freeing or to ndo_queue_start
+ *  to create an RX queue with this memory.
+ *
+ * @ndo_queue_mem_free:Free memory from an RX queue.
+ *
+ * @ndo_queue_start:   Start an RX queue at the specified index.
+ *
+ * @ndo_queue_stop:Stop the RX queue at the specified index.
+ */
+struct netdev_queue_mgmt_ops {
+   void *  (*ndo_queue_mem_alloc)(struct net_device *dev,
+  int idx);
+   void(*ndo_queue_mem_free)(struct net_device *dev,
+ void *queue_mem);
+   int (*ndo_queue_start)(struct net_device *dev,
+  int idx,
+  void *queue_mem);
+   int (*ndo_queue_stop)(struct net_device *dev,
+ int idx,
+ void **out_queue_mem);
+};
+
 /**
  * DOC: Lockless queue stopping / waking helpers.
  *
-- 
2.44.0.396.g6e790dbe36-goog



[RFC PATCH net-next v7 00/14] Device Memory TCP

2024-03-26 Thread Mina Almasry
RFC v7:
===

Major Changes:
--

This revision largely rebases on top of net-next and addresses the feedback
RFCv6 received from folks, namely Jakub, Yunsheng, Arnd, David, & Pavel.

The series remains in RFC because the queue-API ndos defined in this
series are not yet implemented. I have a GVE implementation I carry out
of tree for my testing. A upstreamable GVE implementation is in the
works. Aside from that, in my estimation all the patches are ready for
review/merge. Please do take a look.

As usual the full devmem TCP changes including the full GVE driver
implementation is here:

https://github.com/mina/linux/commits/tcpdevmem-v7/

Detailed changelog:

- Use admin-perm in netlink API.
- Addressed feedback from Jakub with regards to netlink API
  implementation.
- Renamed devmem.c functions to something more appropriate for that
  file.
- Improve the performance seen through the page_pool benchmark.
- Fix the value definition of all the SO_DEVMEM_* uapi.
- Various fixes to documentation.

Perf - page-pool benchmark:
---

Improved performance of bench_page_pool_simple.ko tests compared to v6:

https://pastebin.com/raw/v5dYRg8L

  net-next base: 8 cycle fast path.
  RFC v6: 10 cycle fast path.
  RFC v7: 9 cycle fast path.
  RFC v7 with CONFIG_DMA_SHARED_BUFFER disabled: 8 cycle fast path,
 same as baseline.

Perf - Devmem TCP benchmark:
-

Perf is about the same regardless of the changes in v7, namely the
removal of the static_branch_unlikely to improve the page_pool benchmark
performance:

189/200gbps bi-directional throughput with RX devmem TCP and regular TCP
TX i.e. ~95% line rate.

RFC v6:
===

Major Changes:
--

This revision largely rebases on top of net-next and addresses the little
feedback RFCv5 received.

The series remains in RFC because the queue-API ndos defined in this
series are not yet implemented. I have a GVE implementation I carry out
of tree for my testing. A upstreamable GVE implementation is in the
works. Aside from that, in my estimation all the patches are ready for
review/merge. Please do take a look.

As usual the full devmem TCP changes including the full GVE driver
implementation is here:

https://github.com/mina/linux/commits/tcpdevmem-v6/

This version also comes with some performance data recorded in the cover
letter (see below changelog).

Detailed changelog:

- Rebased on top of the merged netmem_ref changes.

- Converted skb->dmabuf to skb->readable (Pavel). Pavel's original
  suggestion was to remove the skb->dmabuf flag entirely, but when I
  looked into it closely, I found the issue that if we remove the flag
  we have to dereference the shinfo(skb) pointer to obtain the first
  frag to tell whether an skb is readable or not. This can cause a
  performance regression if it dirties the cache line when the
  shinfo(skb) was not really needed. Instead, I converted the skb->dmabuf
  flag into a generic skb->readable flag which can be re-used by io_uring
  0-copy RX.

- Squashed a few locking optimizations from Eric Dumazet in the RX path
  and the DEVMEM_DONTNEED setsockopt.

- Expanded the tests a bit. Added validation for invalid scenarios and
  added some more coverage.

Perf - page-pool benchmark:
---

bench_page_pool_simple.ko tests with and without these changes:
https://pastebin.com/raw/ncHDwAbn

AFAIK the number that really matters in the perf tests is the
'tasklet_page_pool01_fast_path Per elem'. This one measures at about 8
cycles without the changes but there is some 1 cycle noise in some
results.

With the patches this regresses to 9 cycles with the changes but there
is 1 cycle noise occasionally running this test repeatedly.

Lastly I tried disable the static_branch_unlikely() in
netmem_is_net_iov() check. To my surprise disabling the
static_branch_unlikely() check reduces the fast path back to 8 cycles,
but the 1 cycle noise remains.

Perf - Devmem TCP benchmark:
-

189/200gbps bi-directional throughput with RX devmem TCP and regular TCP
TX i.e. ~95% line rate.

Major changes in RFC v5:


1. Rebased on top of 'Abstract page from net stack' series and used the
   new netmem type to refer to LSB set pointers instead of re-using
   struct page.

2. Downgraded this series back to RFC and called it RFC v5. This is
   because this series is now dependent on 'Abstract page from net
   stack'[1] and the queue API. Both are removed from the series to
   reduce the patch # and those bits are fairly independent or
   pre-requisite work.

3. Reworked the page_pool devmem support to use netmem and for some
   more unified handling.

4. Reworked the reference counting of net_iov (renamed from
   page_pool_iov) to use pp_ref_count for refcounting.

The full changes including the dependent series and GVE page pool
support is here:


Re: [PATCH 1/9] fbdev: shmobile: fix snprintf truncation

2024-03-26 Thread Laurent Pinchart
Hi Arnd,

Thank you for the patch.

On Tue, Mar 26, 2024 at 11:38:00PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> The name of the overlay does not fit into the fixed-length field:
> 
> drivers/video/fbdev/sh_mobile_lcdcfb.c:1577:2: error: 'snprintf' will always 
> be truncated; specified size is 16, but format string expands to at least 25
> 
> Make it short enough by changing the string.
> 
> Fixes: c5deac3c9b22 ("fbdev: sh_mobile_lcdc: Implement overlays support")
> Signed-off-by: Arnd Bergmann 

Reviewed-by: Laurent Pinchart 

> ---
>  drivers/video/fbdev/sh_mobile_lcdcfb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/video/fbdev/sh_mobile_lcdcfb.c 
> b/drivers/video/fbdev/sh_mobile_lcdcfb.c
> index eb2297b37504..d35d2cf8 100644
> --- a/drivers/video/fbdev/sh_mobile_lcdcfb.c
> +++ b/drivers/video/fbdev/sh_mobile_lcdcfb.c
> @@ -1575,7 +1575,7 @@ sh_mobile_lcdc_overlay_fb_init(struct 
> sh_mobile_lcdc_overlay *ovl)
>*/
>   info->fix = sh_mobile_lcdc_overlay_fix;
>   snprintf(info->fix.id, sizeof(info->fix.id),
> -  "SH Mobile LCDC Overlay %u", ovl->index);
> +  "SHMobile ovl %u", ovl->index);
>   info->fix.smem_start = ovl->dma_handle;
>   info->fix.smem_len = ovl->fb_size;
>   info->fix.line_length = ovl->pitch;

-- 
Regards,

Laurent Pinchart


[PATCH 1/9] fbdev: shmobile: fix snprintf truncation

2024-03-26 Thread Arnd Bergmann
From: Arnd Bergmann 

The name of the overlay does not fit into the fixed-length field:

drivers/video/fbdev/sh_mobile_lcdcfb.c:1577:2: error: 'snprintf' will always be 
truncated; specified size is 16, but format string expands to at least 25

Make it short enough by changing the string.

Fixes: c5deac3c9b22 ("fbdev: sh_mobile_lcdc: Implement overlays support")
Signed-off-by: Arnd Bergmann 
---
 drivers/video/fbdev/sh_mobile_lcdcfb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/sh_mobile_lcdcfb.c 
b/drivers/video/fbdev/sh_mobile_lcdcfb.c
index eb2297b37504..d35d2cf8 100644
--- a/drivers/video/fbdev/sh_mobile_lcdcfb.c
+++ b/drivers/video/fbdev/sh_mobile_lcdcfb.c
@@ -1575,7 +1575,7 @@ sh_mobile_lcdc_overlay_fb_init(struct 
sh_mobile_lcdc_overlay *ovl)
 */
info->fix = sh_mobile_lcdc_overlay_fix;
snprintf(info->fix.id, sizeof(info->fix.id),
-"SH Mobile LCDC Overlay %u", ovl->index);
+"SHMobile ovl %u", ovl->index);
info->fix.smem_start = ovl->dma_handle;
info->fix.smem_len = ovl->fb_size;
info->fix.line_length = ovl->pitch;
-- 
2.39.2



[PATCH 0/9] enabled -Wformat-truncation for clang

2024-03-26 Thread Arnd Bergmann
From: Arnd Bergmann 

With randconfig build testing, I found only eight files that produce
warnings with clang when -Wformat-truncation is enabled. This means
we can just turn it on by default rather than only enabling it for
"make W=1".

Unfortunately, gcc produces a lot more warnings when the option
is enabled, so it's not yet possible to turn it on both both
compilers.

I hope that the patches can get picked up by platform maintainers
directly, so the final patch can go in later on.

 Arnd

Arnd Bergmann (9):
  fbdev: shmobile: fix snprintf truncation
  enetc: avoid truncating error message
  qed: avoid truncating work queue length
  mlx5: avoid truncating error message
  surface3_power: avoid format string truncation warning
  Input: IMS: fix printf string overflow
  scsi: mylex: fix sysfs buffer lengths
  ALSA: aoa: avoid false-positive format truncation warning
  kbuild: enable -Wformat-truncation on clang

 drivers/input/misc/ims-pcu.c  |  4 ++--
 drivers/net/ethernet/freescale/enetc/enetc.c  |  2 +-
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  |  2 +-
 drivers/net/ethernet/qlogic/qed/qed_main.c|  9 ---
 drivers/platform/surface/surface3_power.c |  2 +-
 drivers/scsi/myrb.c   | 20 
 drivers/scsi/myrs.c   | 24 +--
 drivers/video/fbdev/sh_mobile_lcdcfb.c|  2 +-
 scripts/Makefile.extrawarn|  2 ++
 sound/aoa/soundbus/i2sbus/core.c  |  2 +-
 10 files changed, 35 insertions(+), 34 deletions(-)

-- 
2.39.2

Cc: Dmitry Torokhov 
Cc: Claudiu Manoil 
Cc: Vladimir Oltean 
Cc: Jakub Kicinski 
Cc: Saeed Mahameed 
Cc: Leon Romanovsky 
Cc: Ariel Elior 
Cc: Manish Chopra 
Cc: Hans de Goede 
Cc: "Ilpo Järvinen" 
Cc: Maximilian Luz 
Cc: Hannes Reinecke 
Cc: "Martin K. Petersen" 
Cc: Helge Deller 
Cc: Masahiro Yamada 
Cc: Nathan Chancellor 
Cc: Nicolas Schier 
Cc: Johannes Berg 
Cc: Jaroslav Kysela 
Cc: Takashi Iwai 
Cc: Nick Desaulniers 
Cc: Bill Wendling 
Cc: Justin Stitt 
Cc: linux-in...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: net...@vger.kernel.org
Cc: linux-r...@vger.kernel.org
Cc: platform-driver-...@vger.kernel.org
Cc: linux-s...@vger.kernel.org
Cc: linux-fb...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-kbu...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Cc: alsa-de...@alsa-project.org
Cc: linux-so...@vger.kernel.org
Cc: l...@lists.linux.dev



Re: [PATCH v4 09/16] drm/msm: import gen_header.py script from Mesa

2024-03-26 Thread Abhinav Kumar




On 3/26/2024 3:25 PM, Dmitry Baryshkov wrote:

On Wed, 27 Mar 2024 at 00:19, Abhinav Kumar  wrote:




On 3/22/2024 3:57 PM, Dmitry Baryshkov wrote:

Import the gen_headers.py script from Mesa, commit FIXME. This script
will be used to generate MSM register files on the fly during
compilation.

Signed-off-by: Dmitry Baryshkov 
---
   drivers/gpu/drm/msm/registers/gen_header.py | 957 

   1 file changed, 957 insertions(+)

diff --git a/drivers/gpu/drm/msm/registers/gen_header.py 
b/drivers/gpu/drm/msm/registers/gen_header.py
new file mode 100644
index ..ae39b7e6cde8
--- /dev/null
+++ b/drivers/gpu/drm/msm/registers/gen_header.py
@@ -0,0 +1,957 @@
+#!/usr/bin/python3
+


We need a licence and copyright here.


Yes, this is going to be fixed in the next revision. Mesa already got
the proper SPDX header here.



Also is something like a "based on" applicable here?




+import xml.parsers.expat
+import sys
+import os
+import collections
+import argparse
+import time
+import datetime
+
+class Error(Exception):
+This file was generated by the rules-ng-ng gen_header.py tool in this git 
repository:
+http://gitlab.freedesktop.org/mesa/mesa/
+git clone https://gitlab.freedesktop.org/mesa/mesa.git
+
+The rules-ng-ng source files this header was generated from are:


Is this still applicable ?

Now gen_header.py is moved to kernel.



Copied, not moved. So Mesa remains the primary source for Adreno
headers and gen_header.py



But all future development and code review on gen_header.py will be done 
in kernel itself OR periodically we will sync it up with mesa?






Re: [PATCH v4 09/16] drm/msm: import gen_header.py script from Mesa

2024-03-26 Thread Dmitry Baryshkov
On Wed, 27 Mar 2024 at 00:19, Abhinav Kumar  wrote:
>
>
>
> On 3/22/2024 3:57 PM, Dmitry Baryshkov wrote:
> > Import the gen_headers.py script from Mesa, commit FIXME. This script
> > will be used to generate MSM register files on the fly during
> > compilation.
> >
> > Signed-off-by: Dmitry Baryshkov 
> > ---
> >   drivers/gpu/drm/msm/registers/gen_header.py | 957 
> > 
> >   1 file changed, 957 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/msm/registers/gen_header.py 
> > b/drivers/gpu/drm/msm/registers/gen_header.py
> > new file mode 100644
> > index ..ae39b7e6cde8
> > --- /dev/null
> > +++ b/drivers/gpu/drm/msm/registers/gen_header.py
> > @@ -0,0 +1,957 @@
> > +#!/usr/bin/python3
> > +
>
> We need a licence and copyright here.

Yes, this is going to be fixed in the next revision. Mesa already got
the proper SPDX header here.

>
> Also is something like a "based on" applicable here?
>
> 
>
> > +import xml.parsers.expat
> > +import sys
> > +import os
> > +import collections
> > +import argparse
> > +import time
> > +import datetime
> > +
> > +class Error(Exception):
> > +This file was generated by the rules-ng-ng gen_header.py tool in this git 
> > repository:
> > +http://gitlab.freedesktop.org/mesa/mesa/
> > +git clone https://gitlab.freedesktop.org/mesa/mesa.git
> > +
> > +The rules-ng-ng source files this header was generated from are:
>
> Is this still applicable ?
>
> Now gen_header.py is moved to kernel.
>

Copied, not moved. So Mesa remains the primary source for Adreno
headers and gen_header.py


-- 
With best wishes
Dmitry


Re: [PATCH v4 01/16] drm/msm/mdp5: add writeback block bases

2024-03-26 Thread Abhinav Kumar




On 3/26/2024 2:52 PM, Dmitry Baryshkov wrote:

On Tue, 26 Mar 2024 at 23:39, Abhinav Kumar  wrote:




On 3/22/2024 3:56 PM, Dmitry Baryshkov wrote:

In order to stop patching the mdp5 headers, import definitions for the
writeback blocks. This part is extracted from the old Rob's patch.

Co-developed-by: Rob Clark 
Signed-off-by: Rob Clark 
Signed-off-by: Dmitry Baryshkov 
---
   drivers/gpu/drm/msm/disp/mdp5/mdp5_cfg.h | 11 +++
   1 file changed, 11 insertions(+)



This is unused today right?

Is it just being migrated now in advance as all the mesa mdp5 headers
are moving to kernel?



Exactly. I had three options: pick up this patch, implement applying
'fixup' patches or drop corresponding doffests from the mdp5.xml. I've
chosen the first option.



Yes, this is fine

Reviewed-by: Abhinav Kumar 



--
With best wishes
Dmitry


Re: [PATCH v4 09/16] drm/msm: import gen_header.py script from Mesa

2024-03-26 Thread Abhinav Kumar




On 3/22/2024 3:57 PM, Dmitry Baryshkov wrote:

Import the gen_headers.py script from Mesa, commit FIXME. This script
will be used to generate MSM register files on the fly during
compilation.

Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/registers/gen_header.py | 957 
  1 file changed, 957 insertions(+)

diff --git a/drivers/gpu/drm/msm/registers/gen_header.py 
b/drivers/gpu/drm/msm/registers/gen_header.py
new file mode 100644
index ..ae39b7e6cde8
--- /dev/null
+++ b/drivers/gpu/drm/msm/registers/gen_header.py
@@ -0,0 +1,957 @@
+#!/usr/bin/python3
+


We need a licence and copyright here.

Also is something like a "based on" applicable here?




+import xml.parsers.expat
+import sys
+import os
+import collections
+import argparse
+import time
+import datetime
+
+class Error(Exception):
+This file was generated by the rules-ng-ng gen_header.py tool in this git 
repository:
+http://gitlab.freedesktop.org/mesa/mesa/
+git clone https://gitlab.freedesktop.org/mesa/mesa.git
+
+The rules-ng-ng source files this header was generated from are:


Is this still applicable ?

Now gen_header.py is moved to kernel.



Re: [PATCH v4 01/16] drm/msm/mdp5: add writeback block bases

2024-03-26 Thread Dmitry Baryshkov
On Tue, 26 Mar 2024 at 23:39, Abhinav Kumar  wrote:
>
>
>
> On 3/22/2024 3:56 PM, Dmitry Baryshkov wrote:
> > In order to stop patching the mdp5 headers, import definitions for the
> > writeback blocks. This part is extracted from the old Rob's patch.
> >
> > Co-developed-by: Rob Clark 
> > Signed-off-by: Rob Clark 
> > Signed-off-by: Dmitry Baryshkov 
> > ---
> >   drivers/gpu/drm/msm/disp/mdp5/mdp5_cfg.h | 11 +++
> >   1 file changed, 11 insertions(+)
> >
>
> This is unused today right?
>
> Is it just being migrated now in advance as all the mesa mdp5 headers
> are moving to kernel?
>

Exactly. I had three options: pick up this patch, implement applying
'fixup' patches or drop corresponding doffests from the mdp5.xml. I've
chosen the first option.


--
With best wishes
Dmitry


Re: [PATCH v4 03/16] drm/msm/dsi: drop mmss_cc.xml.h

2024-03-26 Thread Abhinav Kumar




On 3/22/2024 3:56 PM, Dmitry Baryshkov wrote:

The mmss_cc.xml.h file describes bits of the MMSS clock controller on
APQ8064 / MSM8960 platforms. They are not used by the driver and do not
belong to the DRM MSM driver. Drop the file.

Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/dsi/mmss_cc.xml.h | 131 --
  1 file changed, 131 deletions(-)



Reviewed-by: Abhinav Kumar 


Re: [PATCH v4 02/16] drm/msm/hdmi: drop qfprom.xml.h

2024-03-26 Thread Abhinav Kumar




On 3/22/2024 3:56 PM, Dmitry Baryshkov wrote:

The qfprom.xml.h contains definitions for the nvmem code. They are not
used in the existing code. Also if we were to use them later, we should
have used nvmem cell API instead of using these defs. Drop the file.

Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/hdmi/qfprom.xml.h | 61 ---
  1 file changed, 61 deletions(-)



Reviewed-by: Abhinav Kumar 


Re: [PATCH v4 01/16] drm/msm/mdp5: add writeback block bases

2024-03-26 Thread Abhinav Kumar




On 3/22/2024 3:56 PM, Dmitry Baryshkov wrote:

In order to stop patching the mdp5 headers, import definitions for the
writeback blocks. This part is extracted from the old Rob's patch.

Co-developed-by: Rob Clark 
Signed-off-by: Rob Clark 
Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/disp/mdp5/mdp5_cfg.h | 11 +++
  1 file changed, 11 insertions(+)



This is unused today right?

Is it just being migrated now in advance as all the mesa mdp5 headers 
are moving to kernel?



diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cfg.h 
b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cfg.h
index 26c5d8b4ab46..4b988e69fbfc 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cfg.h
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cfg.h
@@ -69,6 +69,16 @@ struct mdp5_mdp_block {
uint32_t caps;  /* MDP capabilities: MDP_CAP_xxx bits */
  };
  
+struct mdp5_wb_instance {

+   int id;
+   int lm;
+};
+
+struct mdp5_wb_block {
+   MDP5_SUB_BLOCK_DEFINITION;
+   struct mdp5_wb_instance instances[MAX_BASES];
+};
+
  #define MDP5_INTF_NUM_MAX 5
  
  struct mdp5_intf_block {

@@ -98,6 +108,7 @@ struct mdp5_cfg_hw {
struct mdp5_sub_block pp;
struct mdp5_sub_block dsc;
struct mdp5_sub_block cdm;
+   struct mdp5_wb_block wb;
struct mdp5_intf_block intf;
struct mdp5_perf_block perf;
  



Re: drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable 'out' set but not used

2024-03-26 Thread Miguel Ojeda
On Tue, Mar 26, 2024 at 8:56 PM Abhinav Kumar  wrote:
>
> Alright, in that case, Miguel can you please repost this with the Fixes
> tags and in a patch form.

Done at https://lore.kernel.org/lkml/20240326212324.185832-1-oj...@kernel.org/

Thanks all!

Cheers,
Miguel


Re: [PATCH v6 2/3] drm/i915/gt: Do not generate the command streamer for all the CCS

2024-03-26 Thread Matt Roper
On Tue, Mar 26, 2024 at 07:42:34PM +0100, Andi Shyti wrote:
> Hi Matt,
> 
> On Tue, Mar 26, 2024 at 09:03:10AM -0700, Matt Roper wrote:
> > On Wed, Mar 13, 2024 at 09:19:50PM +0100, Andi Shyti wrote:
> > > + /*
> > > +  * Do not create the command streamer for CCS slices
> > > +  * beyond the first. All the workload submitted to the
> > > +  * first engine will be shared among all the slices.
> > > +  *
> > > +  * Once the user will be allowed to customize the CCS
> > > +  * mode, then this check needs to be removed.
> > > +  */
> > > + if (IS_DG2(i915) &&
> > > + class == COMPUTE_CLASS &&
> > > + ccs_instance++)
> > > + continue;
> > 
> > Wouldn't it be more intuitive to drop the non-lowest CCS engines in
> > init_engine_mask() since that's the function that's dedicated to
> > building the list of engines we'll use?  Then we don't need to kill the
> > assertion farther down either.
> 
> Because we don't check the result of init_engine_mask() while
> creating the engine's structure. We check it only after and
> indeed I removed the drm_WARN_ON() check.
> 
> I think the whole process of creating the engine's structure in
> the intel_engines_init_mmio() can be simplified, but this goes
> beyong the scope of the series.
> 
> Or am I missing something?

The important part of init_engine_mask isn't the return value, but
rather that it's what sets up gt->info.engine_mask.  The HAS_ENGINE()
check that intel_engines_init_mmio() uses is based on the value stored
there, so updating that function will also ensure that we skip the
engines we don't want in the loop.


Matt

> 
> Thanks,
> Andi

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation


[PATCH] drm/msm: fix the `CRASHDUMP_READ` target of `a6xx_get_shader_block()`

2024-03-26 Thread Miguel Ojeda
Clang 14 in an (essentially) defconfig arm64 build for next-20240326
reports [1]:

drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error:
variable 'out' set but not used [-Werror,-Wunused-but-set-variable]

The variable `out` in these functions is meant to compute the `target` of
`CRASHDUMP_READ()`, but in this case only the initial value (`dumper->iova
+ A6XX_CD_DATA_OFFSET`) was being passed.

Thus use `out` as it was intended by Connor [2].

There was an alternative patch at [3] that removed the variable
altogether, but that would only use the initial value.

Fixes: 64d6255650d4 ("drm/msm: More fully implement devcoredump for a7xx")
Closes: 
https://lore.kernel.org/lkml/caniq72mjc5t4n25sqvysroehxxpxypz4ppznesjhenc3qap...@mail.gmail.com/
 [1]
Link: 
https://lore.kernel.org/lkml/cacu1e7hhckmjd6fixzspinaz6ekoznkmthtclfvmbz-9vol...@mail.gmail.com/
 [2]
Link: 
https://lore.kernel.org/lkml/20240307093727.1978126-1-colin.i.k...@gmail.com/ 
[3]
Signed-off-by: Miguel Ojeda 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 1f5245fc2cdc..a847a0f7a73c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -852,7 +852,7 @@ static void a6xx_get_shader_block(struct msm_gpu *gpu,
(block->type << 8) | i);
 
in += CRASHDUMP_READ(in, REG_A6XX_HLSQ_DBG_AHB_READ_APERTURE,
-   block->size, dumper->iova + A6XX_CD_DATA_OFFSET);
+   block->size, out);
 
out += block->size * sizeof(u32);
}

base-commit: 084c8e315db34b59d38d06e684b1a0dd07d30287
-- 
2.44.0



Re: [PATCH 01/12] kbuild: make -Woverride-init warnings more consistent

2024-03-26 Thread Arnd Bergmann
On Tue, Mar 26, 2024, at 21:24, Jani Nikula wrote:
> On Tue, 26 Mar 2024, Arnd Bergmann  wrote:
>> From: Arnd Bergmann 
>> index 475e1e8c1d35..0786eb0da391 100644
>> --- a/drivers/net/ethernet/renesas/sh_eth.c
>> +++ b/drivers/net/ethernet/renesas/sh_eth.c
>> @@ -50,7 +50,7 @@
>>   * the macros available to do this only define GCC 8.
>>   */
>>  __diag_push();
>> -__diag_ignore(GCC, 8, "-Woverride-init",
>> +__diag_ignore_all("-Woverride-init",
>>"logic to initialize all and then override some is OK");
>
> This is nice because it's more localized than the per-file
> disable. However, we tried to do this in i915, but this doesn't work for
> GCC versions < 8, and some defconfigs enabling -Werror forced us to
> revert. See commit 290d16104575 ("Revert "drm/i915: use localized
> __diag_ignore_all() instead of per file"").

It works now.

The original __diag_ignore_all() only did it for gcc-8 and above
because that was initially needed to suppress warnings that
got added in that version, but this was always a mistake.

689b097a06ba ("compiler-gcc: Suppress -Wmissing-prototypes
warning for all supported GCC") made it work correctly.

 Arnd


Re: [PATCH 3/4] arm64: dts: qcom: sc8180x: Drop flags for mdss irqs

2024-03-26 Thread Konrad Dybcio
On 26.03.2024 9:02 PM, Dmitry Baryshkov wrote:
> The number of interrupt cells for the mdss interrupt controller is 1,
> meaning there should only be one cell for the interrupt number, not two.
> Drop the second cell containing (unused) irq flags.
> 
> Fixes: 494dec9b6f54 ("arm64: dts: qcom: sc8180x: Add display and gpu nodes")
> Signed-off-by: Dmitry Baryshkov 
> ---

Reviewed-by: Konrad Dybcio 

Konrad


Re: [PATCH 2/4] arm64: dts: qcom: sc8180x: drop legacy property #stream-id-cells

2024-03-26 Thread Konrad Dybcio
On 26.03.2024 9:02 PM, Dmitry Baryshkov wrote:
> The property #stream-id-cells is legacy, it is not documented as valid
> for the GPU. Drop it now.
> 
> Fixes: 494dec9b6f54 ("arm64: dts: qcom: sc8180x: Add display and gpu nodes")
> Signed-off-by: Dmitry Baryshkov 
> ---

Reviewed-by: Konrad Dybcio 

Konrad


Re: [PATCH 2/5] dt-bindings: display: Add GameForce Chi Panel

2024-03-26 Thread Rob Herring


On Mon, 25 Mar 2024 08:49:56 -0500, Chris Morgan wrote:
> From: Chris Morgan 
> 
> The GameForce Chi panel is a panel specific to the GameForce Chi
> handheld device that measures 3.5" diagonally with a resolution of
> 640x480.
> 
> Signed-off-by: Chris Morgan 
> ---
>  .../devicetree/bindings/display/panel/rocktech,jh057n00900.yaml | 2 ++
>  1 file changed, 2 insertions(+)
> 

Acked-by: Rob Herring 



Re: [PATCH 1/5] dt-bindings: vendor-prefix: Add prefix for GameForce

2024-03-26 Thread Rob Herring


On Mon, 25 Mar 2024 08:49:55 -0500, Chris Morgan wrote:
> From: Chris Morgan 
> 
> GameForce is a company that produces handheld game consoles.
> 
> https://gameforce.fun/
> 
> Signed-off-by: Chris Morgan 
> ---
>  Documentation/devicetree/bindings/vendor-prefixes.yaml | 2 ++
>  1 file changed, 2 insertions(+)
> 

Acked-by: Rob Herring 



Re: [PATCH 01/12] kbuild: make -Woverride-init warnings more consistent

2024-03-26 Thread Jani Nikula
On Tue, 26 Mar 2024, Arnd Bergmann  wrote:
> From: Arnd Bergmann 
>
> The -Woverride-init warn about code that may be intentional or not,
> but the inintentional ones tend to be real bugs, so there is a bit of
> disagreement on whether this warning option should be enabled by default
> and we have multiple settings in scripts/Makefile.extrawarn as well as
> individual subsystems.
>
> Older versions of clang only supported -Wno-initializer-overrides with
> the same meaning as gcc's -Woverride-init, though all supported versions
> now work with both. Because of this difference, an earlier cleanup of
> mine accidentally turned the clang warning off for W=1 builds and only
> left it on for W=2, while it's still enabled for gcc with W=1.
>
> There is also one driver that only turns the warning off for newer
> versions of gcc but not other compilers, and some but not all the
> Makefiles still use a cc-disable-warning conditional that is no
> longer needed with supported compilers here.
>
> Address all of the above by removing the special cases for clang
> and always turning the warning off unconditionally where it got
> in the way, using the syntax that is supported by both compilers.
>
> Fixes: 2cd3271b7a31 ("kbuild: avoid duplicate warning options")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/gpu/drm/amd/display/dc/dce110/Makefile |  2 +-
>  drivers/gpu/drm/amd/display/dc/dce112/Makefile |  2 +-
>  drivers/gpu/drm/amd/display/dc/dce120/Makefile |  2 +-
>  drivers/gpu/drm/amd/display/dc/dce60/Makefile  |  2 +-
>  drivers/gpu/drm/amd/display/dc/dce80/Makefile  |  2 +-
>  drivers/gpu/drm/i915/Makefile  |  6 +++---
>  drivers/gpu/drm/xe/Makefile|  4 ++--
>  drivers/net/ethernet/renesas/sh_eth.c  |  2 +-
>  drivers/pinctrl/aspeed/Makefile|  2 +-
>  fs/proc/Makefile   |  2 +-
>  kernel/bpf/Makefile|  2 +-
>  mm/Makefile|  3 +--
>  scripts/Makefile.extrawarn | 10 +++---
>  13 files changed, 18 insertions(+), 23 deletions(-)
>

[snip]

> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 3ef6ed41e62b..4c2f85632391 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -33,9 +33,9 @@ endif
>  subdir-ccflags-$(CONFIG_DRM_I915_WERROR) += -Werror
>  
>  # Fine grained warnings disable
> -CFLAGS_i915_pci.o = $(call cc-disable-warning, override-init)
> -CFLAGS_display/intel_display_device.o = $(call cc-disable-warning, 
> override-init)
> -CFLAGS_display/intel_fbdev.o = $(call cc-disable-warning, override-init)
> +CFLAGS_i915_pci.o = -Wno-override-init
> +CFLAGS_display/intel_display_device.o = -Wno-override-init
> +CFLAGS_display/intel_fbdev.o = -Wno-override-init
>  
>  # Support compiling the display code separately for both i915 and xe
>  # drivers. Define I915 when building i915.
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 5a428ca00f10..c29a850859ad 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -172,8 +172,8 @@ subdir-ccflags-$(CONFIG_DRM_XE_DISPLAY) += \
>   -Ddrm_i915_gem_object=xe_bo \
>   -Ddrm_i915_private=xe_device
>  
> -CFLAGS_i915-display/intel_fbdev.o = $(call cc-disable-warning, override-init)
> -CFLAGS_i915-display/intel_display_device.o = $(call cc-disable-warning, 
> override-init)
> +CFLAGS_i915-display/intel_fbdev.o = -Wno-override-init
> +CFLAGS_i915-display/intel_display_device.o = -Wno-override-init

For i915 and xe parts,

Acked-by: Jani Nikula 

>  # Rule to build SOC code shared with i915
>  $(obj)/i915-soc/%.o: $(srctree)/drivers/gpu/drm/i915/soc/%.c FORCE
> diff --git a/drivers/net/ethernet/renesas/sh_eth.c 
> b/drivers/net/ethernet/renesas/sh_eth.c
> index 475e1e8c1d35..0786eb0da391 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
> @@ -50,7 +50,7 @@
>   * the macros available to do this only define GCC 8.
>   */
>  __diag_push();
> -__diag_ignore(GCC, 8, "-Woverride-init",
> +__diag_ignore_all("-Woverride-init",
> "logic to initialize all and then override some is OK");

This is nice because it's more localized than the per-file
disable. However, we tried to do this in i915, but this doesn't work for
GCC versions < 8, and some defconfigs enabling -Werror forced us to
revert. See commit 290d16104575 ("Revert "drm/i915: use localized
__diag_ignore_all() instead of per file"").

BR,
Jani.


-- 
Jani Nikula, Intel


Re: [RFC PATCH net-next v6 02/15] net: page_pool: create hooks for custom page providers

2024-03-26 Thread Mina Almasry
On Sun, Mar 24, 2024 at 4:37 PM Christoph Hellwig  wrote:
>
> On Fri, Mar 22, 2024 at 10:54:54AM -0700, Mina Almasry wrote:
> > Sorry I don't mean to argue but as David mentioned, there are some
> > plans in the works and ones not in the works to extend this to other
> > memory types. David mentioned io_uring & Jakub's huge page use cases
> > which may want to re-use this design. I have an additional one in
> > mind, which is extending devmem TCP for storage devices. Currently
> > storage devices do not support dmabuf and my understanding is that
> > it's very hard to do so, and NVMe uses pci_p2pdma instead. I wonder if
> > it's possible to extend devmem TCP in the future to support pci_p2pdma
> > to support nvme devices in the future.
>
> The block layer needs to suppotr dmabuf for this kind of I/O.
> Any special netdev to block side channel will be NAKed before you can
> even send it out.

Thanks, a few questions if you have time to help me understand the
potential of extending this to storage devices.

Are you envisioning that dmabuf support would be added to the block
layer (which I understand is part of the VFS and not driver specific),
or as part of the specific storage driver (like nvme for example)? If
we can add dmabuf support to the block layer itself that sounds
awesome. We may then be able to do devmem TCP on all/most storage
devices without having to modify each individual driver.

In your estimation, is adding dmabuf support to the block layer
something technically feasible & acceptable upstream? I notice you
suggested it so I'm guessing yes to both, but I thought I'd confirm.

Worthy of note this is all pertaining to potential follow up use
cases, nothing in this particular proposal is trying to do any of this
yet.

-- 
Thanks,
Mina


Re: [RFC PATCH net-next v6 00/15] Device Memory TCP

2024-03-26 Thread Mina Almasry
On Tue, Mar 26, 2024 at 5:47 AM Yunsheng Lin  wrote:
>
> On 2024/3/26 8:28, Mina Almasry wrote:
> > On Tue, Mar 5, 2024 at 11:38 AM Mina Almasry  wrote:
> >>
> >> On Tue, Mar 5, 2024 at 4:54 AM Yunsheng Lin  wrote:
> >>>
> >>> On 2024/3/5 10:01, Mina Almasry wrote:
> >>>
> >>> ...
> >>>
> 
>  Perf - page-pool benchmark:
>  ---
> 
>  bench_page_pool_simple.ko tests with and without these changes:
>  https://pastebin.com/raw/ncHDwAbn
> 
>  AFAIK the number that really matters in the perf tests is the
>  'tasklet_page_pool01_fast_path Per elem'. This one measures at about 8
>  cycles without the changes but there is some 1 cycle noise in some
>  results.
> 
>  With the patches this regresses to 9 cycles with the changes but there
>  is 1 cycle noise occasionally running this test repeatedly.
> 
>  Lastly I tried disable the static_branch_unlikely() in
>  netmem_is_net_iov() check. To my surprise disabling the
>  static_branch_unlikely() check reduces the fast path back to 8 cycles,
>  but the 1 cycle noise remains.
> 
> >>>
> >>> The last sentence seems to be suggesting the above 1 ns regresses is 
> >>> caused
> >>> by the static_branch_unlikely() checking?
> >>
> >> Note it's not a 1ns regression, it's looks like maybe a 1 cycle
> >> regression (slightly less than 1ns if I'm reading the output of the
> >> test correctly):
> >>
> >> # clean net-next
> >> time_bench: Type:tasklet_page_pool01_fast_path Per elem: 8 cycles(tsc)
> >> 2.993 ns (step:0)
> >>
> >> # with patches
> >> time_bench: Type:tasklet_page_pool01_fast_path Per elem: 9 cycles(tsc)
> >> 3.679 ns (step:0)
> >>
> >> # with patches and with diff that disables static branching:
> >> time_bench: Type:tasklet_page_pool01_fast_path Per elem: 8 cycles(tsc)
> >> 3.248 ns (step:0)
> >>
> >> I do see noise in the test results between run and run, and any
> >> regression (if any) is slightly obfuscated by the noise, so it's a bit
> >> hard to make confident statements. So far it looks like a ~0.25ns
> >> regression without static branch and about ~0.65ns with static branch.
> >>
> >> Honestly when I saw all 3 results were within some noise I did not
> >> investigate more, but if this looks concerning to you I can dig
> >> further. I likely need to gather a few test runs to filter out the
> >> noise and maybe investigate the assembly my compiler is generating to
> >> maybe narrow down what changes there.
> >>
> >
> > I did some more investigation here to gather more data to filter out
> > the noise, and recorded the summary here:
> >
> > https://pastebin.com/raw/v5dYRg8L
> >
> > Long story short, the page_pool benchmark results are consistent with
> > some outlier noise results that I'm discounting here. Currently
> > page_pool fast path is at 8 cycles
> >
> > [ 2115.724510] time_bench: Type:tasklet_page_pool01_fast_path Per
> > elem: 8 cycles(tsc) 3.187 ns (step:0) - (measurement period
> > time:0.031870585 sec time_interval:31870585) - (invoke count:1000
> > tsc_interval:86043192)
> >
> > and with this patch series it degrades to 10 cycles, or about a 0.7ns
> > degradation or so:
>
> Even if the absolute value for the overhead is small, we seems have a
> degradation of about 20% for tasklet_page_pool01_fast_path testcase,
> which seems scary.
>
> I am assuming that every page is recyclable for tasklet_page_pool01_fast_path
> testcase, and that code path matters for page_pool, it would be good to
> remove any additional checking for that code path.
>

We can remove the usage of static_branch_unlikely in the net_iov
check, which reduces the overhead to 1 cycle (8->9), only 12.5%
overhead. The addition of the static_branch_unlikely is not improving
the performance of devmem TCP anyway. From previous discussions with
Jesper he deemed a 1 cycle degradation acceptable, but he hasn't
commented in a while, he may have changed his mind but so far no
complaints.

We can additionally only add the check only if
CONFIG_SHARED_DMA_BUFFER is enabled. I've tested that and the fast
path goes back to 8 cycles (0 overhead). If CONFIG_SHARED_DMA_BUFFER
is not enabled then netmem can't be dmabuf anyway, so no reason to
check.

> And we already have pool->has_init_callback checking when we have to use
> a new page, it may make sense to refactor that to share the same checking
> for provider to avoid the overhead as much as possible.
>
> Also, I am not sure if it really matter that much, as with the introducing
> of netmem_is_net_iov() checking spreading in the networking, the overhead
> might add up for other case too.


-- 
Thanks,
Mina


Re: [PATCH] drm: DRM_WERROR should depend on DRM

2024-03-26 Thread Jani Nikula
On Tue, 26 Mar 2024, Geert Uytterhoeven  wrote:
> There is no point in asking the user about enforcing the DRM compiler
> warning policy when configuring a kernel without DRM support.
>
> Fixes: f89632a9e5fa6c47 ("drm: Add CONFIG_DRM_WERROR")
> Signed-off-by: Geert Uytterhoeven 

D'oh! My bad.

Reviewed-by: Jani Nikula 

> ---
>  drivers/gpu/drm/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index f2bcf5504aa77679..2e1b23ccf30423a9 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -423,7 +423,7 @@ config DRM_PRIVACY_SCREEN
>  
>  config DRM_WERROR
>   bool "Compile the drm subsystem with warnings as errors"
> - depends on EXPERT
> + depends on DRM && EXPERT
>   default n
>   help
> A kernel build should not cause any compiler warnings, and this

-- 
Jani Nikula, Intel


[RFC PATCH 1/1] dt-bindings: display/msm: gpu: Split Adreno schemas into separate files

2024-03-26 Thread Adam Skladowski
Split shared schema into per-gen and group adrenos by clocks used.

Signed-off-by: Adam Skladowski 
---
 .../devicetree/bindings/display/msm/gpu.yaml  | 317 ++
 .../bindings/display/msm/qcom,adreno-306.yaml | 115 +++
 .../bindings/display/msm/qcom,adreno-330.yaml | 111 ++
 .../bindings/display/msm/qcom,adreno-405.yaml | 135 
 .../bindings/display/msm/qcom,adreno-506.yaml | 184 ++
 .../bindings/display/msm/qcom,adreno-530.yaml | 161 +
 .../bindings/display/msm/qcom,adreno-540.yaml | 154 +
 .../bindings/display/msm/qcom,adreno-6xx.yaml | 160 +
 .../display/msm/qcom,adreno-common.yaml   | 112 +++
 9 files changed, 1157 insertions(+), 292 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-306.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-330.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-405.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-506.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-530.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-540.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-6xx.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-common.yaml

diff --git a/Documentation/devicetree/bindings/display/msm/gpu.yaml 
b/Documentation/devicetree/bindings/display/msm/gpu.yaml
index 40b5c6bd11f8..be29d85e597c 100644
--- a/Documentation/devicetree/bindings/display/msm/gpu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/gpu.yaml
@@ -5,7 +5,7 @@
 $id: http://devicetree.org/schemas/display/msm/gpu.yaml#
 $schema: http://devicetree.org/meta-schemas/core.yaml#
 
-title: Adreno or Snapdragon GPUs
+title: Imageon 200 GPU
 
 maintainers:
   - Rob Clark 
@@ -13,18 +13,6 @@ maintainers:
 properties:
   compatible:
 oneOf:
-  - description: |
-  The driver is parsing the compat string for Adreno to
-  figure out the chip-id.
-items:
-  - pattern: 
'^qcom,adreno-[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]$'
-  - const: qcom,adreno
-  - description: |
-  The driver is parsing the compat string for Adreno to
-  figure out the gpu-id and patch level.
-items:
-  - pattern: '^qcom,adreno-[3-7][0-9][0-9]\.[0-9]+$'
-  - const: qcom,adreno
   - description: |
   The driver is parsing the compat string for Imageon to
   figure out the gpu-id and patch level.
@@ -32,88 +20,31 @@ properties:
   - pattern: '^amd,imageon-200\.[0-1]$'
   - const: amd,imageon
 
-  clocks: true
+  clocks:
+items:
+  - description: GPU Core clock
+  - description: GPU Memory Interface clock
 
-  clock-names: true
+  clock-names:
+items:
+  - const: core_clk
+  - const: mem_iface_clk
 
   reg:
-minItems: 1
-maxItems: 3
+items:
+  - description: base address of GPU device
 
   reg-names:
-minItems: 1
-maxItems: 3
+items:
+  - const: kgsl_3d0_reg_memory
 
   interrupts:
-maxItems: 1
-
-  interrupt-names:
-maxItems: 1
-
-  interconnects:
-minItems: 1
-maxItems: 2
-
-  interconnect-names:
-minItems: 1
 items:
-  - const: gfx-mem
-  - const: ocmem
+  - description: interrupt of GPU device
 
-  iommus:
-minItems: 1
-maxItems: 64
-
-  sram:
-$ref: /schemas/types.yaml#/definitions/phandle-array
-minItems: 1
-maxItems: 4
+  interrupt-names:
 items:
-  maxItems: 1
-description: |
-  phandles to one or more reserved on-chip SRAM regions.
-  phandle to the On Chip Memory (OCMEM) that's present on some a3xx and
-  a4xx Snapdragon SoCs. See
-  Documentation/devicetree/bindings/sram/qcom,ocmem.yaml
-
-  operating-points-v2: true
-  opp-table:
-type: object
-
-  power-domains:
-maxItems: 1
-
-  zap-shader:
-type: object
-additionalProperties: false
-description: |
-  For a5xx and a6xx devices this node contains a memory-region that
-  points to reserved memory to store the zap shader that can be used to
-  help bring the GPU out of secure mode.
-properties:
-  memory-region:
-maxItems: 1
-
-  firmware-name:
-description: |
-  Default name of the firmware to load to the remote processor.
-
-  "#cooling-cells":
-const: 2
-
-  nvmem-cell-names:
-maxItems: 1
-
-  nvmem-cells:
-description: efuse registers
-maxItems: 1
-
-  qcom,gmu:
-$ref: /schemas/types.yaml#/definitions/phandle
-description: |
-  For GMU attached devices a phandle to the GMU device that will
-  control the power for the GPU.
-
+  - const: kgsl_3d0_irq
 
 required:
   - compatible
@@ -122,222 +53,24 @@ required:
 
 additionalProperties: false
 

[RFC PATCH 0/1] Split Adreno schemas

2024-03-26 Thread Adam Skladowski
Following recommendation from Dmitry Baryshkov this series split schema
into separate schemas per gpu family, as i don't really understand much
of yamls and dt-schema i decided to send this as RFC and if there
are any changes suggested i will be glad if these can be explained
to me in ELI5 format.

Adam Skladowski (1):
  dt-bindings: display/msm: gpu: Split Adreno schemas into separate
files

 .../devicetree/bindings/display/msm/gpu.yaml  | 317 ++
 .../bindings/display/msm/qcom,adreno-306.yaml | 115 +++
 .../bindings/display/msm/qcom,adreno-330.yaml | 111 ++
 .../bindings/display/msm/qcom,adreno-405.yaml | 135 
 .../bindings/display/msm/qcom,adreno-506.yaml | 184 ++
 .../bindings/display/msm/qcom,adreno-530.yaml | 161 +
 .../bindings/display/msm/qcom,adreno-540.yaml | 154 +
 .../bindings/display/msm/qcom,adreno-6xx.yaml | 160 +
 .../display/msm/qcom,adreno-common.yaml   | 112 +++
 9 files changed, 1157 insertions(+), 292 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-306.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-330.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-405.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-506.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-530.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-540.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-6xx.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/qcom,adreno-common.yaml

-- 
2.44.0



[PATCH 4/4] arm64: dts: qcom: sc8180x: add dp_p1 register blocks to DP nodes

2024-03-26 Thread Dmitry Baryshkov
DisplayPort nodes must declare the dp_p1 register space in addition to
dp_p0. Add corresponding resource to DisplayPort DT nodes.

Signed-off-by: Dmitry Baryshkov 
---
 arch/arm64/boot/dts/qcom/sc8180x.dtsi | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sc8180x.dtsi 
b/arch/arm64/boot/dts/qcom/sc8180x.dtsi
index 6d74867d3b61..019104bd70fb 100644
--- a/arch/arm64/boot/dts/qcom/sc8180x.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc8180x.dtsi
@@ -3029,7 +3029,8 @@ mdss_dp0: displayport-controller@ae9 {
reg = <0 0xae9 0 0x200>,
  <0 0xae90200 0 0x200>,
  <0 0xae90400 0 0x600>,
- <0 0xae90a00 0 0x400>;
+ <0 0xae90a00 0 0x400>,
+ <0 0xae91000 0 0x400>;
interrupt-parent = <>;
interrupts = <12>;
clocks = < DISP_CC_MDSS_AHB_CLK>,
@@ -3105,7 +3106,8 @@ mdss_dp1: displayport-controller@ae98000 {
reg = <0 0xae98000 0 0x200>,
  <0 0xae98200 0 0x200>,
  <0 0xae98400 0 0x600>,
- <0 0xae98a00 0 0x400>;
+ <0 0xae98a00 0 0x400>,
+ <0 0xae99000 0 0x400>;
interrupt-parent = <>;
interrupts = <13>;
clocks = < DISP_CC_MDSS_AHB_CLK>,

-- 
2.39.2



[PATCH 1/4] dt-bindings: display/msm: sm8150-mdss: add DP node

2024-03-26 Thread Dmitry Baryshkov
As Qualcomm SM8150 got support for the DisplayPort, add displayport@
node as a valid child to the MDSS node.

Signed-off-by: Dmitry Baryshkov 
---
 .../devicetree/bindings/display/msm/qcom,sm8150-mdss.yaml  | 10 ++
 1 file changed, 10 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/display/msm/qcom,sm8150-mdss.yaml 
b/Documentation/devicetree/bindings/display/msm/qcom,sm8150-mdss.yaml
index c0d6a4fdff97..40b077fb20aa 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,sm8150-mdss.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sm8150-mdss.yaml
@@ -53,6 +53,16 @@ patternProperties:
   compatible:
 const: qcom,sm8150-dpu
 
+  "^displayport-controller@[0-9a-f]+$":
+type: object
+additionalProperties: true
+
+properties:
+  compatible:
+items:
+  - const: qcom,sm8150-dp
+  - const: qcom,sm8350-dp
+
   "^dsi@[0-9a-f]+$":
 type: object
 additionalProperties: true

-- 
2.39.2



[PATCH 3/4] arm64: dts: qcom: sc8180x: Drop flags for mdss irqs

2024-03-26 Thread Dmitry Baryshkov
The number of interrupt cells for the mdss interrupt controller is 1,
meaning there should only be one cell for the interrupt number, not two.
Drop the second cell containing (unused) irq flags.

Fixes: 494dec9b6f54 ("arm64: dts: qcom: sc8180x: Add display and gpu nodes")
Signed-off-by: Dmitry Baryshkov 
---
 arch/arm64/boot/dts/qcom/sc8180x.dtsi | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sc8180x.dtsi 
b/arch/arm64/boot/dts/qcom/sc8180x.dtsi
index 99462b42cfc5..6d74867d3b61 100644
--- a/arch/arm64/boot/dts/qcom/sc8180x.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc8180x.dtsi
@@ -2804,7 +2804,7 @@ mdss_mdp: mdp@ae01000 {
power-domains = < SC8180X_MMCX>;
 
interrupt-parent = <>;
-   interrupts = <0 IRQ_TYPE_LEVEL_HIGH>;
+   interrupts = <0>;
 
ports {
#address-cells = <1>;
@@ -2877,7 +2877,7 @@ mdss_dsi0: dsi@ae94000 {
reg-names = "dsi_ctrl";
 
interrupt-parent = <>;
-   interrupts = <4 IRQ_TYPE_LEVEL_HIGH>;
+   interrupts = <4>;
 
clocks = < DISP_CC_MDSS_BYTE0_CLK>,
 < DISP_CC_MDSS_BYTE0_INTF_CLK>,
@@ -2963,7 +2963,7 @@ mdss_dsi1: dsi@ae96000 {
reg-names = "dsi_ctrl";
 
interrupt-parent = <>;
-   interrupts = <5 IRQ_TYPE_LEVEL_HIGH>;
+   interrupts = <5>;
 
clocks = < DISP_CC_MDSS_BYTE1_CLK>,
 < DISP_CC_MDSS_BYTE1_INTF_CLK>,

-- 
2.39.2



[PATCH 2/4] arm64: dts: qcom: sc8180x: drop legacy property #stream-id-cells

2024-03-26 Thread Dmitry Baryshkov
The property #stream-id-cells is legacy, it is not documented as valid
for the GPU. Drop it now.

Fixes: 494dec9b6f54 ("arm64: dts: qcom: sc8180x: Add display and gpu nodes")
Signed-off-by: Dmitry Baryshkov 
---
 arch/arm64/boot/dts/qcom/sc8180x.dtsi | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sc8180x.dtsi 
b/arch/arm64/boot/dts/qcom/sc8180x.dtsi
index 32afc78d5b76..99462b42cfc5 100644
--- a/arch/arm64/boot/dts/qcom/sc8180x.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc8180x.dtsi
@@ -2225,7 +2225,6 @@ tcsr_mutex: hwlock@1f4 {
 
gpu: gpu@2c0 {
compatible = "qcom,adreno-680.1", "qcom,adreno";
-   #stream-id-cells = <16>;
 
reg = <0 0x02c0 0 0x4>;
reg-names = "kgsl_3d0_reg_memory";

-- 
2.39.2



[PATCH 0/4] arm64: dts: fix several display-related schema warnings

2024-03-26 Thread Dmitry Baryshkov
Fix several warnings produced by the display nodes.

Signed-off-by: Dmitry Baryshkov 
---
Dmitry Baryshkov (4):
  dt-bindings: display/msm: sm8150-mdss: add DP node
  arm64: dts: qcom: sc8180x: drop legacy property #stream-id-cells
  arm64: dts: qcom: sc8180x: Drop flags for mdss irqs
  arm64: dts: qcom: sc8180x: add dp_p1 register blocks to DP nodes

 .../devicetree/bindings/display/msm/qcom,sm8150-mdss.yaml   | 10 ++
 arch/arm64/boot/dts/qcom/sc8180x.dtsi   | 13 +++--
 2 files changed, 17 insertions(+), 6 deletions(-)
---
base-commit: 13ee4a7161b6fd938aef6688ff43b163f6d83e37
change-id: 20240326-fd-fix-schema-b91f94a95135

Best regards,
-- 
Dmitry Baryshkov 



Re: drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable 'out' set but not used

2024-03-26 Thread Abhinav Kumar




On 3/26/2024 12:47 PM, Dmitry Baryshkov wrote:

On Tue, 26 Mar 2024 at 21:32, Abhinav Kumar  wrote:




On 3/26/2024 12:10 PM, Dmitry Baryshkov wrote:

On Tue, 26 Mar 2024 at 20:31, Abhinav Kumar  wrote:




On 3/26/2024 11:19 AM, Dmitry Baryshkov wrote:

On Tue, 26 Mar 2024 at 20:05, Miguel Ojeda
 wrote:


Hi,

In today's next, I got:

   drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable
'out' set but not used [-Werror,-Wunused-but-set-variable]

`out` seems to be there since commit 64d6255650d4 ("drm/msm: More
fully implement devcoredump for a7xx").

Untested diff below assuming `dumper->iova` is constant -- if you want
a formal patch, please let me know.


Please send a proper patch that we can pick up.



This should be fixed with https://patchwork.freedesktop.org/patch/581853/.


Is that a correct fix? If you check other usage locations for
CRASHDUMP_READ, you'll see that `out` is the last parameter and it is
being incremented.



Right but in this function out is not the last parameter of CRASHDUMP_READ.


Yes. I think in this case the patch from this email is more correct.



Alright, in that case, Miguel can you please repost this with the Fixes 
tags and in a patch form.




Maybe you or Rob can correct me but I thought the fix looked sane
although noone commented on that patch.






We can pickup that one with a Fixes tag applied.



Cheers,
Miguel

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 1f5245fc2cdc..a847a0f7a73c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -852,7 +852,7 @@ static void a6xx_get_shader_block(struct msm_gpu *gpu,
(block->type << 8) | i);

in += CRASHDUMP_READ(in, REG_A6XX_HLSQ_DBG_AHB_READ_APERTURE,
-block->size, dumper->iova + A6XX_CD_DATA_OFFSET);
+block->size, out);

out += block->size * sizeof(u32);
}














Re: drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable 'out' set but not used

2024-03-26 Thread Connor Abbott
On Tue, Mar 26, 2024 at 7:47 PM Dmitry Baryshkov
 wrote:
>
> On Tue, 26 Mar 2024 at 21:32, Abhinav Kumar  wrote:
> >
> >
> >
> > On 3/26/2024 12:10 PM, Dmitry Baryshkov wrote:
> > > On Tue, 26 Mar 2024 at 20:31, Abhinav Kumar  
> > > wrote:
> > >>
> > >>
> > >>
> > >> On 3/26/2024 11:19 AM, Dmitry Baryshkov wrote:
> > >>> On Tue, 26 Mar 2024 at 20:05, Miguel Ojeda
> > >>>  wrote:
> > 
> >  Hi,
> > 
> >  In today's next, I got:
> > 
> >    drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: 
> >  variable
> >  'out' set but not used [-Werror,-Wunused-but-set-variable]
> > 
> >  `out` seems to be there since commit 64d6255650d4 ("drm/msm: More
> >  fully implement devcoredump for a7xx").
> > 
> >  Untested diff below assuming `dumper->iova` is constant -- if you want
> >  a formal patch, please let me know.
> > >>>
> > >>> Please send a proper patch that we can pick up.
> > >>>
> > >>
> > >> This should be fixed with 
> > >> https://patchwork.freedesktop.org/patch/581853/.
> > >
> > > Is that a correct fix? If you check other usage locations for
> > > CRASHDUMP_READ, you'll see that `out` is the last parameter and it is
> > > being incremented.
> > >
> >
> > Right but in this function out is not the last parameter of CRASHDUMP_READ.
>
> Yes. I think in this case the patch from this email is more correct.

Yes, this patch is more correct than the other one. I tried to fix a
bug with a6xx that I noticed while adding support for a7xx, which I
forgot to split out from "drm/msm: More fully implement devcoredump
for a7xx" into a separate commit, and this hunk was missing. Sorry
about that.

Connor


Re: drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable 'out' set but not used

2024-03-26 Thread Dmitry Baryshkov
On Tue, 26 Mar 2024 at 21:32, Abhinav Kumar  wrote:
>
>
>
> On 3/26/2024 12:10 PM, Dmitry Baryshkov wrote:
> > On Tue, 26 Mar 2024 at 20:31, Abhinav Kumar  
> > wrote:
> >>
> >>
> >>
> >> On 3/26/2024 11:19 AM, Dmitry Baryshkov wrote:
> >>> On Tue, 26 Mar 2024 at 20:05, Miguel Ojeda
> >>>  wrote:
> 
>  Hi,
> 
>  In today's next, I got:
> 
>    drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable
>  'out' set but not used [-Werror,-Wunused-but-set-variable]
> 
>  `out` seems to be there since commit 64d6255650d4 ("drm/msm: More
>  fully implement devcoredump for a7xx").
> 
>  Untested diff below assuming `dumper->iova` is constant -- if you want
>  a formal patch, please let me know.
> >>>
> >>> Please send a proper patch that we can pick up.
> >>>
> >>
> >> This should be fixed with https://patchwork.freedesktop.org/patch/581853/.
> >
> > Is that a correct fix? If you check other usage locations for
> > CRASHDUMP_READ, you'll see that `out` is the last parameter and it is
> > being incremented.
> >
>
> Right but in this function out is not the last parameter of CRASHDUMP_READ.

Yes. I think in this case the patch from this email is more correct.

>
> Maybe you or Rob can correct me but I thought the fix looked sane
> although noone commented on that patch.

>
> >>
> >> We can pickup that one with a Fixes tag applied.
> >>
> 
>  Cheers,
>  Miguel
> 
>  diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
>  b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
>  index 1f5245fc2cdc..a847a0f7a73c 100644
>  --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
>  +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
>  @@ -852,7 +852,7 @@ static void a6xx_get_shader_block(struct msm_gpu 
>  *gpu,
> (block->type << 8) | i);
> 
> in += CRASHDUMP_READ(in, REG_A6XX_HLSQ_DBG_AHB_READ_APERTURE,
>  -block->size, dumper->iova + A6XX_CD_DATA_OFFSET);
>  +block->size, out);
> 
> out += block->size * sizeof(u32);
> }
> >>>
> >>>
> >>>
> >
> >
> >



-- 
With best wishes
Dmitry


Re: drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable 'out' set but not used

2024-03-26 Thread Abhinav Kumar




On 3/26/2024 12:10 PM, Dmitry Baryshkov wrote:

On Tue, 26 Mar 2024 at 20:31, Abhinav Kumar  wrote:




On 3/26/2024 11:19 AM, Dmitry Baryshkov wrote:

On Tue, 26 Mar 2024 at 20:05, Miguel Ojeda
 wrote:


Hi,

In today's next, I got:

  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable
'out' set but not used [-Werror,-Wunused-but-set-variable]

`out` seems to be there since commit 64d6255650d4 ("drm/msm: More
fully implement devcoredump for a7xx").

Untested diff below assuming `dumper->iova` is constant -- if you want
a formal patch, please let me know.


Please send a proper patch that we can pick up.



This should be fixed with https://patchwork.freedesktop.org/patch/581853/.


Is that a correct fix? If you check other usage locations for
CRASHDUMP_READ, you'll see that `out` is the last parameter and it is
being incremented.



Right but in this function out is not the last parameter of CRASHDUMP_READ.

Maybe you or Rob can correct me but I thought the fix looked sane 
although noone commented on that patch.




We can pickup that one with a Fixes tag applied.



Cheers,
Miguel

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 1f5245fc2cdc..a847a0f7a73c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -852,7 +852,7 @@ static void a6xx_get_shader_block(struct msm_gpu *gpu,
   (block->type << 8) | i);

   in += CRASHDUMP_READ(in, REG_A6XX_HLSQ_DBG_AHB_READ_APERTURE,
-block->size, dumper->iova + A6XX_CD_DATA_OFFSET);
+block->size, out);

   out += block->size * sizeof(u32);
   }










Re: [PATCH] dt-bindings: display: rockchip: add missing #sound-dai-cells to dw-hdmi

2024-03-26 Thread Heiko Stübner
Am Dienstag, 26. März 2024, 18:50:37 CET schrieb Krzysztof Kozlowski:
> On 26/03/2024 18:50, Krzysztof Kozlowski wrote:
> > On 26/03/2024 18:28, Heiko Stuebner wrote:
> >> The #sound-dai-cells DT property is required to describe link between
> >> the HDMI IP block and the SoC's audio subsystem.
> >>
> >> Signed-off-by: Heiko Stuebner 
> >> ---
> >>  .../devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git 
> >> a/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml 
> >> b/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
> >> index af638b6c0d21..3768df80ca7a 100644
> >> --- 
> >> a/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
> >> +++ 
> >> b/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
> >> @@ -124,6 +124,9 @@ properties:
> >>  description:
> >>phandle to the GRF to mux vopl/vopb.
> >>  
> >> +  "#sound-dai-cells":
> >> +const: 0
> >> +
> > 
> > Then you miss $ref in allOf to /schemas/sound/dai-common.yaml
> 
> I meant, except your change you should add also above $ref.

sorry about that, will fix that.

Thanks for the pointer
Heiko





Re: [PATCH v4 2/2] drm/msm/dp: Add support for the X1E80100

2024-03-26 Thread Bjorn Andersson
On Sun, Mar 24, 2024 at 08:56:52PM +0200, Abel Vesa wrote:
> Add the X1E80100 DP descs and compatible. This platform will be using
> a single compatible for both eDP and DP mode. The actual mode will
> be set based on the presence of the panel node in DT.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Abel Vesa 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/gpu/drm/msm/dp/dp_display.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c 
> b/drivers/gpu/drm/msm/dp/dp_display.c
> index 9169a739cc54..521cba76d2a0 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -171,6 +171,14 @@ static const struct msm_dp_desc sm8650_dp_descs[] = {
>   {}
>  };
>  
> +static const struct msm_dp_desc x1e80100_dp_descs[] = {
> + { .io_start = 0x0ae9, .id = MSM_DP_CONTROLLER_0, 
> .wide_bus_supported = true },
> + { .io_start = 0x0ae98000, .id = MSM_DP_CONTROLLER_1, 
> .wide_bus_supported = true },
> + { .io_start = 0x0ae9a000, .id = MSM_DP_CONTROLLER_2, 
> .wide_bus_supported = true },
> + { .io_start = 0x0aea, .id = MSM_DP_CONTROLLER_3, 
> .wide_bus_supported = true },
> + {}
> +};
> +
>  static const struct of_device_id dp_dt_match[] = {
>   { .compatible = "qcom,sc7180-dp", .data = _dp_descs },
>   { .compatible = "qcom,sc7280-dp", .data = _dp_descs },
> @@ -182,6 +190,7 @@ static const struct of_device_id dp_dt_match[] = {
>   { .compatible = "qcom,sdm845-dp", .data = _dp_descs },
>   { .compatible = "qcom,sm8350-dp", .data = _dp_descs },
>   { .compatible = "qcom,sm8650-dp", .data = _dp_descs },
> + { .compatible = "qcom,x1e80100-dp", .data = _dp_descs },
>   {}
>  };
>  
> 
> -- 
> 2.34.1
> 


Re: [PATCH v4 1/2] drm/msm/dp: Add support for determining the eDP/DP mode from DT

2024-03-26 Thread Bjorn Andersson
On Sun, Mar 24, 2024 at 08:56:51PM +0200, Abel Vesa wrote:
> Instead of relying on different compatibles for eDP and DP, lookup
> the panel node in devicetree to figure out the connector type and
> then pass on that information to the PHY. External DP doesn't have
> a panel described in DT, therefore, assume it's eDP if panel node
> is present.
> 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> Signed-off-by: Abel Vesa 
> ---
>  drivers/gpu/drm/msm/dp/dp_display.c | 29 -
>  1 file changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c 
> b/drivers/gpu/drm/msm/dp/dp_display.c
> index c4cb82af5c2f..9169a739cc54 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -726,6 +726,14 @@ static int dp_init_sub_modules(struct dp_display_private 
> *dp)
>   if (IS_ERR(phy))
>   return PTR_ERR(phy);
>  
> + rc = phy_set_mode_ext(phy, PHY_MODE_DP,
> +   dp->dp_display.is_edp ? PHY_SUBMODE_EDP : 
> PHY_SUBMODE_DP);
> + if (rc) {
> + DRM_ERROR("failed to set phy submode, rc = %d\n", rc);
> + dp->catalog = NULL;
> + goto error;
> + }
> +
>   dp->catalog = dp_catalog_get(dev);
>   if (IS_ERR(dp->catalog)) {
>   rc = PTR_ERR(dp->catalog);
> @@ -1241,6 +1249,25 @@ static int dp_auxbus_done_probe(struct drm_dp_aux *aux)
>   return dp_display_probe_tail(aux->dev);
>  }
>  
> +static int dp_display_get_connector_type(struct platform_device *pdev,
> +  const struct msm_dp_desc *desc)
> +{
> + struct device_node *node = pdev->dev.of_node;
> + struct device_node *aux_bus = of_get_child_by_name(node, "aux-bus");
> + struct device_node *panel = of_get_child_by_name(aux_bus, "panel");
> + int connector_type;
> +
> + if (panel)
> + connector_type = DRM_MODE_CONNECTOR_eDP;
> + else
> + connector_type = DRM_MODE_SUBCONNECTOR_DisplayPort;
> +
> + of_node_put(panel);
> + of_node_put(aux_bus);
> +
> + return connector_type;
> +}
> +
>  static int dp_display_probe(struct platform_device *pdev)
>  {
>   int rc = 0;
> @@ -1263,7 +1290,7 @@ static int dp_display_probe(struct platform_device 
> *pdev)
>   dp->dp_display.pdev = pdev;
>   dp->name = "drm_dp";
>   dp->id = desc->id;
> - dp->dp_display.connector_type = desc->connector_type;
> + dp->dp_display.connector_type = dp_display_get_connector_type(pdev, 
> desc);
>   dp->wide_bus_supported = desc->wide_bus_supported;
>   dp->dp_display.is_edp =
>   (dp->dp_display.connector_type == DRM_MODE_CONNECTOR_eDP);
> 
> -- 
> 2.34.1
> 


Re: drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable 'out' set but not used

2024-03-26 Thread Dmitry Baryshkov
On Tue, 26 Mar 2024 at 20:31, Abhinav Kumar  wrote:
>
>
>
> On 3/26/2024 11:19 AM, Dmitry Baryshkov wrote:
> > On Tue, 26 Mar 2024 at 20:05, Miguel Ojeda
> >  wrote:
> >>
> >> Hi,
> >>
> >> In today's next, I got:
> >>
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable
> >> 'out' set but not used [-Werror,-Wunused-but-set-variable]
> >>
> >> `out` seems to be there since commit 64d6255650d4 ("drm/msm: More
> >> fully implement devcoredump for a7xx").
> >>
> >> Untested diff below assuming `dumper->iova` is constant -- if you want
> >> a formal patch, please let me know.
> >
> > Please send a proper patch that we can pick up.
> >
>
> This should be fixed with https://patchwork.freedesktop.org/patch/581853/.

Is that a correct fix? If you check other usage locations for
CRASHDUMP_READ, you'll see that `out` is the last parameter and it is
being incremented.

>
> We can pickup that one with a Fixes tag applied.
>
> >>
> >> Cheers,
> >> Miguel
> >>
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> >> index 1f5245fc2cdc..a847a0f7a73c 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> >> @@ -852,7 +852,7 @@ static void a6xx_get_shader_block(struct msm_gpu *gpu,
> >>   (block->type << 8) | i);
> >>
> >>   in += CRASHDUMP_READ(in, REG_A6XX_HLSQ_DBG_AHB_READ_APERTURE,
> >> -block->size, dumper->iova + A6XX_CD_DATA_OFFSET);
> >> +block->size, out);
> >>
> >>   out += block->size * sizeof(u32);
> >>   }
> >
> >
> >



-- 
With best wishes
Dmitry


Re: [PATCH v6 2/3] drm/i915/gt: Do not generate the command streamer for all the CCS

2024-03-26 Thread Andi Shyti
Hi Matt,

On Tue, Mar 26, 2024 at 09:03:10AM -0700, Matt Roper wrote:
> On Wed, Mar 13, 2024 at 09:19:50PM +0100, Andi Shyti wrote:
> > +   /*
> > +* Do not create the command streamer for CCS slices
> > +* beyond the first. All the workload submitted to the
> > +* first engine will be shared among all the slices.
> > +*
> > +* Once the user will be allowed to customize the CCS
> > +* mode, then this check needs to be removed.
> > +*/
> > +   if (IS_DG2(i915) &&
> > +   class == COMPUTE_CLASS &&
> > +   ccs_instance++)
> > +   continue;
> 
> Wouldn't it be more intuitive to drop the non-lowest CCS engines in
> init_engine_mask() since that's the function that's dedicated to
> building the list of engines we'll use?  Then we don't need to kill the
> assertion farther down either.

Because we don't check the result of init_engine_mask() while
creating the engine's structure. We check it only after and
indeed I removed the drm_WARN_ON() check.

I think the whole process of creating the engine's structure in
the intel_engines_init_mmio() can be simplified, but this goes
beyong the scope of the series.

Or am I missing something?

Thanks,
Andi


Re: drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable 'out' set but not used

2024-03-26 Thread Miguel Ojeda
On Tue, Mar 26, 2024 at 7:31 PM Abhinav Kumar  wrote:
>
> This should be fixed with https://patchwork.freedesktop.org/patch/581853/.

Ah, so in that case the `CRASHDUMP_READ` target should really be
constant, unlike in other cases in that file?

> We can pickup that one with a Fixes tag applied.

Thanks!

Cheers,
Miguel


Re: drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable 'out' set but not used

2024-03-26 Thread Abhinav Kumar




On 3/26/2024 11:19 AM, Dmitry Baryshkov wrote:

On Tue, 26 Mar 2024 at 20:05, Miguel Ojeda
 wrote:


Hi,

In today's next, I got:

 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable
'out' set but not used [-Werror,-Wunused-but-set-variable]

`out` seems to be there since commit 64d6255650d4 ("drm/msm: More
fully implement devcoredump for a7xx").

Untested diff below assuming `dumper->iova` is constant -- if you want
a formal patch, please let me know.


Please send a proper patch that we can pick up.



This should be fixed with https://patchwork.freedesktop.org/patch/581853/.

We can pickup that one with a Fixes tag applied.



Cheers,
Miguel

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 1f5245fc2cdc..a847a0f7a73c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -852,7 +852,7 @@ static void a6xx_get_shader_block(struct msm_gpu *gpu,
  (block->type << 8) | i);

  in += CRASHDUMP_READ(in, REG_A6XX_HLSQ_DBG_AHB_READ_APERTURE,
-block->size, dumper->iova + A6XX_CD_DATA_OFFSET);
+block->size, out);

  out += block->size * sizeof(u32);
  }






Re: drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable 'out' set but not used

2024-03-26 Thread Dmitry Baryshkov
On Tue, 26 Mar 2024 at 20:05, Miguel Ojeda
 wrote:
>
> Hi,
>
> In today's next, I got:
>
> drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable
> 'out' set but not used [-Werror,-Wunused-but-set-variable]
>
> `out` seems to be there since commit 64d6255650d4 ("drm/msm: More
> fully implement devcoredump for a7xx").
>
> Untested diff below assuming `dumper->iova` is constant -- if you want
> a formal patch, please let me know.

Please send a proper patch that we can pick up.

>
> Cheers,
> Miguel
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> index 1f5245fc2cdc..a847a0f7a73c 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> @@ -852,7 +852,7 @@ static void a6xx_get_shader_block(struct msm_gpu *gpu,
>  (block->type << 8) | i);
>
>  in += CRASHDUMP_READ(in, REG_A6XX_HLSQ_DBG_AHB_READ_APERTURE,
> -block->size, dumper->iova + A6XX_CD_DATA_OFFSET);
> +block->size, out);
>
>  out += block->size * sizeof(u32);
>  }



-- 
With best wishes
Dmitry


Re: Build regressions/improvements in v6.9-rc1

2024-03-26 Thread Sam Ravnborg
Hi all.

>   + error: arch/sparc/kernel/process_32.o: relocation truncated to fit: 
> R_SPARC_WDISP22 against `.text':  => (.fixup+0xc), (.fixup+0x4)
>   + error: arch/sparc/kernel/signal_32.o: relocation truncated to fit: 
> R_SPARC_WDISP22 against `.text':  => (.fixup+0x18), (.fixup+0x8), 
> (.fixup+0x0), (.fixup+0x20), (.fixup+0x10)
>   + error: relocation truncated to fit: R_SPARC_WDISP22 against `.init.text': 
>  => (.head.text+0x5100), (.head.text+0x5040)
>   + error: relocation truncated to fit: R_SPARC_WDISP22 against symbol 
> `leon_smp_cpu_startup' defined in .text section in 
> arch/sparc/kernel/trampoline_32.o:  => (.init.text+0xa4)

Looks like something is too big for the available space here.
Any hints how to dig into this would be nice.

Note: this is a sparc32 allmodconfig build

Sam


Re: [PATCH v9 1/3] drm/buddy: Implement tracking clear page feature

2024-03-26 Thread Matthew Auld

On 18/03/2024 21:40, Arunpravin Paneer Selvam wrote:

- Add tracking clear page feature.

- Driver should enable the DRM_BUDDY_CLEARED flag if it
   successfully clears the blocks in the free path. On the otherhand,
   DRM buddy marks each block as cleared.

- Track the available cleared pages size

- If driver requests cleared memory we prefer cleared memory
   but fallback to uncleared if we can't find the cleared blocks.
   when driver requests uncleared memory we try to use uncleared but
   fallback to cleared memory if necessary.

- When a block gets freed we clear it and mark the freed block as cleared,
   when there are buddies which are cleared as well we can merge them.
   Otherwise, we prefer to keep the blocks as separated.

- Add a function to support defragmentation.

v1:
   - Depends on the flag check DRM_BUDDY_CLEARED, enable the block as
 cleared. Else, reset the clear flag for each block in the list(Christian)
   - For merging the 2 cleared blocks compare as below,
 drm_buddy_is_clear(block) != drm_buddy_is_clear(buddy)(Christian)
   - Defragment the memory beginning from min_order
 till the required memory space is available.

v2: (Matthew)
   - Add a wrapper drm_buddy_free_list_internal for the freeing of blocks
 operation within drm buddy.
   - Write a macro block_incompatible() to allocate the required blocks.
   - Update the xe driver for the drm_buddy_free_list change in arguments.
   - add a warning if the two blocks are incompatible on
 defragmentation
   - call full defragmentation in the fini() function
   - place a condition to test if min_order is equal to 0
   - replace the list with safe_reverse() variant as we might
 remove the block from the list.

v3:
   - fix Gitlab user reported lockup issue.
   - Keep DRM_BUDDY_HEADER_CLEAR define sorted(Matthew)
   - modify to pass the root order instead max_order in fini()
 function(Matthew)
   - change bool 1 to true(Matthew)
   - add check if min_block_size is power of 2(Matthew)
   - modify the min_block_size datatype to u64(Matthew)

v4:
   - rename the function drm_buddy_defrag with __force_merge.
   - Include __force_merge directly in drm buddy file and remove
 the defrag use in amdgpu driver.
   - Remove list_empty() check(Matthew)
   - Remove unnecessary space, headers and placement of new variables(Matthew)
   - Add a unit test case(Matthew)

Signed-off-by: Arunpravin Paneer Selvam 
Signed-off-by: Matthew Auld 
Suggested-by: Christian König 
Suggested-by: Matthew Auld 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  |   6 +-
  drivers/gpu/drm/drm_buddy.c   | 427 ++
  drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |   6 +-
  drivers/gpu/drm/tests/drm_buddy_test.c|  18 +-
  drivers/gpu/drm/xe/xe_ttm_vram_mgr.c  |   4 +-
  include/drm/drm_buddy.h   |  16 +-
  6 files changed, 360 insertions(+), 117 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 8db880244324..c0c851409241 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -571,7 +571,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
return 0;
  
  error_free_blocks:

-   drm_buddy_free_list(mm, >blocks);
+   drm_buddy_free_list(mm, >blocks, 0);
mutex_unlock(>lock);
  error_fini:
ttm_resource_fini(man, >base);
@@ -604,7 +604,7 @@ static void amdgpu_vram_mgr_del(struct ttm_resource_manager 
*man,
  
  	amdgpu_vram_mgr_do_reserve(man);
  
-	drm_buddy_free_list(mm, >blocks);

+   drm_buddy_free_list(mm, >blocks, 0);
mutex_unlock(>lock);
  
  	atomic64_sub(vis_usage, >vis_usage);

@@ -912,7 +912,7 @@ void amdgpu_vram_mgr_fini(struct amdgpu_device *adev)
kfree(rsv);
  
  	list_for_each_entry_safe(rsv, temp, >reserved_pages, blocks) {

-   drm_buddy_free_list(>mm, >allocated);
+   drm_buddy_free_list(>mm, >allocated, 0);
kfree(rsv);
}
if (!adev->gmc.is_app_apu)
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index c4222b886db7..625a30a6b855 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -38,8 +38,8 @@ static void drm_block_free(struct drm_buddy *mm,
kmem_cache_free(slab_blocks, block);
  }
  
-static void list_insert_sorted(struct drm_buddy *mm,

-  struct drm_buddy_block *block)
+static void list_insert(struct drm_buddy *mm,
+   struct drm_buddy_block *block)
  {
struct drm_buddy_block *node;
struct list_head *head;
@@ -57,6 +57,16 @@ static void list_insert_sorted(struct drm_buddy *mm,
__list_add(>link, node->link.prev, >link);
  }
  
+static void clear_reset(struct drm_buddy_block *block)

+{
+   block->header &= ~DRM_BUDDY_HEADER_CLEAR;
+}
+
+static void 

drivers/gpu/drm/qxl/qxl_cmd.c:424:6: error: variable 'count' set but not used

2024-03-26 Thread Miguel Ojeda
Hi,

In today's next, I got:

drivers/gpu/drm/qxl/qxl_cmd.c:424:6: error: variable 'count' set
but not used [-Werror,-Wunused-but-set-variable]

`count` seems to be there since commit f64122c1f6ad ("drm: add new QXL
driver. (v1.4)").

Untested diff below -- if you want a formal patch, please let me know.

Cheers,
Miguel

diff --git a/drivers/gpu/drm/qxl/qxl_cmd.c b/drivers/gpu/drm/qxl/qxl_cmd.c
index 281edab518cd..d6ea01f3797b 100644
--- a/drivers/gpu/drm/qxl/qxl_cmd.c
+++ b/drivers/gpu/drm/qxl/qxl_cmd.c
@@ -421,7 +421,6 @@ int qxl_surface_id_alloc(struct qxl_device *qdev,
 {
uint32_t handle;
int idr_ret;
-   int count = 0;
 again:
idr_preload(GFP_ATOMIC);
spin_lock(>surf_id_idr_lock);
@@ -433,7 +432,6 @@ int qxl_surface_id_alloc(struct qxl_device *qdev,
handle = idr_ret;

if (handle >= qdev->rom->n_surfaces) {
-   count++;
spin_lock(>surf_id_idr_lock);
idr_remove(>surf_id_idr, handle);
spin_unlock(>surf_id_idr_lock);


drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable 'out' set but not used

2024-03-26 Thread Miguel Ojeda
Hi,

In today's next, I got:

drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable
'out' set but not used [-Werror,-Wunused-but-set-variable]

`out` seems to be there since commit 64d6255650d4 ("drm/msm: More
fully implement devcoredump for a7xx").

Untested diff below assuming `dumper->iova` is constant -- if you want
a formal patch, please let me know.

Cheers,
Miguel

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 1f5245fc2cdc..a847a0f7a73c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -852,7 +852,7 @@ static void a6xx_get_shader_block(struct msm_gpu *gpu,
 (block->type << 8) | i);

 in += CRASHDUMP_READ(in, REG_A6XX_HLSQ_DBG_AHB_READ_APERTURE,
-block->size, dumper->iova + A6XX_CD_DATA_OFFSET);
+block->size, out);

 out += block->size * sizeof(u32);
 }


Re: Library and interfaces for GPU offloading

2024-03-26 Thread Werner Sembach

Am 25.03.24 um 11:41 schrieb Werner Sembach:

Hello everyone,

currently GPU offloading on Linux is handled via environment variables. Which 
is a subpar experience for desktop files and might not be possible when using 
launchers (i.e. Steam, Lutris, Heroic, etc.) that have no explicit support for 
it without running the whole launcher permanently on the dGPU.


A proof of concept for a better solution is posted here: 
https://gitlab.freedesktop.org/glvnd/libglvnd/-/merge_requests/224 + 
https://gitlab.freedesktop.org/glvnd/libglvnd/-/merge_requests/228 , but it's 
stale since 3 years so I wanted to make a push for it.


Is there currently active work on this?

What is Mesas take on on the PoCs? 
https://gitlab.freedesktop.org/glvnd/libglvnd/-/merge_requests/228#note_1364162


Best regards,

Werner Sembach


I was pointed to dri-devel with this to find the correct people.

Best Regards,

Werner Sembach



Re: [PATCH 0/2] drm/fourcc.h: Add libcamera to Open Source Waiver

2024-03-26 Thread Kieran Bingham
Quoting Jacopo Mondi (2024-03-14 10:12:47)
> Hello
> 
> gentle nudge for
> 
> *) libcamera: are we ok being listed here ?

I think it's fine ...

Acked-by: Kieran Bingham 

> *) DRM/KMS: is it ok splitting the list of projects in the way I've
>done ?
> 
> Thanks
>j
> 
> On Wed, Feb 28, 2024 at 11:22:42AM +0100, Jacopo Mondi wrote:
> > As suggested by Sima, add libcamera to the list of projects to which the
> > Open Source Waiver notice applies.
> >
> > To maintain the paragraph readable, make a list out of the projects to which
> > such notice applies.
> >
> > Jacopo Mondi (2):
> >   drm/fourcc.h: List of Open Source Waiver projects
> >   drm/fourcc.h: Add libcamera to Open Source Waiver
> >
> >  include/uapi/drm/drm_fourcc.h | 12 +---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > --
> > 2.43.2
> >


Re: [PATCH 6/6] drm/msm/dp: Use function arguments for audio operations

2024-03-26 Thread Dmitry Baryshkov
On Tue, 26 Mar 2024 at 17:06, Bjorn Andersson  wrote:
>
> From: Bjorn Andersson 
>
> The dp_audio read and write operations uses members in struct dp_catalog
> for passing arguments and return values. This adds unnecessary
> complexity to the implementation, as it turns out after detangling the
> logic that no state is actually held in these variables.
>
> Clean this up by using function arguments and return values for passing
> the data.
>
> Signed-off-by: Bjorn Andersson 
> ---
>  drivers/gpu/drm/msm/dp/dp_audio.c   | 20 +--
>  drivers/gpu/drm/msm/dp/dp_catalog.c | 39 
> +
>  drivers/gpu/drm/msm/dp/dp_catalog.h | 18 +
>  3 files changed, 28 insertions(+), 49 deletions(-)

Reviewed-by: Dmitry Baryshkov 

Thanks a lot for the cleanup!

--
With best wishes
Dmitry


Re: [PATCH] dt-bindings: display: rockchip: add missing #sound-dai-cells to dw-hdmi

2024-03-26 Thread Krzysztof Kozlowski
On 26/03/2024 18:50, Krzysztof Kozlowski wrote:
> On 26/03/2024 18:28, Heiko Stuebner wrote:
>> The #sound-dai-cells DT property is required to describe link between
>> the HDMI IP block and the SoC's audio subsystem.
>>
>> Signed-off-by: Heiko Stuebner 
>> ---
>>  .../devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git 
>> a/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml 
>> b/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
>> index af638b6c0d21..3768df80ca7a 100644
>> --- 
>> a/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
>> +++ 
>> b/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
>> @@ -124,6 +124,9 @@ properties:
>>  description:
>>phandle to the GRF to mux vopl/vopb.
>>  
>> +  "#sound-dai-cells":
>> +const: 0
>> +
> 
> Then you miss $ref in allOf to /schemas/sound/dai-common.yaml

I meant, except your change you should add also above $ref.

Best regards,
Krzysztof



Re: [PATCH] dt-bindings: display: rockchip: add missing #sound-dai-cells to dw-hdmi

2024-03-26 Thread Krzysztof Kozlowski
On 26/03/2024 18:28, Heiko Stuebner wrote:
> The #sound-dai-cells DT property is required to describe link between
> the HDMI IP block and the SoC's audio subsystem.
> 
> Signed-off-by: Heiko Stuebner 
> ---
>  .../devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml 
> b/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
> index af638b6c0d21..3768df80ca7a 100644
> --- a/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
> +++ b/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
> @@ -124,6 +124,9 @@ properties:
>  description:
>phandle to the GRF to mux vopl/vopb.
>  
> +  "#sound-dai-cells":
> +const: 0
> +

Then you miss $ref in allOf to /schemas/sound/dai-common.yaml

Best regards,
Krzysztof



Re: [PATCH v9 3/3] drm/tests: Add a test case for drm buddy clear allocation

2024-03-26 Thread Matthew Auld

On 18/03/2024 21:40, Arunpravin Paneer Selvam wrote:

Add a new test case for the drm buddy clear and dirty
allocation.

Signed-off-by: Arunpravin Paneer Selvam 
Suggested-by: Matthew Auld 
---
  drivers/gpu/drm/tests/drm_buddy_test.c | 127 +
  1 file changed, 127 insertions(+)

diff --git a/drivers/gpu/drm/tests/drm_buddy_test.c 
b/drivers/gpu/drm/tests/drm_buddy_test.c
index 454ad9952f56..d355a6e61893 100644
--- a/drivers/gpu/drm/tests/drm_buddy_test.c
+++ b/drivers/gpu/drm/tests/drm_buddy_test.c
@@ -19,6 +19,132 @@ static inline u64 get_size(int order, u64 chunk_size)
return (1 << order) * chunk_size;
  }
  
+static void drm_test_buddy_alloc_clear(struct kunit *test)

+{
+   unsigned long n_pages, total, i = 0;
+   const unsigned long ps = SZ_4K;
+   struct drm_buddy_block *block;
+   const int max_order = 12;
+   LIST_HEAD(allocated);
+   struct drm_buddy mm;
+   unsigned int order;
+   u64 mm_size, size;


Maybe just make these two u32 or unsigned long. That should be big 
enough, plus avoids any kind of 32b compilation bugs below.



+   LIST_HEAD(dirty);
+   LIST_HEAD(clean);
+
+   mm_size = PAGE_SIZE << max_order;


s/PAGE_SIZE/SZ_4K/ below also.


+   KUNIT_EXPECT_FALSE(test, drm_buddy_init(, mm_size, ps));
+
+   KUNIT_EXPECT_EQ(test, mm.max_order, max_order);
+
+   /**


Drop the extra *, since is not actual kernel-doc. Below also.


+* Idea is to allocate and free some random portion of the address 
space,
+* returning those pages as non-dirty and randomly alternate between
+* requesting dirty and non-dirty pages (not going over the limit
+* we freed as non-dirty), putting that into two separate lists.
+* Loop over both lists at the end checking that the dirty list
+* is indeed all dirty pages and vice versa. Free it all again,
+* keeping the dirty/clear status.
+*/
+   KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(, 0, mm_size,
+   5 * ps, ps, 
,
+   
DRM_BUDDY_TOPDOWN_ALLOCATION),
+   "buddy_alloc hit an error size=%u\n", 5 * ps);
+   drm_buddy_free_list(, , DRM_BUDDY_CLEARED);
+
+   n_pages = 10;
+   do {
+   unsigned long flags;
+   struct list_head *list;
+   int slot = i % 2;
+
+   if (slot == 0) {
+   list = 
+   flags = 0;
+   } else if (slot == 1) {


Could just be else {


+   list = 
+   flags = DRM_BUDDY_CLEAR_ALLOCATION;
+   }
+
+   KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(, 0, 
mm_size,
+   ps, ps, 
list,
+   flags),
+   "buddy_alloc hit an error size=%u\n", 
ps);
+   } while (++i < n_pages);
+
+   list_for_each_entry(block, , link)
+   KUNIT_EXPECT_EQ(test, drm_buddy_block_is_clear(block), true);
+
+   list_for_each_entry(block, , link)
+   KUNIT_EXPECT_EQ(test, drm_buddy_block_is_clear(block), false);
+
+   drm_buddy_free_list(, , DRM_BUDDY_CLEARED);
+
+   /**
+* Trying to go over the clear limit for some allocation.
+* The allocation should never fail with reasonable page-size.
+*/
+   KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(, 0, mm_size,
+   10 * ps, ps, ,
+   
DRM_BUDDY_CLEAR_ALLOCATION),
+   "buddy_alloc hit an error size=%u\n", 10 * ps);
+
+   drm_buddy_free_list(, , DRM_BUDDY_CLEARED);
+   drm_buddy_free_list(, , 0);
+   drm_buddy_fini();
+
+   KUNIT_EXPECT_FALSE(test, drm_buddy_init(, mm_size, ps));
+
+   /**
+* Create a new mm. Intentionally fragment the address space by creating
+* two alternating lists. Free both lists, one as dirty the other as 
clean.
+* Try to allocate double the previous size with matching 
min_page_size. The
+* allocation should never fail as it calls the force_merge. Also check 
that
+* the page is always dirty after force_merge. Free the page as dirty, 
then
+* repeat the whole thing, increment the order until we hit the 
max_order.
+*/
+
+   order = 1;
+   do {
+   size = PAGE_SIZE << order;
+   i = 0;
+   n_pages = mm_size / ps;
+   do {
+   struct list_head *list;
+   int slot = i % 2;
+
+   if (slot == 0)
+   list = 
+   else if (slot == 1)



Re: [PATCH] drm/amdgpu: add support of bios dump in devcoredump

2024-03-26 Thread Khatri, Sunil



On 3/26/2024 10:23 PM, Alex Deucher wrote:

On Tue, Mar 26, 2024 at 10:38 AM Sunil Khatri  wrote:

dump the bios binary in the devcoredump.

Signed-off-by: Sunil Khatri 
---
  .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c  | 20 +++
  1 file changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
index 44c5da8aa9ce..f33963d777eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
@@ -132,6 +132,26 @@ amdgpu_devcoredump_read(char *buffer, loff_t offset, 
size_t count,
 drm_printf(, "Faulty page starting at address: 0x%016llx\n", 
fault_info->addr);
 drm_printf(, "Protection fault status register: 0x%x\n\n", 
fault_info->status);

+   /* Dump BIOS */
+   if (coredump->adev->bios && coredump->adev->bios_size) {
+   int i = 0;
+
+   drm_printf(, "BIOS Binary dump\n");
+   drm_printf(, "Valid BIOS  Size:%d bytes type:%s\n",
+  coredump->adev->bios_size,
+  coredump->adev->is_atom_fw ?
+  "Atom bios":"Non Atom Bios");
+
+   while (i < coredump->adev->bios_size) {
+   /* Printing 15 bytes in a line */
+   if (i % 15 == 0)
+   drm_printf(, "\n");
+   drm_printf(, "0x%x \t", coredump->adev->bios[i]);
+   i++;
+   }
+   drm_printf(, "\n");
+   }

I don't think it's too useful to dump this as text.  I was hoping it
could be a binary.  I guess, we can just get this from debugfs if we
need it if a binary is not possible.



Yes , this dumps in text format only and the binary is already available 
in debugfs. So discarding the patch.




Alex



+
 /* Add ring buffer information */
 drm_printf(, "Ring buffer information\n");
 for (int i = 0; i < coredump->adev->num_rings; i++) {
--
2.34.1



Re: [PATCH 5/6] drm/msm/dp: Use function arguments for timing configuration

2024-03-26 Thread Dmitry Baryshkov
On Tue, 26 Mar 2024 at 17:06, Bjorn Andersson  wrote:
>
> From: Bjorn Andersson 
>
> dp_catalog_panel_timing_cfg() takes 4 arguments, which are passed from
> the calling function through members of struct dp_catalog.
>
> No state is maintained other than across this call, so switch to
> function arguments to clean up the code.
>
> Signed-off-by: Bjorn Andersson 
> ---
>  drivers/gpu/drm/msm/dp/dp_catalog.c | 14 ++
>  drivers/gpu/drm/msm/dp/dp_catalog.h |  7 ++-
>  drivers/gpu/drm/msm/dp/dp_panel.c   | 14 +-
>  3 files changed, 17 insertions(+), 18 deletions(-)
>

Reviewed-by: Dmitry Baryshkov 

-- 
With best wishes
Dmitry


[PATCH] drm: DRM_DEBUG_MODESET_LOCK should depend on DRM

2024-03-26 Thread Geert Uytterhoeven
There is no point in asking the user about enabling DRM debug tracing
when configuring a kernel without DRM support.

Signed-off-by: Geert Uytterhoeven 
---
 drivers/gpu/drm/Kconfig | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 2e1b23ccf30423a9..a24c48acf235449a 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -119,9 +119,7 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
 
 config DRM_DEBUG_MODESET_LOCK
bool "Enable backtrace history for lock contention"
-   depends on STACKTRACE_SUPPORT
-   depends on DEBUG_KERNEL
-   depends on EXPERT
+   depends on DRM && STACKTRACE_SUPPORT && DEBUG_KERNEL && EXPERT
select STACKDEPOT
default y if DEBUG_WW_MUTEX_SLOWPATH
help
-- 
2.34.1



[PATCH] drm/amdgpu: add IP's FW information to devcoredump

2024-03-26 Thread Sunil Khatri
Add FW information of all the IP's in the devcoredump.

Signed-off-by: Sunil Khatri 
---
 .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c  | 122 ++
 1 file changed, 122 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
index 44c5da8aa9ce..d598b6520ec9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
@@ -69,6 +69,124 @@ const char *hw_ip_names[MAX_HWIP] = {
[PCIE_HWIP] = "PCIE",
 };
 
+static void amdgpu_devcoredump_fw_info(struct amdgpu_device *adev,
+  struct drm_printer *p)
+{
+   uint32_t version;
+   uint32_t feature;
+   uint8_t smu_program, smu_major, smu_minor, smu_debug;
+
+   drm_printf(p, "VCE feature version: %u, fw version: 0x%08x\n",
+  adev->vce.fb_version, adev->vce.fw_version);
+   drm_printf(p, "UVD feature version: %u, fw version: 0x%08x\n", 0,
+  adev->uvd.fw_version);
+   drm_printf(p, "GMC feature version: %u, fw version: 0x%08x\n", 0,
+  adev->gmc.fw_version);
+   drm_printf(p, "ME feature version: %u, fw version: 0x%08x\n",
+  adev->gfx.me_feature_version, adev->gfx.me_fw_version);
+   drm_printf(p, "PFP feature version: %u, fw version: 0x%08x\n",
+  adev->gfx.pfp_feature_version, adev->gfx.pfp_fw_version);
+   drm_printf(p, "CE feature version: %u, fw version: 0x%08x\n",
+  adev->gfx.ce_feature_version, adev->gfx.ce_fw_version);
+   drm_printf(p, "RLC feature version: %u, fw version: 0x%08x\n",
+  adev->gfx.rlc_feature_version, adev->gfx.rlc_fw_version);
+
+   drm_printf(p, "RLC SRLC feature version: %u, fw version: 0x%08x\n",
+  adev->gfx.rlc_srlc_feature_version,
+  adev->gfx.rlc_srlc_fw_version);
+   drm_printf(p, "RLC SRLG feature version: %u, fw version: 0x%08x\n",
+  adev->gfx.rlc_srlg_feature_version,
+  adev->gfx.rlc_srlg_fw_version);
+   drm_printf(p, "RLC SRLS feature version: %u, fw version: 0x%08x\n",
+  adev->gfx.rlc_srls_feature_version,
+  adev->gfx.rlc_srls_fw_version);
+   drm_printf(p, "RLCP feature version: %u, fw version: 0x%08x\n",
+  adev->gfx.rlcp_ucode_feature_version,
+  adev->gfx.rlcp_ucode_version);
+   drm_printf(p, "RLCV feature version: %u, fw version: 0x%08x\n",
+  adev->gfx.rlcv_ucode_feature_version,
+  adev->gfx.rlcv_ucode_version);
+   drm_printf(p, "MEC feature version: %u, fw version: 0x%08x\n",
+  adev->gfx.mec_feature_version, adev->gfx.mec_fw_version);
+
+   if (adev->gfx.mec2_fw)
+   drm_printf(p, "MEC2 feature version: %u, fw version: 0x%08x\n",
+  adev->gfx.mec2_feature_version,
+  adev->gfx.mec2_fw_version);
+
+   drm_printf(p, "IMU feature version: %u, fw version: 0x%08x\n", 0,
+  adev->gfx.imu_fw_version);
+   drm_printf(p, "PSP SOS feature version: %u, fw version: 0x%08x\n",
+  adev->psp.sos.feature_version, adev->psp.sos.fw_version);
+   drm_printf(p, "PSP ASD feature version: %u, fw version: 0x%08x\n",
+  adev->psp.asd_context.bin_desc.feature_version,
+  adev->psp.asd_context.bin_desc.fw_version);
+
+   drm_printf(p, "TA XGMI feature version: 0x%08x, fw version: 0x%08x\n",
+  adev->psp.xgmi_context.context.bin_desc.feature_version,
+  adev->psp.xgmi_context.context.bin_desc.fw_version);
+   drm_printf(p, "TA RAS feature version: 0x%08x, fw version: 0x%08x\n",
+  adev->psp.ras_context.context.bin_desc.feature_version,
+  adev->psp.ras_context.context.bin_desc.fw_version);
+   drm_printf(p, "TA HDCP feature version: 0x%08x, fw version: 0x%08x\n",
+  adev->psp.hdcp_context.context.bin_desc.feature_version,
+  adev->psp.hdcp_context.context.bin_desc.fw_version);
+   drm_printf(p, "TA DTM feature version: 0x%08x, fw version: 0x%08x\n",
+  adev->psp.dtm_context.context.bin_desc.feature_version,
+  adev->psp.dtm_context.context.bin_desc.fw_version);
+   drm_printf(p, "TA RAP feature version: 0x%08x, fw version: 0x%08x\n",
+  adev->psp.rap_context.context.bin_desc.feature_version,
+  adev->psp.rap_context.context.bin_desc.fw_version);
+   drm_printf(
+   p,
+   "TA SECURE DISPLAY feature version: 0x%08x, fw version: 
0x%08x\n",
+   
adev->psp.securedisplay_context.context.bin_desc.feature_version,
+   adev->psp.securedisplay_context.context.bin_desc.fw_version);
+
+   /* SMC firmware */
+   version = 

[PATCH] drm: DRM_WERROR should depend on DRM

2024-03-26 Thread Geert Uytterhoeven
There is no point in asking the user about enforcing the DRM compiler
warning policy when configuring a kernel without DRM support.

Fixes: f89632a9e5fa6c47 ("drm: Add CONFIG_DRM_WERROR")
Signed-off-by: Geert Uytterhoeven 
---
 drivers/gpu/drm/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index f2bcf5504aa77679..2e1b23ccf30423a9 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -423,7 +423,7 @@ config DRM_PRIVACY_SCREEN
 
 config DRM_WERROR
bool "Compile the drm subsystem with warnings as errors"
-   depends on EXPERT
+   depends on DRM && EXPERT
default n
help
  A kernel build should not cause any compiler warnings, and this
-- 
2.34.1



Re: [PATCH 4/6] drm/msm/dp: Use function arguments for aux writes

2024-03-26 Thread Dmitry Baryshkov
On Tue, 26 Mar 2024 at 17:06, Bjorn Andersson  wrote:
>
> From: Bjorn Andersson 
>
> The dp_aux write operations takes the data to be operated on through a
> member of struct dp_catalog, rather than as an argument to the function.
>
> No state is maintained other than across the calling of the functions,
> so replace this member with a function argument.

Definitely yes, thank you!

Reviewed-by: Dmitry Baryshkov 

>
> Signed-off-by: Bjorn Andersson 
> ---
>  drivers/gpu/drm/msm/dp/dp_aux.c | 9 +++--
>  drivers/gpu/drm/msm/dp/dp_catalog.c | 8 
>  drivers/gpu/drm/msm/dp/dp_catalog.h | 5 ++---
>  3 files changed, 9 insertions(+), 13 deletions(-)

-- 
With best wishes
Dmitry


Re: [PATCH 2/6] drm/msm/dp: Removed fixed nvid "support"

2024-03-26 Thread Dmitry Baryshkov
On Tue, 26 Mar 2024 at 17:06, Bjorn Andersson  wrote:
>
> From: Bjorn Andersson 
>
> The "desc" member of struct dp_panel is zero-initialized during
> allocation and never assigned, resulting in dp_ctrl_use_fixed_nvid()
> never returning true. This returned boolean value is passed around but
> never acted upon.
>
> Perform constant propagation and remove the traces of "fixed nvid".
>
> Signed-off-by: Bjorn Andersson 
> ---
>  drivers/gpu/drm/msm/dp/dp_catalog.c |  2 +-
>  drivers/gpu/drm/msm/dp/dp_catalog.h |  2 +-
>  drivers/gpu/drm/msm/dp/dp_ctrl.c| 17 +
>  drivers/gpu/drm/msm/dp/dp_panel.h   |  1 -
>  4 files changed, 3 insertions(+), 19 deletions(-)

Reviewed-by: Dmitry Baryshkov 

Kuogee could you possibly comment, why was this necessary at all?

-- 
With best wishes
Dmitry


Re: [PATCH 1/6] drm/msm/dp: Drop unused dp_debug struct

2024-03-26 Thread Dmitry Baryshkov
On Tue, 26 Mar 2024 at 17:06, Bjorn Andersson  wrote:
>
> From: Bjorn Andersson 
>
> The members of struct dp_debug are no longer used, so the only purpose
> of this struct is as a type of the return value of dp_debug_get(), to
> signal success/error.
>
> Drop the struct in favor of signalling the result of initialization
> using an int.
>
> Signed-off-by: Bjorn Andersson 
> ---
>  drivers/gpu/drm/msm/dp/dp_debug.c   | 38 
> ++---
>  drivers/gpu/drm/msm/dp/dp_debug.h   | 38 
> +++--
>  drivers/gpu/drm/msm/dp/dp_display.c | 10 ++
>  3 files changed, 23 insertions(+), 63 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_debug.c 
> b/drivers/gpu/drm/msm/dp/dp_debug.c
> index eca5a02f9003..a631cbe0e599 100644
> --- a/drivers/gpu/drm/msm/dp/dp_debug.c
> +++ b/drivers/gpu/drm/msm/dp/dp_debug.c
> @@ -21,8 +21,6 @@ struct dp_debug_private {
> struct dp_link *link;
> struct dp_panel *panel;
> struct drm_connector *connector;
> -
> -   struct dp_debug dp_debug;
>  };
>
>  static int dp_debug_show(struct seq_file *seq, void *p)
> @@ -199,11 +197,8 @@ static const struct file_operations test_active_fops = {
> .write = dp_test_active_write
>  };
>
> -static void dp_debug_init(struct dp_debug *dp_debug, struct dentry *root, 
> bool is_edp)
> +static void dp_debug_init(struct dp_debug_private *debug, struct dentry 
> *root, bool is_edp)
>  {
> -   struct dp_debug_private *debug = container_of(dp_debug,
> -   struct dp_debug_private, dp_debug);
> -
> debugfs_create_file("dp_debug", 0444, root,
> debug, _debug_fops);
>
> @@ -222,39 +217,26 @@ static void dp_debug_init(struct dp_debug *dp_debug, 
> struct dentry *root, bool i
> }
>  }
>
> -struct dp_debug *dp_debug_get(struct device *dev, struct dp_panel *panel,
> -   struct dp_link *link,
> -   struct drm_connector *connector,
> -   struct dentry *root, bool is_edp)
> +int dp_debug_get(struct device *dev, struct dp_panel *panel,
> +struct dp_link *link,
> +struct drm_connector *connector,
> +struct dentry *root, bool is_edp)
>  {
> struct dp_debug_private *debug;
> -   struct dp_debug *dp_debug;
> -   int rc;
>
> if (!dev || !panel || !link) {
> DRM_ERROR("invalid input\n");
> -   rc = -EINVAL;
> -   goto error;
> +   return -EINVAL;
> }
>
> debug = devm_kzalloc(dev, sizeof(*debug), GFP_KERNEL);
> -   if (!debug) {
> -   rc = -ENOMEM;
> -   goto error;
> -   }
> +   if (!debug)
> +   return -ENOMEM;
>
> -   debug->dp_debug.debug_en = false;
> debug->link = link;
> debug->panel = panel;
>
> -   dp_debug = >dp_debug;
> -   dp_debug->vdisplay = 0;
> -   dp_debug->hdisplay = 0;
> -   dp_debug->vrefresh = 0;
> -
> -   dp_debug_init(dp_debug, root, is_edp);
> +   dp_debug_init(debug, root, is_edp);
>
> -   return dp_debug;
> - error:
> -   return ERR_PTR(rc);
> +   return 0;

Since there is nothing more to get, could you please move the
devm_kzalloc to dp_debug_init and call it directly from dp_display.c?

>  }
> diff --git a/drivers/gpu/drm/msm/dp/dp_debug.h 
> b/drivers/gpu/drm/msm/dp/dp_debug.h
> index 9b3b2e702f65..c57200751c9f 100644
> --- a/drivers/gpu/drm/msm/dp/dp_debug.h
> +++ b/drivers/gpu/drm/msm/dp/dp_debug.h
> @@ -9,22 +9,6 @@
>  #include "dp_panel.h"
>  #include "dp_link.h"
>
> -/**
> - * struct dp_debug
> - * @debug_en: specifies whether debug mode enabled
> - * @vdisplay: used to filter out vdisplay value
> - * @hdisplay: used to filter out hdisplay value
> - * @vrefresh: used to filter out vrefresh value
> - * @tpg_state: specifies whether tpg feature is enabled
> - */
> -struct dp_debug {
> -   bool debug_en;
> -   int aspect_ratio;
> -   int vdisplay;
> -   int hdisplay;
> -   int vrefresh;
> -};
> -
>  #if defined(CONFIG_DEBUG_FS)
>
>  /**
> @@ -41,22 +25,22 @@ struct dp_debug {
>   * This function sets up the debug module and provides a way
>   * for debugfs input to be communicated with existing modules
>   */
> -struct dp_debug *dp_debug_get(struct device *dev, struct dp_panel *panel,
> -   struct dp_link *link,
> -   struct drm_connector *connector,
> -   struct dentry *root,
> -   bool is_edp);
> +int dp_debug_get(struct device *dev, struct dp_panel *panel,
> +struct dp_link *link,
> +struct drm_connector *connector,
> +struct dentry *root,
> +bool is_edp);
>
>  #else
>
>  static inline
> -struct dp_debug *dp_debug_get(struct device *dev, struct dp_panel *panel,
> -   struct dp_link *link,
> -   struct drm_connector *connector,
> -   

[PATCH] dt-bindings: display: rockchip: add missing #sound-dai-cells to dw-hdmi

2024-03-26 Thread Heiko Stuebner
The #sound-dai-cells DT property is required to describe link between
the HDMI IP block and the SoC's audio subsystem.

Signed-off-by: Heiko Stuebner 
---
 .../devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml | 3 +++
 1 file changed, 3 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml 
b/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
index af638b6c0d21..3768df80ca7a 100644
--- a/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
+++ b/Documentation/devicetree/bindings/display/rockchip/rockchip,dw-hdmi.yaml
@@ -124,6 +124,9 @@ properties:
 description:
   phandle to the GRF to mux vopl/vopb.
 
+  "#sound-dai-cells":
+const: 0
+
 required:
   - compatible
   - reg
-- 
2.39.2



Re: [PATCH 3/6] drm/msm/dp: Remove unused defines and members

2024-03-26 Thread Dmitry Baryshkov
On Tue, 26 Mar 2024 at 17:06, Bjorn Andersson  wrote:
>
> From: Bjorn Andersson 
>
> Throughout the Qualcomm Displayport driver a number of defines and
> struct members has become unused, but lingers in the code. Remove these.
>
> Signed-off-by: Bjorn Andersson 
> ---
>  drivers/gpu/drm/msm/dp/dp_audio.c   |  5 -
>  drivers/gpu/drm/msm/dp/dp_catalog.c |  1 -
>  drivers/gpu/drm/msm/dp/dp_catalog.h | 17 -
>  drivers/gpu/drm/msm/dp/dp_ctrl.h|  1 -
>  drivers/gpu/drm/msm/dp/dp_display.c |  5 -
>  drivers/gpu/drm/msm/dp/dp_display.h |  3 ---
>  drivers/gpu/drm/msm/dp/dp_drm.c |  2 --
>  drivers/gpu/drm/msm/dp/dp_link.c|  4 
>  drivers/gpu/drm/msm/dp/dp_link.h|  1 -
>  drivers/gpu/drm/msm/dp/dp_panel.h   |  2 --
>  10 files changed, 41 deletions(-)
>

I'd have preferred to have this split into somewhat logical chunks,
but I think it doesn't make sense for such cleanup.

Reviewed-by: Dmitry Baryshkov 


-- 
With best wishes
Dmitry


[PULL] drm-xe-fixes

2024-03-26 Thread Lucas De Marchi

Hi Dave and Sima,

Please pull the drm-xe-fixes for this week targeting v6.9-rc2. 


drm-xe-fixes-2024-03-26:
- Fix build on mips
- Fix wrong bound checks
- Fix use of msec rather than jiffies
- Remove dead code
The following changes since commit 4cece764965020c22cff7665b18a012006359095:

  Linux 6.9-rc1 (2024-03-24 14:10:05 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-03-26

for you to fetch changes up to 0d8cf0c924732a045273c6aca6900a340ac88529:

  drm/xe: Fix END redefinition (2024-03-25 13:47:48 -0500)


- Fix build on mips
- Fix wrong bound checks
- Fix use of msec rather than jiffies
- Remove dead code


Lucas De Marchi (1):
  drm/xe: Fix END redefinition

Matthew Auld (5):
  drm/xe/guc_submit: use jiffies for job timeout
  drm/xe/queue: fix engine_class bounds check
  drm/xe/device: fix XE_MAX_GT_PER_TILE check
  drm/xe/device: fix XE_MAX_TILES_PER_DEVICE check
  drm/xe/query: fix gt_id bounds check

Nirmoy Das (1):
  drm/xe: Remove unused xe_bo->props struct

 drivers/gpu/drm/xe/xe_bo.c | 59 ++
 drivers/gpu/drm/xe/xe_bo_types.h   | 19 
 drivers/gpu/drm/xe/xe_device.h |  4 +--
 drivers/gpu/drm/xe/xe_exec_queue.c |  2 +-
 drivers/gpu/drm/xe/xe_guc_submit.c |  2 +-
 drivers/gpu/drm/xe/xe_lrc.c| 20 ++---
 drivers/gpu/drm/xe/xe_query.c  |  2 +-
 7 files changed, 23 insertions(+), 85 deletions(-)


Re: In kernel virtual HID devices (was Future handling of complex RGB devices on Linux v3)

2024-03-26 Thread Werner Sembach

Hi all,

Am 26.03.24 um 16:39 schrieb Benjamin Tissoires:

On Mar 26 2024, Werner Sembach wrote:

Hi all,

Am 25.03.24 um 19:30 schrieb Hans de Goede:

[snip]

If the kernel already handles the custom protocol into generic HID, the
work for userspace is not too hard because they can deal with a known
protocol and can be cross-platform in their implementation.

I'm mentioning that cross-platform because SDL used to rely on the
input, LEDs, and other Linux peculiarities and eventually fell back on
using hidraw only because it's way more easier that way.

The other advantage of LampArray is that according to Microsoft's
document, new devices are going to support it out of the box, so they'll
be supported out of the box directly.

Most of the time my stance is "do not add new kernel API, you'll regret
it later". So in that case, given that we have a formally approved
standard, I would suggest to use it, and consider it your API.

The only new UAPI would be the use_leds_uapi switch to turn on/off the 
backwards compatibility.

I have my reserves with such a kill switch (see below).


Actually we don't even need that. Typically there is a single HID
driver handling both keys and the backlight, so userspace cannot
just unbind the HID driver since then the keys stop working.

I don't think Werner meant unbinding the HID driver, just a toggle to
enable/disable the basic HID core processing of LampArray.


But with a virtual LampArray HID device the only functionality
for an in kernel HID driver would be to export a basic keyboard
backlight control interface for simple non per key backlight control
to integrate nicely with e.g. GNOME's backlight control.

Don't forget that in the future there will be devices that natively support
LampArray in their firmware, so for them it is the same device.

Yeah, the generic LampArray support will not be able to differentiate
"emulated" devices from native ones.


Regards,

Werner


And then when OpenRGB wants to take over it can just unbind the HID
driver from the HID device using existing mechanisms for that.

Again no, it'll be too unpredicted.


Hmm, I wonder if that will not also kill hidraw support though ...
I guess getting hidraw support back might require then also manually
binding the default HID input driver.  Bentiss any input on this?

To be able to talk over hidraw you need a driver to be bound, yes. But I
had the impression that LampArray would be supported by default in
hid-input.c, thus making this hard to remove. Having a separate driver
will work, but as soon as the LampArray device will also export a
multitouch touchpad, we are screwed and will have to make a choice
between LampArray and touch...


Background info: as discussed earlier in the thread Werner would like
to have a basic driver registering a /sys/class/leds/foo::kbd_backlight/
device, since those are automatically supported by GNOME (and others)
and will give basic kbd backlight brightness control in the desktop
environment. This could be a simple HID driver for
the hid_allocate_device()-ed virtual HID device, but userspace needs
to be able to move that out of the way when it wants to take over
full control of the per key lighting.

Do we really need to entirely unregister the led class device? Can't we
snoop on the commands and get some "mean value"?


Regards,

Hans








The control flow for the whole system would look something like this:

- System boots

      - Kernel driver initializes keyboard (maybe stops rainbowpuke boot 
effects, sets brightness to a default value, or initializes a solid color)

      - systemd-backlight restores last keyboard backlight brightness

      - UPower sees sysfs leds entry and exposes it to DBus for DEs to do 
keyboard brightness handling

- If the user wants more control they (auto-)start OpenRGB

      - OpenRGB disables sysfs leds entry via use_leds_uapi to prevent double 
control of the same device by UPower

      - OpenRGB directly interacts with hidraw device via LampArray API to give 
fine granular control of the backlight

      - When OpenRGB closes it should reenable the sysfs leds entry

That's where your plan falls short: if OpenRGB crashes, or is killed it
will not reset that bit.

Next question: is OpenRGB supposed to keep the hidraw node opened all
the time or not?
TBH I didn't look at the OpenRGB code yet and LampArray there is currently only 
planned. I somewhat hope that until the kernel driver is ready someone else 
already picked up implementing LampArray in OpenRGB.


If it has to keep it open, we should be able to come up with a somewhat
similar hack that we have with hid-steam: when the hidraw node is
opened, we disable the kernel processing of LampArray. When the node is
closed, we re-enable it.

But that also means we have to distinguish steam/SDL from OpenRGB...


My first thought here also: What is if something else is reading hidraw devices?

Especially for hidraw devices that are not just LampArray.



I just carefully read 

Re: [PATCH] drm/amdgpu: add support of bios dump in devcoredump

2024-03-26 Thread Alex Deucher
On Tue, Mar 26, 2024 at 10:38 AM Sunil Khatri  wrote:
>
> dump the bios binary in the devcoredump.
>
> Signed-off-by: Sunil Khatri 
> ---
>  .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c  | 20 +++
>  1 file changed, 20 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
> index 44c5da8aa9ce..f33963d777eb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
> @@ -132,6 +132,26 @@ amdgpu_devcoredump_read(char *buffer, loff_t offset, 
> size_t count,
> drm_printf(, "Faulty page starting at address: 0x%016llx\n", 
> fault_info->addr);
> drm_printf(, "Protection fault status register: 0x%x\n\n", 
> fault_info->status);
>
> +   /* Dump BIOS */
> +   if (coredump->adev->bios && coredump->adev->bios_size) {
> +   int i = 0;
> +
> +   drm_printf(, "BIOS Binary dump\n");
> +   drm_printf(, "Valid BIOS  Size:%d bytes type:%s\n",
> +  coredump->adev->bios_size,
> +  coredump->adev->is_atom_fw ?
> +  "Atom bios":"Non Atom Bios");
> +
> +   while (i < coredump->adev->bios_size) {
> +   /* Printing 15 bytes in a line */
> +   if (i % 15 == 0)
> +   drm_printf(, "\n");
> +   drm_printf(, "0x%x \t", coredump->adev->bios[i]);
> +   i++;
> +   }
> +   drm_printf(, "\n");
> +   }

I don't think it's too useful to dump this as text.  I was hoping it
could be a binary.  I guess, we can just get this from debugfs if we
need it if a binary is not possible.

Alex


> +
> /* Add ring buffer information */
> drm_printf(, "Ring buffer information\n");
> for (int i = 0; i < coredump->adev->num_rings; i++) {
> --
> 2.34.1
>


[PATCH 4/4] drm/bridge: hotplug-bridge: add driver to support hot-pluggable DSI bridges

2024-03-26 Thread Luca Ceresoli
This driver implements the point of a DRM pipeline where a connector allows
removal of all the following bridges up to the panel.

The DRM subsystem currently allows hotplug of the monitor but not preceding
components. However there are embedded devices where the "tail" of the DRM
pipeline, including one or more bridges, can be physically removed:

 ..
 |   DISPLAY CONTROLLER   |
 | .-.   .--. |
 | | ENCODER |<--| CRTC | |
 | '-'   '--' |
 '--|-'
|
|   HOTPLUG
V  CONNECTOR
   .-..--..-..-. .---.
   | 0 to N  || _|   _| || 1 to N  | |   |
   | BRIDGES |--DSI-->||_   |_  |--DSI-->| BRIDGES |--LVDS-->| PANEL |
   | ||  || || | |   |
   '-''--''-''-' '---'

 [--- fixed components --]  [--- removable add-on ---]

This driver supports such devices, where the final segment of a MIPI DSI
bus, including one or more bridges, can be physically disconnected and
reconnected at runtime, possibly with a different model.

This implementation supports a MIPI DSI bus only, but it is designed to be
as far as possible generic and extendable to other busses that have no
native hotplug and model ID discovery.

This driver does not provide facilities to add and remove the hot-pluggable
components from the kernel: this needs to be done by other means
(e.g. device tree overlay runtime insertion and removal). The
hotplug-bridge gets notified of hot-plugging by the DRM bridge notifier
callbacks after they get added or before they get removed.

The hotplug-bridge role is to implement the "hot-pluggable connector" in
the bridge chain. In this position, what the hotplug-bridge should ideally
do is:

 * communicate with the previous component (bridge or encoder) so that it
   believes it always has a connected bridge following it and the DRM card
   is always present
 * be notified of the addition and removal of the following bridge and
   attach/detach to/from it
 * communicate with the following bridge so that it will attach and detach
   using the normal procedure (as if the entire pipeline were being created
   or destroyed, not only the tail)
 * expose the "add-on connected/disconnected" status via the DRM connector
   connected/disconnected status, so that users of the DRM pipeline know
   when they can render output on the display

However some aspects make it a bit more complex than that. Most notably:

 * the next bridge can be probed and removed at any moment and all probing
   sequences need to be handled
 * the DSI host/device registration process, which adds to the DRM bridge
   attach process, makes the initial card registration tricky
 * the need to register and deregister the following bridges at runtime
   without tearing down the whole DRM card prevents using the functions
   that are normally recommended
 * the automatic mechanism to call the appropriate .get_modes operation
   (typically provided by the panel bridge) cannot work as the panel can
   disappear and reappear as a different model, so an ad-hoc lookup is
   needed

The code handling these and other tricky aspects is accurately documented
by comments in the code.

Co-developed-by: Paul Kocialkowski 
Signed-off-by: Paul Kocialkowski 
Signed-off-by: Luca Ceresoli 
---
 MAINTAINERS |   1 +
 drivers/gpu/drm/bridge/Kconfig  |  15 +
 drivers/gpu/drm/bridge/Makefile |   1 +
 drivers/gpu/drm/bridge/hotplug-bridge.c | 561 
 4 files changed, 578 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index e1affd13e30b..b3fe36ed35a0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6720,6 +6720,7 @@ DRM DRIVER FOR HOTPLUG VIDEO CONNECTOR BRIDGE
 M: Luca Ceresoli 
 S: Maintained
 F: 
Documentation/devicetree/bindings/display/bridge/hotplug-video-connector-dsi.yaml
+F: drivers/gpu/drm/bridge/hotplug-bridge.c
 
 DRM DRIVER FOR HX8357D PANELS
 S: Orphan
diff --git a/drivers/gpu/drm/bridge/Kconfig b/drivers/gpu/drm/bridge/Kconfig
index efd996f6c138..409d090ee94d 100644
--- a/drivers/gpu/drm/bridge/Kconfig
+++ b/drivers/gpu/drm/bridge/Kconfig
@@ -90,6 +90,21 @@ config DRM_FSL_LDB
help
  Support for i.MX8MP DPI-to-LVDS on-SoC encoder.
 
+config DRM_HOTPLUG_BRIDGE
+   tristate "Hotplug DRM bridge support"
+   depends on OF
+   select DRM_PANEL_BRIDGE
+   select DRM_MIPI_DSI
+   select DRM_KMS_HELPER
+   help
+ Driver for a DRM bridge representing a physical connector that
+ splits a DRM pipeline into a fixed part and a physically
+ removable part. The fixed part includes up to the encoder and
+ zero or more bridges. The removable part includes any following
+ bridges up to the connector and panel and can be 

[PATCH 3/4] drm/encoder: add drm_encoder_cleanup_from()

2024-03-26 Thread Luca Ceresoli
Supporting hardware whose final part of the DRM pipeline can be physically
removed requires the ability to detach all bridges from a given point to
the end of the pipeline.

Introduce a variant of drm_encoder_cleanup() for this.

Signed-off-by: Luca Ceresoli 
---
 drivers/gpu/drm/drm_encoder.c | 21 +
 include/drm/drm_encoder.h |  1 +
 2 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/drm_encoder.c b/drivers/gpu/drm/drm_encoder.c
index 8f2bc6a28482..13149447bec8 100644
--- a/drivers/gpu/drm/drm_encoder.c
+++ b/drivers/gpu/drm/drm_encoder.c
@@ -207,6 +207,27 @@ void drm_encoder_cleanup(struct drm_encoder *encoder)
 }
 EXPORT_SYMBOL(drm_encoder_cleanup);
 
+/**
+ * drm_encoder_cleanup_from - remove a given bridge and all the following
+ * @encoder: encoder whole list of bridges shall be pruned
+ * @bridge: first bridge to remove
+ *
+ * Removes from an encoder all the bridges starting with a given bridges
+ * and until the end of the chain.
+ *
+ * This should not be used in "normal" DRM pipelines. It is only useful for
+ * devices whose final part of the DRM chain can be physically removed and
+ * later reconnected (possibly with different hardware).
+ */
+void drm_encoder_cleanup_from(struct drm_encoder *encoder, struct drm_bridge 
*bridge)
+{
+   struct drm_bridge *next;
+
+   list_for_each_entry_safe_from(bridge, next, >bridge_chain, 
chain_node)
+   drm_bridge_detach(bridge);
+}
+EXPORT_SYMBOL(drm_encoder_cleanup_from);
+
 static void drmm_encoder_alloc_release(struct drm_device *dev, void *ptr)
 {
struct drm_encoder *encoder = ptr;
diff --git a/include/drm/drm_encoder.h b/include/drm/drm_encoder.h
index 977a9381c8ba..bafcabb24267 100644
--- a/include/drm/drm_encoder.h
+++ b/include/drm/drm_encoder.h
@@ -320,6 +320,7 @@ static inline struct drm_encoder *drm_encoder_find(struct 
drm_device *dev,
 }
 
 void drm_encoder_cleanup(struct drm_encoder *encoder);
+void drm_encoder_cleanup_from(struct drm_encoder *encoder, struct drm_bridge 
*bridge);
 
 /**
  * drm_for_each_encoder_mask - iterate over encoders specified by bitmask

-- 
2.34.1



[PATCH 1/4] dt-bindings: display: bridge: add the Hot-plug MIPI DSI connector

2024-03-26 Thread Luca Ceresoli
Add bindings for a physical, hot-pluggable connector allowing the far end
of a MIPI DSI bus to be connected and disconnected at runtime.

Signed-off-by: Luca Ceresoli 
---
 .../bridge/hotplug-video-connector-dsi.yaml| 87 ++
 MAINTAINERS|  5 ++
 2 files changed, 92 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/display/bridge/hotplug-video-connector-dsi.yaml
 
b/Documentation/devicetree/bindings/display/bridge/hotplug-video-connector-dsi.yaml
new file mode 100644
index ..05beb8aa9ab4
--- /dev/null
+++ 
b/Documentation/devicetree/bindings/display/bridge/hotplug-video-connector-dsi.yaml
@@ -0,0 +1,87 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: 
http://devicetree.org/schemas/display/bridge/hotplug-video-connector-dsi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Hot-pluggable connector on a MIPI DSI bus
+
+maintainers:
+  - Luca Ceresoli 
+
+description:
+  A bridge representing a physical, hot-pluggable connector on a MIPI DSI
+  video bus. The connector splits the video pipeline in a fixed part and a
+  removable part.
+
+  The fixed part of the video pipeline includes all components up to the
+  display controller and 0 or more bridges. The removable part includes one
+  or more bridges and any other components up to the panel.
+
+  The removable part of the pipeline can be physically disconnected at any
+  moment, making all of its components not usable anymore. The same or a
+  different removable part of the pipeline can be reconnected later on.
+
+  Note that the hotplug-video-connector does not describe video busses
+  having native hotplug capabilities in the hardware, such as HDMI.
+
+properties:
+  compatible:
+const: hotplug-video-connector-dsi
+
+  ports:
+$ref: /schemas/graph.yaml#/properties/ports
+
+properties:
+  port@0:
+$ref: /schemas/graph.yaml#/properties/port
+description:
+  The end of the fixed part of the MIPI DSI bus (terminating at the
+  hotplug connector). The remote-endpoint sub-node must point to
+  the previous component of the video pipeline.
+
+  port@1:
+$ref: /schemas/graph.yaml#/properties/port
+description:
+  The start of the removable part of the MIPI DSI bus (starting
+  from the hotplug connector). The remote-endpoint sub-node must
+  point to the next component of the video pipeline.
+
+required:
+  - port@0
+  - port@1
+
+required:
+  - compatible
+  - ports
+
+additionalProperties: false
+
+examples:
+  - |
+hotplug-video-connector {
+compatible = "hotplug-video-connector-dsi";
+
+ports {
+#address-cells = <1>;
+#size-cells = <0>;
+
+port@0 {
+reg = <0>;
+
+hotplug_connector_in: endpoint {
+remote-endpoint = <_bridge_out>;
+};
+};
+
+port@1 {
+reg = <1>;
+
+hotplug_connector_out: endpoint {
+remote-endpoint = <_bridge_in>;
+};
+};
+};
+};
+
+...
diff --git a/MAINTAINERS b/MAINTAINERS
index aa3b947fb080..e1affd13e30b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6716,6 +6716,11 @@ T:   git git://anongit.freedesktop.org/drm/drm-misc
 F: Documentation/devicetree/bindings/display/panel/himax,hx8394.yaml
 F: drivers/gpu/drm/panel/panel-himax-hx8394.c
 
+DRM DRIVER FOR HOTPLUG VIDEO CONNECTOR BRIDGE
+M: Luca Ceresoli 
+S: Maintained
+F: 
Documentation/devicetree/bindings/display/bridge/hotplug-video-connector-dsi.yaml
+
 DRM DRIVER FOR HX8357D PANELS
 S: Orphan
 T: git git://anongit.freedesktop.org/drm/drm-misc

-- 
2.34.1



[PATCH 2/4] drm/bridge: add bridge notifier to be notified of bridge addition and removal

2024-03-26 Thread Luca Ceresoli
From: Paul Kocialkowski 

In preparation for allowing bridges to be added to and removed from a DRM
card without destroying the whole card, add a DRM bridge notifier. Notified
events are addition and removal to/from the global bridge list.

Co-developed-by: Luca Ceresoli 
Signed-off-by: Luca Ceresoli 
Signed-off-by: Paul Kocialkowski 
---
 drivers/gpu/drm/drm_bridge.c | 35 +++
 include/drm/drm_bridge.h | 19 +++
 2 files changed, 54 insertions(+)

diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
index 521a71c61b16..245f7fa4ea22 100644
--- a/drivers/gpu/drm/drm_bridge.c
+++ b/drivers/gpu/drm/drm_bridge.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -197,6 +198,36 @@
 
 static DEFINE_MUTEX(bridge_lock);
 static LIST_HEAD(bridge_list);
+static BLOCKING_NOTIFIER_HEAD(bridge_notifier);
+
+/**
+ * drm_bridge_notifier_register - add a DRM bridge notifier
+ * @nb: the notifier block to be registered
+ *
+ * The notifier block will be notified of events defined in
+ * _bridge_notifier_event
+ */
+int drm_bridge_notifier_register(struct notifier_block *nb)
+{
+   return blocking_notifier_chain_register(_notifier, nb);
+}
+EXPORT_SYMBOL(drm_bridge_notifier_register);
+
+/**
+ * drm_bridge_notifier_unregister - remove a DRM bridge notifier
+ * @nb: the notifier block to be unregistered
+ */
+int drm_bridge_notifier_unregister(struct notifier_block *nb)
+{
+   return blocking_notifier_chain_unregister(_notifier, nb);
+}
+EXPORT_SYMBOL(drm_bridge_notifier_unregister);
+
+static void drm_bridge_notifier_notify(unsigned long event,
+  struct drm_bridge *bridge)
+{
+   blocking_notifier_call_chain(_notifier, event, bridge);
+}
 
 /**
  * drm_bridge_add - add the given bridge to the global bridge list
@@ -210,6 +241,8 @@ void drm_bridge_add(struct drm_bridge *bridge)
mutex_lock(_lock);
list_add_tail(>list, _list);
mutex_unlock(_lock);
+
+   drm_bridge_notifier_notify(DRM_BRIDGE_NOTIFY_ADD, bridge);
 }
 EXPORT_SYMBOL(drm_bridge_add);
 
@@ -243,6 +276,8 @@ EXPORT_SYMBOL(devm_drm_bridge_add);
  */
 void drm_bridge_remove(struct drm_bridge *bridge)
 {
+   drm_bridge_notifier_notify(DRM_BRIDGE_NOTIFY_REMOVE, bridge);
+
mutex_lock(_lock);
list_del_init(>list);
mutex_unlock(_lock);
diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h
index 4baca0d9107b..ee48c1eb76ae 100644
--- a/include/drm/drm_bridge.h
+++ b/include/drm/drm_bridge.h
@@ -43,6 +43,22 @@ struct drm_panel;
 struct edid;
 struct i2c_adapter;
 
+/**
+ * enum drm_bridge_notifier_event - DRM bridge events
+ */
+enum drm_bridge_notifier_event {
+   /**
+* @DRM_BRIDGE_NOTIFY_ADD: A bridge has just been added to the
+* global bridge list. See drm_bridge_add().
+*/
+   DRM_BRIDGE_NOTIFY_ADD,
+   /**
+* @DRM_BRIDGE_NOTIFY_REMOVE: A bridge is about to be removed from
+* the global bridge list. See drm_bridge_remove().
+*/
+   DRM_BRIDGE_NOTIFY_REMOVE,
+};
+
 /**
  * enum drm_bridge_attach_flags - Flags for _bridge_funcs.attach
  */
@@ -781,6 +797,9 @@ drm_priv_to_bridge(struct drm_private_obj *priv)
return container_of(priv, struct drm_bridge, base);
 }
 
+int drm_bridge_notifier_register(struct notifier_block *nb);
+int drm_bridge_notifier_unregister(struct notifier_block *nb);
+
 void drm_bridge_add(struct drm_bridge *bridge);
 int devm_drm_bridge_add(struct device *dev, struct drm_bridge *bridge);
 void drm_bridge_remove(struct drm_bridge *bridge);

-- 
2.34.1



[PATCH 0/4] drm: add support for hot-pluggable bridges

2024-03-26 Thread Luca Ceresoli
Hello,

DRM natively supports pipelines whose display can be removed, but all the
components preceding it (all the display controller and any bridges) are
assumed to be fixed and cannot be plugged, removed or modified at runtime.

This series adds support for DRM pipelines having a removable part after
the encoder, thus also allowing bridges to be removed and reconnected at
runtime, possibly with different components.

In the overall ongoing work, this is going to be handled via device tree
overlay insertion and removal. For many kernel driver frameworks, adding
and removing devices via device tree overlays works already (albeit with
some issues related to overlays in general), but this does not happen for
DRM, so this serias aims at filling this gap.

This series only covers the DRM aspects and not the overlay ones. See
"Development roadmap" below for more details.

Use case


The use case we are working on is to support professional products that
have a portable "main" part running on battery, with the main SoC and able
to work autonomously with limited features, and that can be connected to an
"add-on" part that is not portable and adds more features.

The add-on can be connected and disconnected at runtime at any moment by
the end user, and add-on features need to be enabled and disabled
automatically at runtime. The features provided by the add-on include a
display and a battery charger to recharge the battery of the main part. The
display on the add-on has an LVDS input but the connector between the base
and the add-on has a MIPI DSI bus, so a DSI-to-LVDS bridge is present on
the add-on.

Targeted abstraction level
--

This series aims at supporting both the use case described above and any
similar use cases, e.g. using different video busses, up to a given level
of generalization.

This picture summarizes the DRM aspects of such devices:

 ..
 |   DISPLAY CONTROLLER   |
 | .-.   .--. |
 | | ENCODER |<--| CRTC | |
 | '-'   '--' |
 '--|-'
|
|DSIHOTPLUG
V  CONNECTOR
   .-..--..-..-. .---.
   | 0 to N  || _|   _| || 1 to N  | |   |
   | BRIDGES |--DSI-->||_   |_  |--DSI-->| BRIDGES |--LVDS-->| PANEL |
   | ||  || || | |   |
   '-''--''-''-' '---'

 [--- fixed components --]  [--- removable add-on ---]

Fixed components include:

 * all components up to the DRM encoder, usually part of the SoC
 * optionally some bridges, in the SoC and/or as external chips

Components on the removable add-on include:

 * one or more bridges
 * a fixed connector (not one natively supporting hotplug such as HDMI)
 * the panel

Overall this looks like a fairly standard embedded device, except for the
hot-pluggable connector allowing to remove a bridge and all the following
components at runtime and without prior notice for the kernel.

The video bus is MIPI DSI in the example and in the implementation provided
by this series, but the implementation is meant to allow can be
generalizedwgeneralization to other video busses without native hotplug
support, such as parallel video and LVDS.

The "hotplug connector" in picture is the mechanical connector that can be
physically removed at runtime. All the video bus signals (DSI in the
example) get connected or disconnected via that connector.

Note that the term "connector" in this context has nothing to do with the
"DRM connector" abstraction already present in the DRM subsystem (struct
drm_connector). The existing "DRM connector" has been designed to support
hotplug on a bus that physically supports both hotplug _and_ monitor
identification (e.g. HDMI), and later also used to model the connection to
a non-removable panel that is commonly found on embedded systems and
supports neither hotplug nor panel identification. For this reason, the
"DRM connector" is always physically located after all the bridges.

The "hotplug connector" here described is physically hot-pluggable but does
not support model identification, being meant for buses that do not support
identification because they do not support hot-plugging natively.

This is why at least 1 bridge is assumed to be present in the removable
add-on: if there were no such bridge, the "hotplug connector" would be
immediately followed by the "DRM connector" and the panel. In such a
situation, hot-plugging could be implemented by the "DRM connector" in a
much more straightforward way. So this work is mostly useful when there is
at least one bridge on the removable add-on.

The removable components form a unique assembly whose components can not be
separated individually: at any given moment the add-on is either connected
or disconencted -- it is never considered partially connected.

After an add-on 

RE: [PATCH v6 0/3] Disable automatic load CCS load balancing

2024-03-26 Thread Mrozek, Michal
On Wed, Mar 13, 2024 at 09:19:48PM +0100, Andi Shyti wrote:
> Hi,
> 
> this series does basically two things:
> 
> 1. Disables automatic load balancing as adviced by the hardware
>workaround.
> 
> 2. Assigns all the CCS slices to one single user engine. The user
>will then be able to query only one CCS engine
> 
> >From v5 I have created a new file, gt/intel_gt_ccs_mode.c where
> I added the intel_gt_apply_ccs_mode(). In the upcoming patches, this 
> file will contain the implementation for dynamic CCS mode setting.
> 
> Thanks Tvrtko, Matt, John and Joonas for your reviews!
> 
> Andi
> 
> Changelog
> =
> v5 -> v6 (thanks Matt for the suggestions in v6)
>  - Remove the refactoring and the for_each_available_engine()
>macro and instead do not create the intel_engine_cs structure
>at all.
>  - In patch 1 just a trivial reordering of the bit definitions.
> 
> v4 -> v5
>  - Use the workaround framework to do all the CCS balancing
>settings in order to always apply the modes also when the
>engine resets. Put everything in its own specific function to
>be executed for the first CCS engine encountered. (Thanks
>Matt)
>  - Calculate the CCS ID for the CCS mode as the first available
>CCS among all the engines (Thanks Matt)
>  - create the intel_gt_ccs_mode.c function to host the CCS
>configuration. We will have it ready for the next series.
>  - Fix a selftest that was failing because could not set CCS2.
>  - Add the for_each_available_engine() macro to exclude CCS1+ and
>start using it in the hangcheck selftest.
> 
> v3 -> v4
>  - Reword correctly the comment in the workaround
>  - Fix a buffer overflow (Thanks Joonas)
>  - Handle properly the fused engines when setting the CCS mode.
> 
> v2 -> v3
>  - Simplified the algorithm for creating the list of the exported
>uabi engines. (Patch 1) (Thanks, Tvrtko)
>  - Consider the fused engines when creating the uabi engine list
>(Patch 2) (Thanks, Matt)
>  - Patch 4 now uses a the refactoring from patch 1, in a cleaner
>outcome.
> 
> v1 -> v2
>  - In Patch 1 use the correct workaround number (thanks Matt).
>  - In Patch 2 do not add the extra CCS engines to the exposed
>UABI engine list and adapt the engine counting accordingly
>(thanks Tvrtko).
>  - Reword the commit of Patch 2 (thanks John).
> 
> Andi Shyti (3):
>   drm/i915/gt: Disable HW load balancing for CCS
>   drm/i915/gt: Do not generate the command streamer for all the CCS
>   drm/i915/gt: Enable only one CCS for compute workload
> 
>  drivers/gpu/drm/i915/Makefile   |  1 +
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 20 ---
>  drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c | 39 
> +  drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h | 13 
> +++
>  drivers/gpu/drm/i915/gt/intel_gt_regs.h |  6 
>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 30 ++--
>  6 files changed, 103 insertions(+), 6 deletions(-)  create mode 
> 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
> 
> --
> 2.43.0

Acked-by: Michal Mrozek 



  1   2   3   >