[PATCH v2 2/2] drm/kmb: Do not report 0 (success) in case of error

2021-05-26 Thread Christophe JAILLET
'ret' is known to be 0 at this point.
Reporting the error from the previous 'platform_get_irq()' call is likely,
so add the missing assignment.

Fixes: 7f7b96a8a0a1 ("drm/kmb: Add support for KeemBay Display")
Signed-off-by: Christophe JAILLET 
---
v2: New patch
---
 drivers/gpu/drm/kmb/kmb_drv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/kmb/kmb_drv.c b/drivers/gpu/drm/kmb/kmb_drv.c
index fa28e42da460..d9e10ac9847c 100644
--- a/drivers/gpu/drm/kmb/kmb_drv.c
+++ b/drivers/gpu/drm/kmb/kmb_drv.c
@@ -138,6 +138,7 @@ static int kmb_hw_init(struct drm_device *drm, unsigned 
long flags)
irq_lcd = platform_get_irq(pdev, 0);
if (irq_lcd < 0) {
drm_err(&kmb->drm, "irq_lcd not found");
+   ret = irq_lcd;
goto setup_fail;
}
 
-- 
2.30.2



[PATCH v2 1/2] drm/kmb: Fix an error handling path

2021-05-26 Thread Christophe JAILLET
If 'platform_get_irq()' fails, it is spurious to call
'of_reserved_mem_device_release()' in the error handling path, because
'of_reserved_mem_device_init() has not been called yet.

Moreover, a previous 'kmb_initialize_clocks()' is unbalanced by a
corresponding 'kmb_display_clk_disable()' call, has already done in the
remove function.

It is likely that 'kmb_display_clk_disable()' is expected in the error
handling path, instead of 'of_reserved_mem_device_release()'.


Also, it is spurious to return directly if 'of_reserved_mem_device_init()'
fails.
Goto the error handling path instead to free some resources.

Fixes: 7f7b96a8a0a1 ("drm/kmb: Add support for KeemBay Display")
Signed-off-by: Christophe JAILLET 
---
v2: Keep label name
Fix the commit message where a wrong function name was used
---
 drivers/gpu/drm/kmb/kmb_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/kmb/kmb_drv.c b/drivers/gpu/drm/kmb/kmb_drv.c
index f64e06e1067d..fa28e42da460 100644
--- a/drivers/gpu/drm/kmb/kmb_drv.c
+++ b/drivers/gpu/drm/kmb/kmb_drv.c
@@ -144,7 +144,7 @@ static int kmb_hw_init(struct drm_device *drm, unsigned 
long flags)
/* Get the optional framebuffer memory resource */
ret = of_reserved_mem_device_init(drm->dev);
if (ret && ret != -ENODEV)
-   return ret;
+   goto setup_fail;
 
spin_lock_init(&kmb->irq_lock);
 
@@ -153,7 +153,7 @@ static int kmb_hw_init(struct drm_device *drm, unsigned 
long flags)
return 0;
 
  setup_fail:
-   of_reserved_mem_device_release(drm->dev);
+   kmb_display_clk_disable(kmb);
 
return ret;
 }
-- 
2.30.2



[v1] drm/msm/disp/dpu1: avoid perf update in frame done event

2021-05-26 Thread Krishna Manikandan
Crtc perf update from frame event work can result in
wrong bandwidth and clock update from dpu if the work
is scheduled after the swap state has happened.

Avoid such issues by moving perf update to complete
commit once the frame is accepted by the hardware.

Fixes: a29c8c024165 ("drm/msm/disp/dpu1: fix display underruns during modeset")
Signed-off-by: Krishna Manikandan 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
index 18bc76b..4523d6b 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
@@ -407,9 +407,6 @@ static void dpu_crtc_frame_event_work(struct kthread_work 
*work)
fevent->event);
}
 
-   if (fevent->event & DPU_ENCODER_FRAME_EVENT_DONE)
-   dpu_core_perf_crtc_update(crtc, 0, false);
-
if (fevent->event & (DPU_ENCODER_FRAME_EVENT_DONE
| DPU_ENCODER_FRAME_EVENT_ERROR))
frame_done = true;
@@ -477,6 +474,7 @@ static void dpu_crtc_frame_event_cb(void *data, u32 event)
 void dpu_crtc_complete_commit(struct drm_crtc *crtc)
 {
trace_dpu_crtc_complete_commit(DRMID(crtc));
+   dpu_core_perf_crtc_update(crtc, 0, false);
_dpu_crtc_complete_flip(crtc);
 }
 
-- 
2.7.4



Re: [PATCH 4/4] RFC: dma-buf: Add an API for importing sync files (v6)

2021-05-26 Thread Jason Ekstrand

On May 26, 2021 13:15:08 Daniel Stone  wrote:


Hey,

On Wed, 26 May 2021 at 16:24, Jason Ekstrand  wrote:

On Wed, May 26, 2021 at 6:09 AM Daniel Stone  wrote:

Typing out the Wayland protocol isn't the hard bit. If we just need to
copy and sed syncobj to weirdsyncobj, no problem really, and it gives
us a six-month head start on painful compositor-internal surgery
whilst we work on common infrastructure to ship userspace fences
around (mappable dmabuf with the sync bracketing? FD where every
read() gives you the current value? memfd? other?).


I feel like I should elaborate more about timelines.  In my earlier
reply, my commentary about timeline syncobj was mostly focused around
helping people avoid typing.  That's not really the full story,
though, and I hope more context will help.

First, let me say that timeline syncobj was designed as a mechanism to
implement VK_KHR_timeline_semaphore without inserting future fences
into the kernel.  It's entirely designed around the needs of Vulkan
drivers, not really as a window-system primitive.  The semantics are
designed around one driver communicating to another that new fences
have been added and it's safe to kick off more rendering.  I'm not
convinced that it's the right object for window-systems and I'm also
not convinced that it's a good idea to try and make a version of it
that's a wrapper around a userspace memory fence.  (I'm going to start
typing UMF for userspace memory fence because it's long to type out.)

Why?  Well, the fundamental problem with timelines in general is
trying to figure out when it's about to be done.  But timeline syncobj
solves this for us!  It gives us this fancy super-useful ioctl!
Right?  Uh not as well as I'd like.  Let's say we make a timeline
syncobj that's a wrapper around a userspace memory fence.  What do we
do with that ioctl?  As I mentioned above, the kernel doesn't have any
clue when it will be triggered so that ioctl turns into an actual
wait.  That's no good because it creates unnecessary stalls.


Yeah, I'm assuming that UMF will be a separate primitive. No problem.
I also think that your submitted/completed thing is a non-problem: at
this stage we're just throwing up our hands and admitting that we're
letting userspace tie itself in knots, and giving it the tools to tie
a sufficiently un-streetwise compositor in knots too. We're already
crossing that Rubicon, so let's just embrace it and not try to design
it out. Us compositors can handle the scheduling, really.


Ok, good. I think we're on the same page.




There's another potential solution here:  Have each UMF be two
timelines: submitted and completed.  At the start of every batch
that's supposed to trigger a UMF, we set the "submitted" side and
then, when it completes, we set the "completed" side.  Ok, great, now
we can get at the "about to be done" with the submitted side,
implement the ioctl, and we're all good, right?  Sadly, no.  There's
no guarantee about how long a "batch" takes.  So there's no universal
timeout the kernel can apply.  Also, if it does time out, the kernel
doesn't know who to blame for the timeout and how to prevent itself
from getting in trouble again.  The compositor does so, in theory,
given the right ioctls, it could detect the -ETIME and kill that
client.  Not a great solution.

The best option I've been able to come up with for this is some sort
of client-provided signal.  Something where it says, as part of submit
or somewhere else, "I promise I'll be done soon" where that promise
comes with dire consequences if it's not.  At that point, we can turn
the UMF and a particular wait value into a one-shot fence like a
dma_fence or sync_file, or signal a syncobj on it.  If it ever times
out, we kick their context.  In Vulkan terminology, they get
VK_ERROR_DEVICE_LOST.  There are two important bits here:  First, is
that it's based on a client-provided thing.  With a fully timeline
model and wait-before-signal, we can't infer when something is about
to be done.  Only the client knows when it submitted its last node in
the dependency graph and the whole mess is unblocked.  Second, is that
the dma_fence is created within the client's driver context.  If it's
created compositor-side, the kernel doesn't know who to blame if
things go badly.  If we create it in the client, it's pretty easy to
make context death on -ETIME part of the contract.

(Before danvet jumps in here and rants about how UMF -> dma_fence
isn't possible, I haven't forgotten.  I'm pretending, for now, that
we've solved some of those problems.)


Funny how we've come full circle to the original proposal here ...

If we really want a kernel primitive for this - and I think it's a
good idea, since can help surface 'badness' in a way which is
observable by e.g. session managers in a way analogous to cgroup stats
and controls - how about this for a counter-proposal? Client exports a
FD for its context/queue and sends it to winsys as part of setup,
compositor can ioctl() on t

Re: [PATCH 24/34] drm/amd/display/modules/hdcp/hdcp_psp: Remove unused function 'mod_hdcp_hdcp1_get_link_encryption_status()'

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp_psp.c:374:22: 
> warning: no previous prototype for 
> ‘mod_hdcp_hdcp1_get_link_encryption_status’ [-Wmissing-prototypes]
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c | 13 -
>  1 file changed, 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c 
> b/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c
> index 26f96c05e0ec8..06910d2fd57a0 100644
> --- a/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c
> +++ b/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c
> @@ -371,19 +371,6 @@ enum mod_hdcp_status 
> mod_hdcp_hdcp1_link_maintenance(struct mod_hdcp *hdcp)
> return status;
>  }
>
> -enum mod_hdcp_status mod_hdcp_hdcp1_get_link_encryption_status(struct 
> mod_hdcp *hdcp,
> -  enum 
> mod_hdcp_encryption_status *encryption_status)
> -{
> -   *encryption_status = MOD_HDCP_ENCRYPTION_STATUS_HDCP_OFF;
> -
> -   if (mod_hdcp_hdcp1_link_maintenance(hdcp) != MOD_HDCP_STATUS_SUCCESS)
> -   return MOD_HDCP_STATUS_FAILURE;
> -
> -   *encryption_status = MOD_HDCP_ENCRYPTION_STATUS_HDCP1_ON;
> -
> -   return MOD_HDCP_STATUS_SUCCESS;
> -}
> -
>  enum mod_hdcp_status mod_hdcp_hdcp2_create_session(struct mod_hdcp *hdcp)
>  {
> struct psp_context *psp = hdcp->config.psp.handle;
> --
> 2.31.1
>


Re: [PATCH 23/34] drm/amd/display/dmub/src/dmub_srv_stat: Convert function header to kernel-doc

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../display/dmub/src/dmub_srv_stat.c:38: warning: 
> Cannot understand  
> *
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Jun Lei 
> Cc: Meenakshikumar Somasundaram 
> Cc: Rodrigo Siqueira 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  .../drm/amd/display/dmub/src/dmub_srv_stat.c  | 19 ++-
>  1 file changed, 6 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv_stat.c 
> b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv_stat.c
> index e6f3bfab33d3e..70766d534c9c8 100644
> --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv_stat.c
> +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv_stat.c
> @@ -35,20 +35,13 @@
>   */
>
>  /**
> - 
> *
> - *  Function: dmub_srv_stat_get_notification
> + * dmub_srv_stat_get_notification - Retrieves a dmub outbox notification, 
> set up dmub notification
> + *  structure with message information. Also 
> a pending bit if queue
> + *  is having more notifications
> + *  @dmub: dmub srv structure
> + *  @notify: dmub notification structure to be filled up
>   *
> - *  @brief
> - * Retrieves a dmub outbox notification, set up dmub notification
> - * structure with message information. Also a pending bit if 
> queue
> - * is having more notifications
> - *
> - *  @param [in] dmub: dmub srv structure
> - *  @param [out] pnotify: dmub notification structure to be filled up
> - *
> - *  @return
> - * dmub_status
> - 
> *
> + *  Returns: dmub_status
>   */
>  enum dmub_status dmub_srv_stat_get_notification(struct dmub_srv *dmub,
> struct dmub_notification 
> *notify)
> --
> 2.31.1
>


Re: [PATCH 22/34] drm/amd/display/dc/core/dc: Convert function headers to kernel-doc

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:3324: warning: Cannot 
> understand  
> *
>  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:3344: warning: Cannot 
> understand  
> *
>  drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:3417: warning: Cannot 
> understand  
> *
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/display/dc/core/dc.c | 46 ++--
>  1 file changed, 11 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc.c
> index ef157b83bacd2..34c207f92df98 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
> @@ -3335,18 +3335,10 @@ void dc_hardware_release(struct dc *dc)
>  #endif
>
>  /**
> - 
> *
> - *  Function: dc_enable_dmub_notifications
> + * dc_enable_dmub_notifications - Returns whether dmub notification can be 
> enabled
> + * @dc: dc structure
>   *
> - *  @brief
> - * Returns whether dmub notification can be enabled
> - *
> - *  @param
> - * [in] dc: dc structure
> - *
> - * @return
> - * True to enable dmub notifications, False otherwise
> - 
> *
> + * Returns: True to enable dmub notifications, False otherwise
>   */
>  bool dc_enable_dmub_notifications(struct dc *dc)
>  {
> @@ -3355,21 +3347,13 @@ bool dc_enable_dmub_notifications(struct dc *dc)
>  }
>
>  /**
> - 
> *
> - *  Function: dc_process_dmub_aux_transfer_async
> - *
> - *  @brief
> - * Submits aux command to dmub via inbox message
> - * Sets port index appropriately for legacy DDC
> - *
> - *  @param
> - * [in] dc: dc structure
> - * [in] link_index: link index
> - * [in] payload: aux payload
> + * dc_process_dmub_aux_transfer_async - Submits aux command to dmub via 
> inbox message
> + *  Sets port index appropriately for 
> legacy DDC
> + * @dc: dc structure
> + * @link_index: link index
> + * @payload: aux payload
>   *
> - * @return
> - * True if successful, False if failure
> - 
> *
> + * Returns: True if successful, False if failure
>   */
>  bool dc_process_dmub_aux_transfer_async(struct dc *dc,
> uint32_t link_index,
> @@ -3428,16 +3412,8 @@ bool dc_process_dmub_aux_transfer_async(struct dc *dc,
>  }
>
>  /**
> - 
> *
> - *  Function: dc_disable_accelerated_mode
> - *
> - *  @brief
> - * disable accelerated mode
> - *
> - *  @param
> - * [in] dc: dc structure
> - *
> - 
> *
> + * dc_disable_accelerated_mode - disable accelerated mode
> + * @dc: dc structure
>   */
>  void dc_disable_accelerated_mode(struct dc *dc)
>  {
> --
> 2.31.1
>


Re: [PATCH 21/34] drm/amd/display/dc/dce110/dce110_hw_sequencer: Include header containing our prototypes

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce110/dce110_hw_sequencer.c:929:6: 
> warning: no previous prototype for ‘dce110_edp_wait_for_T12’ 
> [-Wmissing-prototypes]
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
> b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> index 9219db79f32b6..1ef1b1b33fb09 100644
> --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> @@ -64,6 +64,7 @@
>  #include "atomfirmware.h"
>
>  #include "dce110_hw_sequencer.h"
> +#include "dcn10/dcn10_hw_sequencer.h"
>
>  #define GAMMA_HW_POINTS_NUM 256
>
> --
> 2.31.1
>


Re: [PATCH 20/34] drm/amd/display/amdgpu_dm/amdgpu_dm: Fix kernel-doc formatting issue

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:608: warning: 
> Function parameter or member 'interrupt_params' not described in 
> 'dm_dcn_vertical_interrupt0_high_irq'
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index ae0a95c5f1d8c..0b4841f377e41 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -605,7 +605,7 @@ static void dm_crtc_high_irq(void *interrupt_params)
>  /**
>   * dm_dcn_vertical_interrupt0_high_irq() - Handles OTG Vertical interrupt0 
> for
>   * DCN generation ASICs
> - * @interrupt params - interrupt parameters
> + * @interrupt_params: interrupt parameters
>   *
>   * Used to set crc window/read out crc value at vertical line 0 position
>   */
> --
> 2.31.1
>


Re: [PATCH 19/34] drm/amd/amdgpu/amdgpu_device: Make local function static

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4624:6: warning: no previous 
> prototype for ‘amdgpu_device_recheck_guilty_jobs’ [-Wmissing-prototypes]
>
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Sumit Semwal 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 4a040f89ca5aa..f15e180762d2e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4692,7 +4692,7 @@ static int amdgpu_device_suspend_display_audio(struct 
> amdgpu_device *adev)
> return 0;
>  }
>
> -void amdgpu_device_recheck_guilty_jobs(
> +static void amdgpu_device_recheck_guilty_jobs(
> struct amdgpu_device *adev, struct list_head *device_list_handle,
> struct amdgpu_reset_context *reset_context)
>  {
> --
> 2.31.1
>


Re: [PATCH 18/34] drm/amd/display/dc/dce/dce_mem_input: Remove duplicate initialisation of GRPH_CONTROL__GRPH_NUM_BANKS_{SHIFT, MASK

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  In file included from 
> drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:29:
>  
> drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_6_0_sh_mask.h:7270:45: 
> warning: initialized field overwritten [-Woverride-init]
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:155:28: note: 
> in expansion of macro ‘GRPH_CONTROL__GRPH_NUM_BANKS__SHIFT’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:159:2: note: in 
> expansion of macro ‘SFB’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:264:2: note: in 
> expansion of macro ‘MI_GFX6_TILE_MASK_SH_LIST’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:657:3: note: 
> in expansion of macro ‘MI_DCE6_MASK_SH_LIST’
>  
> drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_6_0_sh_mask.h:7270:45: 
> note: (near initialization for ‘mi_shifts.GRPH_NUM_BANKS’)
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:155:28: note: 
> in expansion of macro ‘GRPH_CONTROL__GRPH_NUM_BANKS__SHIFT’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:159:2: note: in 
> expansion of macro ‘SFB’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:264:2: note: in 
> expansion of macro ‘MI_GFX6_TILE_MASK_SH_LIST’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:657:3: note: 
> in expansion of macro ‘MI_DCE6_MASK_SH_LIST’
>  
> drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_6_0_sh_mask.h:7269:43: 
> warning: initialized field overwritten [-Woverride-init]
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:155:28: note: 
> in expansion of macro ‘GRPH_CONTROL__GRPH_NUM_BANKS_MASK’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:159:2: note: in 
> expansion of macro ‘SFB’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:264:2: note: in 
> expansion of macro ‘MI_GFX6_TILE_MASK_SH_LIST’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:662:3: note: 
> in expansion of macro ‘MI_DCE6_MASK_SH_LIST’
>  
> drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_6_0_sh_mask.h:7269:43: 
> note: (near initialization for ‘mi_masks.GRPH_NUM_BANKS’)
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:155:28: note: 
> in expansion of macro ‘GRPH_CONTROL__GRPH_NUM_BANKS_MASK’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:159:2: note: in 
> expansion of macro ‘SFB’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:264:2: note: in 
> expansion of macro ‘MI_GFX6_TILE_MASK_SH_LIST’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:662:3: note: 
> in expansion of macro ‘MI_DCE6_MASK_SH_LIST’
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Mauro Rossi 
> Cc: Lee Jones 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.h | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.h 
> b/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.h
> index 9b1c4d56275a4..08a4c8d029d9f 100644
> --- a/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.h
> +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.h
> @@ -206,7 +206,6 @@ struct dce_mem_input_registers {
> SFB(blk, GRPH_ENABLE, GRPH_ENABLE, mask_sh),\
> SFB(blk, GRPH_CONTROL, GRPH_DEPTH, mask_sh),\
> SFB(blk, GRPH_CONTROL, GRPH_FORMAT, mask_sh),\
> -   SFB(blk, GRPH_CONTROL, GRPH_NUM_BANKS, mask_sh),\
> SFB(blk, GRPH_X_START, GRPH_X_START, mask_sh),\
> SFB(blk, GRPH_Y_START, GRPH_Y_START, mask_sh),\
> SFB(blk, GRPH_X_END, GRPH_X_END, mask_sh),\
> --
> 2.31.1
>


Re: [PATCH 17/34] drm/amd/display/dc/dce/dce_mem_input: Remove duplicate initialisation of GRPH_CONTROL__GRPH_NUM_BANKS_{SHIFT, MASK}

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  In file included from 
> drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:29:
>  
> drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_6_0_sh_mask.h:7270:45: 
> warning: initialized field overwritten [-Woverride-init]
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:155:28: note: 
> in expansion of macro ‘GRPH_CONTROL__GRPH_NUM_BANKS__SHIFT’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:159:2: note: in 
> expansion of macro ‘SFB’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:265:2: note: in 
> expansion of macro ‘MI_GFX6_TILE_MASK_SH_LIST’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:657:3: note: 
> in expansion of macro ‘MI_DCE6_MASK_SH_LIST’
>  
> drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_6_0_sh_mask.h:7270:45: 
> note: (near initialization for ‘mi_shifts.GRPH_NUM_BANKS’)
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:155:28: note: 
> in expansion of macro ‘GRPH_CONTROL__GRPH_NUM_BANKS__SHIFT’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:159:2: note: in 
> expansion of macro ‘SFB’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:265:2: note: in 
> expansion of macro ‘MI_GFX6_TILE_MASK_SH_LIST’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:657:3: note: 
> in expansion of macro ‘MI_DCE6_MASK_SH_LIST’
>  
> drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_6_0_sh_mask.h:7269:43: 
> warning: initialized field overwritten [-Woverride-init]
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:155:28: note: 
> in expansion of macro ‘GRPH_CONTROL__GRPH_NUM_BANKS_MASK’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:159:2: note: in 
> expansion of macro ‘SFB’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:265:2: note: in 
> expansion of macro ‘MI_GFX6_TILE_MASK_SH_LIST’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:662:3: note: 
> in expansion of macro ‘MI_DCE6_MASK_SH_LIST’
>  
> drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_6_0_sh_mask.h:7269:43: 
> note: (near initialization for ‘mi_masks.GRPH_NUM_BANKS’)
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:155:28: note: 
> in expansion of macro ‘GRPH_CONTROL__GRPH_NUM_BANKS_MASK’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:159:2: note: in 
> expansion of macro ‘SFB’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_mem_input.h:265:2: note: in 
> expansion of macro ‘MI_GFX6_TILE_MASK_SH_LIST’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:662:3: note: 
> in expansion of macro ‘MI_DCE6_MASK_SH_LIST’
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Mauro Rossi 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.h | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.h 
> b/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.h
> index 23db5c72f07ed..9b1c4d56275a4 100644
> --- a/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.h
> +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.h
> @@ -181,7 +181,6 @@ struct dce_mem_input_registers {
> SFB(blk, GRPH_ENABLE, GRPH_ENABLE, mask_sh),\
> SFB(blk, GRPH_CONTROL, GRPH_DEPTH, mask_sh),\
> SFB(blk, GRPH_CONTROL, GRPH_FORMAT, mask_sh),\
> -   SFB(blk, GRPH_CONTROL, GRPH_NUM_BANKS, mask_sh),\
> SFB(blk, GRPH_X_START, GRPH_X_START, mask_sh),\
> SFB(blk, GRPH_Y_START, GRPH_Y_START, mask_sh),\
> SFB(blk, GRPH_X_END, GRPH_X_END, mask_sh),\
> --
> 2.31.1
>


Re: [PATCH 16/34] drm/amd/display/dc/dce/dce_transform: Remove superfluous re-initialisation of DCFE_MEM_LIGHT_SLEEP_CNTL,

2021-05-26 Thread Alex Deucher
On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp_psp.c:374:22: 
> warning: no previous prototype for ‘mod_hdcp_hdcp1_get_link_encryption_status’
>  In file included from 
> drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:28:
>  drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_6_0_d.h:568:43: 
> warning: initialized field overwritten [-Woverride-init]
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:157:14: 
> note: in expansion of macro ‘mmCRTC0_DCFE_MEM_LIGHT_SLEEP_CNTL’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_transform.h:170:2: note: in 
> expansion of macro ‘SRI’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:183:3: note: 
> in expansion of macro ‘XFM_COMMON_REG_LIST_DCE60’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:187:3: note: 
> in expansion of macro ‘transform_regs’
>  drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_6_0_d.h:568:43: note: 
> (near initialization for ‘xfm_regs[0].DCFE_MEM_LIGHT_SLEEP_CNTL’)
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:157:14: 
> note: in expansion of macro ‘mmCRTC0_DCFE_MEM_LIGHT_SLEEP_CNTL’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_transform.h:170:2: note: in 
> expansion of macro ‘SRI’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:183:3: note: 
> in expansion of macro ‘XFM_COMMON_REG_LIST_DCE60’
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce60/dce60_resource.c:187:3: note: 
> in expansion of macro ‘transform_regs’
>  drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dce/dce_6_0_d.h:645:43: 
> warning: initialized field overwritten [-Woverride-init]
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Mauro Rossi 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/display/dc/dce/dce_transform.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h 
> b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h
> index cbce194ec7b82..e98b5d4141739 100644
> --- a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h
> +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h
> @@ -166,8 +166,7 @@
> SRI(SCL_F_SHARP_CONTROL, SCL, id)
>
>  #define XFM_COMMON_REG_LIST_DCE60(id) \
> -   XFM_COMMON_REG_LIST_DCE60_BASE(id), \
> -   SRI(DCFE_MEM_LIGHT_SLEEP_CNTL, CRTC, id)
> +   XFM_COMMON_REG_LIST_DCE60_BASE(id)

I believe this should be kept and it should be removed from
XFM_COMMON_REG_LIST_DCE60_BASE().

Alex

>  #endif
>
>  #define XFM_SF(reg_name, field_name, post_fix)\
> --
> 2.31.1
>


Re: [PATCH 15/34] drm/amd/display/dc/dce110/dce110_hw_sequencer: Include our own header

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce110/dce110_hw_sequencer.c:927:6: 
> warning: no previous prototype for ‘dce110_edp_wait_for_T12’ 
> [-Wmissing-prototypes]
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
> b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> index 5ddeee96bf235..9219db79f32b6 100644
> --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> @@ -63,6 +63,8 @@
>
>  #include "atomfirmware.h"
>
> +#include "dce110_hw_sequencer.h"
> +
>  #define GAMMA_HW_POINTS_NUM 256
>
>  /*
> --
> 2.31.1
>


Re: [PATCH 14/34] drm/amd/display/dc/gpio/gpio_service: Pass around correct dce_{version, environment} types

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../display/dc/gpio/gpio_service.c: In function 
> ‘dal_gpio_service_create’:
>  drivers/gpu/drm/amd/amdgpu/../display/dc/gpio/gpio_service.c:71:4: warning: 
> implicit conversion from ‘enum dce_version’ to ‘enum dce_environment’ 
> [-Wenum-conversion]
>  drivers/gpu/drm/amd/amdgpu/../display/dc/gpio/gpio_service.c:77:4: warning: 
> implicit conversion from ‘enum dce_version’ to ‘enum dce_environment’ 
> [-Wenum-conversion]
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c   | 12 ++--
>  .../drm/amd/display/include/gpio_service_interface.h |  4 ++--
>  2 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c 
> b/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c
> index 92280cc05e2db..dae8e489c8cf4 100644
> --- a/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c
> +++ b/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c
> @@ -53,8 +53,8 @@
>   */
>
>  struct gpio_service *dal_gpio_service_create(
> -   enum dce_version dce_version_major,
> -   enum dce_version dce_version_minor,
> +   enum dce_version dce_version,
> +   enum dce_environment dce_environment,
> struct dc_context *ctx)
>  {
> struct gpio_service *service;
> @@ -67,14 +67,14 @@ struct gpio_service *dal_gpio_service_create(
> return NULL;
> }
>
> -   if (!dal_hw_translate_init(&service->translate, dce_version_major,
> -   dce_version_minor)) {
> +   if (!dal_hw_translate_init(&service->translate, dce_version,
> +   dce_environment)) {
> BREAK_TO_DEBUGGER();
> goto failure_1;
> }
>
> -   if (!dal_hw_factory_init(&service->factory, dce_version_major,
> -   dce_version_minor)) {
> +   if (!dal_hw_factory_init(&service->factory, dce_version,
> +   dce_environment)) {
> BREAK_TO_DEBUGGER();
> goto failure_1;
> }
> diff --git a/drivers/gpu/drm/amd/display/include/gpio_service_interface.h 
> b/drivers/gpu/drm/amd/display/include/gpio_service_interface.h
> index 9c55d247227ea..7e3240e73c1fc 100644
> --- a/drivers/gpu/drm/amd/display/include/gpio_service_interface.h
> +++ b/drivers/gpu/drm/amd/display/include/gpio_service_interface.h
> @@ -42,8 +42,8 @@ void dal_gpio_destroy(
> struct gpio **ptr);
>
>  struct gpio_service *dal_gpio_service_create(
> -   enum dce_version dce_version_major,
> -   enum dce_version dce_version_minor,
> +   enum dce_version dce_version,
> +   enum dce_environment dce_environment,
> struct dc_context *ctx);
>
>  struct gpio *dal_gpio_service_create_irq(
> --
> 2.31.1
>


Re: [PATCH 13/34] drm/amd/display/dc/dce/dmub_outbox: Convert over to kernel-doc

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dmub_outbox.c:30: warning: 
> Cannot understand  
> *
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Rodrigo Siqueira 
> Cc: Meenakshikumar Somasundaram 
> Cc: Jun Lei 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  .../gpu/drm/amd/display/dc/dce/dmub_outbox.c| 17 -
>  1 file changed, 4 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dce/dmub_outbox.c 
> b/drivers/gpu/drm/amd/display/dc/dce/dmub_outbox.c
> index 295596d1f47f2..faad8555ddbb6 100644
> --- a/drivers/gpu/drm/amd/display/dc/dce/dmub_outbox.c
> +++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_outbox.c
> @@ -27,19 +27,10 @@
>  #include "dmub/inc/dmub_cmd.h"
>
>  /**
> - 
> *
> - *  Function: dmub_enable_outbox_notification
> - *
> - *  @brief
> - * Sends inbox cmd to dmub to enable outbox1 messages with 
> interrupt.
> - * Dmub sends outbox1 message and triggers outbox1 interrupt.
> - *
> - *  @param
> - * [in] dc: dc structure
> - *
> - *  @return
> - * None
> - 
> *
> + *  dmub_enable_outbox_notification - Sends inbox cmd to dmub to enable 
> outbox1
> + *messages with interrupt. Dmub sends 
> outbox1
> + *message and triggers outbox1 interrupt.
> + * @dc: dc structure
>   */
>  void dmub_enable_outbox_notification(struct dc *dc)
>  {
> --
> 2.31.1
>


Re: [PATCH 12/34] drm/amd/display/amdgpu_dm/amdgpu_dm: Functions must directly follow their headers

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:47 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:608: warning: 
> Function parameter or member 'interrupt_params' not described in 
> 'dm_dcn_vertical_interrupt0_high_irq'
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index b4e95d3ff3b88..ae0a95c5f1d8c 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -601,6 +601,7 @@ static void dm_crtc_high_irq(void *interrupt_params)
>  }
>
>  #if defined(CONFIG_DRM_AMD_DC_DCN)
> +#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
>  /**
>   * dm_dcn_vertical_interrupt0_high_irq() - Handles OTG Vertical interrupt0 
> for
>   * DCN generation ASICs
> @@ -608,7 +609,6 @@ static void dm_crtc_high_irq(void *interrupt_params)
>   *
>   * Used to set crc window/read out crc value at vertical line 0 position
>   */
> -#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
>  static void dm_dcn_vertical_interrupt0_high_irq(void *interrupt_params)
>  {
> struct common_irq_params *irq_params = interrupt_params;
> --
> 2.31.1
>


Re: [PATCH 10/34] drm/amd/display/dc/bios/bios_parser: Fix formatting and misnaming issues

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:47 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser.c:997: warning: 
> expecting prototype for get_ss_info_from_table(). Prototype was for 
> get_ss_info_from_tbl() instead
>  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser.c:1562: warning: 
> expecting prototype for BiosParserObject(). Prototype was for 
> bios_parser_get_ss_entry_number() instead
>  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser.c:1739: warning: 
> expecting prototype for 
> get_ss_entry_number_from_internal_ss_info_table_V3_1(). Prototype was for 
> get_ss_entry_number_from_internal_ss_info_tbl_V3_1() instead
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Lee Jones 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/display/dc/bios/bios_parser.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c 
> b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
> index c67d21a5ee52f..9b8ea6e9a2b96 100644
> --- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
> +++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
> @@ -979,7 +979,7 @@ static enum bp_result 
> get_ss_info_from_internal_ss_info_tbl_V2_1(
> struct spread_spectrum_info *info);
>
>  /**
> - * get_ss_info_from_table
> + * get_ss_info_from_tbl
>   * Get spread sprectrum information from the ASIC_InternalSS_Info Ver 2.1 or
>   * SS_Info table from the VBIOS
>   * There can not be more than 1 entry for  ASIC_InternalSS_Info Ver 2.1 or
> @@ -1548,7 +1548,7 @@ static uint32_t get_ss_entry_number_from_ss_info_tbl(
> uint32_t id);
>
>  /**
> - * BiosParserObject::GetNumberofSpreadSpectrumEntry
> + * bios_parser_get_ss_entry_number
>   * Get Number of SpreadSpectrum Entry from the ASIC_InternalSS_Info table 
> from
>   * the VBIOS that match the SSid (to be converted from signal)
>   *
> @@ -1725,7 +1725,7 @@ static uint32_t 
> get_ss_entry_number_from_internal_ss_info_tbl_v2_1(
> return 0;
>  }
>  /**
> - * get_ss_entry_number_from_internal_ss_info_table_V3_1
> + * get_ss_entry_number_from_internal_ss_info_tbl_V3_1
>   * Get Number of SpreadSpectrum Entry from the ASIC_InternalSS_Info table of
>   * the VBIOS that matches id
>   *
> --
> 2.31.1
>


Re: [PATCH 09/34] drm/amd/display/dc/bios/command_table_helper2: Fix function name 'dal_cmd_table_helper_transmitter_bp_to_atom2()'

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/command_table_helper2.c:141: 
> warning: expecting prototype for translate_transmitter_bp_to_atom2(). 
> Prototype was for dal_cmd_table_helper_transmitter_bp_to_atom2() instead
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/display/dc/bios/command_table_helper2.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table_helper2.c 
> b/drivers/gpu/drm/amd/display/dc/bios/command_table_helper2.c
> index 00706b072b5f8..6d2fb112ad9f9 100644
> --- a/drivers/gpu/drm/amd/display/dc/bios/command_table_helper2.c
> +++ b/drivers/gpu/drm/amd/display/dc/bios/command_table_helper2.c
> @@ -129,7 +129,7 @@ bool dal_cmd_table_helper_controller_id_to_atom2(
>  }
>
>  /**
> - * translate_transmitter_bp_to_atom2 - Translate the Transmitter to the
> + * dal_cmd_table_helper_transmitter_bp_to_atom2 - Translate the Transmitter 
> to the
>   * corresponding ATOM BIOS value
>   *  @t: transmitter
>   *  returns: digitalTransmitter
> --
> 2.31.1
>


Re: [PATCH 08/34] drm/amd/display/dc/bios/command_table_helper: Fix function name for 'dal_cmd_table_helper_transmitter_bp_to_atom()'

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:48 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../display/dc/bios/command_table_helper.c:127: 
> warning: expecting prototype for translate_transmitter_bp_to_atom(). 
> Prototype was for dal_cmd_table_helper_transmitter_bp_to_atom() instead
>
> Cc: Harry Wentland 
> Cc: Leo Li 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Lee Jones 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/display/dc/bios/command_table_helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table_helper.c 
> b/drivers/gpu/drm/amd/display/dc/bios/command_table_helper.c
> index 5b77251e05909..e317a36151477 100644
> --- a/drivers/gpu/drm/amd/display/dc/bios/command_table_helper.c
> +++ b/drivers/gpu/drm/amd/display/dc/bios/command_table_helper.c
> @@ -114,7 +114,7 @@ bool dal_cmd_table_helper_controller_id_to_atom(
>  }
>
>  /**
> - * translate_transmitter_bp_to_atom - Translate the Transmitter to the
> + * dal_cmd_table_helper_transmitter_bp_to_atom - Translate the Transmitter 
> to the
>   *corresponding ATOM BIOS value
>   * @t: transmitter
>   * returns: output digitalTransmitter
> --
> 2.31.1
>


Re: [PATCH 07/34] drm/amd/pm/powerplay/hwmgr/vega20_hwmgr: Provide function name 'vega20_init_smc_table()'

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:47 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega20_hwmgr.c:781: 
> warning: expecting prototype for Initializes the SMC table and uploads it(). 
> Prototype was for vega20_init_smc_table() instead
>
> Cc: Evan Quan 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c 
> b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c
> index d3177a534fdf0..0791309586c58 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c
> @@ -772,7 +772,7 @@ static int vega20_setup_default_dpm_tables(struct 
> pp_hwmgr *hwmgr)
>  }
>
>  /**
> - * Initializes the SMC table and uploads it
> + * vega20_init_smc_table - Initializes the SMC table and uploads it
>   *
>   * @hwmgr:  the address of the powerplay hardware manager.
>   * return:  always 0
> --
> 2.31.1
>


Re: [PATCH 06/34] drm/amd/pm/powerplay/hwmgr/vega10_hwmgr: Kernel-doc headers must contain function names

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:47 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_hwmgr.c:547: 
> warning: This comment starts with '/**', but isn't a kernel-doc comment. 
> Refer Documentation/doc-guide/kernel-doc.rst
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_hwmgr.c:603: 
> warning: This comment starts with '/**', but isn't a kernel-doc comment. 
> Refer Documentation/doc-guide/kernel-doc.rst
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_hwmgr.c:629: 
> warning: This comment starts with '/**', but isn't a kernel-doc comment. 
> Refer Documentation/doc-guide/kernel-doc.rst
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_hwmgr.c:1006: 
> warning: This comment starts with '/**', but isn't a kernel-doc comment. 
> Refer Documentation/doc-guide/kernel-doc.rst
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_hwmgr.c:1155: 
> warning: This comment starts with '/**', but isn't a kernel-doc comment. 
> Refer Documentation/doc-guide/kernel-doc.rst
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_hwmgr.c:1608: 
> warning: expecting prototype for Populates single SMC GFXSCLK structure using 
> the provided engine clock(). Prototype was for 
> vega10_populate_single_gfx_level() instead
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_hwmgr.c:1663: 
> warning: This comment starts with '/**', but isn't a kernel-doc comment. 
> Refer Documentation/doc-guide/kernel-doc.rst
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_hwmgr.c:1713: 
> warning: This comment starts with '/**', but isn't a kernel-doc comment. 
> Refer Documentation/doc-guide/kernel-doc.rst
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_hwmgr.c:1862: 
> warning: This comment starts with '/**', but isn't a kernel-doc comment. 
> Refer Documentation/doc-guide/kernel-doc.rst
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_hwmgr.c:2546: 
> warning: expecting prototype for Initializes the SMC table and uploads it(). 
> Prototype was for vega10_init_smc_table() instead
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_hwmgr.c:2922: 
> warning: This comment starts with '/**', but isn't a kernel-doc comment. 
> Refer Documentation/doc-guide/kernel-doc.rst
>
> Cc: Evan Quan 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  .../drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c | 26 +++
>  1 file changed, 15 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c 
> b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c
> index 31c61ac3bd5e1..25979106fd255 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c
> @@ -544,7 +544,7 @@ static int vega10_get_socclk_for_voltage_evv(struct 
> pp_hwmgr *hwmgr,
>
>  #define ATOM_VIRTUAL_VOLTAGE_ID0 0xff01
>  /**
> - * Get Leakage VDDC based on leakage ID.
> + * vega10_get_evv_voltages - Get Leakage VDDC based on leakage ID.
>   *
>   * @hwmgr:  the address of the powerplay hardware manager.
>   * return:  always 0.
> @@ -600,7 +600,7 @@ static int vega10_get_evv_voltages(struct pp_hwmgr *hwmgr)
>  }
>
>  /**
> - * Change virtual leakage voltage to actual value.
> + * vega10_patch_with_vdd_leakage - Change virtual leakage voltage to actual 
> value.
>   *
>   * @hwmgr: the address of the powerplay hardware manager.
>   * @voltage:   pointer to changing voltage
> @@ -626,7 +626,7 @@ static void vega10_patch_with_vdd_leakage(struct pp_hwmgr 
> *hwmgr,
>  }
>
>  /**
> - * Patch voltage lookup table by EVV leakages.
> + * vega10_patch_lookup_table_with_leakage - Patch voltage lookup table by 
> EVV leakages.
>   *
>   * @hwmgr: the address of the powerplay hardware manager.
>   * @lookup_table:  pointer to voltage lookup table
> @@ -1003,7 +1003,7 @@ static int vega10_setup_asic_task(struct pp_hwmgr 
> *hwmgr)
>  }
>
>  /**
> - * Remove repeated voltage values and create table with unique values.
> + * vega10_trim_voltage_table - Remove repeated voltage values and create 
> table with unique values.
>   *
>   * @hwmgr:  the address of the powerplay hardware manager.
>   * @vol_table:  the pointer to changing voltage table
> @@ -1152,7 +1152,7 @@ static void 
> vega10_trim_voltage_table_to_fit_state_table(
>  }
>
>  /**
> - * Create Voltage Tables.
> + * vega10_construct_voltage_tables - Create Voltage Tables.
>   *
>   * @hwmgr:  the address of the powerplay hardware manager.
>   * return:  always 0
> @@ -1595,7 +1595,8 @@ static int vega10_populate_smc_link_levels(struct 
> pp_hwmgr *hwmgr)
>  }
>
>  /**
> - * Populates single SMC GFXSCLK structure using the provided engine clock
> + * vega10_populate_single_g

Re: [PATCH 05/34] drm/amd/pm/powerplay/hwmgr/vega12_hwmgr: Provide 'vega12_init_smc_table()' function name

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:47 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega12_hwmgr.c:812: 
> warning: expecting prototype for Initializes the SMC table and uploads it(). 
> Prototype was for vega12_init_smc_table() instead
>
> Cc: Evan Quan 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c 
> b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c
> index 1a097e608808e..29e0d1d4035ad 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c
> @@ -803,7 +803,7 @@ static int vega12_save_default_power_profile(struct 
> pp_hwmgr *hwmgr)
>  #endif
>
>  /**
> - * Initializes the SMC table and uploads it
> + * vega12_init_smc_table - Initializes the SMC table and uploads it
>   *
>   * @hwmgr:  the address of the powerplay hardware manager.
>   * return:  always 0
> --
> 2.31.1
>


Re: [PATCH 04/34] drm/amd/pm/powerplay/hwmgr/vega12_thermal: Provide function name

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:47 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega12_thermal.c:171: 
> warning: expecting prototype for Set the requested temperature range for high 
> and low alert signals(). Prototype was for 
> vega12_thermal_set_temperature_range() instead
>
> Cc: Evan Quan 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_thermal.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_thermal.c 
> b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_thermal.c
> index 0dc16f25a463b..ed3dff0b52d21 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_thermal.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_thermal.c
> @@ -159,7 +159,8 @@ int vega12_thermal_get_temperature(struct pp_hwmgr *hwmgr)
>  }
>
>  /**
> - * Set the requested temperature range for high and low alert signals
> + * vega12_thermal_set_temperature_range - Set the requested temperature range
> + *for high and low alert signals
>   *
>   * @hwmgr: The address of the hardware manager.
>   * @range: Temperature range to be programmed for
> --
> 2.31.1
>


Re: [PATCH 03/34] drm/amd/pm/powerplay/hwmgr/smu7_thermal: Provide function name for 'smu7_fan_ctrl_set_default_mode()'

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:47 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu7_thermal.c:132: 
> warning: This comment starts with '/**', but isn't a kernel-doc comment. 
> Refer Documentation/doc-guide/kernel-doc.rst
>
> Cc: Evan Quan 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_thermal.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_thermal.c 
> b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_thermal.c
> index 0d38d4206848a..6cfe148ed45bb 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_thermal.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_thermal.c
> @@ -129,10 +129,10 @@ int smu7_fan_ctrl_set_static_mode(struct pp_hwmgr 
> *hwmgr, uint32_t mode)
>  }
>
>  /**
> -* Reset Fan Speed Control to default mode.
> -* @hwmgr:  the address of the powerplay hardware manager.
> -* Exception: Should always succeed.
> -*/
> + * smu7_fan_ctrl_set_default_mode - Reset Fan Speed Control to default mode.
> + * @hwmgr:  the address of the powerplay hardware manager.
> + * Exception: Should always succeed.
> + */
>  int smu7_fan_ctrl_set_default_mode(struct pp_hwmgr *hwmgr)
>  {
> if (!hwmgr->fan_ctrl_is_in_default_mode) {
> --
> 2.31.1
>


Re: [PATCH 02/34] drm/amd/pm/swsmu/smu13/aldebaran_ppt: Remove unused variable 'ret'

2021-05-26 Thread Alex Deucher
This should be checked.  Will send out a patch momentarily.

Thanks,

Alex

On Wed, May 26, 2021 at 4:47 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/aldebaran_ppt.c: In function 
> ‘aldebaran_is_dpm_running’:
>  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/aldebaran_ppt.c:1260:6: 
> warning: variable ‘ret’ set but not used [-Wunused-but-set-variable]
>
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
> index d6ce665baaf3b..d077e211017a9 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
> @@ -1368,10 +1368,9 @@ static int aldebaran_usr_edit_dpm_table(struct 
> smu_context *smu, enum PP_OD_DPM_
>
>  static bool aldebaran_is_dpm_running(struct smu_context *smu)
>  {
> -   int ret = 0;
> uint32_t feature_mask[2];
> unsigned long feature_enabled;
> -   ret = smu_cmn_get_enabled_mask(smu, feature_mask, 2);
> +   smu_cmn_get_enabled_mask(smu, feature_mask, 2);
> feature_enabled = (unsigned long)((uint64_t)feature_mask[0] |
>   ((uint64_t)feature_mask[1] << 32));
> return !!(feature_enabled & SMC_DPM_FEATURE);
> --
> 2.31.1
>


Re: [PATCH 01/34] drm/amd/pm/inc/smu_v13_0: Move table into the only source file that uses it

2021-05-26 Thread Alex Deucher
Applied.  Thanks!

On Wed, May 26, 2021 at 4:47 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/amd/amdgpu/../pm/inc/smu_v13_0.h:54:43: warning: 
> ‘smu13_thermal_policy’ defined but not used [-Wunused-const-variable=]
>
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Kevin Wang 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/amd/pm/inc/smu_v13_0.h | 6 --
>  drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 6 ++
>  2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h 
> b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h
> index 1687709507b3d..6119a36b2cba0 100644
> --- a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h
> +++ b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h
> @@ -51,12 +51,6 @@
>  #define CTF_OFFSET_HOTSPOT 5
>  #define CTF_OFFSET_MEM 5
>
> -static const struct smu_temperature_range smu13_thermal_policy[] =
> -{
> -   {-273150,  99000, 99000, -273150, 99000, 99000, -273150, 99000, 
> 99000},
> -   { 12, 12, 12, 12, 12, 12, 12, 12, 
> 12},
> -};
> -
>  struct smu_13_0_max_sustainable_clocks {
> uint32_t display_clock;
> uint32_t phy_clock;
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
> index d62cc6bb1a305..d6ce665baaf3b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
> @@ -78,6 +78,12 @@
>
>  #define smnPCIE_ESM_CTRL   0x111003D0
>
> +static const struct smu_temperature_range smu13_thermal_policy[] =
> +{
> +   {-273150,  99000, 99000, -273150, 99000, 99000, -273150, 99000, 
> 99000},
> +   { 12, 12, 12, 12, 12, 12, 12, 12, 
> 12},
> +};
> +
>  static const struct cmn2asic_msg_mapping 
> aldebaran_message_map[SMU_MSG_MAX_COUNT] = {
> MSG_MAP(TestMessage, PPSMC_MSG_TestMessage,   
>   0),
> MSG_MAP(GetSmuVersion,   PPSMC_MSG_GetSmuVersion, 
>   1),
> --
> 2.31.1
>


Re: [PATCH v9 07/10] mm: Device exclusive memory access

2021-05-26 Thread Alistair Popple
On Thursday, 27 May 2021 5:28:32 AM AEST Peter Xu wrote:
> On Mon, May 24, 2021 at 11:27:22PM +1000, Alistair Popple wrote:
> > Some devices require exclusive write access to shared virtual
> > memory (SVM) ranges to perform atomic operations on that memory. This
> > requires CPU page tables to be updated to deny access whilst atomic
> > operations are occurring.
> > 
> > In order to do this introduce a new swap entry
> > type (SWP_DEVICE_EXCLUSIVE). When a SVM range needs to be marked for
> > exclusive access by a device all page table mappings for the particular
> > range are replaced with device exclusive swap entries. This causes any
> > CPU access to the page to result in a fault.
> > 
> > Faults are resovled by replacing the faulting entry with the original
> > mapping. This results in MMU notifiers being called which a driver uses
> > to update access permissions such as revoking atomic access. After
> > notifiers have been called the device will no longer have exclusive
> > access to the region.
> > 
> > Walking of the page tables to find the target pages is handled by
> > get_user_pages() rather than a direct page table walk. A direct page
> > table walk similar to what migrate_vma_collect()/unmap() does could also
> > have been utilised. However this resulted in more code similar in
> > functionality to what get_user_pages() provides as page faulting is
> > required to make the PTEs present and to break COW.
> > 
> > Signed-off-by: Alistair Popple 
> > Reviewed-by: Christoph Hellwig 
> > 
> > ---
> > 
> > v9:
> > * Split rename of migrate_pgmap_owner into a separate patch.
> > * Added comments explaining SWP_DEVICE_EXCLUSIVE_* entries.
> > * Renamed try_to_protect{_one} to page_make_device_exclusive{_one} based
> > 
> >   somewhat on a suggestion from Peter Xu. I was never particularly happy
> >   with try_to_protect() as a name so think this is better.
> > 
> > * Removed unneccesary code and reworded some comments based on feedback
> > 
> >   from Peter Xu.
> > 
> > * Removed the VMA walk when restoring PTEs for device-exclusive entries.
> > * Simplified implementation of copy_pte_range() to fail if the page
> > 
> >   cannot be locked. This might lead to occasional fork() failures but at
> >   this stage we don't think that will be an issue.
> > 
> > v8:
> > * Remove device exclusive entries on fork rather than copy them.
> > 
> > v7:
> > * Added Christoph's Reviewed-by.
> > * Minor cosmetic cleanups suggested by Christoph.
> > * Replace mmu_notifier_range_init_migrate/exclusive with
> > 
> >   mmu_notifier_range_init_owner as suggested by Christoph.
> > 
> > * Replaced lock_page() with lock_page_retry() when handling faults.
> > * Restrict to anonymous pages for now.
> > 
> > v6:
> > * Fixed a bisectablity issue due to incorrectly applying the rename of
> > 
> >   migrate_pgmap_owner to the wrong patches for Nouveau and hmm_test.
> > 
> > v5:
> > * Renamed range->migrate_pgmap_owner to range->owner.
> > * Added MMU_NOTIFY_EXCLUSIVE to allow passing of a driver cookie which
> > 
> >   allows notifiers called as a result of make_device_exclusive_range() to
> >   be ignored.
> > 
> > * Added a check to try_to_protect_one() to detect if the pages originally
> > 
> >   returned from get_user_pages() have been unmapped or not.
> > 
> > * Removed check_device_exclusive_range() as it is no longer required with
> > 
> >   the other changes.
> > 
> > * Documentation update.
> > 
> > v4:
> > * Add function to check that mappings are still valid and exclusive.
> > * s/long/unsigned long/ in make_device_exclusive_entry().
> > ---
> > 
> >  Documentation/vm/hmm.rst |  17 
> >  include/linux/mmu_notifier.h |   6 ++
> >  include/linux/rmap.h |   4 +
> >  include/linux/swap.h |   7 +-
> >  include/linux/swapops.h  |  44 -
> >  mm/hmm.c |   5 +
> >  mm/memory.c  | 128 +++-
> >  mm/mprotect.c|   8 ++
> >  mm/page_vma_mapped.c |   9 +-
> >  mm/rmap.c| 186 +++
> >  10 files changed, 405 insertions(+), 9 deletions(-)
> > 
> > diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
> > index 3df79307a797..a14c2938e7af 100644
> > --- a/Documentation/vm/hmm.rst
> > +++ b/Documentation/vm/hmm.rst
> > 
> > @@ -405,6 +405,23 @@ between device driver specific code and shared common 
code:
> > The lock can now be released.
> > 
> > +Exclusive access memory
> > +===
> > +
> > +Some devices have features such as atomic PTE bits that can be used to
> > implement +atomic access to system memory. To support atomic operations
> > to a shared virtual +memory page such a device needs access to that page
> > which is exclusive of any +userspace access from the CPU. The
> > ``make_device_exclusive_range()`` function +can be used to make a memory
> > range inaccessible from userspace.
> > +
> > +This replaces all mappings for pages in t

[pull] amdgpu, amdkfd drm-fixes-5.13

2021-05-26 Thread Alex Deucher
Hi Dave, Daniel,

Fixes for 5.13.

The following changes since commit a2b4785f01280a4291edb9fda69032fc2e4bfd3f:

  drm/amdgpu: stop touching sched.ready in the backend (2021-05-19 18:07:43 
-0400)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-5.13-2021-05-26

for you to fetch changes up to 20ebbfd22f8115a1e4f60d3d289f66be4d47f1ec:

  drm/amdgpu/jpeg3: add cancel_delayed_work_sync before power gate (2021-05-20 
17:04:58 -0400)


amd-drm-fixes-5.13-2021-05-26:

amdgpu:
- MultiGPU fan fix
- VCN powergating fixes

amdkfd:
- Fix SDMA register offset error


Evan Quan (1):
  drm/amd/pm: correct MGpuFanBoost setting

James Zhu (7):
  drm/amdgpu/vcn1: add cancel_delayed_work_sync before power gate
  drm/amdgpu/vcn2.0: add cancel_delayed_work_sync before power gate
  drm/amdgpu/vcn2.5: add cancel_delayed_work_sync before power gate
  drm/amdgpu/vcn3: add cancel_delayed_work_sync before power gate
  drm/amdgpu/jpeg2.0: add cancel_delayed_work_sync before power gate
  drm/amdgpu/jpeg2.5: add cancel_delayed_work_sync before power gate
  drm/amdgpu/jpeg3: add cancel_delayed_work_sync before power gate

Kevin Wang (1):
  drm/amdkfd: correct sienna_cichlid SDMA RLC register offset error

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c| 12 ++--
 drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c  |  2 ++
 drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c  |  4 ++--
 drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c  |  4 ++--
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c   |  6 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c   |  2 ++
 drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c   |  2 ++
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c   |  5 ++---
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c |  9 +
 drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 10 ++
 10 files changed, 42 insertions(+), 14 deletions(-)


Re: [Intel-gfx] [PATCH 1/1] drm/i915: Engine relative MMIO

2021-05-26 Thread Matthew Brost
On Wed, May 26, 2021 at 07:43:02PM -0700, Matthew Brost wrote:
> On Wed, May 26, 2021 at 06:34:44PM -0700, Daniele Ceraolo Spurio wrote:
> > 
> > 
> > On 5/26/2021 12:11 PM, Matthew Brost wrote:
> > > With virtual engines, it is no longer possible to know which specific
> > > physical engine a given request will be executed on at the time that
> > > request is generated. This means that the request itself must be engine
> > > agnostic - any direct register writes must be relative to the engine
> > > and not absolute addresses.
> > > 
> > > The LRI command has support for engine relative addressing. However,
> > > the mechanism is not transparent to the driver. The scheme for Gen11
> > > (MI_LRI_ADD_CS_MMIO_START) requires the LRI address to have no
> > > absolute engine base component in the ring and BBs. The hardware then
> > > adds on the correct engine offset at execution time. This differs
> > > slightly for LRC where the upper bits of the base component are just
> > > ignored.
> > > 
> > > Due to the non-trivial and differing schemes on different hardware, it
> > > is not possible to simply update the code that creates the LRI
> > > commands to set a remap flag and let the hardware get on with it.
> > > Instead, this patch adds function wrappers for generating the LRI
> > > command itself and then for constructing the correct address to use
> > > with the LRI.
> > > 
> > > Bspec: 45606
> > > Signed-off-by: John Harrison 
> > > Signed-off-by: Matthew Brost 
> > > CC: Rodrigo Vivi 
> > > CC: Tvrtko Ursulin 
> > > CC: Chris P Wilson 
> > > CC: Daniele Ceraolo Spurio 
> > > ---
> > >   drivers/gpu/drm/i915/gem/i915_gem_context.c  |  7 ---
> > >   drivers/gpu/drm/i915/gt/intel_engine_cs.c| 22 
> > >   drivers/gpu/drm/i915/gt/intel_engine_types.h |  3 +++
> > >   drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  6 ++
> > >   drivers/gpu/drm/i915/gt/intel_lrc.c  |  4 +---
> > >   5 files changed, 36 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> > > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > index 188dee13e017..a8a195bfcb57 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > @@ -1211,7 +1211,7 @@ static int emit_ppgtt_update(struct i915_request 
> > > *rq, void *data)
> > >   {
> > >   struct i915_address_space *vm = rq->context->vm;
> > >   struct intel_engine_cs *engine = rq->engine;
> > > - u32 base = engine->mmio_base;
> > > + u32 base = engine->lri_mmio_base;
> > >   u32 *cs;
> > >   int i;
> > > @@ -1223,7 +1223,7 @@ static int emit_ppgtt_update(struct i915_request 
> > > *rq, void *data)
> > >   if (IS_ERR(cs))
> > >   return PTR_ERR(cs);
> > > - *cs++ = MI_LOAD_REGISTER_IMM(2);
> > > + *cs++ = MI_LOAD_REGISTER_IMM_REL(engine, 2);
> > 
> > This is the only place where you changed the behavior and I think it is
> > going away
> > (https://lists.freedesktop.org/archives/dri-devel/2021-May/305328.html), so
> > the new macro is potentially not needed.
> >
> 
> See my last comment, I think this irrelevant as I think I missed some
> cases where this macro should be used.
>  

Actually this wrong, the macro is indeed used in all the places it is needed.

Let me talk to Jason tomororw about when he expects his series to land, I
suspect it is going to take a bit as IGTs have to updated as well. If GuC
virtual engines land before his series we need this. Even his series lands first
adding this macro + hooks isn't a terrible idea.

Matt

> > >   *cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, 
> > > 0));
> > >   *cs++ = upper_32_bits(pd_daddr);
> > > @@ -1245,7 +1245,8 @@ static int emit_ppgtt_update(struct i915_request 
> > > *rq, void *data)
> > >   if (IS_ERR(cs))
> > >   return PTR_ERR(cs);
> > > - *cs++ = MI_LOAD_REGISTER_IMM(2 * GEN8_3LVL_PDPES) | 
> > > MI_LRI_FORCE_POSTED;
> > > + *cs++ = MI_LOAD_REGISTER_IMM_REL(engine, 2 * GEN8_3LVL_PDPES) |
> > > + MI_LRI_FORCE_POSTED;
> > >   for (i = GEN8_3LVL_PDPES; i--; ) {
> > >   const dma_addr_t pd_daddr = 
> > > i915_page_dir_dma_addr(ppgtt, i);
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
> > > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > index 3f9a811eb02b..0de6bc533776 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > @@ -15,6 +15,7 @@
> > >   #include "intel_engine_pm.h"
> > >   #include "intel_engine_user.h"
> > >   #include "intel_execlists_submission.h"
> > > +#include "intel_gpu_commands.h"
> > >   #include "intel_gt.h"
> > >   #include "intel_gt_requests.h"
> > >   #include "intel_gt_pm.h"
> > > @@ -222,6 +223,25 @@ static u32 __engine_mmio_base(struct 
>

Re: [PATCH] drm/msm/dpu: remove unused variable cmd_enc

2021-05-26 Thread Austin Kim
2021년 5월 21일 (금) 오후 1:16, Austin Kim 님이 작성:
>
> After the call to to_dpu_encoder_phys_cmd() is made,
> 'cmd_enc' is not used. Where to_dpu_encoder_phys_cmd() is simply replaced with
> container_of(x, struct dpu_encoder_phys_cmd, base) by compiler.
>
> So it had better remove W=1 kernel build warning(s):
>
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c: In function
>  ‘dpu_encoder_phys_cmd_wait_for_commit_done’:
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c:688:31: warning:
>   variable ‘cmd_enc’ set but not used
>
> Signed-off-by: Austin Kim 
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c | 4 
>  1 file changed, 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
> index b2be39b9144e..088900841bf8 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
> @@ -685,10 +685,6 @@ static int dpu_encoder_phys_cmd_wait_for_tx_complete(
>  static int dpu_encoder_phys_cmd_wait_for_commit_done(
> struct dpu_encoder_phys *phys_enc)
>  {
> -   struct dpu_encoder_phys_cmd *cmd_enc;
> -
> -   cmd_enc = to_dpu_encoder_phys_cmd(phys_enc);
> -
> /* only required for master controller */
> if (!dpu_encoder_phys_cmd_is_master(phys_enc))
> return 0;
> --
> 2.20.1
>

If you are available, would you please review this patch.

BR,
Austin Kim


Re: [Intel-gfx] [PATCH 1/1] drm/i915: Engine relative MMIO

2021-05-26 Thread Matthew Brost
On Wed, May 26, 2021 at 06:34:44PM -0700, Daniele Ceraolo Spurio wrote:
> 
> 
> On 5/26/2021 12:11 PM, Matthew Brost wrote:
> > With virtual engines, it is no longer possible to know which specific
> > physical engine a given request will be executed on at the time that
> > request is generated. This means that the request itself must be engine
> > agnostic - any direct register writes must be relative to the engine
> > and not absolute addresses.
> > 
> > The LRI command has support for engine relative addressing. However,
> > the mechanism is not transparent to the driver. The scheme for Gen11
> > (MI_LRI_ADD_CS_MMIO_START) requires the LRI address to have no
> > absolute engine base component in the ring and BBs. The hardware then
> > adds on the correct engine offset at execution time. This differs
> > slightly for LRC where the upper bits of the base component are just
> > ignored.
> > 
> > Due to the non-trivial and differing schemes on different hardware, it
> > is not possible to simply update the code that creates the LRI
> > commands to set a remap flag and let the hardware get on with it.
> > Instead, this patch adds function wrappers for generating the LRI
> > command itself and then for constructing the correct address to use
> > with the LRI.
> > 
> > Bspec: 45606
> > Signed-off-by: John Harrison 
> > Signed-off-by: Matthew Brost 
> > CC: Rodrigo Vivi 
> > CC: Tvrtko Ursulin 
> > CC: Chris P Wilson 
> > CC: Daniele Ceraolo Spurio 
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_context.c  |  7 ---
> >   drivers/gpu/drm/i915/gt/intel_engine_cs.c| 22 
> >   drivers/gpu/drm/i915/gt/intel_engine_types.h |  3 +++
> >   drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  6 ++
> >   drivers/gpu/drm/i915/gt/intel_lrc.c  |  4 +---
> >   5 files changed, 36 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 188dee13e017..a8a195bfcb57 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -1211,7 +1211,7 @@ static int emit_ppgtt_update(struct i915_request *rq, 
> > void *data)
> >   {
> > struct i915_address_space *vm = rq->context->vm;
> > struct intel_engine_cs *engine = rq->engine;
> > -   u32 base = engine->mmio_base;
> > +   u32 base = engine->lri_mmio_base;
> > u32 *cs;
> > int i;
> > @@ -1223,7 +1223,7 @@ static int emit_ppgtt_update(struct i915_request *rq, 
> > void *data)
> > if (IS_ERR(cs))
> > return PTR_ERR(cs);
> > -   *cs++ = MI_LOAD_REGISTER_IMM(2);
> > +   *cs++ = MI_LOAD_REGISTER_IMM_REL(engine, 2);
> 
> This is the only place where you changed the behavior and I think it is
> going away
> (https://lists.freedesktop.org/archives/dri-devel/2021-May/305328.html), so
> the new macro is potentially not needed.
>

See my last comment, I think this irrelevant as I think I missed some
cases where this macro should be used.
 
> > *cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, 0));
> > *cs++ = upper_32_bits(pd_daddr);
> > @@ -1245,7 +1245,8 @@ static int emit_ppgtt_update(struct i915_request *rq, 
> > void *data)
> > if (IS_ERR(cs))
> > return PTR_ERR(cs);
> > -   *cs++ = MI_LOAD_REGISTER_IMM(2 * GEN8_3LVL_PDPES) | 
> > MI_LRI_FORCE_POSTED;
> > +   *cs++ = MI_LOAD_REGISTER_IMM_REL(engine, 2 * GEN8_3LVL_PDPES) |
> > +   MI_LRI_FORCE_POSTED;
> > for (i = GEN8_3LVL_PDPES; i--; ) {
> > const dma_addr_t pd_daddr = 
> > i915_page_dir_dma_addr(ppgtt, i);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
> > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index 3f9a811eb02b..0de6bc533776 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -15,6 +15,7 @@
> >   #include "intel_engine_pm.h"
> >   #include "intel_engine_user.h"
> >   #include "intel_execlists_submission.h"
> > +#include "intel_gpu_commands.h"
> >   #include "intel_gt.h"
> >   #include "intel_gt_requests.h"
> >   #include "intel_gt_pm.h"
> > @@ -222,6 +223,25 @@ static u32 __engine_mmio_base(struct drm_i915_private 
> > *i915,
> > return bases[i].base;
> >   }
> > +static bool i915_engine_has_relative_lri(const struct intel_engine_cs 
> > *engine)
> > +{
> > +   if (INTEL_GEN(engine->i915) < 11)
> > +   return false;
> > +
> > +   return true;
> 
> We already have intel_engine_has_relative_mmio(), can just re-use that. Note
> that I915_ENGINE_HAS_RELATIVE_MMIO is only set for gen12+ at the moment;
> this was because CI failed on ICL and since we urgently needed the change
> for gen12 we just excluded gen11 and pushed (see Mika's comment @
> https://lists.freedesktop.org/archives/intel-gfx/2019-September/211812.html).
> It should be ok to extend that to gen11 if w

Re: [PATCH v2] drm/ast: Add detect function support

2021-05-26 Thread Ainux Wang
You're welcome!

Best regards
Ainux Wang


Thomas Zimmermann  于2021年5月27日周四 上午3:15写道:
>
> Hi
>
> Am 26.05.21 um 13:15 schrieb ainux.w...@gmail.com:
> > From: Ainux 
> >
> > The existence of the connector cannot be detected,
> > so add the detect function to support.
> >
> > Signed-off-by: Ainux 
>
> Looks good. If no one else comments, I'll merge the patch soon. Thanks a
> lot.
>
> Best regards
> Thomas
>
> > ---
> >   drivers/gpu/drm/ast/ast_mode.c | 18 +-
> >   1 file changed, 17 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
> > index 36d9575aa27b..e5996ae03c49 100644
> > --- a/drivers/gpu/drm/ast/ast_mode.c
> > +++ b/drivers/gpu/drm/ast/ast_mode.c
> > @@ -1293,6 +1293,18 @@ static enum drm_mode_status ast_mode_valid(struct 
> > drm_connector *connector,
> >   return flags;
> >   }
> >
> > +static enum drm_connector_status ast_connector_detect(struct drm_connector
> > +*connector, bool force)
> > +{
> > + int r;
> > +
> > + r = ast_get_modes(connector);
> > + if (r < 0)
> > + return connector_status_disconnected;
> > +
> > + return connector_status_connected;
> > +}
> > +
> >   static void ast_connector_destroy(struct drm_connector *connector)
> >   {
> >   struct ast_connector *ast_connector = to_ast_connector(connector);
> > @@ -1307,6 +1319,7 @@ static const struct drm_connector_helper_funcs 
> > ast_connector_helper_funcs = {
> >
> >   static const struct drm_connector_funcs ast_connector_funcs = {
> >   .reset = drm_atomic_helper_connector_reset,
> > + .detect = ast_connector_detect,
> >   .fill_modes = drm_helper_probe_single_connector_modes,
> >   .destroy = ast_connector_destroy,
> >   .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
> > @@ -1334,7 +1347,8 @@ static int ast_connector_init(struct drm_device *dev)
> >   connector->interlace_allowed = 0;
> >   connector->doublescan_allowed = 0;
> >
> > - connector->polled = DRM_CONNECTOR_POLL_CONNECT;
> > + connector->polled = DRM_CONNECTOR_POLL_CONNECT |
> > + DRM_CONNECTOR_POLL_DISCONNECT;
> >
> >   drm_connector_attach_encoder(connector, encoder);
> >
> > @@ -1403,6 +1417,8 @@ int ast_mode_config_init(struct ast_private *ast)
> >
> >   drm_mode_config_reset(dev);
> >
> > + drm_kms_helper_poll_init(dev);
> > +
> >   return 0;
> >   }
> >
> >
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>


Re: [PATCH 0/4] drm: Finally retire struct drm_format_name_buf

2021-05-26 Thread Alex Deucher
Acked-by: Alex Deucher 
for the amdgpu changes.

On Wed, May 26, 2021 at 3:21 PM Thomas Zimmermann  wrote:
>
> ping for further a-bs / r-bs
>
> Am 16.05.21 um 14:13 schrieb Thomas Zimmermann:
> > This is a cleanup patchset to remove drm_format_name_buf et al. There
> > are two instances in drivers that need to be replaced with the %4cc
> > printk format modifier. Patch 3 was left over back from an earlier
> > patchset. [1] Patch 4 removes struct drm_format_name_buf.
> >
> > I built-tested with drm-tip. The patchsetcan be mered through drm-misc.
> >
> > [1] 
> > https://lore.kernel.org/dri-devel/20210216155723.17109-1-sakari.ai...@linux.intel.com/
> >
> > Sakari Ailus (1):
> >drm: Remove drm_get_format_name()
> >
> > Thomas Zimmermann (3):
> >drm/amdgpu: Use %p4cc to print 4CC format
> >drm/simpledrm: Use %p4cc to print 4CC format
> >drm/fourcc: Remove struct drm_format_buf_name
> >
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_display.c |  7 ++
> >   drivers/gpu/drm/drm_fourcc.c| 25 -
> >   drivers/gpu/drm/tiny/simpledrm.c|  6 ++---
> >   include/drm/drm_fourcc.h|  9 
> >   4 files changed, 4 insertions(+), 43 deletions(-)
> >
> > --
> > 2.31.1
> >
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>


Re: [Intel-gfx] [PATCH v4 14/17] drm/i915/pxp: User interface for Protected buffer

2021-05-26 Thread Daniele Ceraolo Spurio




On 5/25/2021 11:36 AM, Tang, CQ wrote:



-Original Message-
From: Intel-gfx  On Behalf Of
Daniele Ceraolo Spurio
Sent: Monday, May 24, 2021 10:48 PM
To: intel-...@lists.freedesktop.org
Cc: Vetter, Daniel ; Huang Sean Z
; dri-devel@lists.freedesktop.org; Chris Wilson
; Kondapally Kalyan
; Bommu, Krishnaiah

Subject: [Intel-gfx] [PATCH v4 14/17] drm/i915/pxp: User interface for
Protected buffer

From: Bommu Krishnaiah 

This api allow user mode to create Protected buffers. Only contexts marked
as protected are allowed to operate on protected buffers.

We only allow setting the flags at creation time.

All protected objects that have backing storage will be considered invalid
when the session is destroyed and they won't be usable anymore.

Then these protected objects will be hanging in the system till user call 
gem_close() to free them?
If the objects won't be usable anymore, why don't we automatically free these 
objects when the session is destroyed?


Auto-freeing an object would require some extra reworks (i.e. plumbing 
PXP status checks in a lot of places), so to keep things simple, the 
protected objects have the same lifetime as normal ones. A user can keep 
non protected objects hanging around as long as they want, it's not like 
the protected ones are worse in that sense.



How is a session started/destroyed?  From the code, intel_pxp_init() is called 
when loading i915 driver, so I think session lifetime is the same as i915 
driver lifetime.
Can we start multiple sessions after loading the driver?


A session is started with a call into the PXP mei device and can be 
manually destroyed with a specific instruction submitted via a video 
engine, but the HW also invalidates the keys when certain events occurs 
(e.g. suspend/resume). The HW supports multiple sessions, but we 
currently only use one in i915; it is automatically started when a 
protected context is submitted and then kept running until an 
invalidation event occurs. See patch 12 for details.


Daniele



--CQ


Given that the PXP HW supports multiple modes (but we currently only care
about one), a flag variable has been reserved in the structure used in the
create_ext ioctl for possible future updates.

This is a rework of the original code by Bommu Krishnaiah. I've kept
authorship unchanged since significant chunks have not been modified.

v2: split context changes, fix defines and improve documentation (Chris),
 add object invalidation logic
v3: fix spinlock definition and usage, only validate objects when
 they're first added to a context lut, only remove them once (Chris),
 make protected context flag not mandatory in protected object execbuf
 to avoid abuse (Lionel)
v4: rebase to new gem_create_ext

Signed-off-by: Bommu Krishnaiah 
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Telukuntla Sreedhar 
Cc: Kondapally Kalyan 
Cc: Gupta Anshuman 
Cc: Huang Sean Z 
Cc: Chris Wilson 
Cc: Lionel Landwerlin 
Cc: Jason Ekstrand 
Cc: Daniel Vetter 
---
  drivers/gpu/drm/i915/gem/i915_gem_create.c| 26 
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 15 +++
  drivers/gpu/drm/i915/gem/i915_gem_object.c|  6 +++
  drivers/gpu/drm/i915/gem/i915_gem_object.h| 12 ++
  .../gpu/drm/i915/gem/i915_gem_object_types.h  | 13 ++
  drivers/gpu/drm/i915/pxp/intel_pxp.c  | 41 +++
  drivers/gpu/drm/i915/pxp/intel_pxp.h  | 13 ++
  drivers/gpu/drm/i915/pxp/intel_pxp_types.h|  5 +++
  include/uapi/drm/i915_drm.h   | 33 ++-
  9 files changed, 163 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 548ddf39d853..c14be3882c35 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -6,6 +6,7 @@
  #include "gem/i915_gem_ioctls.h"
  #include "gem/i915_gem_lmem.h"
  #include "gem/i915_gem_region.h"
+#include "pxp/intel_pxp.h"

  #include "i915_drv.h"
  #include "i915_trace.h"
@@ -99,7 +100,11 @@ i915_gem_setup(struct drm_i915_gem_object *obj,
u64 size)

GEM_BUG_ON(size != obj->base.size);

+   if (obj->user_flags & I915_GEM_OBJECT_PROTECTED)
+   intel_pxp_object_add(obj);
+
trace_i915_gem_object_create(obj);
+
return 0;
  }

@@ -344,8 +349,29 @@ static int ext_set_placements(struct
i915_user_extension __user *base,
return set_placements(&ext, data);
  }

+static int ext_set_protected(struct i915_user_extension __user *base,
+void *data) {
+   struct drm_i915_gem_create_ext_protected_content ext;
+   struct create_ext *ext_data = data;
+
+   if (copy_from_user(&ext, base, sizeof(ext)))
+   return -EFAULT;
+
+   if (ext.flags)
+   return -EINVAL;
+
+   if (!intel_pxp_is_enabled(&ext_data->i915->gt.pxp))
+   return -ENODEV;
+
+   ext_data->vanilla_object->user_flags |=
I915_GEM_OBJECT_PROTECTED;
+
+  

Re: [PATCH v4 14/17] drm/i915/pxp: User interface for Protected buffer

2021-05-26 Thread Daniele Ceraolo Spurio




On 5/25/2021 6:32 AM, Daniel Vetter wrote:

On Mon, May 24, 2021 at 10:48:00PM -0700, Daniele Ceraolo Spurio wrote:

From: Bommu Krishnaiah 

This api allow user mode to create Protected buffers. Only contexts
marked as protected are allowed to operate on protected buffers.

We only allow setting the flags at creation time.

All protected objects that have backing storage will be considered
invalid when the session is destroyed and they won't be usable anymore.

Given that the PXP HW supports multiple modes (but we currently only
care about one), a flag variable has been reserved in the structure
used in the create_ext ioctl for possible future updates.

This is a rework of the original code by Bommu Krishnaiah. I've kept
authorship unchanged since significant chunks have not been modified.

v2: split context changes, fix defines and improve documentation (Chris),
 add object invalidation logic
v3: fix spinlock definition and usage, only validate objects when
 they're first added to a context lut, only remove them once (Chris),
 make protected context flag not mandatory in protected object execbuf
 to avoid abuse (Lionel)
v4: rebase to new gem_create_ext

Signed-off-by: Bommu Krishnaiah 
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Telukuntla Sreedhar 
Cc: Kondapally Kalyan 
Cc: Gupta Anshuman 
Cc: Huang Sean Z 
Cc: Chris Wilson 
Cc: Lionel Landwerlin 
Cc: Jason Ekstrand 
Cc: Daniel Vetter 
---
  drivers/gpu/drm/i915/gem/i915_gem_create.c| 26 
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 15 +++
  drivers/gpu/drm/i915/gem/i915_gem_object.c|  6 +++
  drivers/gpu/drm/i915/gem/i915_gem_object.h| 12 ++
  .../gpu/drm/i915/gem/i915_gem_object_types.h  | 13 ++
  drivers/gpu/drm/i915/pxp/intel_pxp.c  | 41 +++
  drivers/gpu/drm/i915/pxp/intel_pxp.h  | 13 ++
  drivers/gpu/drm/i915/pxp/intel_pxp_types.h|  5 +++
  include/uapi/drm/i915_drm.h   | 33 ++-
  9 files changed, 163 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 548ddf39d853..c14be3882c35 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -6,6 +6,7 @@
  #include "gem/i915_gem_ioctls.h"
  #include "gem/i915_gem_lmem.h"
  #include "gem/i915_gem_region.h"
+#include "pxp/intel_pxp.h"
  
  #include "i915_drv.h"

  #include "i915_trace.h"
@@ -99,7 +100,11 @@ i915_gem_setup(struct drm_i915_gem_object *obj, u64 size)
  
  	GEM_BUG_ON(size != obj->base.size);
  
+	if (obj->user_flags & I915_GEM_OBJECT_PROTECTED)

+   intel_pxp_object_add(obj);
+
trace_i915_gem_object_create(obj);
+
return 0;
  }
  
@@ -344,8 +349,29 @@ static int ext_set_placements(struct i915_user_extension __user *base,

return set_placements(&ext, data);
  }
  
+static int ext_set_protected(struct i915_user_extension __user *base, void *data)

+{
+   struct drm_i915_gem_create_ext_protected_content ext;
+   struct create_ext *ext_data = data;
+
+   if (copy_from_user(&ext, base, sizeof(ext)))
+   return -EFAULT;
+
+   if (ext.flags)
+   return -EINVAL;
+
+   if (!intel_pxp_is_enabled(&ext_data->i915->gt.pxp))
+   return -ENODEV;
+
+   ext_data->vanilla_object->user_flags |= I915_GEM_OBJECT_PROTECTED;
+
+   return 0;
+}
+
+
  static const i915_user_extension_fn create_extensions[] = {
[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
+   [I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
  };
  
  /**

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index c08e28847064..5dd813d04a9f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -839,6 +839,21 @@ static struct i915_vma *eb_lookup_vma(struct 
i915_execbuffer *eb, u32 handle)
if (unlikely(!obj))
return ERR_PTR(-ENOENT);
  
+		/*

+* If the user has opted-in for protected-object tracking, make
+* sure the object encryption can be used.
+* We only need to do this when the object is first used with
+* this context, because the context itself will be banned when
+* the protected objects become invalid.
+*/
+   if (i915_gem_context_uses_protected_content(eb->gem_context) &&
+   i915_gem_object_is_protected(obj)) {
+   if (!intel_pxp_is_active(&vm->gt->pxp))
+   return ERR_PTR(-ENODEV);
+   if (!i915_gem_object_has_valid_protection(obj))
+   return ERR_PTR(-ENOEXEC);
+   }
+
vma = i915_vma_instance(obj, vm, NULL);
if (IS_ERR(vma)) {
 

Re: [PATCH v9 06/10] mm/memory.c: Allow different return codes for copy_nonpresent_pte()

2021-05-26 Thread Peter Xu
On Thu, May 27, 2021 at 11:20:36AM +1000, Alistair Popple wrote:
> On Thursday, 27 May 2021 5:50:05 AM AEST Peter Xu wrote:
> > On Mon, May 24, 2021 at 11:27:21PM +1000, Alistair Popple wrote:
> > > Currently if copy_nonpresent_pte() returns a non-zero value it is
> > > assumed to be a swap entry which requires further processing outside the
> > > loop in copy_pte_range() after dropping locks. This prevents other
> > > values being returned to signal conditions such as failure which a
> > > subsequent change requires.
> > > 
> > > Instead make copy_nonpresent_pte() return an error code if further
> > > processing is required and read the value for the swap entry in the main
> > > loop under the ptl.
> > > 
> > > Signed-off-by: Alistair Popple 
> > > 
> > > ---
> > > 
> > > v9:
> > > 
> > > New for v9 to allow device exclusive handling to occur in
> > > copy_nonpresent_pte().
> > > ---
> > > 
> > >  mm/memory.c | 12 +++-
> > >  1 file changed, 7 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 2fb455c365c2..e061cfa18c11 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -718,7 +718,7 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct
> > > mm_struct *src_mm,> 
> > >   if (likely(!non_swap_entry(entry))) {
> > >   
> > >   if (swap_duplicate(entry) < 0)
> > > 
> > > - return entry.val;
> > > + return -EAGAIN;
> > > 
> > >   /* make sure dst_mm is on swapoff's mmlist. */
> > >   if (unlikely(list_empty(&dst_mm->mmlist))) {
> > > 
> > > @@ -974,11 +974,13 @@ copy_pte_range(struct vm_area_struct *dst_vma,
> > > struct vm_area_struct *src_vma,> 
> > >   continue;
> > >   
> > >   }
> > >   if (unlikely(!pte_present(*src_pte))) {
> > > 
> > > - entry.val = copy_nonpresent_pte(dst_mm, src_mm,
> > > - dst_pte, src_pte,
> > > - src_vma, addr, rss);
> > > - if (entry.val)
> > > + ret = copy_nonpresent_pte(dst_mm, src_mm,
> > > + dst_pte, src_pte,
> > > + src_vma, addr, rss);
> > > + if (ret == -EAGAIN) {
> > > + entry = pte_to_swp_entry(*src_pte);
> > > 
> > >   break;
> > > 
> > > + }
> > > 
> > >   progress += 8;
> > >   continue;
> > >   
> > >   }
> > 
> > Note that -EAGAIN was previously used by copy_present_page() for early cow
> > use.  Here later although we check entry.val first:
> > 
> > if (entry.val) {
> > if (add_swap_count_continuation(entry, GFP_KERNEL) < 0) {
> > ret = -ENOMEM;
> > goto out;
> > }
> > entry.val = 0;
> > } else if (ret) {
> > WARN_ON_ONCE(ret != -EAGAIN);
> > prealloc = page_copy_prealloc(src_mm, src_vma, addr);
> > if (!prealloc)
> > return -ENOMEM;
> > /* We've captured and resolved the error. Reset, try again.
> > */ ret = 0;
> > }
> > 
> > We didn't reset "ret" in entry.val case (maybe we should?). Then in the next
> > round of "goto again" if "ret" is unluckily untouched, it could reach the
> > 2nd if check, and I think it could cause an unexpected
> > page_copy_prealloc().
> 
> Thanks, I had considered that but saw "ret" was always set either by 
> copy_nonpresent_pte() or copy_present_pte(). However missed the "unlucky" 
> case 
> at the start of the loop:
> 
>   if (progress >= 32) {
>   progress = 0;
>   if (need_resched() ||
>   spin_needbreak(src_ptl) || 
> pin_needbreak(dst_ptl))
>   break;
> 
> Looking at this again though checking different variables to figure out what 
> to do outside the locks and reusing error codes seems error prone. I reused -
> EAGAIN for copy_nonpresent_pte() simply because that seemed the most sensible 
> error code, but I don't think that aids readability and it might be better to 
> use a unique error code for each case needing extra handling.
> 
> So it might be better if I update this patch to:
> 1) Use unique error codes for each case requiring special handling outside 
> the 
> lock.
> 2) Only check "ret" to determine what to do outside locks (ie. not entry.val)
> 3) Document these.
> 4) Always reset ret after handling.
> 
> Thoughts?

Looks good to me.  Thanks,

-- 
Peter Xu



Re: [Intel-gfx] [PATCH 1/1] drm/i915: Engine relative MMIO

2021-05-26 Thread Daniele Ceraolo Spurio




On 5/26/2021 12:11 PM, Matthew Brost wrote:

With virtual engines, it is no longer possible to know which specific
physical engine a given request will be executed on at the time that
request is generated. This means that the request itself must be engine
agnostic - any direct register writes must be relative to the engine
and not absolute addresses.

The LRI command has support for engine relative addressing. However,
the mechanism is not transparent to the driver. The scheme for Gen11
(MI_LRI_ADD_CS_MMIO_START) requires the LRI address to have no
absolute engine base component in the ring and BBs. The hardware then
adds on the correct engine offset at execution time. This differs
slightly for LRC where the upper bits of the base component are just
ignored.

Due to the non-trivial and differing schemes on different hardware, it
is not possible to simply update the code that creates the LRI
commands to set a remap flag and let the hardware get on with it.
Instead, this patch adds function wrappers for generating the LRI
command itself and then for constructing the correct address to use
with the LRI.

Bspec: 45606
Signed-off-by: John Harrison 
Signed-off-by: Matthew Brost 
CC: Rodrigo Vivi 
CC: Tvrtko Ursulin 
CC: Chris P Wilson 
CC: Daniele Ceraolo Spurio 
---
  drivers/gpu/drm/i915/gem/i915_gem_context.c  |  7 ---
  drivers/gpu/drm/i915/gt/intel_engine_cs.c| 22 
  drivers/gpu/drm/i915/gt/intel_engine_types.h |  3 +++
  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  6 ++
  drivers/gpu/drm/i915/gt/intel_lrc.c  |  4 +---
  5 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 188dee13e017..a8a195bfcb57 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1211,7 +1211,7 @@ static int emit_ppgtt_update(struct i915_request *rq, 
void *data)
  {
struct i915_address_space *vm = rq->context->vm;
struct intel_engine_cs *engine = rq->engine;
-   u32 base = engine->mmio_base;
+   u32 base = engine->lri_mmio_base;
u32 *cs;
int i;
  
@@ -1223,7 +1223,7 @@ static int emit_ppgtt_update(struct i915_request *rq, void *data)

if (IS_ERR(cs))
return PTR_ERR(cs);
  
-		*cs++ = MI_LOAD_REGISTER_IMM(2);

+   *cs++ = MI_LOAD_REGISTER_IMM_REL(engine, 2);


This is the only place where you changed the behavior and I think it is 
going away 
(https://lists.freedesktop.org/archives/dri-devel/2021-May/305328.html), 
so the new macro is potentially not needed.


  
  		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, 0));

*cs++ = upper_32_bits(pd_daddr);
@@ -1245,7 +1245,8 @@ static int emit_ppgtt_update(struct i915_request *rq, 
void *data)
if (IS_ERR(cs))
return PTR_ERR(cs);
  
-		*cs++ = MI_LOAD_REGISTER_IMM(2 * GEN8_3LVL_PDPES) | MI_LRI_FORCE_POSTED;

+   *cs++ = MI_LOAD_REGISTER_IMM_REL(engine, 2 * GEN8_3LVL_PDPES) |
+   MI_LRI_FORCE_POSTED;
for (i = GEN8_3LVL_PDPES; i--; ) {
const dma_addr_t pd_daddr = 
i915_page_dir_dma_addr(ppgtt, i);
  
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c

index 3f9a811eb02b..0de6bc533776 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -15,6 +15,7 @@
  #include "intel_engine_pm.h"
  #include "intel_engine_user.h"
  #include "intel_execlists_submission.h"
+#include "intel_gpu_commands.h"
  #include "intel_gt.h"
  #include "intel_gt_requests.h"
  #include "intel_gt_pm.h"
@@ -222,6 +223,25 @@ static u32 __engine_mmio_base(struct drm_i915_private 
*i915,
return bases[i].base;
  }
  
+static bool i915_engine_has_relative_lri(const struct intel_engine_cs *engine)

+{
+   if (INTEL_GEN(engine->i915) < 11)
+   return false;
+
+   return true;


We already have intel_engine_has_relative_mmio(), can just re-use that. 
Note that I915_ENGINE_HAS_RELATIVE_MMIO is only set for gen12+ at the 
moment; this was because CI failed on ICL and since we urgently needed 
the change for gen12 we just excluded gen11 and pushed (see Mika's 
comment @ 
https://lists.freedesktop.org/archives/intel-gfx/2019-September/211812.html). 
It should be ok to extend that to gen11 if we get a green CI.



+}
+
+static void lri_init(struct intel_engine_cs *engine)
+{
+   if (i915_engine_has_relative_lri(engine)) {
+   engine->lri_cmd_mode = MI_LRI_LRM_CS_MMIO;
+   engine->lri_mmio_base = 0;
+   } else {
+   engine->lri_cmd_mode = 0;
+   engine->lri_mmio_base = engine->mmio_base;
+   }
+}
+
  static void __sprint_engine_name(struct intel_engine_cs *engine)
  {
/*
@@ -329,6 +349,8 @@ static int intel_engine_se

[PATCH 2/2] drm/amdgpu: stop bookkeeping of temporary GTT allocation

2021-05-26 Thread Lang Yu
To improve buffer migration performace, stop bookkeeping of
temporary GTT allocation, including allocation for BO evicted
from VRAM and bounce buffer.

Signed-off-by: Lang Yu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 16 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |  4 +++-
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 8860545344c7..32fedd495c7f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -111,14 +111,15 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager 
*man,
struct amdgpu_gtt_node *node;
int r;
 
-   spin_lock(&mgr->lock);
-   if ((&tbo->mem == mem || tbo->mem.mem_type != TTM_PL_TT) &&
-   atomic64_read(&mgr->available) < mem->num_pages) {
+   if (!(mem->placement & TTM_PL_FLAG_TEMPORARY)) {
+   spin_lock(&mgr->lock);
+   if (atomic64_read(&mgr->available) < mem->num_pages) {
+   spin_unlock(&mgr->lock);
+   return -ENOSPC;
+   }
+   atomic64_sub(mem->num_pages, &mgr->available);
spin_unlock(&mgr->lock);
-   return -ENOSPC;
}
-   atomic64_sub(mem->num_pages, &mgr->available);
-   spin_unlock(&mgr->lock);
 
if (!place->lpfn) {
mem->mm_node = NULL;
@@ -178,6 +179,9 @@ static void amdgpu_gtt_mgr_del(struct ttm_resource_manager 
*man,
kfree(node);
}
 
+   if (mem->placement & TTM_PL_FLAG_TEMPORARY)
+   return;
+
atomic64_add(mem->num_pages, &mgr->available);
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index c0aef327292a..129d39392859 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -152,9 +152,11 @@ static void amdgpu_evict_flags(struct ttm_buffer_object 
*bo,
abo->placements[0].lpfn = 0;
abo->placement.busy_placement = &abo->placements[1];
abo->placement.num_busy_placement = 1;
+   abo->placements[1].flags |= TTM_PL_FLAG_TEMPORARY;
} else {
/* Move to GTT memory */
amdgpu_bo_placement_from_domain(abo, 
AMDGPU_GEM_DOMAIN_GTT);
+   abo->placements[0].flags |= TTM_PL_FLAG_TEMPORARY;
}
break;
case TTM_PL_TT:
@@ -538,7 +540,7 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, 
bool evict,
hop->fpfn = 0;
hop->lpfn = 0;
hop->mem_type = TTM_PL_TT;
-   hop->flags = 0;
+   hop->flags |= TTM_PL_FLAG_TEMPORARY;
return -EMULTIHOP;
}
 
-- 
2.25.1



[PATCH 1/2] drm/ttm: cleanup and add TTM_PL_FLAG_TEMPORARY

2021-05-26 Thread Lang Yu
Make TTM_PL_FLAG_* start from zero and add
TTM_PL_FLAG_TEMPORARY flag for temporary
GTT allocation use.

Signed-off-by: Lang Yu 
---
 include/drm/ttm/ttm_placement.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/drm/ttm/ttm_placement.h b/include/drm/ttm/ttm_placement.h
index aa6ba4d0cf78..9f5cfc7c2d5a 100644
--- a/include/drm/ttm/ttm_placement.h
+++ b/include/drm/ttm/ttm_placement.h
@@ -47,8 +47,9 @@
  * top of the memory area, instead of the bottom.
  */
 
-#define TTM_PL_FLAG_CONTIGUOUS  (1 << 19)
-#define TTM_PL_FLAG_TOPDOWN (1 << 22)
+#define TTM_PL_FLAG_CONTIGUOUS  (1 << 0)
+#define TTM_PL_FLAG_TOPDOWN (1 << 1)
+#define TTM_PL_FLAG_TEMPORARY   (1 << 2)
 
 /**
  * struct ttm_place
-- 
2.25.1



Re: [PATCH v9 06/10] mm/memory.c: Allow different return codes for copy_nonpresent_pte()

2021-05-26 Thread Alistair Popple
On Thursday, 27 May 2021 5:50:05 AM AEST Peter Xu wrote:
> On Mon, May 24, 2021 at 11:27:21PM +1000, Alistair Popple wrote:
> > Currently if copy_nonpresent_pte() returns a non-zero value it is
> > assumed to be a swap entry which requires further processing outside the
> > loop in copy_pte_range() after dropping locks. This prevents other
> > values being returned to signal conditions such as failure which a
> > subsequent change requires.
> > 
> > Instead make copy_nonpresent_pte() return an error code if further
> > processing is required and read the value for the swap entry in the main
> > loop under the ptl.
> > 
> > Signed-off-by: Alistair Popple 
> > 
> > ---
> > 
> > v9:
> > 
> > New for v9 to allow device exclusive handling to occur in
> > copy_nonpresent_pte().
> > ---
> > 
> >  mm/memory.c | 12 +++-
> >  1 file changed, 7 insertions(+), 5 deletions(-)
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 2fb455c365c2..e061cfa18c11 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -718,7 +718,7 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct
> > mm_struct *src_mm,> 
> >   if (likely(!non_swap_entry(entry))) {
> >   
> >   if (swap_duplicate(entry) < 0)
> > 
> > - return entry.val;
> > + return -EAGAIN;
> > 
> >   /* make sure dst_mm is on swapoff's mmlist. */
> >   if (unlikely(list_empty(&dst_mm->mmlist))) {
> > 
> > @@ -974,11 +974,13 @@ copy_pte_range(struct vm_area_struct *dst_vma,
> > struct vm_area_struct *src_vma,> 
> >   continue;
> >   
> >   }
> >   if (unlikely(!pte_present(*src_pte))) {
> > 
> > - entry.val = copy_nonpresent_pte(dst_mm, src_mm,
> > - dst_pte, src_pte,
> > - src_vma, addr, rss);
> > - if (entry.val)
> > + ret = copy_nonpresent_pte(dst_mm, src_mm,
> > + dst_pte, src_pte,
> > + src_vma, addr, rss);
> > + if (ret == -EAGAIN) {
> > + entry = pte_to_swp_entry(*src_pte);
> > 
> >   break;
> > 
> > + }
> > 
> >   progress += 8;
> >   continue;
> >   
> >   }
> 
> Note that -EAGAIN was previously used by copy_present_page() for early cow
> use.  Here later although we check entry.val first:
> 
> if (entry.val) {
> if (add_swap_count_continuation(entry, GFP_KERNEL) < 0) {
> ret = -ENOMEM;
> goto out;
> }
> entry.val = 0;
> } else if (ret) {
> WARN_ON_ONCE(ret != -EAGAIN);
> prealloc = page_copy_prealloc(src_mm, src_vma, addr);
> if (!prealloc)
> return -ENOMEM;
> /* We've captured and resolved the error. Reset, try again.
> */ ret = 0;
> }
> 
> We didn't reset "ret" in entry.val case (maybe we should?). Then in the next
> round of "goto again" if "ret" is unluckily untouched, it could reach the
> 2nd if check, and I think it could cause an unexpected
> page_copy_prealloc().

Thanks, I had considered that but saw "ret" was always set either by 
copy_nonpresent_pte() or copy_present_pte(). However missed the "unlucky" case 
at the start of the loop:

if (progress >= 32) {
progress = 0;
if (need_resched() ||
spin_needbreak(src_ptl) || 
pin_needbreak(dst_ptl))
break;

Looking at this again though checking different variables to figure out what 
to do outside the locks and reusing error codes seems error prone. I reused -
EAGAIN for copy_nonpresent_pte() simply because that seemed the most sensible 
error code, but I don't think that aids readability and it might be better to 
use a unique error code for each case needing extra handling.

So it might be better if I update this patch to:
1) Use unique error codes for each case requiring special handling outside the 
lock.
2) Only check "ret" to determine what to do outside locks (ie. not entry.val)
3) Document these.
4) Always reset ret after handling.

Thoughts?

 - Alistair

> --
> Peter Xu






Re: [PATCH][next] nouveau/svm: Fix missing failure check on call to make_device_exclusive_range

2021-05-26 Thread Alistair Popple
On Thursday, 27 May 2021 12:04:59 AM AEST Colin King wrote:
> The call to make_device_exclusive_range can potentially fail leaving
> pointer page not initialized that leads to an uninitialized pointer
> read issue. Fix this by adding a check to see if the call failed and
> returning the error code.
> 
> Addresses-Coverity: ("Uninitialized pointer read")
> Fixes: c620bba9828c ("nouveau/svm: implement atomic SVM access")
> Signed-off-by: Colin Ian King 
> ---
>  drivers/gpu/drm/nouveau/nouveau_svm.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c
> b/drivers/gpu/drm/nouveau/nouveau_svm.c index 84726a89e665..b913b4907088
> 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_svm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
> @@ -609,8 +609,10 @@ static int nouveau_atomic_range_fault(struct
> nouveau_svmm *svmm,
> 
> notifier_seq = mmu_interval_read_begin(¬ifier->notifier);
> mmap_read_lock(mm);
> -   make_device_exclusive_range(mm, start, start + PAGE_SIZE,
> -   &page, drm->dev);
> +   ret = make_device_exclusive_range(mm, start, start +
> PAGE_SIZE, + &page,
> drm->dev); +   if (ret < 0)
> +   goto out;

Thanks for spotting, this is fixing get_user_pages() inside 
make_device_exclusive_range() returning an error. However the check needs to 
happen after dropping mmap_lock below:

> mmap_read_unlock(mm);
> if (!page) {
> ret = -EINVAL;
> --
> 2.31.1






Re: [Intel-gfx] [PATCH] drm/i915: Disable gpu relocations

2021-05-26 Thread Dave Airlie
On Thu, 27 May 2021 at 02:37, Daniel Vetter  wrote:
>
> Media userspace was the last userspace to still use them, and they
> converted now too:
>
> https://github.com/intel/media-driver/commit/144020c37770083974bedf59902b70b8f444c799
>
> This means no reason anymore to make relocations faster than they've
> been for the first 9 years of gem. This code was added in
>
> commit 7dd4f6729f9243bd7046c6f04c107a456bda38eb
> Author: Chris Wilson 
> Date:   Fri Jun 16 15:05:24 2017 +0100
>
> drm/i915: Async GPU relocation processing
>
> Furthermore there's pretty strong indications it's buggy, since the
> code to use it by default as the only option had to be reverted:
>
> commit ad5d95e4d538737ed3fa25493777decf264a3011
> Author: Dave Airlie 
> Date:   Tue Sep 8 15:41:17 2020 +1000
>
> Revert "drm/i915/gem: Async GPU relocations only"
>
> This code just disables gpu relocations, leaving the garbage
> collection for later patches and more importantly, much less confusing
> diff. Also given how much headaches this code has caused in the past,
> letting this soak for a bit seems justified.
>
> Cc: Jon Bloomfield 
> Signed-off-by: Daniel Vetter 
> Cc: Chris Wilson 
> Cc: Maarten Lankhorst 
> Cc: Joonas Lahtinen 
> Cc: Daniel Vetter 
> Cc: "Thomas Hellström" 
> Cc: Matthew Auld 
> Cc: Lionel Landwerlin 
> Cc: Dave Airlie 

Acked-by: Dave Airlie 

Thanks for making this happen, hope the softpin world is a happier future.

Dave.


RE: [PATCH] drm/kmb: Fix an error handling path

2021-05-26 Thread Chrisanthus, Anitha
Hi Christophe,
Thanks for the patch, good catch! Patch looks good, few minor comments.

Anitha

> -Original Message-
> From: Christophe JAILLET 
> Sent: Wednesday, May 19, 2021 1:47 PM
> To: Chrisanthus, Anitha ; Dea, Edmund J
> ; airl...@linux.ie; dan...@ffwll.ch;
> s...@ravnborg.org
> Cc: dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org; kernel-
> janit...@vger.kernel.org; Christophe JAILLET 
> Subject: [PATCH] drm/kmb: Fix an error handling path
> 
> If 'platform_get_irq()' fails, it is spurious to call
> 'of_reserved_mem_device_release()' in the error handling path, because
> 'of_reserved_mem_device_init() has not been called yet.
> 
> Moreover, a previous 'kmb_initialize_clocks()' is unbalanced by a
> corresponding 'kmb_display_clk_disable()' call, has already done in the
> remove function.
> 
> It is likely that 'kmb_display_clk_disable()' is expected in the error
> handling path, instead of 'kmb_display_clk_disable()'.
You mean instead of of_reserved_mem_device_release()
> 
> 
> Also, it is spurious to return directly if 'of_reserved_mem_device_init()'
> fails.
> Goto the error handling path instead to free some resources.
> 
> Fixes: 7f7b96a8a0a1 ("drm/kmb: Add support for KeemBay Display")
> Signed-off-by: Christophe JAILLET 
> ---
>  drivers/gpu/drm/kmb/kmb_drv.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/kmb/kmb_drv.c
> b/drivers/gpu/drm/kmb/kmb_drv.c
> index f64e06e1067d..b41b8789fe57 100644
> --- a/drivers/gpu/drm/kmb/kmb_drv.c
> +++ b/drivers/gpu/drm/kmb/kmb_drv.c
> @@ -138,13 +138,13 @@ static int kmb_hw_init(struct drm_device *drm,
> unsigned long flags)
>   irq_lcd = platform_get_irq(pdev, 0);
>   if (irq_lcd < 0) {
>   drm_err(&kmb->drm, "irq_lcd not found");
> - goto setup_fail;
> + goto disable_clk_err;
Keep setup_fail label or something like err_free_clocks
>   }
> 
>   /* Get the optional framebuffer memory resource */
>   ret = of_reserved_mem_device_init(drm->dev);
>   if (ret && ret != -ENODEV)
> - return ret;
> + goto disable_clk_err;
> 
>   spin_lock_init(&kmb->irq_lock);
> 
> @@ -152,8 +152,8 @@ static int kmb_hw_init(struct drm_device *drm,
> unsigned long flags)
> 
>   return 0;
> 
> - setup_fail:
> - of_reserved_mem_device_release(drm->dev);
> + disable_clk_err:
> + kmb_display_clk_disable(kmb);
> 
>   return ret;
>  }
> --
> 2.30.2



[RFC PATCH 1/2] drm/doc/rfc: i915 GuC submission / DRM scheduler

2021-05-26 Thread Matthew Brost
Add entry for i915 GuC submission / DRM scheduler integration plan.
Follow up patch with details of new parallel submission uAPI to come.

v2:
 (Daniel Vetter)
  - Expand explaination of why bonding isn't supported for GuC
submission
  - CC some of the DRM scheduler maintainers
  - Add priority inheritance / boosting use case
  - Add reasoning for removing in order assumptions
 (Daniel Stone)
  - Add links to priority spec

Cc: Christian König 
Cc: Luben Tuikov 
Cc: Alex Deucher 
Cc: Steven Price 
Cc: Jon Bloomfield 
Cc: Jason Ekstrand 
Cc: Dave Airlie 
Cc: Daniel Vetter 
Cc: Jason Ekstrand 
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Matthew Brost 
---
 Documentation/gpu/rfc/i915_scheduler.rst | 85 
 Documentation/gpu/rfc/index.rst  |  4 ++
 2 files changed, 89 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_scheduler.rst

diff --git a/Documentation/gpu/rfc/i915_scheduler.rst 
b/Documentation/gpu/rfc/i915_scheduler.rst
new file mode 100644
index ..7faa46cde088
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_scheduler.rst
@@ -0,0 +1,85 @@
+=
+I915 GuC Submission/DRM Scheduler Section
+=
+
+Upstream plan
+=
+For upstream the overall plan for landing GuC submission and integrating the
+i915 with the DRM scheduler is:
+
+* Merge basic GuC submission
+   * Basic submission support for all gen11+ platforms
+   * Not enabled by default on any current platforms but can be enabled via
+ modparam enable_guc
+   * Lots of rework will need to be done to integrate with DRM scheduler so
+ no need to nit pick everything in the code, it just should be
+ functional, no major coding style / layering errors, and not regress
+ execlists
+   * Update IGTs / selftests as needed to work with GuC submission
+   * Enable CI on supported platforms for a baseline
+   * Rework / get CI heathly for GuC submission in place as needed
+* Merge new parallel submission uAPI
+   * Bonding uAPI completely incompatible with GuC submission, plus it has
+ severe design issues in general, which is why we want to retire it no
+ matter what
+   * New uAPI adds I915_CONTEXT_ENGINES_EXT_PARALLEL context setup step
+ which configures a slot with N contexts 
+   * After I915_CONTEXT_ENGINES_EXT_PARALLEL a user can submit N batches to
+ a slot in a single execbuf IOCTL and the batches run on the GPU in
+ paralllel
+   * Initially only for GuC submission but execlists can be supported if
+ needed
+* Convert the i915 to use the DRM scheduler
+   * GuC submission backend fully integrated with DRM scheduler
+   * All request queues removed from backend (e.g. all backpressure
+ handled in DRM scheduler)
+   * Resets / cancels hook in DRM scheduler
+   * Watchdog hooks into DRM scheduler
+   * Lots of complexity of the GuC backend can be pulled out once
+ integrated with DRM scheduler (e.g. state machine gets
+ simplier, locking gets simplier, etc...)
+   * Execlist backend will do the minimum required to hook in the DRM
+ scheduler so it can live next to the fully integrated GuC backend
+   * Legacy interface
+   * Features like timeslicing / preemption / virtual engines would
+ be difficult to integrate with the DRM scheduler and these
+ features are not required for GuC submission as the GuC does
+ these things for us
+   * ROI low on fully integrating into DRM scheduler
+   * Fully integrating would add lots of complexity to DRM
+ scheduler
+   * Port i915 priority inheritance / boosting feature in DRM scheduler
+   * Used for i915 page flip, may be useful to other DRM drivers as
+ well
+   * Will be an optional feature in the DRM scheduler
+   * Remove in-order completion assumptions from DRM scheduler
+   * Even when using the DRM scheduler the backends will handle
+ preemption, timeslicing, etc... so it is possible for jobs to
+ finish out of order
+   * Pull out i915 priority levels and use DRM priority levels
+   * Optimize DRM scheduler as needed
+
+New uAPI for basic GuC submission
+=
+No major changes are required to the uAPI for basic GuC submission. The only
+change is a new scheduler attribute: I915_SCHEDULER_CAP_STATIC_PRIORITY_MAP.
+This attribute indicates the 2k i915 user priority levels are statically mapped
+into 3 levels as follows:
+
+* -1k to -1 Low priority
+* 0 Medium priority
+* 1 to 1k High priority
+
+This is needed because the GuC only has 4 priority bands. The highest priority
+band is reserved with the 

[RFC PATCH 2/2] drm/doc/rfc: i915 new parallel submission uAPI plan

2021-05-26 Thread Matthew Brost
Add entry for i915 new parallel submission uAPI plan.

v2:
 (Daniel Vetter):
  - Expand logical order explaination
  - Add dummy header
  - Only allow N BBs in execbuf IOCTL
  - Configure parallel submission per slot not per gem context
v3:
 (Marcin Ślusarz):
  - Lot's of typos / bad english fixed
 (Tvrtko Ursulin):
  - Consistent pseudo code, clean up wording in descriptions

Cc: Tvrtko Ursulin 
Cc: Tony Ye 
CC: Carl Zhang 
Cc: Daniel Vetter 
Cc: Jason Ekstrand 
Signed-off-by: Matthew Brost 
---
 Documentation/gpu/rfc/i915_parallel_execbuf.h | 145 ++
 Documentation/gpu/rfc/i915_scheduler.rst  |  55 ++-
 2 files changed, 198 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h

diff --git a/Documentation/gpu/rfc/i915_parallel_execbuf.h 
b/Documentation/gpu/rfc/i915_parallel_execbuf.h
new file mode 100644
index ..20de206e3ab4
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_parallel_execbuf.h
@@ -0,0 +1,145 @@
+#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
i915_context_engines_parallel_submit */
+
+/*
+ * i915_context_engines_parallel_submit:
+ *
+ * Setup a slot in the context engine map to allow multiple BBs to be submitted
+ * in a single execbuf IOCTL. Those BBs will then be scheduled to run on the 
GPU
+ * in parallel. Multiple hardware contexts are created internally in the i915
+ * run these BBs. Once a slot is configured for N BBs only N BBs can be
+ * submitted in each execbuf IOCTL and this is implicit behavior e.g. The user
+ * doesn't tell the execbuf IOCTL there are N BBs, the execbuf IOCTL know how
+ * many BBs there are based on the slots configuration. The N BBs are the last 
N
+ * buffer objects for first N if I915_EXEC_BATCH_FIRST is set.
+ *
+ * There are two currently defined ways to control the placement of the
+ * hardware contexts on physical engines: default behavior (no flags) and
+ * I915_PARALLEL_IMPLICIT_BONDS (a flag). More flags may be added the in the
+ * future as new hardware / use cases arise. Details of how to use this
+ * interface above the flags field in this structure.
+ *
+ * Returns -EINVAL if hardware context placement configuration is invalid or if
+ * the placement configuration isn't supported on the platform / submission
+ * interface.
+ * Returns -ENODEV if extension isn't supported on the platform / submission
+ * inteface.
+ */
+struct i915_context_engines_parallel_submit {
+   struct i915_user_extension base;
+
+   __u16 engine_index; /* slot for parallel engine */
+   __u16 width;/* number of contexts per parallel engine */
+   __u16 num_siblings; /* number of siblings per context */
+   __u16 mbz16;
+/*
+ * Default placement behavior (currently unsupported):
+ *
+ * Allow BBs to be placed on any available engine instance. In this case each
+ * context's engine mask indicates where that context can be placed. It is
+ * implied in this mode that all contexts have mutual exclusive placement.
+ * e.g. If one context is running CSX[0] no other contexts can run on CSX[0]).
+ *
+ * Example 1 pseudo code:
+ * CSX,Y[N] = generic engine class X or Y, logical instance N
+ * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
+ * set_engines(INVALID)
+ * set_parallel(engine_index=0, width=2, num_siblings=2,
+ * engines=CSX[0],CSX[1],CSY[0],CSY[1])
+ *
+ * Results in the following valid placements:
+ * CSX[0], CSY[0]
+ * CSX[0], CSY[1]
+ * CSX[1], CSY[0]
+ * CSX[1], CSY[1]
+ *
+ * This can also be thought of as 2 virtual engines described by 2-D array in
+ * the engines the field:
+ * VE[0] = CSX[0], CSX[1]
+ * VE[1] = CSY[0], CSY[1]
+ *
+ * Example 2 pseudo code:
+ * CSX[Y] = generic engine of same class X, logical instance N
+ * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
+ * set_engines(INVALID)
+ * set_parallel(engine_index=0, width=2, num_siblings=3,
+ * engines=CSX[0],CSX[1],CSX[2],CSX[0],CSX[1],CSX[2])
+ *
+ * Results in the following valid placements:
+ * CSX[0], CSX[1]
+ * CSX[0], CSX[2]
+ * CSX[1], CSX[0]
+ * CSX[1], CSX[2]
+ * CSX[2], CSX[0]
+ * CSX[2], CSX[1]
+ *
+ * This can also be thought of as 2 virtual engines described by 2-D array in
+ * the engines the field:
+ * VE[0] = CSX[0], CSX[1], CSX[2]
+ * VE[1] = CSX[0], CSX[1], CSX[2]
+
+ * This enables a use case where all engines are created equally, we don't care
+ * where they are scheduled, we just want a certain number of resources, for
+ * those resources to be scheduled in parallel, and possibly across multiple
+ * engine classes.
+ */
+
+/*
+ * I915_PARALLEL_IMPLICIT_BONDS - Create implicit bonds between each context.
+ * Each context must have the same number of sibling and bonds are implicitly
+ * created between each set of siblings.
+ *
+ * Example 1 pseudo code:
+ * CSX[N] = generic engine of same class X, logical instance N
+ * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
+ * se

[RFC PATCH 0/2] GuC submission / DRM scheduler integration plan + new uAPI

2021-05-26 Thread Matthew Brost
Subject and patches say it all.

v2: Address comments, patches have details of changes
v3: Address comments, patches have details of changes

Signed-off-by: Matthew Brost 

Matthew Brost (2):
  drm/doc/rfc: i915 GuC submission / DRM scheduler
  drm/doc/rfc: i915 new parallel submission uAPI plan

 Documentation/gpu/rfc/i915_parallel_execbuf.h | 145 ++
 Documentation/gpu/rfc/i915_scheduler.rst  | 136 
 Documentation/gpu/rfc/index.rst   |   4 +
 3 files changed, 285 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
 create mode 100644 Documentation/gpu/rfc/i915_scheduler.rst

-- 
2.28.0



[PATCH 2/2] drm/i915/guc: Use guc_class instead of engine_class in fw interface

2021-05-26 Thread Matthew Brost
From: Daniele Ceraolo Spurio 

GuC has its own defines for the engine classes. They're currently
mapping 1:1 to the defines used by the driver, but there is no guarantee
this will continue in the future. Given that we've been caught off-guard
in the past by similar divergences, we can prepare for the changes by
introducing helper functions to convert from engine class to GuC class and
back again.

Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Matthew Brost 
Rewiewed-by: Matthew Brost 
Cc: John Harrison 
Cc: Michal Wajdeczko 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c   |  6 +++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c  | 20 +---
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 26 +
 3 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 3f9a811eb02b..69281b5aba51 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -265,6 +265,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
const struct engine_info *info = &intel_engines[id];
struct drm_i915_private *i915 = gt->i915;
struct intel_engine_cs *engine;
+   u8 guc_class;
 
BUILD_BUG_ON(MAX_ENGINE_CLASS >= BIT(GEN11_ENGINE_CLASS_WIDTH));
BUILD_BUG_ON(MAX_ENGINE_INSTANCE >= BIT(GEN11_ENGINE_INSTANCE_WIDTH));
@@ -293,9 +294,10 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
engine->i915 = i915;
engine->gt = gt;
engine->uncore = gt->uncore;
-   engine->mmio_base = __engine_mmio_base(i915, info->mmio_bases);
engine->hw_id = info->hw_id;
-   engine->guc_id = MAKE_GUC_ID(info->class, info->instance);
+   guc_class = engine_class_to_guc_class(info->class);
+   engine->guc_id = MAKE_GUC_ID(guc_class, info->instance);
+   engine->mmio_base = __engine_mmio_base(i915, info->mmio_bases);
 
engine->irq_handler = nop_irq_handler;
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 17526717368c..efdce309b6f1 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -6,6 +6,7 @@
 #include "gt/intel_gt.h"
 #include "gt/intel_lrc.h"
 #include "intel_guc_ads.h"
+#include "intel_guc_fwif.h"
 #include "intel_uc.h"
 #include "i915_drv.h"
 
@@ -104,7 +105,7 @@ static void guc_mapping_table_init(struct intel_gt *gt,
GUC_MAX_INSTANCES_PER_CLASS;
 
for_each_engine(engine, gt, id) {
-   u8 guc_class = engine->class;
+   u8 guc_class = engine_class_to_guc_class(engine->class);
 
system_info->mapping_table[guc_class][engine->instance] =
engine->instance;
@@ -124,7 +125,7 @@ static void __guc_ads_init(struct intel_guc *guc)
struct __guc_ads_blob *blob = guc->ads_blob;
const u32 skipped_size = LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE;
u32 base;
-   u8 engine_class;
+   u8 engine_class, guc_class;
 
/* GuC scheduling policies */
guc_policies_init(&blob->policies);
@@ -140,22 +141,25 @@ static void __guc_ads_init(struct intel_guc *guc)
for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; 
++engine_class) {
if (engine_class == OTHER_CLASS)
continue;
+
+   guc_class = engine_class_to_guc_class(engine_class);
+
/*
 * TODO: Set context pointer to default state to allow
 * GuC to re-init guilty contexts after internal reset.
 */
-   blob->ads.golden_context_lrca[engine_class] = 0;
-   blob->ads.eng_state_size[engine_class] =
+   blob->ads.golden_context_lrca[guc_class] = 0;
+   blob->ads.eng_state_size[guc_class] =
intel_engine_context_size(guc_to_gt(guc),
  engine_class) -
skipped_size;
}
 
/* System info */
-   blob->system_info.engine_enabled_masks[RENDER_CLASS] = 1;
-   blob->system_info.engine_enabled_masks[COPY_ENGINE_CLASS] = 1;
-   blob->system_info.engine_enabled_masks[VIDEO_DECODE_CLASS] = 
VDBOX_MASK(gt);
-   blob->system_info.engine_enabled_masks[VIDEO_ENHANCEMENT_CLASS] = 
VEBOX_MASK(gt);
+   blob->system_info.engine_enabled_masks[GUC_RENDER_CLASS] = 1;
+   blob->system_info.engine_enabled_masks[GUC_BLITTER_CLASS] = 1;
+   blob->system_info.engine_enabled_masks[GUC_VIDEO_CLASS] = 
VDBOX_MASK(gt);
+   blob->system_info.engine_enabled_masks[GUC_VIDEOENHANCE_CLASS] = 
VEBOX_MASK(gt);
 

blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED] =
hweight8(gt->info.sseu.slice_mask);
diff --git a/drivers/gpu/drm

[PATCH 1/2] drm/i915/guc: Early initialization of GuC send registers

2021-05-26 Thread Matthew Brost
From: Michal Wajdeczko 

Base offset and count of the GuC scratch registers, used for
sending MMIO messages to GuC, can be initialized earlier with
other GuC members that also depends on platform.

Signed-off-by: Michal Wajdeczko 
Signed-off-by: Matthew Brost 
Reviewed-by: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index adae04c47aab..c17694c77bcf 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -60,15 +60,8 @@ void intel_guc_init_send_regs(struct intel_guc *guc)
enum forcewake_domains fw_domains = 0;
unsigned int i;
 
-   if (INTEL_GEN(gt->i915) >= 11) {
-   guc->send_regs.base =
-   i915_mmio_reg_offset(GEN11_SOFT_SCRATCH(0));
-   guc->send_regs.count = GEN11_SOFT_SCRATCH_COUNT;
-   } else {
-   guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0));
-   guc->send_regs.count = GUC_MAX_MMIO_MSG_LEN;
-   BUILD_BUG_ON(GUC_MAX_MMIO_MSG_LEN > SOFT_SCRATCH_COUNT);
-   }
+   GEM_BUG_ON(!guc->send_regs.base);
+   GEM_BUG_ON(!guc->send_regs.count);
 
for (i = 0; i < guc->send_regs.count; i++) {
fw_domains |= intel_uncore_forcewake_for_reg(gt->uncore,
@@ -181,11 +174,18 @@ void intel_guc_init_early(struct intel_guc *guc)
guc->interrupts.reset = gen11_reset_guc_interrupts;
guc->interrupts.enable = gen11_enable_guc_interrupts;
guc->interrupts.disable = gen11_disable_guc_interrupts;
+   guc->send_regs.base =
+   i915_mmio_reg_offset(GEN11_SOFT_SCRATCH(0));
+   guc->send_regs.count = GEN11_SOFT_SCRATCH_COUNT;
+
} else {
guc->notify_reg = GUC_SEND_INTERRUPT;
guc->interrupts.reset = gen9_reset_guc_interrupts;
guc->interrupts.enable = gen9_enable_guc_interrupts;
guc->interrupts.disable = gen9_disable_guc_interrupts;
+   guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0));
+   guc->send_regs.count = GUC_MAX_MMIO_MSG_LEN;
+   BUILD_BUG_ON(GUC_MAX_MMIO_MSG_LEN > SOFT_SCRATCH_COUNT);
}
 }
 
-- 
2.28.0



[PATCH 0/2] A couple more prerequisite patches to GuC submission

2021-05-26 Thread Matthew Brost
As discussed in [1] we are breaking that large series into a several
smaller ones. This series includes 2 patches with no other dependencies
and are fully reviewed discussed as part of step #4.

Assuming CI looks good these patches can be merged immediately.

[1] https://patchwork.freedesktop.org/series/89844/

Signed-off-by: Matthew Brost 

Daniele Ceraolo Spurio (1):
  drm/i915/guc: Use guc_class instead of engine_class in fw interface

Michal Wajdeczko (1):
  drm/i915/guc: Early initialization of GuC send registers

 drivers/gpu/drm/i915/gt/intel_engine_cs.c   |  6 +++--
 drivers/gpu/drm/i915/gt/uc/intel_guc.c  | 18 +++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c  | 20 +---
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 26 +
 4 files changed, 51 insertions(+), 19 deletions(-)

-- 
2.28.0



Re: [RFC PATCH 34/97] drm/i915/guc: Use guc_class instead of engine_class in fw interface

2021-05-26 Thread Matthew Brost
On Thu, May 06, 2021 at 12:13:48PM -0700, Matthew Brost wrote:
> From: Daniele Ceraolo Spurio 
> 
> GuC has its own defines for the engine classes. They're currently
> mapping 1:1 to the defines used by the driver, but there is no guarantee
> this will continue in the future. Given that we've been caught off-guard
> in the past by similar divergences, we can prepare for the changes by
> introducing helper functions to convert from engine class to GuC class and
> back again.
> 
> Signed-off-by: Daniele Ceraolo Spurio 
> Signed-off-by: Matthew Brost 
> Cc: John Harrison 
> Cc: Michal Wajdeczko 

Reviewed-by: Matthew Brost 

> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c   |  6 +++--
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c  | 20 +---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 26 +
>  3 files changed, 42 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index c88b792c1ab5..7866ff0c2673 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -289,6 +289,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
> intel_engine_id id)
>   const struct engine_info *info = &intel_engines[id];
>   struct drm_i915_private *i915 = gt->i915;
>   struct intel_engine_cs *engine;
> + u8 guc_class;
>  
>   BUILD_BUG_ON(MAX_ENGINE_CLASS >= BIT(GEN11_ENGINE_CLASS_WIDTH));
>   BUILD_BUG_ON(MAX_ENGINE_INSTANCE >= BIT(GEN11_ENGINE_INSTANCE_WIDTH));
> @@ -317,9 +318,10 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
> intel_engine_id id)
>   engine->i915 = i915;
>   engine->gt = gt;
>   engine->uncore = gt->uncore;
> - engine->mmio_base = __engine_mmio_base(i915, info->mmio_bases);
>   engine->hw_id = info->hw_id;
> - engine->guc_id = MAKE_GUC_ID(info->class, info->instance);
> + guc_class = engine_class_to_guc_class(info->class);
> + engine->guc_id = MAKE_GUC_ID(guc_class, info->instance);
> + engine->mmio_base = __engine_mmio_base(i915, info->mmio_bases);
>  
>   engine->irq_handler = nop_irq_handler;
>  
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> index 775f00d706fa..ecd18531b40a 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> @@ -6,6 +6,7 @@
>  #include "gt/intel_gt.h"
>  #include "gt/intel_lrc.h"
>  #include "intel_guc_ads.h"
> +#include "intel_guc_fwif.h"
>  #include "intel_uc.h"
>  #include "i915_drv.h"
>  
> @@ -78,7 +79,7 @@ static void guc_mapping_table_init(struct intel_gt *gt,
>   GUC_MAX_INSTANCES_PER_CLASS;
>  
>   for_each_engine(engine, gt, id) {
> - u8 guc_class = engine->class;
> + u8 guc_class = engine_class_to_guc_class(engine->class);
>  
>   system_info->mapping_table[guc_class][engine->instance] =
>   engine->instance;
> @@ -98,7 +99,7 @@ static void __guc_ads_init(struct intel_guc *guc)
>   struct __guc_ads_blob *blob = guc->ads_blob;
>   const u32 skipped_size = LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE;
>   u32 base;
> - u8 engine_class;
> + u8 engine_class, guc_class;
>  
>   /* GuC scheduling policies */
>   guc_policies_init(&blob->policies);
> @@ -114,22 +115,25 @@ static void __guc_ads_init(struct intel_guc *guc)
>   for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; 
> ++engine_class) {
>   if (engine_class == OTHER_CLASS)
>   continue;
> +
> + guc_class = engine_class_to_guc_class(engine_class);
> +
>   /*
>* TODO: Set context pointer to default state to allow
>* GuC to re-init guilty contexts after internal reset.
>*/
> - blob->ads.golden_context_lrca[engine_class] = 0;
> - blob->ads.eng_state_size[engine_class] =
> + blob->ads.golden_context_lrca[guc_class] = 0;
> + blob->ads.eng_state_size[guc_class] =
>   intel_engine_context_size(guc_to_gt(guc),
> engine_class) -
>   skipped_size;
>   }
>  
>   /* System info */
> - blob->system_info.engine_enabled_masks[RENDER_CLASS] = 1;
> - blob->system_info.engine_enabled_masks[COPY_ENGINE_CLASS] = 1;
> - blob->system_info.engine_enabled_masks[VIDEO_DECODE_CLASS] = 
> VDBOX_MASK(gt);
> - blob->system_info.engine_enabled_masks[VIDEO_ENHANCEMENT_CLASS] = 
> VEBOX_MASK(gt);
> + blob->system_info.engine_enabled_masks[GUC_RENDER_CLASS] = 1;
> + blob->system_info.engine_enabled_masks[GUC_BLITTER_CLASS] = 1;
> + blob->system_info.engine_enabled_masks[GUC_VIDEO_CLASS] = 
> VDBOX_MASK(gt);
> + blob->system_info.engine_enabled_masks[GUC_VIDEOENHANCE_CLASS] = 
> VEBOX_M

Re: [RFC PATCH 31/97] drm/i915/guc: Early initialization of GuC send registers

2021-05-26 Thread Matthew Brost
On Thu, May 06, 2021 at 12:13:45PM -0700, Matthew Brost wrote:
> From: Michal Wajdeczko 
> 
> Base offset and count of the GuC scratch registers, used for
> sending MMIO messages to GuC, can be initialized earlier with
> other GuC members that also depends on platform.
> 
> Signed-off-by: Michal Wajdeczko 
> Signed-off-by: Matthew Brost 
> Cc: Daniele Ceraolo Spurio 

Reviewed-by: Matthew Brost 

> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc.c | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> index 454c8d886499..235c1997f32d 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> @@ -60,15 +60,8 @@ void intel_guc_init_send_regs(struct intel_guc *guc)
>   enum forcewake_domains fw_domains = 0;
>   unsigned int i;
>  
> - if (INTEL_GEN(gt->i915) >= 11) {
> - guc->send_regs.base =
> - i915_mmio_reg_offset(GEN11_SOFT_SCRATCH(0));
> - guc->send_regs.count = GEN11_SOFT_SCRATCH_COUNT;
> - } else {
> - guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0));
> - guc->send_regs.count = GUC_MAX_MMIO_MSG_LEN;
> - BUILD_BUG_ON(GUC_MAX_MMIO_MSG_LEN > SOFT_SCRATCH_COUNT);
> - }
> + GEM_BUG_ON(!guc->send_regs.base);
> + GEM_BUG_ON(!guc->send_regs.count);
>  
>   for (i = 0; i < guc->send_regs.count; i++) {
>   fw_domains |= intel_uncore_forcewake_for_reg(gt->uncore,
> @@ -181,11 +174,18 @@ void intel_guc_init_early(struct intel_guc *guc)
>   guc->interrupts.reset = gen11_reset_guc_interrupts;
>   guc->interrupts.enable = gen11_enable_guc_interrupts;
>   guc->interrupts.disable = gen11_disable_guc_interrupts;
> + guc->send_regs.base =
> + i915_mmio_reg_offset(GEN11_SOFT_SCRATCH(0));
> + guc->send_regs.count = GEN11_SOFT_SCRATCH_COUNT;
> +
>   } else {
>   guc->notify_reg = GUC_SEND_INTERRUPT;
>   guc->interrupts.reset = gen9_reset_guc_interrupts;
>   guc->interrupts.enable = gen9_enable_guc_interrupts;
>   guc->interrupts.disable = gen9_disable_guc_interrupts;
> + guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0));
> + guc->send_regs.count = GUC_MAX_MMIO_MSG_LEN;
> + BUILD_BUG_ON(GUC_MAX_MMIO_MSG_LEN > SOFT_SCRATCH_COUNT);
>   }
>  }
>  
> -- 
> 2.28.0
> 


Re: [PATCH] drm/i915: Use generic_access_phys

2021-05-26 Thread kernel test robot
Hi Daniel,

I love your patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip v5.13-rc3 next-20210526]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Daniel-Vetter/drm-i915-Use-generic_access_phys/20210526-231425
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-r025-20210526 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
99155e913e9bad5f7f8a247f8bb3a3ff3da74af1)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# 
https://github.com/0day-ci/linux/commit/80157be9e8542ce9a835e6f159408b951590b578
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Daniel-Vetter/drm-i915-Use-generic_access_phys/20210526-231425
git checkout 80157be9e8542ce9a835e6f159408b951590b578
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gem/i915_gem_mman.c:755:2: error: member reference base 
>> type 'int (struct vm_area_struct *, unsigned long, void *, int, int)' is not 
>> a structure or union
   .open = vm_open,
   ^
   drivers/gpu/drm/i915/gem/i915_gem_mman.c:764:2: error: member reference base 
type 'int (struct vm_area_struct *, unsigned long, void *, int, int)' is not a 
structure or union
   .open = vm_open,
   ^
   2 errors generated.


vim +755 drivers/gpu/drm/i915/gem/i915_gem_mman.c

cc662126b4134e2 Abdiel Janulgue 2019-12-04  749  
cc662126b4134e2 Abdiel Janulgue 2019-12-04  750  static const struct 
vm_operations_struct vm_ops_gtt = {
cc662126b4134e2 Abdiel Janulgue 2019-12-04  751 .fault = vm_fault_gtt,
80157be9e8542ce Daniel Vetter   2021-05-26  752  #ifdef CONFIG_HAVE_IOREMAP_PROT
80157be9e8542ce Daniel Vetter   2021-05-26  753 .access = 
generic_access_phys
80157be9e8542ce Daniel Vetter   2021-05-26  754  #endif
cc662126b4134e2 Abdiel Janulgue 2019-12-04 @755 .open = vm_open,
cc662126b4134e2 Abdiel Janulgue 2019-12-04  756 .close = vm_close,
cc662126b4134e2 Abdiel Janulgue 2019-12-04  757  };
cc662126b4134e2 Abdiel Janulgue 2019-12-04  758  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH 0/4] drm: Finally retire struct drm_format_name_buf

2021-05-26 Thread Sakari Ailus
On Wed, May 26, 2021 at 09:21:10PM +0200, Thomas Zimmermann wrote:
> ping for further a-bs / r-bs

Thanks for the ping.

For the series:

Reviewed-by: Sakari Ailus 

> 
> Am 16.05.21 um 14:13 schrieb Thomas Zimmermann:
> > This is a cleanup patchset to remove drm_format_name_buf et al. There
> > are two instances in drivers that need to be replaced with the %4cc
> > printk format modifier. Patch 3 was left over back from an earlier
> > patchset. [1] Patch 4 removes struct drm_format_name_buf.
> > 
> > I built-tested with drm-tip. The patchsetcan be mered through drm-misc.
> > 
> > [1] 
> > https://lore.kernel.org/dri-devel/20210216155723.17109-1-sakari.ai...@linux.intel.com/
> > 
> > Sakari Ailus (1):
> >drm: Remove drm_get_format_name()
> > 
> > Thomas Zimmermann (3):
> >drm/amdgpu: Use %p4cc to print 4CC format
> >drm/simpledrm: Use %p4cc to print 4CC format
> >drm/fourcc: Remove struct drm_format_buf_name
> > 
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_display.c |  7 ++
> >   drivers/gpu/drm/drm_fourcc.c| 25 -
> >   drivers/gpu/drm/tiny/simpledrm.c|  6 ++---
> >   include/drm/drm_fourcc.h|  9 
> >   4 files changed, 4 insertions(+), 43 deletions(-)
> > 
> > --
> > 2.31.1
> > 
> 

-- 
Sakari Ailus


Re: [PATCH v9 06/10] mm/memory.c: Allow different return codes for copy_nonpresent_pte()

2021-05-26 Thread Peter Xu
On Mon, May 24, 2021 at 11:27:21PM +1000, Alistair Popple wrote:
> Currently if copy_nonpresent_pte() returns a non-zero value it is
> assumed to be a swap entry which requires further processing outside the
> loop in copy_pte_range() after dropping locks. This prevents other
> values being returned to signal conditions such as failure which a
> subsequent change requires.
> 
> Instead make copy_nonpresent_pte() return an error code if further
> processing is required and read the value for the swap entry in the main
> loop under the ptl.
> 
> Signed-off-by: Alistair Popple 
> 
> ---
> 
> v9:
> 
> New for v9 to allow device exclusive handling to occur in
> copy_nonpresent_pte().
> ---
>  mm/memory.c | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 2fb455c365c2..e061cfa18c11 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -718,7 +718,7 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct 
> mm_struct *src_mm,
>  
>   if (likely(!non_swap_entry(entry))) {
>   if (swap_duplicate(entry) < 0)
> - return entry.val;
> + return -EAGAIN;
>  
>   /* make sure dst_mm is on swapoff's mmlist. */
>   if (unlikely(list_empty(&dst_mm->mmlist))) {
> @@ -974,11 +974,13 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct 
> vm_area_struct *src_vma,
>   continue;
>   }
>   if (unlikely(!pte_present(*src_pte))) {
> - entry.val = copy_nonpresent_pte(dst_mm, src_mm,
> - dst_pte, src_pte,
> - src_vma, addr, rss);
> - if (entry.val)
> + ret = copy_nonpresent_pte(dst_mm, src_mm,
> + dst_pte, src_pte,
> + src_vma, addr, rss);
> + if (ret == -EAGAIN) {
> + entry = pte_to_swp_entry(*src_pte);
>   break;
> + }
>   progress += 8;
>   continue;
>   }

Note that -EAGAIN was previously used by copy_present_page() for early cow
use.  Here later although we check entry.val first:

if (entry.val) {
if (add_swap_count_continuation(entry, GFP_KERNEL) < 0) {
ret = -ENOMEM;
goto out;
}
entry.val = 0;
} else if (ret) {
WARN_ON_ONCE(ret != -EAGAIN);
prealloc = page_copy_prealloc(src_mm, src_vma, addr);
if (!prealloc)
return -ENOMEM;
/* We've captured and resolved the error. Reset, try again. */
ret = 0;
}

We didn't reset "ret" in entry.val case (maybe we should?). Then in the next
round of "goto again" if "ret" is unluckily untouched, it could reach the 2nd
if check, and I think it could cause an unexpected page_copy_prealloc().

-- 
Peter Xu



[PATCH 1/1] drm/i915: Introduce i915_sched_engine object

2021-05-26 Thread Matthew Brost
Introduce i915_sched_engine object which is lower level data structure
that i915_scheduler / generic code can operate on without touching
execlist specific structures. This allows additional submission backends
to be added without breaking the layering.

This is a bit of detour to integrating the i915 with the DRM scheduler
but this object will still exist when the DRM scheduler lands in the
i915. It will however look a bit different. It will encapsulate the
drm_gpu_scheduler object plus and common variables (to the backends)
related to scheduling. Regardless this is a step in the right direction.

Cc: Daniel Vetter 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gem/i915_gem_wait.c  |   4 +-
 drivers/gpu/drm/i915/gt/intel_engine.h|  16 -
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  77 ++--
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |   4 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |  10 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  42 +--
 drivers/gpu/drm/i915/gt/intel_engine_user.c   |   2 +-
 .../drm/i915/gt/intel_execlists_submission.c  | 350 +++---
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  13 +-
 drivers/gpu/drm/i915/gt/mock_engine.c |  17 +-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |  36 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   6 +-
 drivers/gpu/drm/i915/gt/selftest_lrc.c|   6 +-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |   2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  75 ++--
 drivers/gpu/drm/i915/i915_gpu_error.c |   7 +-
 drivers/gpu/drm/i915/i915_request.c   |  50 +--
 drivers/gpu/drm/i915/i915_request.h   |   2 +-
 drivers/gpu/drm/i915/i915_scheduler.c | 168 -
 drivers/gpu/drm/i915/i915_scheduler.h |  65 +++-
 drivers/gpu/drm/i915/i915_scheduler_types.h   |  63 
 21 files changed, 575 insertions(+), 440 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c 
b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index 4b9856d5ba14..af1fbf8e2a9a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -104,8 +104,8 @@ static void fence_set_priority(struct dma_fence *fence,
engine = rq->engine;
 
rcu_read_lock(); /* RCU serialisation for set-wedged protection */
-   if (engine->schedule)
-   engine->schedule(rq, attr);
+   if (engine->sched_engine->schedule)
+   engine->sched_engine->schedule(rq, attr);
rcu_read_unlock();
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
b/drivers/gpu/drm/i915/gt/intel_engine.h
index 8d9184920c51..988d9688ae4d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -123,20 +123,6 @@ execlists_active(const struct intel_engine_execlists 
*execlists)
return active;
 }
 
-static inline void
-execlists_active_lock_bh(struct intel_engine_execlists *execlists)
-{
-   local_bh_disable(); /* prevent local softirq and lock recursion */
-   tasklet_lock(&execlists->tasklet);
-}
-
-static inline void
-execlists_active_unlock_bh(struct intel_engine_execlists *execlists)
-{
-   tasklet_unlock(&execlists->tasklet);
-   local_bh_enable(); /* restore softirq, and kick ksoftirqd! */
-}
-
 struct i915_request *
 execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists);
 
@@ -257,8 +243,6 @@ intel_engine_find_active_request(struct intel_engine_cs 
*engine);
 
 u32 intel_engine_context_size(struct intel_gt *gt, u8 class);
 
-void intel_engine_init_active(struct intel_engine_cs *engine,
- unsigned int subclass);
 #define ENGINE_PHYSICAL0
 #define ENGINE_MOCK1
 #define ENGINE_VIRTUAL 2
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 3f9a811eb02b..dc939c8ef288 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -8,6 +8,7 @@
 #include "gem/i915_gem_context.h"
 
 #include "i915_drv.h"
+#include "i915_scheduler.h"
 
 #include "intel_breadcrumbs.h"
 #include "intel_context.h"
@@ -326,9 +327,6 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
if (engine->context_size)
DRIVER_CAPS(i915)->has_logical_contexts = true;
 
-   /* Nothing to do here, execute in order of dependencies */
-   engine->schedule = NULL;
-
ewma__engine_latency_init(&engine->latency);
seqcount_init(&engine->stats.lock);
 
@@ -583,9 +581,6 @@ void intel_engine_init_execlists(struct intel_engine_cs 
*engine)
memset(execlists->pending, 0, sizeof(execlists->pending));
execlists->active =
memset(execlists->inflight, 0, sizeof(execlists->inflight));
-
-   execlists->queue_priority_hint = INT_MIN;
-   execlists->queue = RB_ROOT_CACHED;
 }
 
 static void cleanup_status_page(struc

[PATCH 0/1] Introduce i915_sched_engine object

2021-05-26 Thread Matthew Brost
As discussed in [1] we are breaking that large series into a several
smaller ones. This series is stand alone patch part of step #4 which has
no other dependencies or patches relevant to it.

Signed-off-by: Matthew Brost 

[1] https://patchwork.freedesktop.org/series/89844/

Matthew Brost (1):
  drm/i915: Introduce i915_sched_engine object

 drivers/gpu/drm/i915/gem/i915_gem_wait.c  |   4 +-
 drivers/gpu/drm/i915/gt/intel_engine.h|  16 -
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  77 ++--
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |   4 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |  10 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  42 +--
 drivers/gpu/drm/i915/gt/intel_engine_user.c   |   2 +-
 .../drm/i915/gt/intel_execlists_submission.c  | 350 +++---
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  13 +-
 drivers/gpu/drm/i915/gt/mock_engine.c |  17 +-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |  36 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   6 +-
 drivers/gpu/drm/i915/gt/selftest_lrc.c|   6 +-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |   2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  75 ++--
 drivers/gpu/drm/i915/i915_gpu_error.c |   7 +-
 drivers/gpu/drm/i915/i915_request.c   |  50 +--
 drivers/gpu/drm/i915/i915_request.h   |   2 +-
 drivers/gpu/drm/i915/i915_scheduler.c | 168 -
 drivers/gpu/drm/i915/i915_scheduler.h |  65 +++-
 drivers/gpu/drm/i915/i915_scheduler_types.h   |  63 
 21 files changed, 575 insertions(+), 440 deletions(-)

-- 
2.28.0



Re: [PATCH v5 1/3] drm/hyperv: Add DRM driver for hyperv synthetic video device

2021-05-26 Thread Thomas Zimmermann

Hi

Am 20.05.21 um 07:41 schrieb Dexuan Cui:

From: Deepak Rawat 
Sent: Wednesday, May 19, 2021 9:38 AM
...
+static int hyperv_vmbus_suspend(struct hv_device *hdev)
+{
+   struct drm_device *dev = hv_get_drvdata(hdev);
+   int ret;
+
+   ret = drm_mode_config_helper_suspend(dev);


If 'ret' is not zero, return immediately?


+
+   vmbus_close(hdev->channel);
+
+   return ret;
+}




+MODULE_DESCRIPTION("DRM driver for hyperv synthetic video device");


s/hyperv/Hyper-V ?



Maybe let's fix these points and then get the driver merged.

Best regards
Thomas

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v9 05/10] mm: Rename migrate_pgmap_owner

2021-05-26 Thread Peter Xu
On Mon, May 24, 2021 at 11:27:20PM +1000, Alistair Popple wrote:
> @@ -521,14 +521,14 @@ static inline void mmu_notifier_range_init(struct 
> mmu_notifier_range *range,
>   range->flags = flags;
>  }
>  
> -static inline void mmu_notifier_range_init_migrate(
> - struct mmu_notifier_range *range, unsigned int flags,
> +static inline void mmu_notifier_range_init_owner(
> + struct mmu_notifier_range *range,
> + enum mmu_notifier_event event, unsigned int flags,
>   struct vm_area_struct *vma, struct mm_struct *mm,
> - unsigned long start, unsigned long end, void *pgmap)
> + unsigned long start, unsigned long end, void *owner)
>  {
> - mmu_notifier_range_init(range, MMU_NOTIFY_MIGRATE, flags, vma, mm,
> - start, end);
> - range->migrate_pgmap_owner = pgmap;
> + mmu_notifier_range_init(range, event, flags, vma, mm, start, end);
> + range->owner = owner;
>  }

mmu_notifier_range_init_migrate() can even be kept to just call the new helper,
then existing callers are unaffected.  Not a big deal, though:

Reviewed-by: Peter Xu 

Thanks,

-- 
Peter Xu



Re: [PATCH] fbdev: matrox: use modern module_init()

2021-05-26 Thread Thomas Zimmermann



Am 14.05.21 um 23:33 schrieb Arnd Bergmann:

From: Arnd Bergmann 

This is one of the last drivers with a global init_module() function
instead of the modern module_init() annotation. Convert it over.

Signed-off-by: Arnd Bergmann 


Added to drm-misc-next. Thank you.

Best regards
Thomas


---
  drivers/video/fbdev/matrox/matroxfb_base.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/video/fbdev/matrox/matroxfb_base.c 
b/drivers/video/fbdev/matrox/matroxfb_base.c
index 4325bf7f388c..5c82611e93d9 100644
--- a/drivers/video/fbdev/matrox/matroxfb_base.c
+++ b/drivers/video/fbdev/matrox/matroxfb_base.c
@@ -2486,8 +2486,6 @@ static int __init matroxfb_init(void)
return err;
  }
  
-module_init(matroxfb_init);

-
  #else
  
  /* *** init module code  */

@@ -2572,7 +2570,7 @@ module_param_named(cmode, default_cmode, int, 0);
  MODULE_PARM_DESC(cmode, "Specify the video depth that should be used (8bit 
default)");
  #endif
  
-int __init init_module(void){

+static int __init matroxfb_init(void){
  
  	DBG(__func__)
  
@@ -2603,6 +2601,7 @@ int __init init_module(void){

  }
  #endif/* MODULE */
  
+module_init(matroxfb_init);

  module_exit(matrox_done);
  EXPORT_SYMBOL(matroxfb_register_driver);
  EXPORT_SYMBOL(matroxfb_unregister_driver);



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v9 07/10] mm: Device exclusive memory access

2021-05-26 Thread Peter Xu
On Mon, May 24, 2021 at 11:27:22PM +1000, Alistair Popple wrote:
> Some devices require exclusive write access to shared virtual
> memory (SVM) ranges to perform atomic operations on that memory. This
> requires CPU page tables to be updated to deny access whilst atomic
> operations are occurring.
> 
> In order to do this introduce a new swap entry
> type (SWP_DEVICE_EXCLUSIVE). When a SVM range needs to be marked for
> exclusive access by a device all page table mappings for the particular
> range are replaced with device exclusive swap entries. This causes any
> CPU access to the page to result in a fault.
> 
> Faults are resovled by replacing the faulting entry with the original
> mapping. This results in MMU notifiers being called which a driver uses
> to update access permissions such as revoking atomic access. After
> notifiers have been called the device will no longer have exclusive
> access to the region.
> 
> Walking of the page tables to find the target pages is handled by
> get_user_pages() rather than a direct page table walk. A direct page
> table walk similar to what migrate_vma_collect()/unmap() does could also
> have been utilised. However this resulted in more code similar in
> functionality to what get_user_pages() provides as page faulting is
> required to make the PTEs present and to break COW.
> 
> Signed-off-by: Alistair Popple 
> Reviewed-by: Christoph Hellwig 
> 
> ---
> 
> v9:
> * Split rename of migrate_pgmap_owner into a separate patch.
> * Added comments explaining SWP_DEVICE_EXCLUSIVE_* entries.
> * Renamed try_to_protect{_one} to page_make_device_exclusive{_one} based
>   somewhat on a suggestion from Peter Xu. I was never particularly happy
>   with try_to_protect() as a name so think this is better.
> * Removed unneccesary code and reworded some comments based on feedback
>   from Peter Xu.
> * Removed the VMA walk when restoring PTEs for device-exclusive entries.
> * Simplified implementation of copy_pte_range() to fail if the page
>   cannot be locked. This might lead to occasional fork() failures but at
>   this stage we don't think that will be an issue.
> 
> v8:
> * Remove device exclusive entries on fork rather than copy them.
> 
> v7:
> * Added Christoph's Reviewed-by.
> * Minor cosmetic cleanups suggested by Christoph.
> * Replace mmu_notifier_range_init_migrate/exclusive with
>   mmu_notifier_range_init_owner as suggested by Christoph.
> * Replaced lock_page() with lock_page_retry() when handling faults.
> * Restrict to anonymous pages for now.
> 
> v6:
> * Fixed a bisectablity issue due to incorrectly applying the rename of
>   migrate_pgmap_owner to the wrong patches for Nouveau and hmm_test.
> 
> v5:
> * Renamed range->migrate_pgmap_owner to range->owner.
> * Added MMU_NOTIFY_EXCLUSIVE to allow passing of a driver cookie which
>   allows notifiers called as a result of make_device_exclusive_range() to
>   be ignored.
> * Added a check to try_to_protect_one() to detect if the pages originally
>   returned from get_user_pages() have been unmapped or not.
> * Removed check_device_exclusive_range() as it is no longer required with
>   the other changes.
> * Documentation update.
> 
> v4:
> * Add function to check that mappings are still valid and exclusive.
> * s/long/unsigned long/ in make_device_exclusive_entry().
> ---
>  Documentation/vm/hmm.rst |  17 
>  include/linux/mmu_notifier.h |   6 ++
>  include/linux/rmap.h |   4 +
>  include/linux/swap.h |   7 +-
>  include/linux/swapops.h  |  44 -
>  mm/hmm.c |   5 +
>  mm/memory.c  | 128 +++-
>  mm/mprotect.c|   8 ++
>  mm/page_vma_mapped.c |   9 +-
>  mm/rmap.c| 186 +++
>  10 files changed, 405 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
> index 3df79307a797..a14c2938e7af 100644
> --- a/Documentation/vm/hmm.rst
> +++ b/Documentation/vm/hmm.rst
> @@ -405,6 +405,23 @@ between device driver specific code and shared common 
> code:
>  
> The lock can now be released.
>  
> +Exclusive access memory
> +===
> +
> +Some devices have features such as atomic PTE bits that can be used to 
> implement
> +atomic access to system memory. To support atomic operations to a shared 
> virtual
> +memory page such a device needs access to that page which is exclusive of any
> +userspace access from the CPU. The ``make_device_exclusive_range()`` function
> +can be used to make a memory range inaccessible from userspace.
> +
> +This replaces all mappings for pages in the given range with special swap
> +entries. Any attempt to access the swap entry results in a fault which is
> +resovled by replacing the entry with the original mapping. A driver gets
> +notified that the mapping has been changed by MMU notifiers, after which 
> point
> +it will no longer have exclusive access to the page. Exclu

Re: [PATCH v2] drm/fb-helper: improve DRM fbdev emulation device names

2021-05-26 Thread Thomas Zimmermann



Am 25.05.21 um 17:13 schrieb Javier Martinez Canillas:

Framebuffer devices that are registered by DRM drivers for fbdev emulation
have a "drmfb" suffix in their name. But makes them to be quite confusing
for drivers that already have "drm" in their name:

$ cat /proc/fb
0 rockchipdrmdrmfb

$ cat /proc/fb
0 simpledrmdrmfb

Also, there isn't a lot of value in adding these "drmfb" suffices to their
names, since users shouldn't really care if the FB devices were registered
by a real fbdev driver or a DRM driver using the fbdev emulation.

What programs should be interested about is if there's a DRM device, and
there are better ways to query that info than reading this procfs entry.

So let's just remove the suffix, which leads to much better device names:

$ cat /proc/fb
0 rockchipdrm

$ cat /proc/fb
0 simpledrm

Suggested-by: Thomas Zimmermann 
Signed-off-by: Javier Martinez Canillas 


Added to drm-misc-next. Thank you.

Best regards
Thomas


---

Changes in v2:
- Just remove the "drmfb" suffix instead of using a different one (tzimmermann).

  drivers/gpu/drm/drm_fb_helper.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index f6baa204612..d77a24507d3 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -1737,7 +1737,7 @@ void drm_fb_helper_fill_info(struct fb_info *info,
   sizes->fb_width, sizes->fb_height);
  
  	info->par = fb_helper;

-   snprintf(info->fix.id, sizeof(info->fix.id), "%sdrmfb",
+   snprintf(info->fix.id, sizeof(info->fix.id), "%s",
 fb_helper->dev->driver->name);
  
  }




--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 0/4] drm: Finally retire struct drm_format_name_buf

2021-05-26 Thread Thomas Zimmermann

ping for further a-bs / r-bs

Am 16.05.21 um 14:13 schrieb Thomas Zimmermann:

This is a cleanup patchset to remove drm_format_name_buf et al. There
are two instances in drivers that need to be replaced with the %4cc
printk format modifier. Patch 3 was left over back from an earlier
patchset. [1] Patch 4 removes struct drm_format_name_buf.

I built-tested with drm-tip. The patchsetcan be mered through drm-misc.

[1] 
https://lore.kernel.org/dri-devel/20210216155723.17109-1-sakari.ai...@linux.intel.com/

Sakari Ailus (1):
   drm: Remove drm_get_format_name()

Thomas Zimmermann (3):
   drm/amdgpu: Use %p4cc to print 4CC format
   drm/simpledrm: Use %p4cc to print 4CC format
   drm/fourcc: Remove struct drm_format_buf_name

  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c |  7 ++
  drivers/gpu/drm/drm_fourcc.c| 25 -
  drivers/gpu/drm/tiny/simpledrm.c|  6 ++---
  include/drm/drm_fourcc.h|  9 
  4 files changed, 4 insertions(+), 43 deletions(-)

--
2.31.1



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH RESEND] drm/hisilicon/kirin: Use the correct HiSilicon copyright

2021-05-26 Thread Thomas Zimmermann



Am 22.05.21 um 12:15 schrieb Hao Fang:

s/Hisilicon/HiSilicon/.
It should use capital S, according to
https://www.hisilicon.com/en.

Signed-off-by: Hao Fang 
Acked-by: Tian Tao 


Applied to drm-misc-next. Thanks a lot.

Best regards
Thomas


---
  drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c| 2 +-
  drivers/gpu/drm/hisilicon/kirin/dw_dsi_reg.h| 2 +-
  drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h | 2 +-
  drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c | 2 +-
  drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c | 2 +-
  drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.h | 2 +-
  6 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c 
b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
index 00e87c2..9b565a0 100644
--- a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
+++ b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
@@ -3,7 +3,7 @@
   * DesignWare MIPI DSI Host Controller v1.02 driver
   *
   * Copyright (c) 2016 Linaro Limited.
- * Copyright (c) 2014-2016 Hisilicon Limited.
+ * Copyright (c) 2014-2016 HiSilicon Limited.
   *
   * Author:
   *Xinliang Liu 
diff --git a/drivers/gpu/drm/hisilicon/kirin/dw_dsi_reg.h 
b/drivers/gpu/drm/hisilicon/kirin/dw_dsi_reg.h
index 19e81ff..d79fc03 100644
--- a/drivers/gpu/drm/hisilicon/kirin/dw_dsi_reg.h
+++ b/drivers/gpu/drm/hisilicon/kirin/dw_dsi_reg.h
@@ -1,7 +1,7 @@
  /* SPDX-License-Identifier: GPL-2.0-only */
  /*
   * Copyright (c) 2016 Linaro Limited.
- * Copyright (c) 2014-2016 Hisilicon Limited.
+ * Copyright (c) 2014-2016 HiSilicon Limited.
   */
  
  #ifndef __DW_DSI_REG_H__

diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h 
b/drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h
index e2ac098..be9e789 100644
--- a/drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h
+++ b/drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h
@@ -1,7 +1,7 @@
  /* SPDX-License-Identifier: GPL-2.0-only */
  /*
   * Copyright (c) 2016 Linaro Limited.
- * Copyright (c) 2014-2016 Hisilicon Limited.
+ * Copyright (c) 2014-2016 HiSilicon Limited.
   */
  
  #ifndef __KIRIN_ADE_REG_H__

diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c 
b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
index 6dcf9ec..1ab9462 100644
--- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
+++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
@@ -3,7 +3,7 @@
   * Hisilicon Hi6220 SoC ADE(Advanced Display Engine)'s crtc&plane driver
   *
   * Copyright (c) 2016 Linaro Limited.
- * Copyright (c) 2014-2016 Hisilicon Limited.
+ * Copyright (c) 2014-2016 HiSilicon Limited.
   *
   * Author:
   *Xinliang Liu 
diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c 
b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c
index 4349da3..e590e19 100644
--- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c
+++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c
@@ -3,7 +3,7 @@
   * Hisilicon Kirin SoCs drm master driver
   *
   * Copyright (c) 2016 Linaro Limited.
- * Copyright (c) 2014-2016 Hisilicon Limited.
+ * Copyright (c) 2014-2016 HiSilicon Limited.
   *
   * Author:
   *Xinliang Liu 
diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.h 
b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.h
index 386d137..db0dc7b 100644
--- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.h
+++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.h
@@ -1,7 +1,7 @@
  /* SPDX-License-Identifier: GPL-2.0-only */
  /*
   * Copyright (c) 2016 Linaro Limited.
- * Copyright (c) 2014-2016 Hisilicon Limited.
+ * Copyright (c) 2014-2016 HiSilicon Limited.
   */
  
  #ifndef __KIRIN_DRM_DRV_H__




--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] drm: fix leaked dma handles after removing drm_pci_free

2021-05-26 Thread Thomas Zimmermann



Am 18.05.21 um 23:28 schrieb Joseph Kogut:

After removing drm_pci_alloc/free, some instances where drm_pci_free()
would have kfreed the dma handle were skipped.

Ensure these handles are freed properly.

Signed-off-by: Joseph Kogut 


Applied to drm-misc-next. Thanks

Best regards
Thomas


---
  drivers/gpu/drm/drm_bufs.c | 1 +
  drivers/gpu/drm/r128/ati_pcigart.c | 2 ++
  2 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_bufs.c b/drivers/gpu/drm/drm_bufs.c
index ea3ca81be9dd..7eb3baed9a70 100644
--- a/drivers/gpu/drm/drm_bufs.c
+++ b/drivers/gpu/drm/drm_bufs.c
@@ -685,6 +685,7 @@ static void drm_cleanup_buf_error(struct drm_device *dev,
  dmah->size,
  dmah->vaddr,
  dmah->busaddr);
+   kfree(dmah);
}
}
kfree(entry->seglist);
diff --git a/drivers/gpu/drm/r128/ati_pcigart.c 
b/drivers/gpu/drm/r128/ati_pcigart.c
index fbb0cfd79758..04408f372f74 100644
--- a/drivers/gpu/drm/r128/ati_pcigart.c
+++ b/drivers/gpu/drm/r128/ati_pcigart.c
@@ -71,6 +71,8 @@ static void drm_ati_free_pcigart_table(struct drm_device *dev,
drm_dma_handle_t *dmah = gart_info->table_handle;
  
  	dma_free_coherent(dev->dev, dmah->size, dmah->vaddr, dmah->busaddr);

+   kfree(dmah);
+
gart_info->table_handle = NULL;
  }
  



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v2] drm/ast: Add detect function support

2021-05-26 Thread Thomas Zimmermann

Hi

Am 26.05.21 um 13:15 schrieb ainux.w...@gmail.com:

From: Ainux 

The existence of the connector cannot be detected,
so add the detect function to support.

Signed-off-by: Ainux 


Looks good. If no one else comments, I'll merge the patch soon. Thanks a 
lot.


Best regards
Thomas


---
  drivers/gpu/drm/ast/ast_mode.c | 18 +-
  1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index 36d9575aa27b..e5996ae03c49 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1293,6 +1293,18 @@ static enum drm_mode_status ast_mode_valid(struct 
drm_connector *connector,
return flags;
  }
  
+static enum drm_connector_status ast_connector_detect(struct drm_connector

+  *connector, bool force)
+{
+   int r;
+
+   r = ast_get_modes(connector);
+   if (r < 0)
+   return connector_status_disconnected;
+
+   return connector_status_connected;
+}
+
  static void ast_connector_destroy(struct drm_connector *connector)
  {
struct ast_connector *ast_connector = to_ast_connector(connector);
@@ -1307,6 +1319,7 @@ static const struct drm_connector_helper_funcs 
ast_connector_helper_funcs = {
  
  static const struct drm_connector_funcs ast_connector_funcs = {

.reset = drm_atomic_helper_connector_reset,
+   .detect = ast_connector_detect,
.fill_modes = drm_helper_probe_single_connector_modes,
.destroy = ast_connector_destroy,
.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
@@ -1334,7 +1347,8 @@ static int ast_connector_init(struct drm_device *dev)
connector->interlace_allowed = 0;
connector->doublescan_allowed = 0;
  
-	connector->polled = DRM_CONNECTOR_POLL_CONNECT;

+   connector->polled = DRM_CONNECTOR_POLL_CONNECT |
+   DRM_CONNECTOR_POLL_DISCONNECT;
  
  	drm_connector_attach_encoder(connector, encoder);
  
@@ -1403,6 +1417,8 @@ int ast_mode_config_init(struct ast_private *ast)
  
  	drm_mode_config_reset(dev);
  
+	drm_kms_helper_poll_init(dev);

+
return 0;
  }
  



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] drm/i915: Use generic_access_phys

2021-05-26 Thread kernel test robot
Hi Daniel,

I love your patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip v5.13-rc3 next-20210526]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Daniel-Vetter/drm-i915-Use-generic_access_phys/20210526-231425
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-a001-20210526 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# 
https://github.com/0day-ci/linux/commit/80157be9e8542ce9a835e6f159408b951590b578
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Daniel-Vetter/drm-i915-Use-generic_access_phys/20210526-231425
git checkout 80157be9e8542ce9a835e6f159408b951590b578
# save the attached .config to linux build tree
make W=1 ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gem/i915_gem_mman.c:755:2: error: request for member 
>> 'open' in something not a structure or union
 755 |  .open = vm_open,
 |  ^
   drivers/gpu/drm/i915/gem/i915_gem_mman.c:764:2: error: request for member 
'open' in something not a structure or union
 764 |  .open = vm_open,
 |  ^


vim +/open +755 drivers/gpu/drm/i915/gem/i915_gem_mman.c

cc662126b4134e2 Abdiel Janulgue 2019-12-04  749  
cc662126b4134e2 Abdiel Janulgue 2019-12-04  750  static const struct 
vm_operations_struct vm_ops_gtt = {
cc662126b4134e2 Abdiel Janulgue 2019-12-04  751 .fault = vm_fault_gtt,
80157be9e8542ce Daniel Vetter   2021-05-26  752  #ifdef CONFIG_HAVE_IOREMAP_PROT
80157be9e8542ce Daniel Vetter   2021-05-26  753 .access = 
generic_access_phys
80157be9e8542ce Daniel Vetter   2021-05-26  754  #endif
cc662126b4134e2 Abdiel Janulgue 2019-12-04 @755 .open = vm_open,
cc662126b4134e2 Abdiel Janulgue 2019-12-04  756 .close = vm_close,
cc662126b4134e2 Abdiel Janulgue 2019-12-04  757  };
cc662126b4134e2 Abdiel Janulgue 2019-12-04  758  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


[PATCH 1/1] drm/i915: Engine relative MMIO

2021-05-26 Thread Matthew Brost
With virtual engines, it is no longer possible to know which specific
physical engine a given request will be executed on at the time that
request is generated. This means that the request itself must be engine
agnostic - any direct register writes must be relative to the engine
and not absolute addresses.

The LRI command has support for engine relative addressing. However,
the mechanism is not transparent to the driver. The scheme for Gen11
(MI_LRI_ADD_CS_MMIO_START) requires the LRI address to have no
absolute engine base component in the ring and BBs. The hardware then
adds on the correct engine offset at execution time. This differs
slightly for LRC where the upper bits of the base component are just
ignored.

Due to the non-trivial and differing schemes on different hardware, it
is not possible to simply update the code that creates the LRI
commands to set a remap flag and let the hardware get on with it.
Instead, this patch adds function wrappers for generating the LRI
command itself and then for constructing the correct address to use
with the LRI.

Bspec: 45606
Signed-off-by: John Harrison 
Signed-off-by: Matthew Brost 
CC: Rodrigo Vivi 
CC: Tvrtko Ursulin 
CC: Chris P Wilson 
CC: Daniele Ceraolo Spurio 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c  |  7 ---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c| 22 
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  3 +++
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  6 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c  |  4 +---
 5 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 188dee13e017..a8a195bfcb57 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1211,7 +1211,7 @@ static int emit_ppgtt_update(struct i915_request *rq, 
void *data)
 {
struct i915_address_space *vm = rq->context->vm;
struct intel_engine_cs *engine = rq->engine;
-   u32 base = engine->mmio_base;
+   u32 base = engine->lri_mmio_base;
u32 *cs;
int i;
 
@@ -1223,7 +1223,7 @@ static int emit_ppgtt_update(struct i915_request *rq, 
void *data)
if (IS_ERR(cs))
return PTR_ERR(cs);
 
-   *cs++ = MI_LOAD_REGISTER_IMM(2);
+   *cs++ = MI_LOAD_REGISTER_IMM_REL(engine, 2);
 
*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, 0));
*cs++ = upper_32_bits(pd_daddr);
@@ -1245,7 +1245,8 @@ static int emit_ppgtt_update(struct i915_request *rq, 
void *data)
if (IS_ERR(cs))
return PTR_ERR(cs);
 
-   *cs++ = MI_LOAD_REGISTER_IMM(2 * GEN8_3LVL_PDPES) | 
MI_LRI_FORCE_POSTED;
+   *cs++ = MI_LOAD_REGISTER_IMM_REL(engine, 2 * GEN8_3LVL_PDPES) |
+   MI_LRI_FORCE_POSTED;
for (i = GEN8_3LVL_PDPES; i--; ) {
const dma_addr_t pd_daddr = 
i915_page_dir_dma_addr(ppgtt, i);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 3f9a811eb02b..0de6bc533776 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -15,6 +15,7 @@
 #include "intel_engine_pm.h"
 #include "intel_engine_user.h"
 #include "intel_execlists_submission.h"
+#include "intel_gpu_commands.h"
 #include "intel_gt.h"
 #include "intel_gt_requests.h"
 #include "intel_gt_pm.h"
@@ -222,6 +223,25 @@ static u32 __engine_mmio_base(struct drm_i915_private 
*i915,
return bases[i].base;
 }
 
+static bool i915_engine_has_relative_lri(const struct intel_engine_cs *engine)
+{
+   if (INTEL_GEN(engine->i915) < 11)
+   return false;
+
+   return true;
+}
+
+static void lri_init(struct intel_engine_cs *engine)
+{
+   if (i915_engine_has_relative_lri(engine)) {
+   engine->lri_cmd_mode = MI_LRI_LRM_CS_MMIO;
+   engine->lri_mmio_base = 0;
+   } else {
+   engine->lri_cmd_mode = 0;
+   engine->lri_mmio_base = engine->mmio_base;
+   }
+}
+
 static void __sprint_engine_name(struct intel_engine_cs *engine)
 {
/*
@@ -329,6 +349,8 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
/* Nothing to do here, execute in order of dependencies */
engine->schedule = NULL;
 
+   lri_init(engine);
+
ewma__engine_latency_init(&engine->latency);
seqcount_init(&engine->stats.lock);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 9ef349cd5cea..e48da23c9b0f 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -310,6 +310,9 @@ struct intel_engine_cs {
u32 context_size;
u32 mmio_base;
 
+   u32 lri_mmio_base;
+   u32 lri_cmd_mode;
+
 

[PATCH 0/1] Engine relative MMIO

2021-05-26 Thread Matthew Brost
As discussed in [1] we are breaking that large series into a several
smaller ones. This series is stand alone patch part of step #4 which has
no other dependencies or patches relevant to it.

Taking ownership of the patch in this series from John Harrison per his
request.

Trybot CI [2] looks good, let's try to get this reviewed a merged
quickly.

Signed-off-by: Matthew Brost 

[1] https://patchwork.freedesktop.org/series/89844/
[2] https://patchwork.freedesktop.org/series/90573/

Matthew Brost (1):
  drm/i915: Engine relative MMIO

 drivers/gpu/drm/i915/gem/i915_gem_context.c  |  7 ---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c| 22 
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  3 +++
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  6 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c  |  4 +---
 5 files changed, 36 insertions(+), 6 deletions(-)

-- 
2.28.0



Re: [Intel-gfx] [RFC PATCH 60/97] drm/i915: Track 'serial' counts for virtual engines

2021-05-26 Thread John Harrison

On 5/26/2021 01:40, Tvrtko Ursulin wrote:

On 25/05/2021 18:52, Matthew Brost wrote:

On Tue, May 25, 2021 at 11:16:12AM +0100, Tvrtko Ursulin wrote:


On 06/05/2021 20:14, Matthew Brost wrote:

From: John Harrison 

The serial number tracking of engines happens at the backend of
request submission and was expecting to only be given physical
engines. However, in GuC submission mode, the decomposition of virtual
to physical engines does not happen in i915. Instead, requests are
submitted to their virtual engine mask all the way through to the
hardware (i.e. to GuC). This would mean that the heart beat code
thinks the physical engines are idle due to the serial number not
incrementing.

This patch updates the tracking to decompose virtual engines into
their physical constituents and tracks the request against each. This
is not entirely accurate as the GuC will only be issuing the request
to one physical engine. However, it is the best that i915 can do given
that it has no knowledge of the GuC's scheduling decisions.


Commit text sounds a bit defeatist. I think instead of making up the 
serial
counts, which has downsides (could you please document in the commit 
what

they are), we should think how to design things properly.



IMO, I don't think fixing serial counts is the scope of this series. We
should focus on getting GuC submission in not cleaning up all the crap
that is in the i915. Let's make a note of this though so we can revisit
later.


I will say again - commit message implies it is introducing an 
unspecified downside by not fully fixing an also unspecified issue. It 
is completely reasonable, and customary even, to ask for both to be 
documented in the commit message.
Not sure what exactly is 'unspecified'. I thought the commit message 
described both the problem (heartbeat not running when using virtual 
engines) and the result (heartbeat running on more engines than strictly 
necessary). But in greater detail...


The serial number tracking is a hack for the heartbeat code to know 
whether an engine is busy or idle, and therefore whether it should be 
pinged for aliveness. Whenever a submission is made to an engine, the 
serial number is incremented. The heartbeat code keeps a copy of the 
value. If the value has changed, the engine is busy and needs to be pinged.


This works fine for execlist mode where virtual engine decomposition is 
done inside i915. It fails miserably for GuC mode where the 
decomposition is done by the hardware. The reason being that the 
heartbeat code only looks at physical engines but the serial count is 
only incremented on the virtual engine. Thus, the heartbeat sees 
everything as idle and does not ping.


This patch decomposes the virtual engines for the sake of incrementing 
the serial count on each sub-engine in order to keep the heartbeat code 
happy. The downside is that now the heartbeat sees all sub-engines as 
busy rather than only the one the submission actually ends up on. There 
really isn't much that can be done about that. The heartbeat code is in 
i915 not GuC, the scheduler is in GuC not i915. The only way to improve 
it is to either move the heartbeat code into GuC as well and completely 
disable the i915 side, or add some way for i915 to interrogate GuC as to 
which engines are or are not active. Technically, we do have both. GuC 
has (or at least had) an option to force a context switch on every 
execution quantum pre-emption. However, that is much, much, more heavy 
weight than the heartbeat. For the latter, we do (almost) have the 
engine usage statistics for PMU and such like. I'm not sure how much 
effort it would be to wire that up to the heartbeat code instead of 
using the serial count.


In short, the serial count is ever so slightly inefficient in that it 
causes heartbeat pings on engines which are idle. On the other hand, it 
is way more efficient and simpler than the current alternatives.


Does that answer the questions?

John.




If we are abandoning the normal review process someone please say so I 
don't waste my time reading it.


Regards,

Tvrtko


Matt


Regards,

Tvrtko


Signed-off-by: John Harrison 
Signed-off-by: Matthew Brost 
---
   drivers/gpu/drm/i915/gt/intel_engine_types.h |  2 ++
   .../gpu/drm/i915/gt/intel_execlists_submission.c |  6 ++
   drivers/gpu/drm/i915/gt/intel_ring_submission.c  |  6 ++
   drivers/gpu/drm/i915/gt/mock_engine.c    |  6 ++
   .../gpu/drm/i915/gt/uc/intel_guc_submission.c    | 16 


   drivers/gpu/drm/i915/i915_request.c  |  4 +++-
   6 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h

index 86302e6d86b2..e2b5cda6dbc4 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -389,6 +389,8 @@ struct intel_engine_cs {
   void    (*park)(struct intel_engine_cs *engine);
   

Re: [PATCH 5/5] PCI: Support ASpeed VGA cards behind a misbehaving bridge

2021-05-26 Thread Bjorn Helgaas
On Wed, May 26, 2021 at 01:00:33PM +1000, Dave Airlie wrote:
> > > > > I think I would see if it's possible to call
> > > > > vga_arb_select_default_device() from vga_arbiter_add_pci_device()
> > > > > instead of from vga_arb_device_init().
> > > > >
> > > > > I would also (as a separate patch) try to get rid of this loop in
> > > > > vga_arb_device_init():
> > > > >
> > > > > list_for_each_entry(vgadev, &vga_list, list) {
> > > > > struct device *dev = &vgadev->pdev->dev;
> > > > >
> > > > > if (vgadev->bridge_has_one_vga)
> > > > > vgaarb_info(dev, "bridge control possible\n");
> > > > > else
> > > > > vgaarb_info(dev, "no bridge control 
> > > > > possible\n");
> > > > > }
> > > > >
> > > > > and do the vgaarb_info() in vga_arbiter_check_bridge_sharing(), where
> > > > > the loop would not be needed.
> > > >
> > > > Any updates?
> > >
> > > Are you waiting for me to do something else?
> > >
> > > I suggested an approach above, but I don't have time to actually do
> > > the work for you.
> >
> > Yes, I am really waiting... but I am also investigating history
> > and thinking.

Well, don't wait for me because this work is not on my to-do list :)

> > If I haven't missed something (correct me if I'm wrong). For the
> > original HiSilicon problem, the first attempt is to modify
> > vga_arbiter_add_pci_device() and remove the VGA_RSRC_LEGACY_MASK
> > check. But vga_arbiter_add_pci_device() is called for each PCI device,
> > so removing that check will cause the first VGA device to be the
> > default VGA device. This breaks some x86 platforms, so after that you
> > don't touch vga_arbiter_add_pci_device(), but add
> > vga_arb_select_default_device() in vga_arb_device_init().
> >
> > If the above history is correct, then we cannot add
> > vga_arb_select_default_device() in vga_arbiter_add_pci_device()
> > directly. So it seems we can only add vga_arb_select_default_device()
> > in pci_notify(). And if we don't care about hotplug, we can simply use
> > subsys_initcall_sync() to wrap vga_arb_device_init().
> >
> > And DRM developers, please let me know what do you think about?
> 
> I'm not 100% following what is going on here.
> 
> Do you need call vga_arb_select_default_device after hotplug for some
> reason, or it this just a race with subsys_init?
> 
> I think just adding subsys_initcall_sync should be fine

Doing subsys_initcall_sync(vga_arb_device_init) is probably "OK".  I
don't think it's *great* because initcalls don't make dependencies
explicit so it won't be obvious *why* it's subsys_initcall_sync, and
it feels a little like a band-aid.

> I don't see why you'd want to care about making a hotplug VGA device
> the default at this point.

I don't think hotplug per se is relevant for this ASpeed case.

But I think the current design is slightly broken in that we set up
the machinery to call vga_arbiter_add_pci_device() for hot-added
devices, but a hot-added device can never be set as the default VGA
device.

Imagine a system with a single VGA device.  If that device is plugged
in before boot, it becomes the default VGA device.  If it is hot-added
after boot, it does not.  That inconsistency feels wrong to me.

If it were possible to set the default VGA device in
vga_arbiter_add_pci_device(), it would fix that inconsistency and
solve the ASpeed case.  But maybe that's not practical.

Bjorn


Re: [Intel-gfx] [RFC PATCH 39/97] drm/i915/guc: Increase size of CTB buffers

2021-05-26 Thread Matthew Brost
On Wed, May 26, 2021 at 10:30:27AM +0100, Tvrtko Ursulin wrote:
> 
> On 25/05/2021 18:15, Matthew Brost wrote:
> > On Tue, May 25, 2021 at 10:24:09AM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 06/05/2021 20:13, Matthew Brost wrote:
> > > > With the introduction of non-blocking CTBs more than one CTB can be in
> > > > flight at a time. Increasing the size of the CTBs should reduce how
> > > > often software hits the case where no space is available in the CTB
> > > > buffer.
> > > 
> > > I'd move this before the patch which adds the non-blocking send since that
> > > one claims congestion should be rare with properly sized buffers. So it
> > > makes sense to have them sized properly back before that one.
> > > 
> > 
> > IMO patch ordering is a bit of bikeshed. All these CTBs changes required
> > for GuC submission (34-40, 54) will get posted its own series and get
> > merged together. None of the individual patches break anything or is any
> > of this code really used until GuC submission is turned on. I can move
> > this when I post these patches by themselves but I just don't really see
> > the point either way.
> 
> As a general principle we do try to have work in the order which makes sense
> functionality wise.
> 
> That includes trying to avoid adding and then removing, or changing a lot,
> the same code within the series. And also adding functionality which is
> known to not work completely well until later in the series.
> 
> With a master switch at the end of series you can sometimes get away with
> it, but if nothing else it at least makes it much easier to read if things
> are flowing in the expected way within (the series).
> 
> In this particular example sizing the buffers appropriately before starting
> to use the facility a lot more certainly sounds like a no brainer to me,
> especially since the patch is so trivial to move conflict wise.
> 

Fair enough. I'll reorder these patches when I do a post to merge these
ones.

Matt

> Regards,
> 
> Tvrtko
> 
> > Matt
> > > Regards,
> > > 
> > > Tvrtko
> > > 
> > > > Cc: John Harrison 
> > > > Signed-off-by: Matthew Brost 
> > > > ---
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 ---
> > > >1 file changed, 8 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > > index 77dfbc94dcc3..d6895d29ed2d 100644
> > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > > @@ -63,11 +63,16 @@ static inline struct drm_device *ct_to_drm(struct 
> > > > intel_guc_ct *ct)
> > > > *  
> > > > ++---+--+
> > > > *
> > > > * Size of each `CT Buffer`_ must be multiple of 4K.
> > > > - * As we don't expect too many messages, for now use minimum sizes.
> > > > + * We don't expect too many messages in flight at any time, unless we 
> > > > are
> > > > + * using the GuC submission. In that case each request requires a 
> > > > minimum
> > > > + * 16 bytes which gives us a maximum 256 queue'd requests. Hopefully 
> > > > this
> > > > + * enough space to avoid backpressure on the driver. We increase the 
> > > > size
> > > > + * of the receive buffer (relative to the send) to ensure a G2H 
> > > > response
> > > > + * CTB has a landing spot.
> > > > */
> > > >#define CTB_DESC_SIZEALIGN(sizeof(struct 
> > > > guc_ct_buffer_desc), SZ_2K)
> > > >#define CTB_H2G_BUFFER_SIZE  (SZ_4K)
> > > > -#define CTB_G2H_BUFFER_SIZE(SZ_4K)
> > > > +#define CTB_G2H_BUFFER_SIZE(4 * CTB_H2G_BUFFER_SIZE)
> > > >#define MAX_US_STALL_CTB 100
> > > > @@ -753,7 +758,7 @@ static int ct_read(struct intel_guc_ct *ct, struct 
> > > > ct_incoming_msg **msg)
> > > > /* beware of buffer wrap case */
> > > > if (unlikely(available < 0))
> > > > available += size;
> > > > -   CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail);
> > > > +   CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, 
> > > > tail, size);
> > > > GEM_BUG_ON(available < 0);
> > > > header = cmds[head];
> > > > 


Re: [Intel-gfx] [RFC PATCH 55/97] drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC

2021-05-26 Thread Matthew Brost
On Wed, May 26, 2021 at 10:21:05AM +0100, Tvrtko Ursulin wrote:
> 
> On 25/05/2021 18:07, Matthew Brost wrote:
> > On Tue, May 25, 2021 at 11:06:00AM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 06/05/2021 20:14, Matthew Brost wrote:
> > > > When running the GuC the GPU can't be considered idle if the GuC still
> > > > has contexts pinned. As such, a call has been added in
> > > > intel_gt_wait_for_idle to idle the UC and in turn the GuC by waiting for
> > > > the number of unpinned contexts to go to zero.
> > > > 
> > > > Cc: John Harrison 
> > > > Signed-off-by: Matthew Brost 
> > > > ---
> > > >drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  3 +-
> > > >drivers/gpu/drm/i915/gt/intel_gt.c| 18 
> > > >drivers/gpu/drm/i915/gt/intel_gt.h|  2 +
> > > >drivers/gpu/drm/i915/gt/intel_gt_requests.c   | 22 ++---
> > > >drivers/gpu/drm/i915/gt/intel_gt_requests.h   |  7 +-
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc.h|  4 +
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  1 +
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  4 +
> > > >.../gpu/drm/i915/gt/uc/intel_guc_submission.c | 91 
> > > > ++-
> > > >drivers/gpu/drm/i915/gt/uc/intel_uc.h |  5 +
> > > >drivers/gpu/drm/i915/i915_debugfs.c   |  1 +
> > > >drivers/gpu/drm/i915/i915_gem_evict.c |  1 +
> > > >.../gpu/drm/i915/selftests/igt_live_test.c|  2 +-
> > > >.../gpu/drm/i915/selftests/mock_gem_device.c  |  3 +-
> > > >14 files changed, 137 insertions(+), 27 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
> > > > b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> > > > index 8598a1c78a4c..2f5295c9408d 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> > > > @@ -634,7 +634,8 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
> > > > goto insert;
> > > > /* Attempt to reap some mmap space from dead objects */
> > > > -   err = intel_gt_retire_requests_timeout(&i915->gt, 
> > > > MAX_SCHEDULE_TIMEOUT);
> > > > +   err = intel_gt_retire_requests_timeout(&i915->gt, 
> > > > MAX_SCHEDULE_TIMEOUT,
> > > > +  NULL);
> > > > if (err)
> > > > goto err;
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
> > > > b/drivers/gpu/drm/i915/gt/intel_gt.c
> > > > index 8d77dcbad059..1742a8561f69 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> > > > @@ -574,6 +574,24 @@ static void __intel_gt_disable(struct intel_gt *gt)
> > > > GEM_BUG_ON(intel_gt_pm_is_awake(gt));
> > > >}
> > > > +int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout)
> > > > +{
> > > > +   long rtimeout;
> > > > +
> > > > +   /* If the device is asleep, we have no requests outstanding */
> > > > +   if (!intel_gt_pm_is_awake(gt))
> > > > +   return 0;
> > > > +
> > > > +   while ((timeout = intel_gt_retire_requests_timeout(gt, timeout,
> > > > +  &rtimeout)) 
> > > > > 0) {
> > > > +   cond_resched();
> > > > +   if (signal_pending(current))
> > > > +   return -EINTR;
> > > > +   }
> > > > +
> > > > +   return timeout ? timeout : intel_uc_wait_for_idle(>->uc, 
> > > > rtimeout);
> > > > +}
> > > > +
> > > >int intel_gt_init(struct intel_gt *gt)
> > > >{
> > > > int err;
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h 
> > > > b/drivers/gpu/drm/i915/gt/intel_gt.h
> > > > index 7ec395cace69..c775043334bf 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_gt.h
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_gt.h
> > > > @@ -48,6 +48,8 @@ void intel_gt_driver_release(struct intel_gt *gt);
> > > >void intel_gt_driver_late_release(struct intel_gt *gt);
> > > > +int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
> > > > +
> > > >void intel_gt_check_and_clear_faults(struct intel_gt *gt);
> > > >void intel_gt_clear_error_registers(struct intel_gt *gt,
> > > > intel_engine_mask_t engine_mask);
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c 
> > > > b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > > > index 647eca9d867a..c6c702f236fa 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > > > @@ -13,6 +13,7 @@
> > > >#include "intel_gt_pm.h"
> > > >#include "intel_gt_requests.h"
> > > >#include "intel_timeline.h"
> > > > +#include "uc/intel_uc.h"
> > > >static bool retire_requests(struct intel_timeline *tl)
> > > >{
> > > > @@ -130,7 +131,8 @@ void intel_engine_fini_retire(struct 
> > > > intel_engine_cs *engine)
> > > > GEM_BUG_ON(engine->retire);
> > > >   

Re: [Intel-gfx] [RFC PATCH 53/97] drm/i915/guc: Disable semaphores when using GuC scheduling

2021-05-26 Thread Matthew Brost
On Wed, May 26, 2021 at 10:25:13AM +0100, Tvrtko Ursulin wrote:
> 
> On 25/05/2021 18:01, Matthew Brost wrote:
> > On Tue, May 25, 2021 at 10:52:01AM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 06/05/2021 20:14, Matthew Brost wrote:
> > > > Disable semaphores when using GuC scheduling as semaphores are broken in
> > > > the current GuC firmware.
> > > 
> > > What is "current"? Given that the patch itself is like year and a half 
> > > old.
> > > 
> > 
> > Stale comment. Semaphore work with the firmware we just haven't enabled
> > them in the i915 with GuC submission as this an optimization and not
> > required for functionality.
> 
> How will the updated commit message look in terms of remaining reasons why
> semaphores won't/can't be enabled?
> 

Semaphores are an optimization and not required for basic GuC submission
to work properly. Disable until we have time to do the implementation to
enable semaphores and tune them for performance.

> They were a nice performance win on some media workloads although granted a
> lot of tweaking was required to find a good balance on when to use them and
> when not to.
>

The same tweaking would have to be done for with GuC submission. Let's
get basic submission then tweak for performance.

Matt 
 
> Regards,
> 
> Tvrtko
> 
> > Matt
> > 
> > > Regards,
> > > 
> > > Tvrtko
> > > 
> > > > Cc: John Harrison 
> > > > Signed-off-by: Matthew Brost 
> > > > ---
> > > >drivers/gpu/drm/i915/gem/i915_gem_context.c | 6 --
> > > >1 file changed, 4 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> > > > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > index 993faa213b41..d30260ffe2a7 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > @@ -230,7 +230,8 @@ static void intel_context_set_gem(struct 
> > > > intel_context *ce,
> > > > ce->timeline = intel_timeline_get(ctx->timeline);
> > > > if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
> > > > -   intel_engine_has_timeslices(ce->engine))
> > > > +   intel_engine_has_timeslices(ce->engine) &&
> > > > +   intel_engine_has_semaphores(ce->engine))
> > > > __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> > > > intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> > > > @@ -1939,7 +1940,8 @@ static int __apply_priority(struct intel_context 
> > > > *ce, void *arg)
> > > > if (!intel_engine_has_timeslices(ce->engine))
> > > > return 0;
> > > > -   if (ctx->sched.priority >= I915_PRIORITY_NORMAL)
> > > > +   if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
> > > > +   intel_engine_has_semaphores(ce->engine))
> > > > intel_context_set_use_semaphores(ce);
> > > > else
> > > > intel_context_clear_use_semaphores(ce);
> > > > 


Re: [Intel-gfx] [RFC PATCH 36/97] drm/i915/guc: Add non blocking CTB send function

2021-05-26 Thread Matthew Brost
On Wed, May 26, 2021 at 09:57:10AM +0100, Tvrtko Ursulin wrote:
> 
> On 25/05/2021 18:21, Matthew Brost wrote:
> > On Tue, May 25, 2021 at 10:21:00AM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 06/05/2021 20:13, Matthew Brost wrote:
> > > > Add non blocking CTB send function, intel_guc_send_nb. In order to
> > > > support a non blocking CTB send function a spin lock is needed to
> > > > protect the CTB descriptors fields. Also the non blocking call must not
> > > > update the fence value as this value is owned by the blocking call
> > > > (intel_guc_send).
> > > 
> > > Could the commit message say why the non-blocking send function is needed?
> > > 
> > 
> > Sure. Something like:
> > 
> > 'CTBs will be used in the critical patch of GuC submission and there is
> > no need to wait for each CTB complete before moving on the i915'
> 
> A bit more, like also mentioning the critical path is with interrupts 
> disabled or so. And not just that there is no need to wait but waiting is not 
> possible because this or that. So only choice is to do this busy loop send. 
> It's a bit horrible so justification needs to be documented.
> 

Don't I basically say all this? Anyways I'll scrub this comment.

> > > > The blocking CTB now must have a flow control mechanism to ensure the
> > > > buffer isn't overrun. A lazy spin wait is used as we believe the flow
> > > > control condition should be rare with properly sized buffer.
> > > > 
> > > > The function, intel_guc_send_nb, is exported in this patch but unused.
> > > > Several patches later in the series make use of this function.
> > > > 
> > > > Signed-off-by: John Harrison 
> > > > Signed-off-by: Matthew Brost 
> > > > ---
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc.h| 12 ++-
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 96 
> > > > +--
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  7 +-
> > > >3 files changed, 105 insertions(+), 10 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > > > index c20f3839de12..4c0a367e41d8 100644
> > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > > > @@ -75,7 +75,15 @@ static inline struct intel_guc *log_to_guc(struct 
> > > > intel_guc_log *log)
> > > >static
> > > >inline int intel_guc_send(struct intel_guc *guc, const u32 *action, 
> > > > u32 len)
> > > >{
> > > > -   return intel_guc_ct_send(&guc->ct, action, len, NULL, 0);
> > > > +   return intel_guc_ct_send(&guc->ct, action, len, NULL, 0, 0);
> > > > +}
> > > > +
> > > > +#define INTEL_GUC_SEND_NB  BIT(31)
> > > > +static
> > > > +inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, 
> > > > u32 len)
> > > > +{
> > > > +   return intel_guc_ct_send(&guc->ct, action, len, NULL, 0,
> > > > +INTEL_GUC_SEND_NB);
> > > >}
> > > >static inline int
> > > > @@ -83,7 +91,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, 
> > > > const u32 *action, u32 len,
> > > >u32 *response_buf, u32 response_buf_size)
> > > >{
> > > > return intel_guc_ct_send(&guc->ct, action, len,
> > > > -response_buf, response_buf_size);
> > > > +response_buf, response_buf_size, 0);
> > > >}
> > > >static inline void intel_guc_to_host_event_handler(struct intel_guc 
> > > > *guc)
> > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > > index a76603537fa8..af7314d45a78 100644
> > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > > @@ -3,6 +3,11 @@
> > > > * Copyright © 2016-2019 Intel Corporation
> > > > */
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +
> > > >#include "i915_drv.h"
> > > >#include "intel_guc_ct.h"
> > > >#include "gt/intel_gt.h"
> > > > @@ -308,6 +313,7 @@ int intel_guc_ct_enable(struct intel_guc_ct *ct)
> > > > if (unlikely(err))
> > > > goto err_deregister;
> > > > +   ct->requests.last_fence = 1;
> > > > ct->enabled = true;
> > > > return 0;
> > > > @@ -343,10 +349,22 @@ static u32 ct_get_next_fence(struct intel_guc_ct 
> > > > *ct)
> > > > return ++ct->requests.last_fence;
> > > >}
> > > > +static void write_barrier(struct intel_guc_ct *ct) {
> > > > +   struct intel_guc *guc = ct_to_guc(ct);
> > > > +   struct intel_gt *gt = guc_to_gt(guc);
> > > > +
> > > > +   if (i915_gem_object_is_lmem(guc->ct.vma->obj)) {
> > > > +   GEM_BUG_ON(guc->send_regs.fw_domains);
> > > > +   intel_uncore_write_fw(gt->uncore, 
> > > > GEN11_SOFT_SCRATCH(0), 0);
> > > 
> > > It's safe to write to this reg? Does it need a comment to explain it?
>

Re: [PATCH 4/4] RFC: dma-buf: Add an API for importing sync files (v6)

2021-05-26 Thread Daniel Stone
Hey,

On Wed, 26 May 2021 at 16:24, Jason Ekstrand  wrote:
> On Wed, May 26, 2021 at 6:09 AM Daniel Stone  wrote:
> > Typing out the Wayland protocol isn't the hard bit. If we just need to
> > copy and sed syncobj to weirdsyncobj, no problem really, and it gives
> > us a six-month head start on painful compositor-internal surgery
> > whilst we work on common infrastructure to ship userspace fences
> > around (mappable dmabuf with the sync bracketing? FD where every
> > read() gives you the current value? memfd? other?).
>
> I feel like I should elaborate more about timelines.  In my earlier
> reply, my commentary about timeline syncobj was mostly focused around
> helping people avoid typing.  That's not really the full story,
> though, and I hope more context will help.
>
> First, let me say that timeline syncobj was designed as a mechanism to
> implement VK_KHR_timeline_semaphore without inserting future fences
> into the kernel.  It's entirely designed around the needs of Vulkan
> drivers, not really as a window-system primitive.  The semantics are
> designed around one driver communicating to another that new fences
> have been added and it's safe to kick off more rendering.  I'm not
> convinced that it's the right object for window-systems and I'm also
> not convinced that it's a good idea to try and make a version of it
> that's a wrapper around a userspace memory fence.  (I'm going to start
> typing UMF for userspace memory fence because it's long to type out.)
>
> Why?  Well, the fundamental problem with timelines in general is
> trying to figure out when it's about to be done.  But timeline syncobj
> solves this for us!  It gives us this fancy super-useful ioctl!
> Right?  Uh not as well as I'd like.  Let's say we make a timeline
> syncobj that's a wrapper around a userspace memory fence.  What do we
> do with that ioctl?  As I mentioned above, the kernel doesn't have any
> clue when it will be triggered so that ioctl turns into an actual
> wait.  That's no good because it creates unnecessary stalls.

Yeah, I'm assuming that UMF will be a separate primitive. No problem.
I also think that your submitted/completed thing is a non-problem: at
this stage we're just throwing up our hands and admitting that we're
letting userspace tie itself in knots, and giving it the tools to tie
a sufficiently un-streetwise compositor in knots too. We're already
crossing that Rubicon, so let's just embrace it and not try to design
it out. Us compositors can handle the scheduling, really.

> There's another potential solution here:  Have each UMF be two
> timelines: submitted and completed.  At the start of every batch
> that's supposed to trigger a UMF, we set the "submitted" side and
> then, when it completes, we set the "completed" side.  Ok, great, now
> we can get at the "about to be done" with the submitted side,
> implement the ioctl, and we're all good, right?  Sadly, no.  There's
> no guarantee about how long a "batch" takes.  So there's no universal
> timeout the kernel can apply.  Also, if it does time out, the kernel
> doesn't know who to blame for the timeout and how to prevent itself
> from getting in trouble again.  The compositor does so, in theory,
> given the right ioctls, it could detect the -ETIME and kill that
> client.  Not a great solution.
>
> The best option I've been able to come up with for this is some sort
> of client-provided signal.  Something where it says, as part of submit
> or somewhere else, "I promise I'll be done soon" where that promise
> comes with dire consequences if it's not.  At that point, we can turn
> the UMF and a particular wait value into a one-shot fence like a
> dma_fence or sync_file, or signal a syncobj on it.  If it ever times
> out, we kick their context.  In Vulkan terminology, they get
> VK_ERROR_DEVICE_LOST.  There are two important bits here:  First, is
> that it's based on a client-provided thing.  With a fully timeline
> model and wait-before-signal, we can't infer when something is about
> to be done.  Only the client knows when it submitted its last node in
> the dependency graph and the whole mess is unblocked.  Second, is that
> the dma_fence is created within the client's driver context.  If it's
> created compositor-side, the kernel doesn't know who to blame if
> things go badly.  If we create it in the client, it's pretty easy to
> make context death on -ETIME part of the contract.
>
> (Before danvet jumps in here and rants about how UMF -> dma_fence
> isn't possible, I haven't forgotten.  I'm pretending, for now, that
> we've solved some of those problems.)

Funny how we've come full circle to the original proposal here ...

If we really want a kernel primitive for this - and I think it's a
good idea, since can help surface 'badness' in a way which is
observable by e.g. session managers in a way analogous to cgroup stats
and controls - how about this for a counter-proposal? Client exports a
FD for its context/queue and sends it to winsys as part o

Re: [PATCH 15/18] drm/i915/guc: Ensure H2G buffer updates visible before tail update

2021-05-26 Thread Matthew Brost
On Wed, May 26, 2021 at 02:36:18PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 26.05.2021 08:42, Matthew Brost wrote:
> > Ensure H2G buffer updates are visible before descriptor tail updates by
> > inserting a barrier between the H2G buffer update and the tail. The
> > barrier is simple wmb() for SMEM and is register write for LMEM. This is
> > needed if more than 1 H2G can be inflight at once.
> > 
> > Signed-off-by: Matthew Brost 
> > Cc: Michal Wajdeczko 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 18 ++
> >  1 file changed, 18 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > index fb875d257536..42063e1c355d 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > @@ -328,6 +328,18 @@ static u32 ct_get_next_fence(struct intel_guc_ct *ct)
> > return ++ct->requests.last_fence;
> >  }
> >  
> > +static void write_barrier(struct intel_guc_ct *ct) {
> > +   struct intel_guc *guc = ct_to_guc(ct);
> > +   struct intel_gt *gt = guc_to_gt(guc);
> > +
> > +   if (i915_gem_object_is_lmem(guc->ct.vma->obj)) {
> > +   GEM_BUG_ON(guc->send_regs.fw_domains);
> > +   intel_uncore_write_fw(gt->uncore, GEN11_SOFT_SCRATCH(0), 0);
> 
> hmm, as this is one of the GuC scratch registers used for H2G MMIO
> communication, writing 0 there might be interpreted by the GuC as new
> request with action=0 and might results in extra processing/logging on
> GuC side, and, since from here we don't protect access to this register
> by send_mutex, we can corrupt other MMIO message being prepared from
> different thread, ... can't we use other register ?
>

Hmm, this code has been internal for a long time and we haven't seen an
issues. MMIOs are always attempted to be processed each interrupt and
then CTBs are processed next. A value a 0 in scratch0 results in no MMIOs
being processed as a value of 0 is a reserved action which translates to
a NOP.

Also in the current i915 once CTBs are enabled MMIOs are never used.
That being said, I think once we transition to the new interface +
enable suspend on a VF MMIOs might be used. 

With that I purpose that we merge this as is with a comment saying if we
ever mix CTBs and MMIOs we need to find another MMIO register. I don't
changing this now is worth delaying upstreaming this and also any change
we make now will make us lose confidence in code that has been
thoroughly tested.

Matt
 
> > +   } else {
> > +   wmb();
> > +   }
> > +}
> > +
> >  /**
> >   * DOC: CTB Host to GuC request
> >   *
> > @@ -411,6 +423,12 @@ static int ct_write(struct intel_guc_ct *ct,
> > }
> > GEM_BUG_ON(tail > size);
> >  
> > +   /*
> > +* make sure H2G buffer update and LRC tail update (if this triggering a
> > +* submission) are visible before updating the descriptor tail
> > +*/
> > +   write_barrier(ct);
> > +
> > /* now update desc tail (back in bytes) */
> > desc->tail = tail * 4;
> > return 0;
> > 


Re: [PATCH 4/4] RFC: dma-buf: Add an API for importing sync files (v6)

2021-05-26 Thread Daniel Stone
Hey,

On Wed, 26 May 2021 at 17:53, Daniel Vetter  wrote:
> On Wed, May 26, 2021 at 5:13 PM Daniel Stone  wrote:
> > > Shared is shared, I just meant to say that we always add the shared fence.
> > > So an explicit ioctl to add more shared fences is kinda pointless.
> > >
> > > So yeah on a good driver this will run in parallel. On a not-so-good
> > > driver (which currently includes amdgpu and panfrost) this will serialize,
> > > because those drivers don't have the concept of a non-exclusive fence for
> > > such shared buffers (amdgpu does not sync internally, but will sync as
> > > soon as it's cross-drm_file).
> >
> > When you say 'we always add the shared fence', add it to ... where?
> > And which shared fence? (I'm going to use 'fence' below to refer to
> > anything from literal sync_file to timeline-syncobj to userspace
> > fence.)
>
> In the current model, every time you submit anything to the gpu, we
> create a dma_fence to track this work. This dma_fence is attached as a
> shared fence to the dma_resv obj of every object in your working set.
> Clarifications
> you = both userspace or kernel, anything really, including fun stuff
> like writing PTEs, or clearing PTEs and then flushing TLBs
> working set = depends, but can be anything from "really just the
> buffers the current gpu submission uses" to "everything bound into a
> given gpu VM"
>
> This is the fence I'm talking about here.
>
> Since you can't escape this (not unless we do direct userspace submit
> with userspace memory fences) and since there's no distinction of the
> shared fences into "relevant for implicit sync" and "not relevant for
> implicit sync" there's really not much point in adding implicit read
> fences. For now at least, we might want to change this eventually.

Yeah, I agree. My own clarification is that I'm talking about an
explicit-first world, where synchronisation is done primarily through
unknowable UMF, and falling back to implicit sync is a painful and
expensive operation that we only do when we need to. So, definitely
not on every CS (command submission aka execbuf aka vkQueueSubmit aka
glFlush).

> > I'll admit that I've typed out an argument twice for always export
> > from excl+shared, and always import to excl, results in oversync. And
> > I keep tying myself in knots trying to do it. It's arguably slightly
> > contrived, but here's my third attempt ...
> >
> > Vulkan Wayland client, full-flying-car-sync Wayland protocol,
> > Vulkan-based compositor. Part of the contract when the server exposes
> > that protocol is that it guarantees to do explicit sync in both
> > directions, so the client provides a fence at QueueSubmit time and the
> > server provides one back when releasing the image for return to ANI.
> > Neither side ever record fences into the dma_resv because they've
> > opted out by being fully explicit-aware.
> >
> > Now add media encode out on the side because you're streaming. The
> > compositor knows this is a transition between explicit and implicit
> > worlds, so it imports the client's fence into the exclusive dma_resv
> > slot, which makes sense: the media encode has to sync against the
> > client work, but is indifferent to the parallel compositor work. The
> > shared fence is exported back out so the compositor can union the
> > encode-finished fence with its composition-finished fence to send back
> > to the client with release/ANI.
> >
> > Now add a second media encode because you want a higher-quality local
> > capture to upload to YouTube later on. The compositor can do the exact
> > same import/export dance, and the two encodes can safely run in
> > parallel. Which is good.
>
> So the example which works is really clear ...
>
> > Where it starts to become complex is: what if your compositor is fully
> > explicit-aware but your clients aren't, so your compositor has more
> > import/export points to record into the resv? What if you aren't
> > actually a compositor but a full-blown media pipeline, where you have
> > a bunch of threads all launching reads in parallel, to the extent
> > where it's not practical to manage implicit/explicit transitions
> > globally, but each thread has to more pessimistically import and
> > export around each access?
>
> ... but the example where we oversync is hand-waving?
>
> :-P

Hey, I said I tied myself into knots! Maybe it's because my brain is
too deeply baked into implicit sync, maybe it's because the problem
cases aren't actually problems. Who knows.

I think what it comes down to is that we make it workable for (at
least current-generation, before someone bakes it into Unity) Wayland
compositors to work well with these modal switches, but really
difficult for more complex and variable pipeline frameworks like
GStreamer or PipeWire to work with it.

> > I can make the relatively simple usecases work, but it really feels
> > like in practice we'll end up with massive oversync in some fairly
> > complex usecases, and we'll regret not having had it from

Re: [PATCH 13/18] drm/i915/guc: Relax CTB response timeout

2021-05-26 Thread Matthew Brost
On Wed, May 26, 2021 at 02:25:26PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 26.05.2021 08:42, Matthew Brost wrote:
> > From: Michal Wajdeczko 
> > 
> > In upcoming patch we will allow more CTB requests to be sent in
> > parallel to the GuC for processing, so we shouldn't assume any more
> > that GuC will always reply without 10ms.
> > 
> > Use bigger value from CONFIG_DRM_I915_GUC_CTB_TIMEOUT instead.
> > 
> > v2: Add CONFIG_DRM_I915_GUC_CTB_TIMEOUT config option
> > 
> > Signed-off-by: Michal Wajdeczko 
> > Signed-off-by: Matthew Brost 
> > Reviewed-by: Matthew Brost 
> > ---
> >  drivers/gpu/drm/i915/Kconfig.profile  | 9 +
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 5 -
> >  2 files changed, 13 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/Kconfig.profile 
> > b/drivers/gpu/drm/i915/Kconfig.profile
> > index 39328567c200..68ac707755d2 100644
> > --- a/drivers/gpu/drm/i915/Kconfig.profile
> > +++ b/drivers/gpu/drm/i915/Kconfig.profile
> > @@ -38,6 +38,15 @@ config DRM_I915_USERFAULT_AUTOSUSPEND
> >   May be 0 to disable the extra delay and solely use the device level
> >   runtime pm autosuspend delay tunable.
> >  
> > +config DRM_I915_GUC_CTB_TIMEOUT
> > +   int "How long to wait for the GuC to make forward progress on CTBs (ms)"
> 
> maybe worth to provide here explicit allowed range:
> 
>   range 10 6
> 
> and then we can skip runtime adjustment for minimum 10ms timeout

Didn't know this option, done.

> 
> > +   default 1500 # milliseconds
> > +   help
> > + Configures the default timeout waiting for GuC the to make forward
> > + progress on CTBs. e.g. Waiting for a response to requeset.
> 
> typo
>

Fixed.

Matt

> > +
> > + A minimum value of 10 ms is allowed.
> > +
> >  config DRM_I915_HEARTBEAT_INTERVAL
> > int "Interval between heartbeat pulses (ms)"
> > default 2500 # milliseconds
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > index 916c2b80c841..5b0dece7a7cd 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > @@ -436,6 +436,7 @@ static int ct_write(struct intel_guc_ct *ct,
> >   */
> >  static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
> >  {
> > +   long timeout;
> > int err;
> >  
> > /*
> > @@ -443,10 +444,12 @@ static int wait_for_ct_request_update(struct 
> > ct_request *req, u32 *status)
> >  * up to that length of time, then switch to a slower sleep-wait loop.
> >  * No GuC command should ever take longer than 10ms.
> >  */
> > +   timeout = max(10, CONFIG_DRM_I915_GUC_CTB_TIMEOUT);
> > +
> >  #define done INTEL_GUC_MSG_IS_RESPONSE(READ_ONCE(req->status))
> > err = wait_for_us(done, 10);
> > if (err)
> > -   err = wait_for(done, 10);
> > +   err = wait_for(done, timeout);
> >  #undef done
> >  
> > if (unlikely(err))
> > 


Re: [PATCH v4 15/15] drm/i915: Use ttm mmap handling for ttm bo's.

2021-05-26 Thread Thomas Hellström



On 5/26/21 1:32 PM, Thomas Hellström wrote:

From: Maarten Lankhorst 

Use the ttm handlers for servicing page faults, and vm_access.

We do our own validation of read-only access, otherwise use the
ttm handlers as much as possible.

Because the ttm handlers expect the vma_node at vma->base, we slightly
need to massage the mmap handlers to look at vma_node->driver_private
to fetch the bo, if it's NULL, we assume i915's normal mmap_offset uapi
is used.

This is the easiest way to achieve compatibility without changing ttm's
semantics.

Signed-off-by: Maarten Lankhorst 
---
  drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  78 +++
  drivers/gpu/drm/i915/gem/i915_gem_object.h|   6 +-
  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   3 +
  drivers/gpu/drm/i915/gem/i915_gem_pages.c |   3 +-
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c   | 122 +-
  .../drm/i915/gem/selftests/i915_gem_mman.c|  90 +++--
  drivers/gpu/drm/i915/selftests/igt_mmap.c |  25 +++-
  drivers/gpu/drm/i915/selftests/igt_mmap.h |  12 +-
  8 files changed, 247 insertions(+), 92 deletions(-)


There are a couple of checkpatch.pl --strict warnings/checks with this 
patch.





diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index fd1c9714f8d8..af04ea593091 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -19,6 +19,7 @@
  #include "i915_gem_mman.h"
  #include "i915_trace.h"
  #include "i915_user_extensions.h"
+#include "i915_gem_ttm.h"
  #include "i915_vma.h"
  
  static inline bool

@@ -622,6 +623,8 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
struct i915_mmap_offset *mmo;
int err;
  
+	GEM_BUG_ON(obj->ops->mmap_offset || obj->ops->mmap_ops);

+
mmo = lookup_mmo(obj, mmap_type);
if (mmo)
goto out;
@@ -664,40 +667,47 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
  }
  
  static int

-__assign_mmap_offset(struct drm_file *file,
-u32 handle,
+__assign_mmap_offset(struct drm_i915_gem_object *obj,
 enum i915_mmap_type mmap_type,
-u64 *offset)
+u64 *offset, struct drm_file *file)
  {
-   struct drm_i915_gem_object *obj;
struct i915_mmap_offset *mmo;
-   int err;
  
-	obj = i915_gem_object_lookup(file, handle);

-   if (!obj)
-   return -ENOENT;
+   if (i915_gem_object_never_mmap(obj))
+   return -ENODEV;
  
-	if (i915_gem_object_never_mmap(obj)) {

-   err = -ENODEV;
-   goto out;
+   if (obj->ops->mmap_offset)  {
+   *offset = obj->ops->mmap_offset(obj);
+   return 0;
}
  
  	if (mmap_type != I915_MMAP_TYPE_GTT &&

!i915_gem_object_has_struct_page(obj) &&
-   !i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM)) {
-   err = -ENODEV;
-   goto out;
-   }
+   !i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM))
+   return -ENODEV;
  
  	mmo = mmap_offset_attach(obj, mmap_type, file);

-   if (IS_ERR(mmo)) {
-   err = PTR_ERR(mmo);
-   goto out;
-   }
+   if (IS_ERR(mmo))
+   return PTR_ERR(mmo);
  
  	*offset = drm_vma_node_offset_addr(&mmo->vma_node);

-   err = 0;
-out:
+   return 0;
+}
+
+static int
+__assign_mmap_offset_handle(struct drm_file *file,
+   u32 handle,
+   enum i915_mmap_type mmap_type,
+   u64 *offset)
+{
+   struct drm_i915_gem_object *obj;
+   int err;
+
+   obj = i915_gem_object_lookup(file, handle);
+   if (!obj)
+   return -ENOENT;
+
+   err = __assign_mmap_offset(obj, mmap_type, offset, file);
i915_gem_object_put(obj);
return err;
  }
@@ -717,7 +727,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
else
mmap_type = I915_MMAP_TYPE_GTT;
  
-	return __assign_mmap_offset(file, handle, mmap_type, offset);

+   return __assign_mmap_offset_handle(file, handle, mmap_type, offset);
  }
  
  /**

@@ -785,7 +795,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void 
*data,
return -EINVAL;
}
  
-	return __assign_mmap_offset(file, args->handle, type, &args->offset);

+   return __assign_mmap_offset_handle(file, args->handle, type, 
&args->offset);
  }
  
  static void vm_open(struct vm_area_struct *vma)

@@ -889,8 +899,16 @@ int i915_gem_mmap(struct file *filp, struct vm_area_struct 
*vma)
 * destroyed and will be invalid when the vma manager lock
 * is released.
 */
-   mmo = container_of(node, struct i915_mmap_offset, vma_node);
-   obj = i915_gem_object_get_rcu(mmo->obj);
+   if (!node->driver_private) {
+   mm

Re: [PATCH v4 13/15] drm/i915: Disable mmap ioctl for gen12+

2021-05-26 Thread Intel



On 5/26/21 1:32 PM, Thomas Hellström wrote:

From: Maarten Lankhorst 

The paltform should exclusively use mmap_offset, one less path to worry

Hmm, Thought this was fixed, but s/paltform/platform/


Re: [PATCH 0/3] Clean a few backend interfaces in the i915

2021-05-26 Thread Daniel Vetter
On Tue, May 25, 2021 at 08:53:42AM -0700, Matthew Brost wrote:
> On Tue, May 25, 2021 at 03:56:56PM +0200, Daniel Vetter wrote:
> > On Fri, May 21, 2021 at 11:32:12AM -0700, Matthew Brost wrote:
> > > As discussed in [1] start merging some support patches as a precursor to
> > > GuC submission the i915. This is step #1 mentioned in [1].
> > > 
> > > [1] https://patchwork.freedesktop.org/series/89844/
> > > 
> > > Signed-off-by: Matthew Brost 
> > 
> > Pushed to drm-intel-gt-next, thanks for patches&reviews. Btw you can also
> > ping John H or Daniele for pushing stuff for you, should be quicker than
> > waiting for me to return from a lon w/e :-)
> > 
> 
> Thanks for the push. I don't think John H has push rights upstream, I
> know Daniele has rights but I don't think is up to date with the process
> to merge patches. I can discuss this with him today and see if he can
> get reenabled on this process.

John Harrison is 1 review short from qualifying for drm-intel.git commit
rights (if I got it right, maybe double-check), so please motivate him to
fix this asap so we have more committers.
-Daniel

> 
> Matt
> 
> > Plus I _really_ don't want to get back into the business of pushing other
> > people's work ...
> > 
> > Cheers, Daniel
> > 
> > > 
> > > Chris Wilson (3):
> > >   drm/i915/gt: Move engine setup out of set_default_submission
> > >   drm/i915/gt: Move submission_method into intel_gt
> > >   drm/i915/gt: Move CS interrupt handler to the backend
> > > 
> > >  drivers/gpu/drm/i915/gt/intel_engine.h|  8 +-
> > >  drivers/gpu/drm/i915/gt/intel_engine_cs.c | 19 +++-
> > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  | 14 +--
> > >  .../drm/i915/gt/intel_execlists_submission.c  | 95 +--
> > >  .../drm/i915/gt/intel_execlists_submission.h  |  3 -
> > >  drivers/gpu/drm/i915/gt/intel_gt_irq.c| 82 +---
> > >  drivers/gpu/drm/i915/gt/intel_gt_irq.h| 23 +
> > >  drivers/gpu/drm/i915/gt/intel_gt_types.h  |  7 ++
> > >  drivers/gpu/drm/i915/gt/intel_reset.c |  7 +-
> > >  .../gpu/drm/i915/gt/intel_ring_submission.c   | 12 ++-
> > >  drivers/gpu/drm/i915/gt/intel_rps.c   |  2 +-
> > >  drivers/gpu/drm/i915/gt/selftest_execlists.c  |  2 +-
> > >  .../drm/i915/gt/selftest_ring_submission.c|  2 +-
> > >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 64 ++---
> > >  .../gpu/drm/i915/gt/uc/intel_guc_submission.h |  1 -
> > >  drivers/gpu/drm/i915/i915_irq.c   | 10 +-
> > >  drivers/gpu/drm/i915/i915_perf.c  | 10 +-
> > >  17 files changed, 199 insertions(+), 162 deletions(-)
> > > 
> > > -- 
> > > 2.28.0
> > > 
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 0/3] Clean a few backend interfaces in the i915

2021-05-26 Thread Daniel Vetter
On Tue, May 25, 2021 at 08:54:38AM -0700, Matthew Brost wrote:
> On Tue, May 25, 2021 at 04:27:49PM +0100, Tvrtko Ursulin wrote:
> > 
> > On 25/05/2021 14:56, Daniel Vetter wrote:
> > > On Fri, May 21, 2021 at 11:32:12AM -0700, Matthew Brost wrote:
> > > > As discussed in [1] start merging some support patches as a precursor to
> > > > GuC submission the i915. This is step #1 mentioned in [1].
> > > > 
> > > > [1] https://patchwork.freedesktop.org/series/89844/
> > > > 
> > > > Signed-off-by: Matthew Brost 
> > > 
> > > Pushed to drm-intel-gt-next, thanks for patches&reviews. Btw you can also
> > > ping John H or Daniele for pushing stuff for you, should be quicker than
> > > waiting for me to return from a lon w/e :-)
> > > 
> > > Plus I _really_ don't want to get back into the business of pushing other
> > > people's work ...
> > 
> > To Matt - Also please take care to preserve r-b's when resurrecting patches
> > because all of these three had mine from before which is now lost in git
> > history.
> >
> 
> Will do. Still getting used to the upstream rules and wasn't sure if
> should have included your old R-Bs.

If you have an r-b but for an old version with some significant changes
compared to the current one add a (v1) or similar tag at the end of that
r-b. That way it's not lost, but also not misattributed to a newer and
potentially buggy version of the patch.
-Daniel

> 
> Matt
>  
> > Regards,
> > 
> > Tvrtko

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 7/7] RFC: dma-buf: Add an API for importing sync files (v7)

2021-05-26 Thread Daniel Vetter
On Tue, May 25, 2021 at 04:17:53PM -0500, Jason Ekstrand wrote:
> This patch is analogous to the previous sync file export patch in that
> it allows you to import a sync_file into a dma-buf.  Unlike the previous
> patch, however, this does add genuinely new functionality to dma-buf.
> Without this, the only way to attach a sync_file to a dma-buf is to
> submit a batch to your driver of choice which waits on the sync_file and
> claims to write to the dma-buf.  Even if said batch is a no-op, a submit
> is typically way more overhead than just attaching a fence.  A submit
> may also imply extra synchronization with other work because it happens
> on a hardware queue.
> 
> In the Vulkan world, this is useful for dealing with the out-fence from
> vkQueuePresent.  Current Linux window-systems (X11, Wayland, etc.) all
> rely on dma-buf implicit sync.  Since Vulkan is an explicit sync API, we
> get a set of fences (VkSemaphores) in vkQueuePresent and have to stash
> those as an exclusive (write) fence on the dma-buf.  We handle it in
> Mesa today with the above mentioned dummy submit trick.  This ioctl
> would allow us to set it directly without the dummy submit.
> 
> This may also open up possibilities for GPU drivers to move away from
> implicit sync for their kernel driver uAPI and instead provide sync
> files and rely on dma-buf import/export for communicating with other
> implicit sync clients.
> 
> We make the explicit choice here to only allow setting RW fences which
> translates to an exclusive fence on the dma_resv.  There's no use for
> read-only fences for communicating with other implicit sync userspace
> and any such attempts are likely to be racy at best.  When we got to
> insert the RW fence, the actual fence we set as the new exclusive fence
> is a combination of the sync_file provided by the user and all the other
> fences on the dma_resv.  This ensures that the newly added exclusive
> fence will never signal before the old one would have and ensures that
> we don't break any dma_resv contracts.  We require userspace to specify
> RW in the flags for symmetry with the export ioctl and in case we ever
> want to support read fences in the future.
> 
> There is one downside here that's worth documenting:  If two clients
> writing to the same dma-buf using this API race with each other, their
> actions on the dma-buf may happen in parallel or in an undefined order.
> Both with and without this API, the pattern is the same:  Collect all
> the fences on dma-buf, submit work which depends on said fences, and
> then set a new exclusive (write) fence on the dma-buf which depends on
> said work.  The difference is that, when it's all handled by the GPU
> driver's submit ioctl, the three operations happen atomically under the
> dma_resv lock.  If two userspace submits race, one will happen before
> the other.  You aren't guaranteed which but you are guaranteed that
> they're strictly ordered.  If userspace manages the fences itself, then
> these three operations happen separately and the two render operations
> may happen genuinely in parallel or get interleaved.  However, this is a
> case of userspace racing with itself.  As long as we ensure userspace
> can't back the kernel into a corner, it should be fine.
> 
> v2 (Jason Ekstrand):
>  - Use a wrapper dma_fence_array of all fences including the new one
>when importing an exclusive fence.
> 
> v3 (Jason Ekstrand):
>  - Lock around setting shared fences as well as exclusive
>  - Mark SIGNAL_SYNC_FILE as a read-write ioctl.
>  - Initialize ret to 0 in dma_buf_wait_sync_file
> 
> v4 (Jason Ekstrand):
>  - Use the new dma_resv_get_singleton helper
> 
> v5 (Jason Ekstrand):
>  - Rename the IOCTLs to import/export rather than wait/signal
>  - Drop the WRITE flag and always get/set the exclusive fence
> 
> v6 (Jason Ekstrand):
>  - Split import and export into separate patches
>  - New commit message
> 
> v7 (Daniel Vetter):
>  - Fix the uapi header to use the right struct in the ioctl
>  - Use a separate dma_buf_import_sync_file struct
>  - Add kerneldoc for dma_buf_import_sync_file
> 
> Signed-off-by: Jason Ekstrand 
> Cc: Christian König 
> Cc: Daniel Vetter 
> Cc: Sumit Semwal 
> Cc: Maarten Lankhorst 
> ---
>  drivers/dma-buf/dma-buf.c| 36 
>  include/uapi/linux/dma-buf.h | 22 ++
>  2 files changed, 58 insertions(+)
> 
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index ea117de962903..098340222662b 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -422,6 +422,40 @@ static long dma_buf_export_sync_file(struct dma_buf 
> *dmabuf,
>   put_unused_fd(fd);
>   return ret;
>  }
> +
> +static long dma_buf_import_sync_file(struct dma_buf *dmabuf,
> +  const void __user *user_data)
> +{
> + struct dma_buf_import_sync_file arg;
> + struct dma_fence *fence, *singleton = NULL;
> + int ret = 0;
> +
> +   

Re: [PATCH 4/4] RFC: dma-buf: Add an API for importing sync files (v6)

2021-05-26 Thread Daniel Vetter
On Wed, May 26, 2021 at 5:13 PM Daniel Stone  wrote:
> On Wed, 26 May 2021 at 14:44, Daniel Vetter  wrote:
> > On Wed, May 26, 2021 at 02:08:19PM +0100, Daniel Stone wrote:
> > > Are you saying that if a compositor imports a client-provided dmabuf
> > > as an EGLImage to use as a source texture for its rendering, and then
> > > provides it to VA-API or V4L2 to use as a media encode source (both
> > > purely read-only ops), that these will both serialise against each
> > > other? Like, my media decode job won't begin execution until the
> > > composition read has fully retired?
> > >
> > > If so, a) good lord that hurts, and b) what are shared fences actually 
> > > ... for?
> >
> > Shared is shared, I just meant to say that we always add the shared fence.
> > So an explicit ioctl to add more shared fences is kinda pointless.
> >
> > So yeah on a good driver this will run in parallel. On a not-so-good
> > driver (which currently includes amdgpu and panfrost) this will serialize,
> > because those drivers don't have the concept of a non-exclusive fence for
> > such shared buffers (amdgpu does not sync internally, but will sync as
> > soon as it's cross-drm_file).
>
> When you say 'we always add the shared fence', add it to ... where?
> And which shared fence? (I'm going to use 'fence' below to refer to
> anything from literal sync_file to timeline-syncobj to userspace
> fence.)

In the current model, every time you submit anything to the gpu, we
create a dma_fence to track this work. This dma_fence is attached as a
shared fence to the dma_resv obj of every object in your working set.
Clarifications
you = both userspace or kernel, anything really, including fun stuff
like writing PTEs, or clearing PTEs and then flushing TLBs
working set = depends, but can be anything from "really just the
buffers the current gpu submission uses" to "everything bound into a
given gpu VM"

This is the fence I'm talking about here.

Since you can't escape this (not unless we do direct userspace submit
with userspace memory fences) and since there's no distinction of the
shared fences into "relevant for implicit sync" and "not relevant for
implicit sync" there's really not much point in adding implicit read
fences. For now at least, we might want to change this eventually.

> I'll admit that I've typed out an argument twice for always export
> from excl+shared, and always import to excl, results in oversync. And
> I keep tying myself in knots trying to do it. It's arguably slightly
> contrived, but here's my third attempt ...
>
> Vulkan Wayland client, full-flying-car-sync Wayland protocol,
> Vulkan-based compositor. Part of the contract when the server exposes
> that protocol is that it guarantees to do explicit sync in both
> directions, so the client provides a fence at QueueSubmit time and the
> server provides one back when releasing the image for return to ANI.
> Neither side ever record fences into the dma_resv because they've
> opted out by being fully explicit-aware.
>
> Now add media encode out on the side because you're streaming. The
> compositor knows this is a transition between explicit and implicit
> worlds, so it imports the client's fence into the exclusive dma_resv
> slot, which makes sense: the media encode has to sync against the
> client work, but is indifferent to the parallel compositor work. The
> shared fence is exported back out so the compositor can union the
> encode-finished fence with its composition-finished fence to send back
> to the client with release/ANI.
>
> Now add a second media encode because you want a higher-quality local
> capture to upload to YouTube later on. The compositor can do the exact
> same import/export dance, and the two encodes can safely run in
> parallel. Which is good.

So the example which works is really clear ...

> Where it starts to become complex is: what if your compositor is fully
> explicit-aware but your clients aren't, so your compositor has more
> import/export points to record into the resv? What if you aren't
> actually a compositor but a full-blown media pipeline, where you have
> a bunch of threads all launching reads in parallel, to the extent
> where it's not practical to manage implicit/explicit transitions
> globally, but each thread has to more pessimistically import and
> export around each access?

... but the example where we oversync is hand-waving?

:-P

> I can make the relatively simple usecases work, but it really feels
> like in practice we'll end up with massive oversync in some fairly
> complex usecases, and we'll regret not having had it from the start,
> plus people will just rely on implicit sync for longer because it has
> better (more parallel) semantics in some usecases.

Things fall apart in implicit sync if you have more than one logical
writer into the same buffer. Trivial example is two images in one
buffer, but you could also do funky stuff like interleaved/tiled
rendering with _indepedent_ consumers. If the consumers are not
ind

[PATCH] drm/i915: Disable gpu relocations

2021-05-26 Thread Daniel Vetter
Media userspace was the last userspace to still use them, and they
converted now too:

https://github.com/intel/media-driver/commit/144020c37770083974bedf59902b70b8f444c799

This means no reason anymore to make relocations faster than they've
been for the first 9 years of gem. This code was added in

commit 7dd4f6729f9243bd7046c6f04c107a456bda38eb
Author: Chris Wilson 
Date:   Fri Jun 16 15:05:24 2017 +0100

drm/i915: Async GPU relocation processing

Furthermore there's pretty strong indications it's buggy, since the
code to use it by default as the only option had to be reverted:

commit ad5d95e4d538737ed3fa25493777decf264a3011
Author: Dave Airlie 
Date:   Tue Sep 8 15:41:17 2020 +1000

Revert "drm/i915/gem: Async GPU relocations only"

This code just disables gpu relocations, leaving the garbage
collection for later patches and more importantly, much less confusing
diff. Also given how much headaches this code has caused in the past,
letting this soak for a bit seems justified.

Cc: Jon Bloomfield 
Signed-off-by: Daniel Vetter 
Cc: Chris Wilson 
Cc: Maarten Lankhorst 
Cc: Joonas Lahtinen 
Cc: Daniel Vetter 
Cc: "Thomas Hellström" 
Cc: Matthew Auld 
Cc: Lionel Landwerlin 
Cc: Dave Airlie 
Cc: Jason Ekstrand 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 43 ---
 1 file changed, 18 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 297143511f99..31e904f79d0a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1571,7 +1571,7 @@ static int __reloc_entry_gpu(struct i915_execbuffer *eb,
return true;
 }
 
-static int reloc_entry_gpu(struct i915_execbuffer *eb,
+static int __maybe_unused reloc_entry_gpu(struct i915_execbuffer *eb,
struct i915_vma *vma,
u64 offset,
u64 target_addr)
@@ -1593,32 +1593,25 @@ relocate_entry(struct i915_vma *vma,
 {
u64 target_addr = relocation_target(reloc, target);
u64 offset = reloc->offset;
-   int reloc_gpu = reloc_entry_gpu(eb, vma, offset, target_addr);
-
-   if (reloc_gpu < 0)
-   return reloc_gpu;
-
-   if (!reloc_gpu) {
-   bool wide = eb->reloc_cache.use_64bit_reloc;
-   void *vaddr;
+   bool wide = eb->reloc_cache.use_64bit_reloc;
+   void *vaddr;
 
 repeat:
-   vaddr = reloc_vaddr(vma->obj, eb,
-   offset >> PAGE_SHIFT);
-   if (IS_ERR(vaddr))
-   return PTR_ERR(vaddr);
-
-   GEM_BUG_ON(!IS_ALIGNED(offset, sizeof(u32)));
-   clflush_write32(vaddr + offset_in_page(offset),
-   lower_32_bits(target_addr),
-   eb->reloc_cache.vaddr);
-
-   if (wide) {
-   offset += sizeof(u32);
-   target_addr >>= 32;
-   wide = false;
-   goto repeat;
-   }
+   vaddr = reloc_vaddr(vma->obj, eb,
+   offset >> PAGE_SHIFT);
+   if (IS_ERR(vaddr))
+   return PTR_ERR(vaddr);
+
+   GEM_BUG_ON(!IS_ALIGNED(offset, sizeof(u32)));
+   clflush_write32(vaddr + offset_in_page(offset),
+   lower_32_bits(target_addr),
+   eb->reloc_cache.vaddr);
+
+   if (wide) {
+   offset += sizeof(u32);
+   target_addr >>= 32;
+   wide = false;
+   goto repeat;
}
 
return target->node.start | UPDATE;
-- 
2.31.0



Re: [PATCH] drm/i915: only disable default vga device

2021-05-26 Thread Emil Velikov
Hi Ville,

On Tue, 18 May 2021 at 12:17, Ville Syrjälä
 wrote:
>
> On Tue, May 18, 2021 at 12:09:56PM +0100, Emil Velikov wrote:
> > Hi Ville,
> >
> > On Mon, 17 May 2021 at 18:24, Ville Syrjälä
> >  wrote:
> > >
> > > On Sun, May 16, 2021 at 06:14:32PM +0100, Emil Velikov wrote:
> > > > From: Vivek Das Mohapatra 
> > > >
> > > > This patch is to do with seamless handover, eg when the sequence is
> > > > bootloader → plymouth → desktop.
> > > >
> > > > It switches the vga arbiter from the "other" GPU to the default one
> > > > (intel in this case), so the driver can issue some io().
> > >
> > > I don't understand what this commit message is trying to say.
> > >
> > Bunch of context is lost due to the patch age, so I'm not 100% sure of
> > the actual hardware setup where this occurs.
> > Does the following make sense?
> >
> > Currently on dual GPU systems, we do not get seamless handover as the
> > output flickers during the transition bootloader -> plymouth ->
> > desktop.
> > This happens as a result of switching (via the VGA arbiter) from the
> > "other" GPU back to the default i915 one and issuing io() commands.
>
> Hmm. Does this work?
>
> --- a/drivers/gpu/drm/i915/display/intel_vga.c
> +++ b/drivers/gpu/drm/i915/display/intel_vga.c
> @@ -29,6 +29,9 @@ void intel_vga_disable(struct drm_i915_private *dev_priv)
> i915_reg_t vga_reg = intel_vga_cntrl_reg(dev_priv);
> u8 sr1;
>
> +   if (intel_de_read(dev_priv, vga_reg) & VGA_DISP_DISABLE)
> +   return;
> +
> /* WaEnableVGAAccessThroughIOPort:ctg,elk,ilk,snb,ivb,vlv,hsw */
> vga_get_uninterruptible(pdev, VGA_RSRC_LEGACY_IO);
> outb(SR01, VGA_SR_INDEX);
>
Was able to replicate the issue somewhat and the above does help quite a lot.
Feel free to add my:
Reviewed-by: Emil Velikov 
Tested-by: Emil Velikov 

Also feel free to reuse as much/little of the following setup details.

To reproduce the issue:

Get a dual GPU system - Intel+Nvidia in my case. Set the other
(Nvidia) as default in UEFI and connect monitors to it.
Ensure the bootloader (and if using splash manager like plymouth) are
set to display the UEFI BGRT. Personally I tested systemd-boot,
although GRUB should also work. I couldn't get plymouth to work/behave
here :shrug:

Note: Having the Nvidia drivers in the initramfs can lead to extra
flicker so leave them out. Include the i915 drivers in initramfs.

Without the patch, the existing bootslash is wiped clean almost
instantaneously as the i915 driver calls intel_vga_disable().
With your patch the call is a no-op, and the bootsplash stays around
until the login manager (and X) is spawned.

HTH
Emil


Re: [PATCH v7 14/15] dt-bindings: of: Add restricted DMA pool

2021-05-26 Thread Will Deacon
On Wed, May 26, 2021 at 01:13:22PM +0100, Will Deacon wrote:
> On Tue, May 18, 2021 at 02:42:14PM +0800, Claire Chang wrote:
> > @@ -138,4 +160,9 @@ one for multimedia processing (named 
> > multimedia-memory@7700, 64MiB).
> > memory-region = <&multimedia_reserved>;
> > /* ... */
> > };
> > +
> > +   pcie_device: pcie_device@0,0 {
> > +   memory-region = <&restricted_dma_mem_reserved>;
> > +   /* ... */
> > +   };
> 
> I still don't understand how this works for individual PCIe devices -- how
> is dev->of_node set to point at the node you have above?
> 
> I tried adding the memory-region to the host controller instead, and then
> I see it crop up in dmesg:
> 
>   | pci-host-generic 4000.pci: assigned reserved memory node 
> restricted_dma_mem_reserved
> 
> but none of the actual PCI devices end up with 'dma_io_tlb_mem' set, and
> so the restricted DMA area is not used. In fact, swiotlb isn't used at all.
> 
> What am I missing to make this work with PCIe devices?

Aha, looks like we're just missing the logic to inherit the DMA
configuration. The diff below gets things working for me.

Will

--->8

diff --git a/drivers/of/address.c b/drivers/of/address.c
index c562a9ff5f0b..bf499fdd6e93 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -1113,25 +1113,25 @@ bool of_dma_is_coherent(struct device_node *np)
 }
 EXPORT_SYMBOL_GPL(of_dma_is_coherent);
 
-int of_dma_set_restricted_buffer(struct device *dev)
+int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np)
 {
-   struct device_node *node;
int count, i;
 
-   if (!dev->of_node)
+   if (!np)
return 0;
 
-   count = of_property_count_elems_of_size(dev->of_node, "memory-region",
+   count = of_property_count_elems_of_size(np, "memory-region",
sizeof(phandle));
for (i = 0; i < count; i++) {
-   node = of_parse_phandle(dev->of_node, "memory-region", i);
+   struct device_node *node;
+
+   node = of_parse_phandle(np, "memory-region", i);
/* There might be multiple memory regions, but only one
-* restriced-dma-pool region is allowed.
+* restricted-dma-pool region is allowed.
 */
if (of_device_is_compatible(node, "restricted-dma-pool") &&
of_device_is_available(node))
-   return of_reserved_mem_device_init_by_idx(
-   dev, dev->of_node, i);
+   return of_reserved_mem_device_init_by_idx(dev, np, i);
}
 
return 0;
diff --git a/drivers/of/device.c b/drivers/of/device.c
index d8d865223e51..2defdca418ec 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -166,7 +166,7 @@ int of_dma_configure_id(struct device *dev, struct 
device_node *np,
arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
 
if (!iommu)
-   return of_dma_set_restricted_buffer(dev);
+   return of_dma_set_restricted_buffer(dev, np);
 
return 0;
 }
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index 9fc874548528..8fde97565d11 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -163,14 +163,15 @@ struct bus_dma_region;
 #if defined(CONFIG_OF_ADDRESS) && defined(CONFIG_HAS_DMA)
 int of_dma_get_range(struct device_node *np,
const struct bus_dma_region **map);
-int of_dma_set_restricted_buffer(struct device *dev);
+int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np);
 #else
 static inline int of_dma_get_range(struct device_node *np,
const struct bus_dma_region **map)
 {
return -ENODEV;
 }
-static inline int of_dma_set_restricted_buffer(struct device *dev)
+static inline int of_dma_set_restricted_buffer(struct device *dev,
+  struct device_node *np)
 {
return -ENODEV;
 }


[PATCH] drm/i915: Use generic_access_phys

2021-05-26 Thread Daniel Vetter
Since

commit 96667f8a4382db9ed042332ca6ee165ae9b91307
Author: Daniel Vetter 
Date:   Fri Nov 27 17:41:21 2020 +0100

mm: Close race in generic_access_phys

it is race-free and can therefore be safely used for dynamic mappings
like we have too.

v2 git commit --amend

*sigh*

Cc: Jon Bloomfield 
Signed-off-by: Daniel Vetter 
Cc: Daniel Vetter 
Cc: "Thomas Hellström" 
Cc: Chris Wilson 
Cc: Maarten Lankhorst 
Cc: Andrew Morton 
Cc: "Christian König" 
Cc: "Ville Syrjälä" 
Cc: Michel Lespinasse 
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c | 60 +++-
 1 file changed, 6 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index f6fe5cb01438..16a059d54bda 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -414,58 +414,6 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)
return i915_error_to_vmf_fault(ret);
 }
 
-static int
-vm_access(struct vm_area_struct *area, unsigned long addr,
- void *buf, int len, int write)
-{
-   struct i915_mmap_offset *mmo = area->vm_private_data;
-   struct drm_i915_gem_object *obj = mmo->obj;
-   struct i915_gem_ww_ctx ww;
-   void *vaddr;
-   int err = 0;
-
-   if (i915_gem_object_is_readonly(obj) && write)
-   return -EACCES;
-
-   addr -= area->vm_start;
-   if (addr >= obj->base.size)
-   return -EINVAL;
-
-   i915_gem_ww_ctx_init(&ww, true);
-retry:
-   err = i915_gem_object_lock(obj, &ww);
-   if (err)
-   goto out;
-
-   /* As this is primarily for debugging, let's focus on simplicity */
-   vaddr = i915_gem_object_pin_map(obj, I915_MAP_FORCE_WC);
-   if (IS_ERR(vaddr)) {
-   err = PTR_ERR(vaddr);
-   goto out;
-   }
-
-   if (write) {
-   memcpy(vaddr + addr, buf, len);
-   __i915_gem_object_flush_map(obj, addr, len);
-   } else {
-   memcpy(buf, vaddr + addr, len);
-   }
-
-   i915_gem_object_unpin_map(obj);
-out:
-   if (err == -EDEADLK) {
-   err = i915_gem_ww_ctx_backoff(&ww);
-   if (!err)
-   goto retry;
-   }
-   i915_gem_ww_ctx_fini(&ww);
-
-   if (err)
-   return err;
-
-   return len;
-}
-
 void __i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj)
 {
struct i915_vma *vma;
@@ -801,14 +749,18 @@ static void vm_close(struct vm_area_struct *vma)
 
 static const struct vm_operations_struct vm_ops_gtt = {
.fault = vm_fault_gtt,
-   .access = vm_access,
+#ifdef CONFIG_HAVE_IOREMAP_PROT
+   .access = generic_access_phys,
+#endif
.open = vm_open,
.close = vm_close,
 };
 
 static const struct vm_operations_struct vm_ops_cpu = {
.fault = vm_fault_cpu,
-   .access = vm_access,
+#ifdef CONFIG_HAVE_IOREMAP_PROT
+   .access = generic_access_phys,
+#endif
.open = vm_open,
.close = vm_close,
 };
-- 
2.31.0



Re: [PATCH 4/4] RFC: dma-buf: Add an API for importing sync files (v6)

2021-05-26 Thread Jason Ekstrand
On Wed, May 26, 2021 at 6:09 AM Daniel Stone  wrote:
> On Mon, 24 May 2021 at 18:11, Jason Ekstrand  wrote:
> >  3. Userspace memory fences.
> >
> > Note that timeline syncobj is NOT in that list.  IMO, all the "wait
> > for submit" stuff is an implementation detail we needed in order to
> > get the timeline semantics on top of immutable SW fences.  Under the
> > hood it's all dma_fence; this just gives us a shareable container so
> > we can implement VK_KHR_timeline_semaphore with sharing.  I really
> > don't want to make Wayland protocol around it if memory fences are the
> > final solution.
>
> Typing out the Wayland protocol isn't the hard bit. If we just need to
> copy and sed syncobj to weirdsyncobj, no problem really, and it gives
> us a six-month head start on painful compositor-internal surgery
> whilst we work on common infrastructure to ship userspace fences
> around (mappable dmabuf with the sync bracketing? FD where every
> read() gives you the current value? memfd? other?).

I feel like I should elaborate more about timelines.  In my earlier
reply, my commentary about timeline syncobj was mostly focused around
helping people avoid typing.  That's not really the full story,
though, and I hope more context will help.

First, let me say that timeline syncobj was designed as a mechanism to
implement VK_KHR_timeline_semaphore without inserting future fences
into the kernel.  It's entirely designed around the needs of Vulkan
drivers, not really as a window-system primitive.  The semantics are
designed around one driver communicating to another that new fences
have been added and it's safe to kick off more rendering.  I'm not
convinced that it's the right object for window-systems and I'm also
not convinced that it's a good idea to try and make a version of it
that's a wrapper around a userspace memory fence.  (I'm going to start
typing UMF for userspace memory fence because it's long to type out.)

Why?  Well, the fundamental problem with timelines in general is
trying to figure out when it's about to be done.  But timeline syncobj
solves this for us!  It gives us this fancy super-useful ioctl!
Right?  Uh not as well as I'd like.  Let's say we make a timeline
syncobj that's a wrapper around a userspace memory fence.  What do we
do with that ioctl?  As I mentioned above, the kernel doesn't have any
clue when it will be triggered so that ioctl turns into an actual
wait.  That's no good because it creates unnecessary stalls.

There's another potential solution here:  Have each UMF be two
timelines: submitted and completed.  At the start of every batch
that's supposed to trigger a UMF, we set the "submitted" side and
then, when it completes, we set the "completed" side.  Ok, great, now
we can get at the "about to be done" with the submitted side,
implement the ioctl, and we're all good, right?  Sadly, no.  There's
no guarantee about how long a "batch" takes.  So there's no universal
timeout the kernel can apply.  Also, if it does time out, the kernel
doesn't know who to blame for the timeout and how to prevent itself
from getting in trouble again.  The compositor does so, in theory,
given the right ioctls, it could detect the -ETIME and kill that
client.  Not a great solution.

The best option I've been able to come up with for this is some sort
of client-provided signal.  Something where it says, as part of submit
or somewhere else, "I promise I'll be done soon" where that promise
comes with dire consequences if it's not.  At that point, we can turn
the UMF and a particular wait value into a one-shot fence like a
dma_fence or sync_file, or signal a syncobj on it.  If it ever times
out, we kick their context.  In Vulkan terminology, they get
VK_ERROR_DEVICE_LOST.  There are two important bits here:  First, is
that it's based on a client-provided thing.  With a fully timeline
model and wait-before-signal, we can't infer when something is about
to be done.  Only the client knows when it submitted its last node in
the dependency graph and the whole mess is unblocked.  Second, is that
the dma_fence is created within the client's driver context.  If it's
created compositor-side, the kernel doesn't know who to blame if
things go badly.  If we create it in the client, it's pretty easy to
make context death on -ETIME part of the contract.

(Before danvet jumps in here and rants about how UMF -> dma_fence
isn't possible, I haven't forgotten.  I'm pretending, for now, that
we've solved some of those problems.)

Another option is to just stall on the UMF until it's done.  Yeah,
kind-of terrible and high-latency, but it always works and doesn't
involve any complex logic to kill clients.  If a client never gets
around to signaling a fence, it just never repaints.  The compositor
keeps going like nothing's wrong.  Maybe, if the client submits lots
of frames without ever triggering, it'll hit some max queue depth
somewhere and kill it but that's it.  More likely, the client's
vkAcquireNextImage will start timing 

Re: [PATCH 4/4] RFC: dma-buf: Add an API for importing sync files (v6)

2021-05-26 Thread Daniel Stone
On Wed, 26 May 2021 at 14:44, Daniel Vetter  wrote:
> On Wed, May 26, 2021 at 02:08:19PM +0100, Daniel Stone wrote:
> > Are you saying that if a compositor imports a client-provided dmabuf
> > as an EGLImage to use as a source texture for its rendering, and then
> > provides it to VA-API or V4L2 to use as a media encode source (both
> > purely read-only ops), that these will both serialise against each
> > other? Like, my media decode job won't begin execution until the
> > composition read has fully retired?
> >
> > If so, a) good lord that hurts, and b) what are shared fences actually ... 
> > for?
>
> Shared is shared, I just meant to say that we always add the shared fence.
> So an explicit ioctl to add more shared fences is kinda pointless.
>
> So yeah on a good driver this will run in parallel. On a not-so-good
> driver (which currently includes amdgpu and panfrost) this will serialize,
> because those drivers don't have the concept of a non-exclusive fence for
> such shared buffers (amdgpu does not sync internally, but will sync as
> soon as it's cross-drm_file).

When you say 'we always add the shared fence', add it to ... where?
And which shared fence? (I'm going to use 'fence' below to refer to
anything from literal sync_file to timeline-syncobj to userspace
fence.)

I'll admit that I've typed out an argument twice for always export
from excl+shared, and always import to excl, results in oversync. And
I keep tying myself in knots trying to do it. It's arguably slightly
contrived, but here's my third attempt ...

Vulkan Wayland client, full-flying-car-sync Wayland protocol,
Vulkan-based compositor. Part of the contract when the server exposes
that protocol is that it guarantees to do explicit sync in both
directions, so the client provides a fence at QueueSubmit time and the
server provides one back when releasing the image for return to ANI.
Neither side ever record fences into the dma_resv because they've
opted out by being fully explicit-aware.

Now add media encode out on the side because you're streaming. The
compositor knows this is a transition between explicit and implicit
worlds, so it imports the client's fence into the exclusive dma_resv
slot, which makes sense: the media encode has to sync against the
client work, but is indifferent to the parallel compositor work. The
shared fence is exported back out so the compositor can union the
encode-finished fence with its composition-finished fence to send back
to the client with release/ANI.

Now add a second media encode because you want a higher-quality local
capture to upload to YouTube later on. The compositor can do the exact
same import/export dance, and the two encodes can safely run in
parallel. Which is good.

Where it starts to become complex is: what if your compositor is fully
explicit-aware but your clients aren't, so your compositor has more
import/export points to record into the resv? What if you aren't
actually a compositor but a full-blown media pipeline, where you have
a bunch of threads all launching reads in parallel, to the extent
where it's not practical to manage implicit/explicit transitions
globally, but each thread has to more pessimistically import and
export around each access?

I can make the relatively simple usecases work, but it really feels
like in practice we'll end up with massive oversync in some fairly
complex usecases, and we'll regret not having had it from the start,
plus people will just rely on implicit sync for longer because it has
better (more parallel) semantics in some usecases.

Cheers,
Daniel


[PATCH] drm/i915: Use generic_access_phys

2021-05-26 Thread Daniel Vetter
Since

commit 96667f8a4382db9ed042332ca6ee165ae9b91307
Author: Daniel Vetter 
Date:   Fri Nov 27 17:41:21 2020 +0100

mm: Close race in generic_access_phys

it is race-free and can therefore be safely used for dynamic mappings
like we have too.

Cc: Jon Bloomfield 
Signed-off-by: Daniel Vetter 
Cc: Daniel Vetter 
Cc: "Thomas Hellström" 
Cc: Chris Wilson 
Cc: Maarten Lankhorst 
Cc: Andrew Morton 
Cc: "Christian König" 
Cc: "Ville Syrjälä" 
Cc: Michel Lespinasse 
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c | 60 +++-
 1 file changed, 6 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index f6fe5cb01438..717798293044 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -414,58 +414,6 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)
return i915_error_to_vmf_fault(ret);
 }
 
-static int
-vm_access(struct vm_area_struct *area, unsigned long addr,
- void *buf, int len, int write)
-{
-   struct i915_mmap_offset *mmo = area->vm_private_data;
-   struct drm_i915_gem_object *obj = mmo->obj;
-   struct i915_gem_ww_ctx ww;
-   void *vaddr;
-   int err = 0;
-
-   if (i915_gem_object_is_readonly(obj) && write)
-   return -EACCES;
-
-   addr -= area->vm_start;
-   if (addr >= obj->base.size)
-   return -EINVAL;
-
-   i915_gem_ww_ctx_init(&ww, true);
-retry:
-   err = i915_gem_object_lock(obj, &ww);
-   if (err)
-   goto out;
-
-   /* As this is primarily for debugging, let's focus on simplicity */
-   vaddr = i915_gem_object_pin_map(obj, I915_MAP_FORCE_WC);
-   if (IS_ERR(vaddr)) {
-   err = PTR_ERR(vaddr);
-   goto out;
-   }
-
-   if (write) {
-   memcpy(vaddr + addr, buf, len);
-   __i915_gem_object_flush_map(obj, addr, len);
-   } else {
-   memcpy(buf, vaddr + addr, len);
-   }
-
-   i915_gem_object_unpin_map(obj);
-out:
-   if (err == -EDEADLK) {
-   err = i915_gem_ww_ctx_backoff(&ww);
-   if (!err)
-   goto retry;
-   }
-   i915_gem_ww_ctx_fini(&ww);
-
-   if (err)
-   return err;
-
-   return len;
-}
-
 void __i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj)
 {
struct i915_vma *vma;
@@ -801,14 +749,18 @@ static void vm_close(struct vm_area_struct *vma)
 
 static const struct vm_operations_struct vm_ops_gtt = {
.fault = vm_fault_gtt,
-   .access = vm_access,
+#ifdef CONFIG_HAVE_IOREMAP_PROT
+   .access = generic_access_phys
+#endif
.open = vm_open,
.close = vm_close,
 };
 
 static const struct vm_operations_struct vm_ops_cpu = {
.fault = vm_fault_cpu,
-   .access = vm_access,
+#ifdef CONFIG_HAVE_IOREMAP_PROT
+   .access = generic_access_phys
+#endif
.open = vm_open,
.close = vm_close,
 };
-- 
2.31.0



Re: [Freedreno] [RFC PATCH 00/13] drm/msm: Add Display Stream Compression Support

2021-05-26 Thread Jeffrey Hugo
On Tue, May 25, 2021 at 11:46 PM Vinod Koul  wrote:
>
> Hello Jeff,
>
> On 21-05-21, 08:09, Jeffrey Hugo wrote:
> > On Fri, May 21, 2021 at 6:50 AM Vinod Koul  wrote:
> > >
> > > Display Stream Compression (DSC) compresses the display stream in host 
> > > which
> > > is later decoded by panel. This series enables this for Qualcomm msm 
> > > driver.
> > > This was tested on Google Pixel3 phone which use LGE SW43408 panel.
> > >
> > > The changes include adding DT properties for DSC then hardware blocks 
> > > support
> > > required in DPU1 driver and support in encoder. We also add support in DSI
> > > and introduce required topology changes.
> > >
> > > In order for panel to set the DSC parameters we add dsc in drm_panel and 
> > > set
> > > it from the msm driver.
> > >
> > > Complete changes which enable this for Pixel3 along with panel driver (not
> > > part of this series) and DT changes can be found at:
> > > git.linaro.org/people/vinod.koul/kernel.git pixel/dsc_rfc
> > >
> > > Comments welcome!
> >
> > This feels backwards to me.  I've only skimmed this series, and the DT
> > changes didn't come through for me, so perhaps I have an incomplete
> > view.
>
> Not sure why, I see it on lore:
> https://lore.kernel.org/dri-devel/20210521124946.3617862-3-vk...@kernel.org/
>
> > DSC is not MSM specific.  There is a standard for it.  Yet it looks
> > like everything is implemented in a MSM specific way, and then pushed
> > to the panel.  So, every vendor needs to implement their vendor
> > specific way to get the DSC info, and then push it to the panel?
> > Seems wrong, given there is an actual standard for this feature.
>
> I have added slice and bpp info in the DT here under the host and then
> pass the generic struct drm_dsc_config to panel which allows panel to
> write the pps cmd
>
> Nothing above is MSM specific.. It can very well work with non MSM
> controllers too.

I disagree.

The DT bindings you defined (thanks for the direct link) are MSM
specific.  I'm not talking (yet) about the properties you defined, but
purely from the stand point that you defined the binding within the
scope of the MSM dsi binding.  No other vendor can use those bindings.
Of course, if we look at the properties themselves, they are prefixed
with "qcom", which is vendor specific.

So, purely on the face of it, this is MSM specific.

Assuming we want a DT solution for DSC, I think it should be something
like Documentation/devicetree/bindings/clock/clock-bindings.txt (the
first example that comes to mind), which is a non-vendor specific
generic set of properties that each vendor/device specific binding can
inherit.  Panel has similar things.

Specific to the properties, I don't much like that you duplicate BPP,
which is already associated with the panel (although perhaps not in
the scope of DT).  What if the panel and your DSC bindings disagree?
Also, I guess I need to ask, have you read the DSC spec?  Last I
looked, there were something like 3 dozen properties that could be
configured.  You have five in your proposed binding.  To me, this is
not a generic DSC solution, this is MSM specific (and frankly I don't
think this supports all the configuration the MSM hardware can do,
either).

I'm surprised Rob Herring didn't have more to say on this.

> I didn't envision DSC to be a specific thing, most of
> the patches here are hardware enabling ones for DSC bits for MSM
> hardware.
>
> > Additionally, we define panel properties (resolution, BPP, etc) at the
> > panel, and have the display drivers pull it from the panel.  However,
> > for DSC, you do the reverse (define it in the display driver, and push
> > it to the panel).  If the argument is that DSC properties can be
> > dynamic, well, so can resolution.  Every panel for MSM MTPs supports
> > multiple resolutions, yet we define that with the panel in Linux.
>
> I dont have an answer for that right now, to start with yes the
> properties are in host but I am okay to discuss this and put wherever we
> feel is most correct thing.  I somehow dont like that we should pull
> from panel DT and program host with that. Here using struct
> drm_dsc_config allows me to configure panel based on resolution passed

I somewhat agree that pulling from the panel and programing the host
based on that is an odd solution, but we have it currently.  Have a
look at Documentation/devicetree/bindings/display/panel in particular
panel-timing.  All of that ends up informing the mdss programing
anyways (particularly the dsi and its phy).  So my problem is that we
currently have a solution that seems to just need to be extended, and
instead you have proposed a completely different solution which is
arguably contradictory.

However, I'd like to see thoughts from Rob Clark, David, and any
others that typically handle this stuff (maybe Sam Ravenborg from the
panel side?).  I consider them to be the experts, and if they think
your solution is the way to go, I'll shut up.  I consider myself to be
a novice tha

[PATCH 2/3] drm/vgem: use shmem helpers

2021-05-26 Thread Daniel Vetter
Aside from deleting lots of code the real motivation here is to switch
the mmap over to VM_PFNMAP, to be more consistent with what real gpu
drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
work, and even if you try and there's a struct page behind that,
touching it and mucking around with its refcount can upset drivers
real bad.

v2: Review from Thomas:
- sort #include
- drop more dead code that I didn't spot somehow

v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci)

Cc: Thomas Zimmermann 
Acked-by: Thomas Zimmermann 
Cc: John Stultz 
Cc: Sumit Semwal 
Cc: "Christian König" 
Signed-off-by: Daniel Vetter 
Cc: Melissa Wen 
Cc: Chris Wilson 
---
 drivers/gpu/drm/Kconfig |   1 +
 drivers/gpu/drm/vgem/vgem_drv.c | 340 +---
 2 files changed, 4 insertions(+), 337 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index d3a9ca4b1cec..1c24de03547e 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -269,6 +269,7 @@ source "drivers/gpu/drm/kmb/Kconfig"
 config DRM_VGEM
tristate "Virtual GEM provider"
depends on DRM
+   select DRM_GEM_SHMEM_HELPER
help
  Choose this option to get a virtual graphics memory manager,
  as used by Mesa's software renderer for enhanced performance.
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index a0e75f1d5d01..b1b3a5ffc542 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -38,6 +38,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -50,87 +51,11 @@
 #define DRIVER_MAJOR   1
 #define DRIVER_MINOR   0
 
-static const struct drm_gem_object_funcs vgem_gem_object_funcs;
-
 static struct vgem_device {
struct drm_device drm;
struct platform_device *platform;
 } *vgem_device;
 
-static void vgem_gem_free_object(struct drm_gem_object *obj)
-{
-   struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
-
-   kvfree(vgem_obj->pages);
-   mutex_destroy(&vgem_obj->pages_lock);
-
-   if (obj->import_attach)
-   drm_prime_gem_destroy(obj, vgem_obj->table);
-
-   drm_gem_object_release(obj);
-   kfree(vgem_obj);
-}
-
-static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
-{
-   struct vm_area_struct *vma = vmf->vma;
-   struct drm_vgem_gem_object *obj = vma->vm_private_data;
-   /* We don't use vmf->pgoff since that has the fake offset */
-   unsigned long vaddr = vmf->address;
-   vm_fault_t ret = VM_FAULT_SIGBUS;
-   loff_t num_pages;
-   pgoff_t page_offset;
-   page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
-
-   num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
-
-   if (page_offset >= num_pages)
-   return VM_FAULT_SIGBUS;
-
-   mutex_lock(&obj->pages_lock);
-   if (obj->pages) {
-   get_page(obj->pages[page_offset]);
-   vmf->page = obj->pages[page_offset];
-   ret = 0;
-   }
-   mutex_unlock(&obj->pages_lock);
-   if (ret) {
-   struct page *page;
-
-   page = shmem_read_mapping_page(
-   file_inode(obj->base.filp)->i_mapping,
-   page_offset);
-   if (!IS_ERR(page)) {
-   vmf->page = page;
-   ret = 0;
-   } else switch (PTR_ERR(page)) {
-   case -ENOSPC:
-   case -ENOMEM:
-   ret = VM_FAULT_OOM;
-   break;
-   case -EBUSY:
-   ret = VM_FAULT_RETRY;
-   break;
-   case -EFAULT:
-   case -EINVAL:
-   ret = VM_FAULT_SIGBUS;
-   break;
-   default:
-   WARN_ON(PTR_ERR(page));
-   ret = VM_FAULT_SIGBUS;
-   break;
-   }
-
-   }
-   return ret;
-}
-
-static const struct vm_operations_struct vgem_gem_vm_ops = {
-   .fault = vgem_gem_fault,
-   .open = drm_gem_vm_open,
-   .close = drm_gem_vm_close,
-};
-
 static int vgem_open(struct drm_device *dev, struct drm_file *file)
 {
struct vgem_file *vfile;
@@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct 
drm_file *file)
kfree(vfile);
 }
 
-static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
-   unsigned long size)
-{
-   struct drm_vgem_gem_object *obj;
-   int ret;
-
-   obj = kzalloc(sizeof(*obj), GFP_KERNEL);
-   if (!obj)
-   return ERR_PTR(-ENOMEM);
-
-   obj->base.funcs = &vgem_gem_object_funcs;
-
-   ret = drm_gem_object_init(dev,

[PATCH 3/3] drm/shmem-helper: Align to page size in dumb_create

2021-05-26 Thread Daniel Vetter
shmem helpers seem a bit sloppy here by automatically rounding up when
actually creating the buffer, which results in under-reporting of what
we actually have. Caught by igt/vgem_basic tests.

Acked-by: Thomas Zimmermann 
Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/drm_gem_shmem_helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 6d625cee7a6a..d5e6d4568f99 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -505,13 +505,13 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, 
struct drm_device *dev,
 
if (!args->pitch || !args->size) {
args->pitch = min_pitch;
-   args->size = args->pitch * args->height;
+   args->size = PAGE_ALIGN(args->pitch * args->height);
} else {
/* ensure sane minimum values */
if (args->pitch < min_pitch)
args->pitch = min_pitch;
if (args->size < args->pitch * args->height)
-   args->size = args->pitch * args->height;
+   args->size = PAGE_ALIGN(args->pitch * args->height);
}
 
shmem = drm_gem_shmem_create_with_handle(file, dev, args->size, 
&args->handle);
-- 
2.31.0



[PATCH 1/3] dma-buf: Require VM_PFNMAP vma for mmap

2021-05-26 Thread Daniel Vetter
tldr; DMA buffers aren't normal memory, expecting that you can use
them like that (like calling get_user_pages works, or that they're
accounting like any other normal memory) cannot be guaranteed.

Since some userspace only runs on integrated devices, where all
buffers are actually all resident system memory, there's a huge
temptation to assume that a struct page is always present and useable
like for any more pagecache backed mmap. This has the potential to
result in a uapi nightmare.

To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
blocks get_user_pages and all the other struct page based
infrastructure for everyone. In spirit this is the uapi counterpart to
the kernel-internal CONFIG_DMABUF_DEBUG.

Motivated by a recent patch which wanted to swich the system dma-buf
heap to vm_insert_page instead of vm_insert_pfn.

v2:

Jason brought up that we also want to guarantee that all ptes have the
pte_special flag set, to catch fast get_user_pages (on architectures
that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.

>From auditing the various functions to insert pfn pte entires
(vm_insert_pfn_prot, remap_pfn_range and all it's callers like
dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
this should be the correct flag to check for.

References: 
https://lore.kernel.org/lkml/cakmk7uhi+mg0z0humnt13qccvuturvjpcr0njrl12k-wbwz...@mail.gmail.com/
Acked-by: Christian König 
Cc: Jason Gunthorpe 
Cc: Suren Baghdasaryan 
Cc: Matthew Wilcox 
Cc: John Stultz 
Signed-off-by: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
--
Resending this so I can test the next two patches for vgem/shmem in
intel-gfx-ci. Last round failed somehow, but I can't repro that at all
locally here.

No immediate plans to merge this patch here since ttm isn't addressed
yet (and there we have the hugepte issue, for which I don't think we
have a clear consensus yet).
-Daniel
---
 drivers/dma-buf/dma-buf.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index eadd1eaa2fb5..dda583fb1f03 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -127,6 +127,7 @@ static struct file_system_type dma_buf_fs_type = {
 static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
 {
struct dma_buf *dmabuf;
+   int ret;
 
if (!is_dma_buf_file(file))
return -EINVAL;
@@ -142,7 +143,11 @@ static int dma_buf_mmap_internal(struct file *file, struct 
vm_area_struct *vma)
dmabuf->size >> PAGE_SHIFT)
return -EINVAL;
 
-   return dmabuf->ops->mmap(dmabuf, vma);
+   ret = dmabuf->ops->mmap(dmabuf, vma);
+
+   WARN_ON(!(vma->vm_flags & VM_PFNMAP));
+
+   return ret;
 }
 
 static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence)
@@ -1244,6 +1249,8 @@ EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access);
 int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
 unsigned long pgoff)
 {
+   int ret;
+
if (WARN_ON(!dmabuf || !vma))
return -EINVAL;
 
@@ -1264,7 +1271,11 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct 
vm_area_struct *vma,
vma_set_file(vma, dmabuf->file);
vma->vm_pgoff = pgoff;
 
-   return dmabuf->ops->mmap(dmabuf, vma);
+   ret = dmabuf->ops->mmap(dmabuf, vma);
+
+   WARN_ON(!(vma->vm_flags & VM_PFNMAP));
+
+   return ret;
 }
 EXPORT_SYMBOL_GPL(dma_buf_mmap);
 
-- 
2.31.0



Re: [PATCH v2] drm/i915/params: Align visibility of device level and global modparams

2021-05-26 Thread Jani Nikula
On Wed, 26 May 2021, Tvrtko Ursulin  wrote:
> From: Tvrtko Ursulin 
>
> We have a few modparams which get conditionaly exposed based on a Kconfig
> options and in most cases this also means portions of the driver
> implementing the respective feature are also left out.
>
> Align the visibility of device level and global modparams to make them
> consistent in this respect.
>
> v2:
>  * Fix misplaced parentheses.
>
> Signed-off-by: Tvrtko Ursulin 
> Cc: Jani Nikula 
> Cc: Ville Syrjälä 

Reviewed-by: Jani Nikula 

I'd happily accept patches removing some of the module params, and just
leaving the debugfs device params in place. ;)

> ---
>  drivers/gpu/drm/i915/i915_params.h | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_params.h 
> b/drivers/gpu/drm/i915/i915_params.h
> index 14cd64cc61d0..4a114a5ad000 100644
> --- a/drivers/gpu/drm/i915/i915_params.h
> +++ b/drivers/gpu/drm/i915/i915_params.h
> @@ -71,18 +71,18 @@ struct drm_printer;
>   param(int, fastboot, -1, 0600) \
>   param(int, enable_dpcd_backlight, -1, 0600) \
>   param(char *, force_probe, CONFIG_DRM_I915_FORCE_PROBE, 0400) \
> - param(unsigned long, fake_lmem_start, 0, 0400) \
> - param(unsigned int, request_timeout_ms, 
> CONFIG_DRM_I915_REQUEST_TIMEOUT, 0600) \
> + param(unsigned long, fake_lmem_start, 0, 
> IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM) ? 0400 : 0) \
> + param(unsigned int, request_timeout_ms, 
> CONFIG_DRM_I915_REQUEST_TIMEOUT, CONFIG_DRM_I915_REQUEST_TIMEOUT ? 0600 : 0) \
>   /* leave bools at the end to not create holes */ \
>   param(bool, enable_hangcheck, true, 0600) \
>   param(bool, load_detect_test, false, 0600) \
>   param(bool, force_reset_modeset_test, false, 0600) \
> - param(bool, error_capture, true, 0600) \
> + param(bool, error_capture, true, 
> IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) ? 0600 : 0) \
>   param(bool, disable_display, false, 0400) \
>   param(bool, verbose_state_checks, true, 0) \
>   param(bool, nuclear_pageflip, false, 0400) \
>   param(bool, enable_dp_mst, true, 0600) \
> - param(bool, enable_gvt, false, 0400)
> + param(bool, enable_gvt, false, IS_ENABLED(CONFIG_DRM_I915_GVT) ? 0400 : 
> 0)
>  
>  #define MEMBER(T, member, ...) T member;
>  struct i915_params {

-- 
Jani Nikula, Intel Open Source Graphics Center


[PATCH v2 12/12] drm/i915/gem: Manage all set-domain waits explicitly

2021-05-26 Thread Tvrtko Ursulin
From: Chris Wilson 

Only perform the domain transition under the object lock, and push the
required waits to outside the lock.

v2 (Tvrtko):
 * Rebase.

v3 (Tvrtko):
 * Restore write to gtt domain in coherency selftest. (Matt)

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld  # v1
Signed-off-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   9 +-
 drivers/gpu/drm/i915/gem/i915_gem_clflush.h   |   2 -
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|   4 +-
 drivers/gpu/drm/i915/gem/i915_gem_domain.c| 163 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   4 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  12 +-
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 +
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |   8 -
 .../i915/gem/selftests/i915_gem_coherency.c   |  31 +++-
 .../drm/i915/gem/selftests/i915_gem_phys.c|   8 +-
 .../drm/i915/gem/selftests/igt_gem_utils.c|   3 +
 drivers/gpu/drm/i915/i915_gem.c   |   4 +-
 12 files changed, 89 insertions(+), 165 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index daf9284ef1f5..e4c24558eaa8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -51,8 +51,6 @@ static struct clflush *clflush_work_create(struct 
drm_i915_gem_object *obj)
 {
struct clflush *clflush;
 
-   GEM_BUG_ON(!obj->cache_dirty);
-
clflush = kmalloc(sizeof(*clflush), GFP_KERNEL);
if (!clflush)
return NULL;
@@ -101,13 +99,10 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object 
*obj,
 
trace_i915_gem_object_clflush(obj);
 
-   clflush = NULL;
-   if (!(flags & I915_CLFLUSH_SYNC))
-   clflush = clflush_work_create(obj);
+   clflush = clflush_work_create(obj);
if (clflush) {
i915_sw_fence_await_reservation(&clflush->base.chain,
-   obj->base.resv, NULL, true,
-   
i915_fence_timeout(to_i915(obj->base.dev)),
+   obj->base.resv, NULL, true, 0,
I915_FENCE_GFP);
dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma);
dma_fence_work_commit(&clflush->base);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h
index e6c382973129..4cd5787d1507 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h
@@ -9,12 +9,10 @@
 
 #include 
 
-struct drm_i915_private;
 struct drm_i915_gem_object;
 
 bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
 unsigned int flags);
 #define I915_CLFLUSH_FORCE BIT(0)
-#define I915_CLFLUSH_SYNC BIT(1)
 
 #endif /* __I915_GEM_CLFLUSH_H__ */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index ccede73c6465..0926e0895ee6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -132,7 +132,7 @@ static int i915_gem_begin_cpu_access(struct dma_buf 
*dma_buf, enum dma_data_dire
if (!err)
err = i915_gem_object_pin_pages(obj);
if (!err) {
-   err = i915_gem_object_set_to_cpu_domain(obj, write);
+   i915_gem_object_set_to_cpu_domain(obj, write);
i915_gem_object_unpin_pages(obj);
}
if (err == -EDEADLK) {
@@ -156,7 +156,7 @@ static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, 
enum dma_data_direct
if (!err)
err = i915_gem_object_pin_pages(obj);
if (!err) {
-   err = i915_gem_object_set_to_gtt_domain(obj, false);
+   i915_gem_object_set_to_gtt_domain(obj, false);
i915_gem_object_unpin_pages(obj);
}
if (err == -EDEADLK) {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 073822100da7..39fda97c49a7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -49,7 +49,7 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned 
int flush_domains)
break;
 
case I915_GEM_DOMAIN_CPU:
-   i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);
+   i915_gem_clflush_object(obj, 0);
break;
 
case I915_GEM_DOMAIN_RENDER:
@@ -97,34 +97,13 @@ void i915_gem_object_flush_if_display_locked(struct 
drm_i915_gem_object *obj)
  * This function returns when the move is complete, including waiting on
  * flushes to occur.
  */
-int
+void
 i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write)
 {
-   int ret;
-
assert_object_held(obj);
 
-   ret = i915_gem_ob

Re: [PATCH v4 09/15] drm/ttm: Document and optimize ttm_bo_pipeline_gutting()

2021-05-26 Thread Christian König

Am 26.05.21 um 13:32 schrieb Thomas Hellström:

If the bo is idle when calling ttm_bo_pipeline_gutting(), we unnecessarily
create a ghost object and push it out to delayed destroy.
Fix this by adding a path for idle, and document the function.

Also avoid having the bo end up in a bad state vulnerable to user-space
triggered kernel BUGs if the call to ttm_tt_create() fails.

Finally reuse ttm_bo_pipeline_gutting() in ttm_bo_evict().

Cc: Christian König 
Signed-off-by: Thomas Hellström 
---
v4:
- Clarify why we mark bo for clearing after ttm_bo_pipeline_gutting()
   (Reported by Matthew Auld)
---
  drivers/gpu/drm/ttm/ttm_bo.c  | 20 +--
  drivers/gpu/drm/ttm/ttm_bo_util.c | 55 ---
  drivers/gpu/drm/ttm/ttm_tt.c  |  5 +++
  include/drm/ttm/ttm_tt.h  | 10 ++
  4 files changed, 76 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 51a94fd63bd7..be0406466460 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -501,10 +501,15 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo,
bdev->funcs->evict_flags(bo, &placement);
  
  	if (!placement.num_placement && !placement.num_busy_placement) {

-   ttm_bo_wait(bo, false, false);
+   ret = ttm_bo_wait(bo, true, false);
+   if (ret)
+   return ret;
  
-		ttm_bo_cleanup_memtype_use(bo);

-   return ttm_tt_create(bo, false);
+   /*
+* Since we've already synced, this frees backing store
+* immediately.
+*/
+   return ttm_bo_pipeline_gutting(bo);
}
  
  	ret = ttm_bo_mem_space(bo, &placement, &evict_mem, ctx);

@@ -976,13 +981,8 @@ int ttm_bo_validate(struct ttm_buffer_object *bo,
/*
 * Remove the backing store if no placement is given.
 */
-   if (!placement->num_placement && !placement->num_busy_placement) {
-   ret = ttm_bo_pipeline_gutting(bo);
-   if (ret)
-   return ret;
-
-   return ttm_tt_create(bo, false);
-   }
+   if (!placement->num_placement && !placement->num_busy_placement)
+   return ttm_bo_pipeline_gutting(bo);
  
  	/*

 * Check whether we need to move buffer.
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index ebff603a97f4..4cca932f1c0e 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -590,26 +590,73 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object 
*bo,
  }
  EXPORT_SYMBOL(ttm_bo_move_accel_cleanup);
  
+/**

+ * ttm_bo_pipeline_gutting - purge the contents of a bo
+ * @bo: The buffer object
+ *
+ * Purge the contents of a bo, async if the bo is not idle.
+ * After a successful call, the bo is left unpopulated in
+ * system placement. The function may wait uninterruptible
+ * for idle on OOM.
+ *
+ * Return: 0 if successful, negative error code on failure.
+ */
  int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
  {
static const struct ttm_place sys_mem = { .mem_type = TTM_PL_SYSTEM };
struct ttm_buffer_object *ghost;
+   struct ttm_tt *ttm;
int ret;
  
-	ret = ttm_buffer_object_transfer(bo, &ghost);

+   /* If already idle, no need for ghost object dance. */
+   ret = ttm_bo_wait(bo, false, true);
+   if (ret != -EBUSY) {
+   if (!bo->ttm) {
+   /* See comment below about clearing. */
+   ret = ttm_tt_create(bo, true);
+   if (ret)
+   return ret;
+   } else {
+   ttm_tt_unpopulate(bo->bdev, bo->ttm);
+   if (bo->type == ttm_bo_type_device)
+   ttm_tt_mark_for_clear(bo->ttm);
+   }
+   ttm_resource_free(bo, &bo->mem);
+   ttm_resource_alloc(bo, &sys_mem, &bo->mem);
+
+   return 0;
+   }
+
+   /*
+* We need an unpopulated ttm_tt after giving our current one,
+* if any, to the ghost object. And we can't afford to fail
+* creating one *after* the operation. If the bo subsequently gets
+* resurrected, make sure it's cleared (if ttm_bo_type_device)
+* to avoid leaking sensitive information to user-space.
+*/
+
+   ttm = bo->ttm;
+   bo->ttm = NULL;
+   ret = ttm_tt_create(bo, true);
+   swap(bo->ttm, ttm);
if (ret)
return ret;
  
+	ret = ttm_buffer_object_transfer(bo, &ghost);

+   if (ret) {
+   ttm_tt_destroy(bo->bdev, ttm);
+   return ret;
+   }
+
ret = dma_resv_copy_fences(&ghost->base._resv, bo->base.resv);
/* Last resort, wait for the BO to be idle when we are OOM */
if (ret)
ttm_bo_wait(bo, false, false);
  
-	ttm_res

Re: [PATCH 12/12] drm/i915/gem: Manage all set-domain waits explicitly

2021-05-26 Thread Matthew Auld

On 26/05/2021 15:14, Tvrtko Ursulin wrote:

From: Chris Wilson 

Only perform the domain transition under the object lock, and push the
required waits to outside the lock.

v2 (Tvrtko):
  * Rebase.

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld  # v1
Signed-off-by: Tvrtko Ursulin 
---
  drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   9 +-
  drivers/gpu/drm/i915/gem/i915_gem_clflush.h   |   2 -
  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|   4 +-
  drivers/gpu/drm/i915/gem/i915_gem_domain.c| 163 +-
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   4 +-
  drivers/gpu/drm/i915/gem/i915_gem_object.h|  12 +-
  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 +
  .../gpu/drm/i915/gem/selftests/huge_pages.c   |   8 -
  .../i915/gem/selftests/i915_gem_coherency.c   |  31 +++-
  .../drm/i915/gem/selftests/i915_gem_phys.c|   8 +-
  .../drm/i915/gem/selftests/igt_gem_utils.c|   3 +
  drivers/gpu/drm/i915/i915_gem.c   |   4 +-
  12 files changed, 89 insertions(+), 165 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index daf9284ef1f5..e4c24558eaa8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -51,8 +51,6 @@ static struct clflush *clflush_work_create(struct 
drm_i915_gem_object *obj)
  {
struct clflush *clflush;
  
-	GEM_BUG_ON(!obj->cache_dirty);

-
clflush = kmalloc(sizeof(*clflush), GFP_KERNEL);
if (!clflush)
return NULL;
@@ -101,13 +99,10 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object 
*obj,
  
  	trace_i915_gem_object_clflush(obj);
  
-	clflush = NULL;

-   if (!(flags & I915_CLFLUSH_SYNC))
-   clflush = clflush_work_create(obj);
+   clflush = clflush_work_create(obj);
if (clflush) {
i915_sw_fence_await_reservation(&clflush->base.chain,
-   obj->base.resv, NULL, true,
-   
i915_fence_timeout(to_i915(obj->base.dev)),
+   obj->base.resv, NULL, true, 0,
I915_FENCE_GFP);
dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma);
dma_fence_work_commit(&clflush->base);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h
index e6c382973129..4cd5787d1507 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h
@@ -9,12 +9,10 @@
  
  #include 
  
-struct drm_i915_private;

  struct drm_i915_gem_object;
  
  bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,

 unsigned int flags);
  #define I915_CLFLUSH_FORCE BIT(0)
-#define I915_CLFLUSH_SYNC BIT(1)
  
  #endif /* __I915_GEM_CLFLUSH_H__ */

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index ccede73c6465..0926e0895ee6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -132,7 +132,7 @@ static int i915_gem_begin_cpu_access(struct dma_buf 
*dma_buf, enum dma_data_dire
if (!err)
err = i915_gem_object_pin_pages(obj);
if (!err) {
-   err = i915_gem_object_set_to_cpu_domain(obj, write);
+   i915_gem_object_set_to_cpu_domain(obj, write);
i915_gem_object_unpin_pages(obj);
}
if (err == -EDEADLK) {
@@ -156,7 +156,7 @@ static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, 
enum dma_data_direct
if (!err)
err = i915_gem_object_pin_pages(obj);
if (!err) {
-   err = i915_gem_object_set_to_gtt_domain(obj, false);
+   i915_gem_object_set_to_gtt_domain(obj, false);
i915_gem_object_unpin_pages(obj);
}
if (err == -EDEADLK) {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 073822100da7..39fda97c49a7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -49,7 +49,7 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned 
int flush_domains)
break;
  
  	case I915_GEM_DOMAIN_CPU:

-   i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);
+   i915_gem_clflush_object(obj, 0);
break;
  
  	case I915_GEM_DOMAIN_RENDER:

@@ -97,34 +97,13 @@ void i915_gem_object_flush_if_display_locked(struct 
drm_i915_gem_object *obj)
   * This function returns when the move is complete, including waiting on
   * flushes to occur.
   */
-int
+void
  i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write)
  {
-   int ret;
-
assert_object_held(obj);
  
-	ret = i915_gem_object_wait(obj,

-

  1   2   3   >