date:20220602

XDC 2022: Registration & Call for Presentations still open!

2022-06-02 Thread Lyude Paul

Hello! This is just a reminder that the CFP for XDC in 2022 is still open!

The 2022 X.Org Developers Conference is being held in conjunction with
the 2022 Wine Developers Conference.  This is a meeting to bring
together developers working on all things open graphics (Linux kernel,
Mesa, DRM, Wayland, X11, etc.) as well as developers for the Wine
Project, a key consumer of open graphics.

Registration & Call for Proposals are now open for both XDC 2022 and
WineConf 2022, which will take place on October 4-6, 2022 in
Minneapolis, Minnesota, USA. 

https://xdc2022.x.org
 
As usual, the conference is free of charge and open to the general
public. If you plan on attending, please make sure to register as early
as possible!
 
In order to register as attendee, you will therefore need to do it via
the XDC website:
 
https://indico.freedesktop.org/event/2/registrations/2/
 
In addition to registration, the CfP is now open for talks, workshops
and demos at XDC 2022. While any serious proposal will be gratefully
considered, topics of interest to X.Org and freedesktop.org developers
are encouraged. The program focus is on new development, ongoing
challenges and anything else that will spark discussions among
attendees in the hallway track.
 
We are open to talks across all layers of the graphics stack, from the
kernel to desktop environments / graphical applications and about how
to make things better for the developers who build them. Head to the
CfP page to learn more: 
 
https://indico.freedesktop.org/event/2/abstracts/
 
The deadline for submissions is Monday July 4th, 2022.
 
Check out our Reimbursement Policy to accept speaker
expenses for X.Org events like XDC 2022:
 
https://www.x.org/wiki/XorgFoundation/Policies/Reimbursement/

If you have any questions, please send me an email to
x...@codeweavers.com, adding on CC the X.org board (board
at foundation.x.org).
 
And don't forget, you can follow us on Twitter for all the latest
updates and to stay connected:
 
https://twitter.com/XOrgDevConf

Best regards,
Lyude Paul, on behalf of X.org

Re: [PATCH] drm/amdgpu/discovery: add comments about VCN instance handling

2022-06-02 Thread Dan Carpenter

On Thu, Jun 02, 2022 at 12:45:47PM -0400, Alex Deucher wrote:
> Add comments to clarify code that is safe, but triggers and
> smatch warning.
> 
> Link: https://lists.freedesktop.org/archives/amd-gfx/2022-June/079905.html 
> Signed-off-by: Alex Deucher 
> Cc: Dan Carpenter 
> ---

Thanks!

regards,
dan carpenter

[PATCH 0/3] Drive-by MST fixes for amdgpu

2022-06-02 Thread Lyude Paul

Now that I'm finishing up my work to remove the legacy MST code from the
tree, I've come across a couple of various issues that I wrote up
patches for along the way. These are some of those patches for amdgpu.

Lyude Paul (3):
  drm/amdgpu/dm/mst: Stop grabbing mst_mgr->lock in
compute_mst_dsc_configs_for_state()
  drm/amdgpu/dm/mst: Stop grabbing mst_mgr->lock in
pre_compute_mst_dsc_configs_for_state()
  drm/amdgpu/dm: Drop != NULL check in dm_mst_get_pbn_divider()

 .../drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c  | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

-- 
2.35.3

[PATCH 2/3] drm/amdgpu/dm/mst: Stop grabbing mst_mgr->lock in pre_compute_mst_dsc_configs_for_state()

2022-06-02 Thread Lyude Paul

This lock is only needed if you're iterating through the in-memory topology
(e.g. drm_dp_mst_branch->ports, drm_dp_mst_port->mstb, etc.). This doesn't
actually seem to be what's going on here though, so we can just drop this
lock.

Signed-off-by: Lyude Paul 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index cb3b0e08acc4..1259f2f7a8f9 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -1112,16 +1112,12 @@ static bool
if (!is_dsc_need_re_compute(state, dc_state, stream->link))
continue;
 
-   mutex_lock(>mst_mgr.lock);
if (!compute_mst_dsc_configs_for_link(state,
  dc_state,
  stream->link,
  vars,
- _vars_start_index)) {
-   mutex_unlock(>mst_mgr.lock);
+ _vars_start_index))
return false;
-   }
-   mutex_unlock(>mst_mgr.lock);
 
for (j = 0; j < dc_state->stream_count; j++) {
if (dc_state->streams[j]->link == stream->link)
-- 
2.35.3

[PATCH 3/3] drm/amdgpu/dm: Drop != NULL check in dm_mst_get_pbn_divider()

2022-06-02 Thread Lyude Paul

A lot of code in amdgpu seems to sprinkle in

  if (foo != NULL)
…

Checks pretty much all over the place, many times in locations where it's
clear foo (whatever foo may be) should never be NULL unless we've run into
a programming error. This is definitely one of those places, as
dm_mst_get_pbn_divider() should never be getting called with a NULL dc_link
pointer.

The problem with this code pattern is that many times the places I've seen
it used in amdgpu have no real error handling. This is actually quite bad,
if we try to avoid the NULL pointer and instead simply skip any code that
was expecting a valid pointer - we're already in undefined territory.
Subsequent code we execute may have expected sideaffects from the code we
skipped that are no longer present, which leads to even more unpredictable
behavior then a simple segfault. This could be silent errors or even just
another segfault somewhere else.

If we simply segfault though, that's not good either. But unlike the former
solution, no subsequent code in the kernel thread will execute - and we
will likely even get a clear backtrace from the invalid memory access. Of
course, the preferred approach is to simply handle the possibility of both
NULL and non-NULL pointers with nice error handling code. However, that's
not always desirable or even possible, and in those cases it's likely just
better to fail predictably rather than unpredictably.

This code is a nice example of that - if link is NULL, you'll return a PBN
divisor of 0. And thus, you've simply traded in your potential segfault for
a potential divide by 0 error. This was something I actually managed to hit
while working on the legacy MST removal work.

Signed-off-by: Lyude Paul 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index 1259f2f7a8f9..35c7def8f2bd 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -537,9 +537,6 @@ void amdgpu_dm_initialize_dp_connector(struct 
amdgpu_display_manager *dm,
 
 int dm_mst_get_pbn_divider(struct dc_link *link)
 {
-   if (!link)
-   return 0;
-
return dc_link_bandwidth_kbps(link,
dc_link_get_link_cap(link)) / (8 * 1000 * 54);
 }
-- 
2.35.3

[PATCH 1/3] drm/amdgpu/dm/mst: Stop grabbing mst_mgr->lock in compute_mst_dsc_configs_for_state()

2022-06-02 Thread Lyude Paul

Noticed this while trying to update amdgpu for the non-atomic MST removal
changes - for some reason we appear to grab mst_mgr->lock before computing
mst DSC configs. This is wrong though - mst_mgr->lock is only needed while
traversing the actual MST topology state - which is not typically something
that DRM drivers should be doing themselves anyway.

Signed-off-by: Lyude Paul 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index 9221b6690a4a..cb3b0e08acc4 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -1056,13 +1056,10 @@ bool compute_mst_dsc_configs_for_state(struct 
drm_atomic_state *state,
if (!is_dsc_need_re_compute(state, dc_state, stream->link))
continue;
 
-   mutex_lock(>mst_mgr.lock);
if (!compute_mst_dsc_configs_for_link(state, dc_state, 
stream->link,
vars, _vars_start_index)) {
-   mutex_unlock(>mst_mgr.lock);
return false;
}
-   mutex_unlock(>mst_mgr.lock);
 
for (j = 0; j < dc_state->stream_count; j++) {
if (dc_state->streams[j]->link == stream->link)
-- 
2.35.3

Re: [PATCH 0/4] PSR-SU-RC DC support and some PSR-SU fixes

2022-06-02 Thread Harry Wentland

Patches 1, 2, and 4 are
Reviewed-by: Harry Wentland 

Harry

On 2022-06-02 14:03, sunpeng...@amd.com wrote:
> From: Leo Li 
> 
> The first two patches here add PSR SU Rate Control support to DC. Support in
> amdgpu_dm is still pending to enable this fully.
> 
> The last two patches are some fixes for PSR SU.
> 
> David Zhang (3):
>   drm/amd/display: expose AMD specific DPCD for PSR-SU-RC support
>   drm/amd/display: Add PSR-SU-RC support in DC
>   drm/amd/display: pass panel instance in dirty rect message
> 
> Robin Chen (1):
>   drm/amd/display: refactor dirty rect dmub command decision
> 
>  drivers/gpu/drm/amd/display/dc/core/dc.c  | 19 ++-
>  drivers/gpu/drm/amd/display/dc/core/dc_link.c | 22 +
>  drivers/gpu/drm/amd/display/dc/dc_link.h  |  3 +++
>  drivers/gpu/drm/amd/display/dc/dc_types.h |  2 ++
>  drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c | 23 ++
>  drivers/gpu/drm/amd/display/dc/dce/dmub_psr.h |  2 ++
>  .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 24 ++-
>  .../drm/amd/display/dc/inc/hw/link_encoder.h  |  8 +++
>  .../amd/display/include/ddc_service_types.h   |  4 
>  9 files changed, 100 insertions(+), 7 deletions(-)
>

Re: [PATCH 3/4] drm/amd/display: pass panel instance in dirty rect message

2022-06-02 Thread Harry Wentland




On 2022-06-02 14:03, sunpeng...@amd.com wrote:
> From: David Zhang 
> 
> [why]
> DMUB FW uses OTG instance to get eDP panel instance. But in case
> of MPO multiple pipe indexes are passed to updated the same panel.
> The other OTG instance passed would be ignored causing in DMUB not
> acknowledging the messages.
> 
> [how]
> Add panel instance to dirty rectangle data and cursor update data
> structures and pass to DMUB.
> 

I'm not entirely following why passing the panel_inst solves the problem
that is described.

> Signed-off-by: Mikita Lipski 

This says the author is David but it has only Mikita's sign-off.
We need David's sign-off as well.

Harry

> Acked-by: Leo Li 
> ---
>  drivers/gpu/drm/amd/display/dc/core/dc.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc.c
> index d4173be11903..31d83297bcb5 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
> @@ -2837,10 +2837,14 @@ void dc_dmub_update_dirty_rect(struct dc *dc,
>   struct dc_context *dc_ctx = dc->ctx;
>   struct dmub_cmd_update_dirty_rect_data *update_dirty_rect;
>   unsigned int i, j;
> + unsigned int panel_inst = 0;
>  
>   if (stream->link->psr_settings.psr_version != DC_PSR_VERSION_SU_1)
>   return;
>  
> + if (!dc_get_edp_link_panel_inst(dc, stream->link, _inst))
> + return;
> +
>   memset(, 0x0, sizeof(cmd));
>   cmd.update_dirty_rect.header.type = DMUB_CMD__UPDATE_DIRTY_RECT;
>   cmd.update_dirty_rect.header.sub_type = 0;
> @@ -2869,6 +2873,7 @@ void dc_dmub_update_dirty_rect(struct dc *dc,
>   if (pipe_ctx->plane_state != plane_state)
>   continue;
>  
> + update_dirty_rect->panel_inst = panel_inst;
>   update_dirty_rect->pipe_idx = j;
>   dc_dmub_srv_cmd_queue(dc_ctx->dmub_srv, );
>   dc_dmub_srv_cmd_execute(dc_ctx->dmub_srv);

[PATCH 4/4] drm/amd/display: refactor dirty rect dmub command decision

2022-06-02 Thread sunpeng.li

From: Robin Chen 

[Why]
To wrap the decision logic of sending dirty rect dmub command
for both frame update and cursor update path.

Signed-off-by: Robin Chen 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 14 ++-
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 24 ++-
 2 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 31d83297bcb5..645ec5bc3a7d 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -2827,6 +2827,18 @@ static void commit_planes_do_stream_update(struct dc *dc,
}
 }
 
+static bool dc_dmub_should_send_dirty_rect_cmd(struct dc *dc, struct 
dc_stream_state *stream)
+{
+   if (stream->link->psr_settings.psr_version == DC_PSR_VERSION_SU_1)
+   return true;
+
+   if (stream->link->psr_settings.psr_version == DC_PSR_VERSION_1 &&
+   dc->debug.enable_sw_cntl_psr)
+   return true;
+
+   return false;
+}
+
 void dc_dmub_update_dirty_rect(struct dc *dc,
   int surface_count,
   struct dc_stream_state *stream,
@@ -2839,7 +2851,7 @@ void dc_dmub_update_dirty_rect(struct dc *dc,
unsigned int i, j;
unsigned int panel_inst = 0;
 
-   if (stream->link->psr_settings.psr_version != DC_PSR_VERSION_SU_1)
+   if (!dc_dmub_should_send_dirty_rect_cmd(dc, stream))
return;
 
if (!dc_get_edp_link_panel_inst(dc, stream->link, _inst))
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 7fe06a2c0c04..5b5e0dd13fd0 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -3325,6 +3325,23 @@ static bool dcn10_can_pipe_disable_cursor(struct 
pipe_ctx *pipe_ctx)
return false;
 }
 
+static bool dcn10_dmub_should_update_cursor_data(
+   struct pipe_ctx *pipe_ctx,
+   struct dc_debug_options *debug)
+{
+   if (pipe_ctx->plane_state->address.type == 
PLN_ADDR_TYPE_VIDEO_PROGRESSIVE)
+   return false;
+
+   if (pipe_ctx->stream->link->psr_settings.psr_version == 
DC_PSR_VERSION_SU_1)
+   return true;
+
+   if (pipe_ctx->stream->link->psr_settings.psr_version == 
DC_PSR_VERSION_1 &&
+   debug->enable_sw_cntl_psr)
+   return true;
+
+   return false;
+}
+
 static void dcn10_dmub_update_cursor_data(
struct pipe_ctx *pipe_ctx,
struct hubp *hubp,
@@ -3346,13 +3363,8 @@ static void dcn10_dmub_update_cursor_data(
 
struct dc_debug_options *debug = >ctx->dc->debug;
 
-   if (!debug->enable_sw_cntl_psr && 
pipe_ctx->stream->link->psr_settings.psr_version != DC_PSR_VERSION_SU_1)
+   if (!dcn10_dmub_should_update_cursor_data(pipe_ctx, debug))
return;
-
-   if (pipe_ctx->stream->link->psr_settings.psr_version == 
DC_PSR_VERSION_UNSUPPORTED ||
-   pipe_ctx->plane_state->address.type == 
PLN_ADDR_TYPE_VIDEO_PROGRESSIVE)
-   return;
-
/**
 * if cur_pos == NULL means the caller is from cursor_set_attribute
 * then driver use previous cursor position data
-- 
2.36.1

[PATCH 2/4] drm/amd/display: Add PSR-SU-RC support in DC

2022-06-02 Thread sunpeng.li

From: David Zhang 

[Why]

PSR-SU Rate Control - or PSR-SU-RC - enables PSR-SU panels to work with
variable refresh rate to allow for more power savings. Lowering the
refresh rate can increase PSR residency by expanding the eDP main link
shut down duration. It can also lower panel power consumption.

There is a complication with With PSR, since the eDP main link can be
shut down. Therefore, the timing controller (TCON) on the eDP sink nees
to be able to scan out its remote buffer independant of the main link.
To allow the eDP source to specify the sink's refresh rate while the
link is off, vendor-specific DPCD registers are used. This allows the
eDP source to then "Rate Control" the panel during PSR active.

[How]

Add DC support to communicate with PSR-SU-RC supported eDP sinks. The
sink will need to know the desired VTotal during PSR active.

This change only adds support to DC, support in amdgpu_dm is still pending to
enable this fully.

Signed-off-by: David Zhang 
Signed-off-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 22 ++
 drivers/gpu/drm/amd/display/dc/dc_link.h  |  3 +++
 drivers/gpu/drm/amd/display/dc/dc_types.h |  2 ++
 drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c | 23 +++
 drivers/gpu/drm/amd/display/dc/dce/dmub_psr.h |  2 ++
 .../drm/amd/display/dc/inc/hw/link_encoder.h  |  8 +++
 6 files changed, 60 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 31ffb961e18b..3d6dcaa6a483 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -1795,6 +1795,7 @@ static bool dc_link_construct_legacy(struct dc_link *link,
 */
program_hpd_filter(link);
 
+   link->psr_settings.psr_vtotal_control_support = false;
link->psr_settings.psr_version = DC_PSR_VERSION_UNSUPPORTED;
 
DC_LOG_DC("BIOS object table - %s finished successfully.\n", __func__);
@@ -3207,6 +3208,7 @@ bool dc_link_setup_psr(struct dc_link *link,
/* updateSinkPsrDpcdConfig*/
union dpcd_psr_configuration psr_configuration;
union dpcd_alpm_configuration alpm_configuration;
+   union dpcd_sink_active_vtotal_control_mode vtotal_control = {0};
 
psr_context->controllerId = CONTROLLER_ID_UNDEFINED;
 
@@ -3276,6 +3278,13 @@ bool dc_link_setup_psr(struct dc_link *link,
psr_config->su_y_granularity;
psr_context->line_time_in_us =
psr_config->line_time_in_us;
+
+   if (link->psr_settings.psr_vtotal_control_support) {
+   psr_context->rate_control_caps = 
psr_config->rate_control_caps;
+   vtotal_control.bits.ENABLE = true;
+   core_link_write_dpcd(link, 
DP_SINK_PSR_ACTIVE_VTOTAL_CONTROL_MODE,
+   _control.raw, 
sizeof(vtotal_control.raw));
+   }
}
 
psr_context->channel = link->ddc->ddc_pin->hw_info.ddc_channel;
@@ -3408,6 +3417,19 @@ void dc_link_get_psr_residency(const struct dc_link 
*link, uint32_t *residency)
*residency = 0;
 }
 
+bool dc_link_set_sink_vtotal_in_psr_active(const struct dc_link *link, 
uint16_t psr_vtotal_idle, uint16_t psr_vtotal_su)
+{
+   struct dc *dc = link->ctx->dc;
+   struct dmub_psr *psr = dc->res_pool->psr;
+
+   if (psr == NULL || !link->psr_settings.psr_feature_enabled || 
!link->psr_settings.psr_vtotal_control_support)
+   return false;
+
+   psr->funcs->psr_set_sink_vtotal_in_psr_active(psr, psr_vtotal_idle, 
psr_vtotal_su);
+
+   return true;
+}
+
 const struct dc_link_status *dc_link_get_status(const struct dc_link *link)
 {
return >link_status;
diff --git a/drivers/gpu/drm/amd/display/dc/dc_link.h 
b/drivers/gpu/drm/amd/display/dc/dc_link.h
index 0bec986a6de8..3ec189dd73da 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_link.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_link.h
@@ -100,6 +100,7 @@ struct psr_settings {
bool psr_feature_enabled;   // PSR is supported by sink
bool psr_allow_active;  // PSR is currently active
enum dc_psr_version psr_version;// Internal PSR 
version, determined based on DPCD
+   bool psr_vtotal_control_support;// Vtotal control is supported 
by sink
 
/* These parameters are calculated in Driver,
 * based on display timing and Sink capabilities.
@@ -324,6 +325,8 @@ void dc_link_get_psr_residency(const struct dc_link *link, 
uint32_t *residency);
 void dc_link_blank_all_dp_displays(struct dc *dc);
 
 void dc_link_blank_dp_stream(struct dc_link *link, bool hw_init);
+bool dc_link_set_sink_vtotal_in_psr_active(const struct dc_link *link,
+   uint16_t psr_vtotal_idle, uint16_t psr_vtotal_su);
 
 /* Request DC to detect if there is a Panel connected.
  *

[PATCH 3/4] drm/amd/display: pass panel instance in dirty rect message

2022-06-02 Thread sunpeng.li

From: David Zhang 

[why]
DMUB FW uses OTG instance to get eDP panel instance. But in case
of MPO multiple pipe indexes are passed to updated the same panel.
The other OTG instance passed would be ignored causing in DMUB not
acknowledging the messages.

[how]
Add panel instance to dirty rectangle data and cursor update data
structures and pass to DMUB.

Signed-off-by: Mikita Lipski 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index d4173be11903..31d83297bcb5 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -2837,10 +2837,14 @@ void dc_dmub_update_dirty_rect(struct dc *dc,
struct dc_context *dc_ctx = dc->ctx;
struct dmub_cmd_update_dirty_rect_data *update_dirty_rect;
unsigned int i, j;
+   unsigned int panel_inst = 0;
 
if (stream->link->psr_settings.psr_version != DC_PSR_VERSION_SU_1)
return;
 
+   if (!dc_get_edp_link_panel_inst(dc, stream->link, _inst))
+   return;
+
memset(, 0x0, sizeof(cmd));
cmd.update_dirty_rect.header.type = DMUB_CMD__UPDATE_DIRTY_RECT;
cmd.update_dirty_rect.header.sub_type = 0;
@@ -2869,6 +2873,7 @@ void dc_dmub_update_dirty_rect(struct dc *dc,
if (pipe_ctx->plane_state != plane_state)
continue;
 
+   update_dirty_rect->panel_inst = panel_inst;
update_dirty_rect->pipe_idx = j;
dc_dmub_srv_cmd_queue(dc_ctx->dmub_srv, );
dc_dmub_srv_cmd_execute(dc_ctx->dmub_srv);
-- 
2.36.1

[PATCH 0/4] PSR-SU-RC DC support and some PSR-SU fixes

2022-06-02 Thread sunpeng.li

From: Leo Li 

The first two patches here add PSR SU Rate Control support to DC. Support in
amdgpu_dm is still pending to enable this fully.

The last two patches are some fixes for PSR SU.

David Zhang (3):
  drm/amd/display: expose AMD specific DPCD for PSR-SU-RC support
  drm/amd/display: Add PSR-SU-RC support in DC
  drm/amd/display: pass panel instance in dirty rect message

Robin Chen (1):
  drm/amd/display: refactor dirty rect dmub command decision

 drivers/gpu/drm/amd/display/dc/core/dc.c  | 19 ++-
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 22 +
 drivers/gpu/drm/amd/display/dc/dc_link.h  |  3 +++
 drivers/gpu/drm/amd/display/dc/dc_types.h |  2 ++
 drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c | 23 ++
 drivers/gpu/drm/amd/display/dc/dce/dmub_psr.h |  2 ++
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 24 ++-
 .../drm/amd/display/dc/inc/hw/link_encoder.h  |  8 +++
 .../amd/display/include/ddc_service_types.h   |  4 
 9 files changed, 100 insertions(+), 7 deletions(-)

-- 
2.36.1

[PATCH 1/4] drm/amd/display: expose AMD specific DPCD for PSR-SU-RC support

2022-06-02 Thread sunpeng.li

From: David Zhang 

[why & how]

Expose vendor specific DPCD registers for rate controlling the eDP sink
TCON's refresh rate during PSR active. When used in combination with
PSR-SU and Freesync, it is called PSR-SU Rate Contorol, or PSR-SU-RC for
short.

v2: Add all DPCD registers required

Signed-off-by: David Zhang 
Acked-by: Leo Li 
---
 drivers/gpu/drm/amd/display/include/ddc_service_types.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/include/ddc_service_types.h 
b/drivers/gpu/drm/amd/display/include/ddc_service_types.h
index 20a3d4e23f66..05096c644a60 100644
--- a/drivers/gpu/drm/amd/display/include/ddc_service_types.h
+++ b/drivers/gpu/drm/amd/display/include/ddc_service_types.h
@@ -41,6 +41,10 @@
 #define DP_DEVICE_ID_38EC11 0x38EC11
 #define DP_FORCE_PSRSU_CAPABILITY 0x40F
 
+#define DP_SINK_PSR_ACTIVE_VTOTAL  0x373
+#define DP_SINK_PSR_ACTIVE_VTOTAL_CONTROL_MODE 0x375
+#define DP_SOURCE_PSR_ACTIVE_VTOTAL0x376
+
 enum ddc_result {
DDC_RESULT_UNKNOWN = 0,
DDC_RESULT_SUCESSFULL,
-- 
2.36.1

Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved

2022-06-02 Thread Felix Kuehling


Am 2022-06-02 um 09:20 schrieb Philip Yang:

Flush TLBs when existing PDEs are updated because a PTB or PDB moved,
but avoids unnecessary TLB flushes when new PDBs or PTBs are added to
the page table, which commonly happens when memory is mapped for the
first time.

Suggested-by: Christian König 
Signed-off-by: Philip Yang 


Reviewed-by: Felix Kuehling 



---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 9596c22fded6..1ea204218903 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
  {
struct amdgpu_vm_update_params params;
struct amdgpu_vm_bo_base *entry;
+   bool flush_tlb_needed = false;
int r, idx;
  
  	if (list_empty(>relocated))

@@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
goto error;
  
  	list_for_each_entry(entry, >relocated, vm_status) {

+   /* vm_flush_needed after updating moved PDEs */
+   flush_tlb_needed |= entry->moved;
+
r = amdgpu_vm_pde_update(, entry);
if (r)
goto error;
@@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
if (r)
goto error;
  
-	/* vm_flush_needed after updating PDEs */

-   atomic64_inc(>tlb_seq);
+   if (flush_tlb_needed)
+   atomic64_inc(>tlb_seq);
  
  	while (!list_empty(>relocated)) {

entry = list_first_entry(>relocated,

Re: [PATCH] drm/ttm: fix bulk move handling during resource init

2022-06-02 Thread Mike Lothian

The Null pointer against drm-next:

Jun 02 19:59:50 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer
dereference, address: 00d8
Jun 02 19:59:50 axion.fireburn.co.uk kernel: #PF: supervisor read
access in kernel mode
Jun 02 19:59:50 axion.fireburn.co.uk kernel: #PF: error_code(0x) -
not-present page
Jun 02 19:59:50 axion.fireburn.co.uk kernel: PGD 118700067 P4D
118700067 PUD 11f116067 PMD 0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: Oops:  [#1] PREEMPT SMP NOPTI
Jun 02 19:59:50 axion.fireburn.co.uk kernel: CPU: 4 PID: 1029 Comm:
GravityMark.x64 Tainted: GW 5.18.0-rc5-drm+ #1070
Jun 02 19:59:50 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK
COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318
03/29/2022
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RIP:
0010:ttm_device_swapout+0x6a/0x3d0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: Code: 85 ed 74 51 80 7d
01 00 74 4b 48 89 e6 48 89 ef e8 7b dd ff ff 48 85 c0 74 3b 48 89 c3
49 89 e6 48 8b 7b 30 4c 89 ee 44 89 e2 <4c> 8b bf d8 00 00 00 e8 fa a5
>
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RSP:
:8881605dfc70 EFLAGS: 00010282
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RAX: 888104f85ac8
RBX: 888104f85ac8 RCX: 
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RDX: 0cc0
RSI: 8881605dfd50 RDI: 
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RBP: 888104f85140
R08: 888101566240 R09: 88814e57b880
Jun 02 19:59:50 axion.fireburn.co.uk kernel: R10: 0063
R11: 818238a0 R12: 0cc0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: R13: 8881605dfd50
R14: 8881605dfc70 R15: 00691000
Jun 02 19:59:50 axion.fireburn.co.uk kernel: FS:
7f4623fb9740() GS:888fde50()
knlGS:
Jun 02 19:59:50 axion.fireburn.co.uk kernel: CS:  0010 DS:  ES:
 CR0: 80050033
Jun 02 19:59:50 axion.fireburn.co.uk kernel: CR2: 00d8
CR3: 000102e3c000 CR4: 00150ee0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: Call Trace:
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? ttm_global_swapout+0xae/0xc0
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? ttm_tt_populate+0x7d/0x130
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ?
ttm_bo_vm_fault_reserved+0x237/0x270
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? amdgpu_gem_fault+0x92/0xd0
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? do_fault+0x28e/0x4b0
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? handle_mm_fault+0x849/0xa80
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? amdgpu_drm_ioctl+0x68/0x80
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? do_user_addr_fault+0x275/0x450
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? asm_exc_page_fault+0x9/0x30
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? exc_page_fault+0x5f/0x150
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  ? asm_exc_page_fault+0x1f/0x30
Jun 02 19:59:50 axion.fireburn.co.uk kernel:  
Jun 02 19:59:50 axion.fireburn.co.uk kernel: Modules linked in:
Jun 02 19:59:50 axion.fireburn.co.uk kernel: CR2: 00d8
Jun 02 19:59:50 axion.fireburn.co.uk kernel: ---[ end trace
 ]---
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RIP:
0010:ttm_device_swapout+0x6a/0x3d0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: Code: 85 ed 74 51 80 7d
01 00 74 4b 48 89 e6 48 89 ef e8 7b dd ff ff 48 85 c0 74 3b 48 89 c3
49 89 e6 48 8b 7b 30 4c 89 ee 44 89 e2 <4c> 8b bf d8 00 00 00 e8 fa a5
>
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RSP:
:8881605dfc70 EFLAGS: 00010282
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RAX: 888104f85ac8
RBX: 888104f85ac8 RCX: 
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RDX: 0cc0
RSI: 8881605dfd50 RDI: 
Jun 02 19:59:50 axion.fireburn.co.uk kernel: RBP: 888104f85140
R08: 888101566240 R09: 88814e57b880
Jun 02 19:59:50 axion.fireburn.co.uk kernel: R10: 0063
R11: 818238a0 R12: 0cc0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: R13: 8881605dfd50
R14: 8881605dfc70 R15: 00691000
Jun 02 19:59:50 axion.fireburn.co.uk kernel: FS:
7f4623fb9740() GS:888fde50()
knlGS:
Jun 02 19:59:50 axion.fireburn.co.uk kernel: CS:  0010 DS:  ES:
 CR0: 80050033
Jun 02 19:59:50 axion.fireburn.co.uk kernel: CR2: 00d8
CR3: 000102e3c000 CR4: 00150ee0
Jun 02 19:59:50 axion.fireburn.co.uk kernel: note:
GravityMark.x64[1029] exited with preempt_count 1

On Thu, 2 Jun 2022 at 19:55, Christian König
 wrote:
>
> That's because drm-misc-next is currently broken and needs a backmerge.
>
> Please try this patch on top of drm-next.
>
> Regards,
> Christian.
>
> Am 02.06.22 um 20:08 schrieb Mike Lothian:
> > Hi
> >
> > I'm still seeing Null pointers against Linus's tree and drm-misc with this 
> > patch
> >
> > Jun 02 19:04:05

Re: [PATCH] drm/ttm: fix bulk move handling during resource init

2022-06-02 Thread Christian König


That's because drm-misc-next is currently broken and needs a backmerge.

Please try this patch on top of drm-next.

Regards,
Christian.

Am 02.06.22 um 20:08 schrieb Mike Lothian:

Hi

I'm still seeing Null pointers against Linus's tree and drm-misc with this patch

Jun 02 19:04:05 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer
dereference, address: 0008
Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: supervisor write
access in kernel mode
Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: error_code(0x0002) -
not-present page
Jun 02 19:04:05 axion.fireburn.co.uk kernel: PGD 11ee04067 P4D
11ee04067 PUD 15eccb067 PMD 0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CPU: 0 PID: 1021 Comm:
GravityMark.x64 Tainted: GW 5.18.0-tip+ #3177
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK
COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318
03/29/2022
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP:
0010:ttm_resource_init+0x108/0x210
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48
39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d
56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP:
0018:888112e73918 EFLAGS: 00010202
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: 888206b715d8
RBX: 888206b715a0 RCX: 
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: 888206b71cf8
RSI: 888206b71cc0 RDI: 888110605b00
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: 88816c848c08
R08: 88812235c790 R09: 8881306a4bd8
Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: 
R11: 81851320 R12: 888110605ad0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: 888206b715a0
R14: 88816c848c58 R15: 888110605ad0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS:
7f4c257c1740() GS:888fde40()
knlGS:
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS:  0010 DS:  ES:
 CR0: 80050033
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0008
CR3: 0001183fc000 CR4: 00350ef0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Call Trace:
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_vram_mgr_new+0xbb/0x4b0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? ttm_bo_mem_space+0x89/0x1e0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? ttm_bo_validate+0x80/0x1a0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_bo_validate+0xe9/0x2b0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
amdgpu_syncobj_lookup_and_add_to_sync+0xa0/0xa0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
amdgpu_vm_validate_pt_bos+0xce/0x1c0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_parser_bos+0x522/0x6e0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_ioctl+0x7fe/0xd00
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
amdgpu_cs_report_moved_bytes+0x60/0x60
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? drm_ioctl_kernel+0xcb/0x130
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? drm_ioctl+0x2f5/0x400
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
amdgpu_cs_report_moved_bytes+0x60/0x60
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_drm_ioctl+0x42/0x80
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? __x64_sys_ioctl+0x5e/0xa0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? do_syscall_64+0x6a/0x90
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
exit_to_user_mode_prepare+0x19/0x90
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Modules linked in:
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0008
Jun 02 19:04:05 axion.fireburn.co.uk kernel: ---[ end trace
 ]---
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP:
0010:ttm_resource_init+0x108/0x210
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48
39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d
56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP:
0018:888112e73918 EFLAGS: 00010202
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: 888206b715d8
RBX: 888206b715a0 RCX: 
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: 888206b71cf8
RSI: 888206b71cc0 RDI: 888110605b00
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: 88816c848c08
R08: 88812235c790 R09: 8881306a4bd8
Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: 
R11: 81851320 R12: 888110605ad0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: 888206b715a0
R14: 88816c848c58 R15: 888110605ad0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS:
7f4c257c1740()

Re: Explicit VM updates

2022-06-02 Thread Christian König


Am 02.06.22 um 16:21 schrieb Felix Kuehling:

[SNIP]


In other words the free is not waiting for the unmap to complete, 
but causes command submissions through the kernel to depend on 
the unmap.


I guess I don't understand that dependency. The next command 
submission obviously cannot use the memory that was unmapped. But 
why does it need to synchronize with the unmap operation?


Because of the necessary TLB flush, only after that one is executed 
we can be sure that nobody has access to the memory any more and 
actually free it.


So freeing the memory has to wait for the TLB flush. Why does the 
next command submission need to wait?


Because that's the one triggering the TLB flush. The issue is that 
flushing the TLB while the VMID is in use is really unreliable on 
most hardware generations.


It's been working well enough with ROCm. With user mode command 
submission there is no way to block GPU work while a TLB flush is in 
progress.


Yeah, but at least on Navi 1x that's so horrible broken that the SDMA 
could write anywhere when we would try this.






User mode submissions are completely unrelated to that.


I mention user mode command submission because there is no way to 
enforce the synchronization you describe here on a user mode 
queue. So this approach is not very future proof.


With user mode queues you need to wait for the work on the queue to 
finish anyway or otherwise you run into VM faults if you just unmap 
or free the memory.


If the next command submission doesn't use the unmapped/freed 
memory, why does it need to wait for the TLB flush?


Because it could potentially use it. When userspace lies to the 
kernel and still accesses the mapping we would allow access to freed 
up memory and create a major security problem.


I'm aware of the potential security problem. That's why I'm 
recommending you don't actually free the memory until the TLB flush is 
done. So a bogus access will either harmlessly access memory that's 
not freed yet, or it will VM fault. It will never access memory that's 
already freed and potentially allocated by someone else.


Yes, that's the idea. The question is just when we can do the TLB flush.

If it is using the unmapped/freed memory, that's a user mode bug. 
But waiting for the TLB flush won't fix that. It will only turn a 
likely VM fault into a certain VM fault.


Yeah, exactly that's the intention here.



The guarantee you need to give is, that the memory is not freed and 
reused by anyone else until the TLB flush is done. This dependency 
requires synchronization of the "free" operation with the TLB flush. 
It does not require synchronization with any future command 
submissions in the context that freed the memory.


See above, the future command submission is what triggers the TLB 
flush because only then we can easily execute it without to much hassle.


That seems to be a limitation of your current command submission 
model. User mode command submission will not be able to trigger a TLB 
flush. Unmapping or freeing memory should be the trigger in that case.


That's how it works with KFD. That said, our TLB flushes aren't as 
well pipelined (which could probably be improved), and your strategy 
can probably batch more TLB flushes, so I see where you're coming from.


Well the mapping/unmapping IOCTL should certainly trigger the TLB 
flushes for the user mode queues, but as I said this is completely 
independent to this here.


The limitation is on the kernel CS IOCTL, not the VM IOCTL. So that is 
completely unrelated to this.




For Vega and Navi 2x we could use async TLB flushes and on gfx6, gfx7 
and gfx8 we could use double TLB flushes with grace time, but Navi 1x 
is so horrible broken regarding this that I don't see how else we 
could do that.


We're using heavy-weight TLB flushes on SOC15 GPUs. On Vega20 with 
XGMI we need double flushes to be safe.


I'm raising my concerns because I don't think making user mode wait is 
a good strategy long-term. And I believe this explicit sync and 
explicit VM update should be designed with an eye for future user-mode 
command submission models.


Yeah, but as already discussed with Daniel and Jason that will never 
ever work correctly. IOCTLs can't depend on user mode queues in any way. 
So user space can only block or rather call the map, unmap, free 
functions at the right time.


If you need short-term workarounds for broken hardware, that's another 
issue. But it would be good if that could be kept out of the API.


Well as I said that is completely unrelated to user mode queues. The 
restriction is on the CS API, not the VM API.


Regards,
Christian.




Regards,
  Felix




Regards,
Christian.



Regards,
  Felix




The signal that TLB flush is completed comes from the MES in this 
case.


Regards,
Christian.



Regards,
  Felix




Regards,
Christian.



Regards,
  Felix




3. All VM operations requested by userspace will still be 
executed in order, e.g. we can't run unmap + map

Re: [PATCH] drm/ttm: fix bulk move handling during resource init

2022-06-02 Thread Mike Lothian

Here's the output from drm-misc too:

Jun 02 19:05:32 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer
dereference, address: 00d8
Jun 02 19:05:32 axion.fireburn.co.uk kernel: #PF: supervisor read
access in kernel mode
Jun 02 19:05:32 axion.fireburn.co.uk kernel: #PF: error_code(0x) -
not-present page
Jun 02 19:05:32 axion.fireburn.co.uk kernel: PGD 11dfbb067 P4D
11dfbb067 PUD 170cd7067 PMD 0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: Oops:  [#1] PREEMPT SMP NOPTI
Jun 02 19:05:32 axion.fireburn.co.uk kernel: CPU: 1 PID: 1040 Comm:
GravityMark.x64 Tainted: GW 5.18.0-rc5-misc+ #10
Jun 02 19:05:32 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK
COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318
03/29/2022
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RIP:
0010:ttm_device_swapout+0x6a/0x3d0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: Code: 85 ed 74 51 80 7d
01 00 74 4b 48 89 e6 48 89 ef e8 7b dd ff ff 48 85 c0 74 3b 48 89 c3
49 89 e6 48 8b 7b 30 4c 89 ee 44 89 e2 <4c> 8b bf d8 00 00 00 e8 fa a5
>
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RSP:
:888125a17c70 EFLAGS: 00010286
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RAX: 888104f45aa0
RBX: 888104f45aa0 RCX: 
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RDX: 0cc0
RSI: 888125a17d50 RDI: 
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RBP: 888104f45118
R08: 88812278da68 R09: 888102eaa680
Jun 02 19:05:32 axion.fireburn.co.uk kernel: R10: 0063
R11: 818212a0 R12: 0cc0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: R13: 888125a17d50
R14: 888125a17c70 R15: 00691000
Jun 02 19:05:32 axion.fireburn.co.uk kernel: FS:
7fd9e0671740() GS:888fde44()
knlGS:
Jun 02 19:05:32 axion.fireburn.co.uk kernel: CS:  0010 DS:  ES:
 CR0: 80050033
Jun 02 19:05:32 axion.fireburn.co.uk kernel: CR2: 00d8
CR3: 000125c5a000 CR4: 00150ee0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: Call Trace:
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? ttm_global_swapout+0xae/0xc0
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? ttm_tt_populate+0x7d/0x130
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ?
ttm_bo_vm_fault_reserved+0x237/0x270
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? amdgpu_gem_fault+0x92/0xd0
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? do_fault+0x28e/0x4b0
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? handle_mm_fault+0x849/0xa80
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? do_user_addr_fault+0x275/0x450
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? asm_exc_page_fault+0x9/0x30
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? exc_page_fault+0x5f/0x150
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  ? asm_exc_page_fault+0x1f/0x30
Jun 02 19:05:32 axion.fireburn.co.uk kernel:  
Jun 02 19:05:32 axion.fireburn.co.uk kernel: Modules linked in:
Jun 02 19:05:32 axion.fireburn.co.uk kernel: CR2: 00d8
Jun 02 19:05:32 axion.fireburn.co.uk kernel: ---[ end trace
 ]---
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RIP:
0010:ttm_device_swapout+0x6a/0x3d0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: Code: 85 ed 74 51 80 7d
01 00 74 4b 48 89 e6 48 89 ef e8 7b dd ff ff 48 85 c0 74 3b 48 89 c3
49 89 e6 48 8b 7b 30 4c 89 ee 44 89 e2 <4c> 8b bf d8 00 00 00 e8 fa a5
>
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RSP:
:888125a17c70 EFLAGS: 00010286
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RAX: 888104f45aa0
RBX: 888104f45aa0 RCX: 
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RDX: 0cc0
RSI: 888125a17d50 RDI: 
Jun 02 19:05:32 axion.fireburn.co.uk kernel: RBP: 888104f45118
R08: 88812278da68 R09: 888102eaa680
Jun 02 19:05:32 axion.fireburn.co.uk kernel: R10: 0063
R11: 818212a0 R12: 0cc0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: R13: 888125a17d50
R14: 888125a17c70 R15: 00691000
Jun 02 19:05:32 axion.fireburn.co.uk kernel: FS:
7fd9e0671740() GS:888fde44()
knlGS:
Jun 02 19:05:32 axion.fireburn.co.uk kernel: CS:  0010 DS:  ES:
 CR0: 80050033
Jun 02 19:05:32 axion.fireburn.co.uk kernel: CR2: 00d8
CR3: 000125c5a000 CR4: 00150ee0
Jun 02 19:05:32 axion.fireburn.co.uk kernel: note:
GravityMark.x64[1040] exited with preempt_count 1

On Thu, 2 Jun 2022 at 19:08, Mike Lothian  wrote:
>
> Hi
>
> I'm still seeing Null pointers against Linus's tree and drm-misc with this 
> patch
>
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer
> dereference, address: 0008
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: supervisor write
> access in kernel mode
> Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: error_code(0x0002) -
> not-present page
> Jun 02 19:04:05

Re: [PATCH] drm/ttm: fix bulk move handling during resource init

2022-06-02 Thread Mike Lothian

Hi

I'm still seeing Null pointers against Linus's tree and drm-misc with this patch

Jun 02 19:04:05 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer
dereference, address: 0008
Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: supervisor write
access in kernel mode
Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: error_code(0x0002) -
not-present page
Jun 02 19:04:05 axion.fireburn.co.uk kernel: PGD 11ee04067 P4D
11ee04067 PUD 15eccb067 PMD 0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CPU: 0 PID: 1021 Comm:
GravityMark.x64 Tainted: GW 5.18.0-tip+ #3177
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK
COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318
03/29/2022
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP:
0010:ttm_resource_init+0x108/0x210
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48
39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d
56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89
>
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP:
0018:888112e73918 EFLAGS: 00010202
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: 888206b715d8
RBX: 888206b715a0 RCX: 
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: 888206b71cf8
RSI: 888206b71cc0 RDI: 888110605b00
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: 88816c848c08
R08: 88812235c790 R09: 8881306a4bd8
Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: 
R11: 81851320 R12: 888110605ad0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: 888206b715a0
R14: 88816c848c58 R15: 888110605ad0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS:
7f4c257c1740() GS:888fde40()
knlGS:
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS:  0010 DS:  ES:
 CR0: 80050033
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0008
CR3: 0001183fc000 CR4: 00350ef0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Call Trace:
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_vram_mgr_new+0xbb/0x4b0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? ttm_bo_mem_space+0x89/0x1e0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? ttm_bo_validate+0x80/0x1a0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_bo_validate+0xe9/0x2b0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
amdgpu_syncobj_lookup_and_add_to_sync+0xa0/0xa0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
amdgpu_vm_validate_pt_bos+0xce/0x1c0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_parser_bos+0x522/0x6e0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_cs_ioctl+0x7fe/0xd00
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
amdgpu_cs_report_moved_bytes+0x60/0x60
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? drm_ioctl_kernel+0xcb/0x130
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? drm_ioctl+0x2f5/0x400
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
amdgpu_cs_report_moved_bytes+0x60/0x60
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? amdgpu_drm_ioctl+0x42/0x80
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? __x64_sys_ioctl+0x5e/0xa0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ? do_syscall_64+0x6a/0x90
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
exit_to_user_mode_prepare+0x19/0x90
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  ?
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Jun 02 19:04:05 axion.fireburn.co.uk kernel:  
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Modules linked in:
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0008
Jun 02 19:04:05 axion.fireburn.co.uk kernel: ---[ end trace
 ]---
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP:
0010:ttm_resource_init+0x108/0x210
Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48
39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d
56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89
>
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP:
0018:888112e73918 EFLAGS: 00010202
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: 888206b715d8
RBX: 888206b715a0 RCX: 
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: 888206b71cf8
RSI: 888206b71cc0 RDI: 888110605b00
Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: 88816c848c08
R08: 88812235c790 R09: 8881306a4bd8
Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: 
R11: 81851320 R12: 888110605ad0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: 888206b715a0
R14: 88816c848c58 R15: 888110605ad0
Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS:
7f4c257c1740() GS:888fde40()
knlGS:
Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS:  0010 DS:  ES:
 CR0: 80050033
Jun 02 19:04:05

Re: [PATCH] drm/amdgpu: Adjust logic around GTT size (v3)

2022-06-02 Thread Christian König

I totally agree on the reasoning, but I have the strong feeling that 
this will blow up in our face once more.


I've tried to raise this limit twice already and had to revert it both 
times. And the reasons why I had to revert it haven't changed since them.


Christian.

Am 02.06.22 um 18:40 schrieb Alex Deucher:

@Christian Koenig
Any objections to this?  I realize that fixing the OOM killer is
ultimately the right approach, but I don't really see how this makes
things worse.  The current scheme is biased towards dGPUs as they have
lots of on board memory so on dGPUs we can end up setting gtt size to
3/4 of system memory already in a lot of cases since there is often as
much vram as system memory.  Due to the limits in ttm, we can't use
more than half at the moment anway, so this shouldn't make things
worse on dGPUs and would help a lot of APUs.  Once could make the
argument that with more vram there is less need for gtt so less chance
for OOM, but I think it is more of a scale issue.  E.g., on dGPUs
you'll generally be running higher resolutions and texture quality,
etc. so the overall memory footprint is just scaled up.

Alex

On Fri, May 20, 2022 at 11:09 AM Alex Deucher  wrote:

Certain GL unit tests for large textures can cause problems
with the OOM killer since there is no way to link this memory
to a process.  This was originally mitigated (but not necessarily
eliminated) by limiting the GTT size.  The problem is this limit
is often too low for many modern games so just make the limit 1/2
of system memory. The OOM accounting needs to be addressed, but
we shouldn't prevent common 3D applications from being usable
just to potentially mitigate that corner case.

Set default GTT size to max(3G, 1/2 of system ram) by default.

v2: drop previous logic and default to 3/4 of ram
v3: default to half of ram to align with ttm

Signed-off-by: Alex Deucher 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 ++--
  1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index d2b5cccb45c3..7195ed77c85a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1798,18 +1798,26 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
  (unsigned) (adev->gmc.real_vram_size / (1024 * 1024)));

-   /* Compute GTT size, either bsaed on 3/4th the size of RAM size
+   /* Compute GTT size, either bsaed on 1/2 the size of RAM size
  * or whatever the user passed on module init */
 if (amdgpu_gtt_size == -1) {
 struct sysinfo si;

 si_meminfo();
-   gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
-  adev->gmc.mc_vram_size),
-  ((uint64_t)si.totalram * si.mem_unit * 3/4));
-   }
-   else
+   /* Certain GL unit tests for large textures can cause problems
+* with the OOM killer since there is no way to link this memory
+* to a process.  This was originally mitigated (but not 
necessarily
+* eliminated) by limiting the GTT size.  The problem is this 
limit
+* is often too low for many modern games so just make the 
limit 1/2
+* of system memory which aligns with TTM. The OOM accounting 
needs
+* to be addressed, but we shouldn't prevent common 3D 
applications
+* from being usable just to potentially mitigate that corner 
case.
+*/
+   gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
+  (u64)si.totalram * si.mem_unit / 2);
+   } else {
 gtt_size = (uint64_t)amdgpu_gtt_size << 20;
+   }

 /* Initialize GTT memory pool */
 r = amdgpu_gtt_mgr_init(adev, gtt_size);
--
2.35.3

Re: [PATCH 1/3] drm/amdgpu: use VRAM|GTT for a bunch of kernel allocations

2022-06-02 Thread Alex Deucher

On Thu, Jun 2, 2022 at 12:12 PM Luben Tuikov  wrote:
>
> From: Christian König 
>
> Technically all of those can use GTT as well, no need to force things
> into VRAM.
>
> Signed-off-by: Christian König 
> Acked-by: Luben Tuikov 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   |  7 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c   | 20 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c   |  9 ++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c   | 14 +
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |  3 ++-
>  drivers/gpu/drm/amd/amdgpu/psp_v10_0.c|  3 ++-
>  .../amd/pm/powerplay/smumgr/smu10_smumgr.c| 10 --

We need to audit the new files which have been added since the time
this patch set was written.  E.g., gfx_v10_.c and gfx_v11_0.c, and
psp_v11_0.c, swsmu, etc. have been added in the meantime.

Alex

>  7 files changed, 41 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 16699158e00d8c..d799815a0f288f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -338,8 +338,11 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
>  * KIQ MQD no matter SRIOV or Bare-metal
>  */
> r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
> -   AMDGPU_GEM_DOMAIN_VRAM, 
> >mqd_obj,
> -   >mqd_gpu_addr, 
> >mqd_ptr);
> +   AMDGPU_GEM_DOMAIN_VRAM |
> +   AMDGPU_GEM_DOMAIN_GTT,
> +   >mqd_obj,
> +   >mqd_gpu_addr,
> +   >mqd_ptr);
> if (r) {
> dev_warn(adev->dev, "failed to create ring mqd ob 
> (%d)", r);
> return r;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index e9411c28d88ba8..116f7fa25aa636 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -748,9 +748,12 @@ static int psp_tmr_init(struct psp_context *psp)
> }
>
> pptr = amdgpu_sriov_vf(psp->adev) ? _buf : NULL;
> -   ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, 
> PSP_TMR_SIZE(psp->adev),
> - AMDGPU_GEM_DOMAIN_VRAM,
> - >tmr_bo, >tmr_mc_addr, pptr);
> +   ret = amdgpu_bo_create_kernel(psp->adev, tmr_size,
> + PSP_TMR_SIZE(psp->adev),
> + AMDGPU_GEM_DOMAIN_VRAM |
> + AMDGPU_GEM_DOMAIN_GTT,
> + >tmr_bo, >tmr_mc_addr,
> + pptr);
>
> return ret;
>  }
> @@ -1039,7 +1042,8 @@ int psp_ta_init_shared_buf(struct psp_context *psp,
> * physical) for ta to host memory
> */
> return amdgpu_bo_create_kernel(psp->adev, mem_ctx->shared_mem_size,
> - PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM,
> + PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM |
> + AMDGPU_GEM_DOMAIN_GTT,
>   _ctx->shared_bo,
>   _ctx->shared_mc_addr,
>   _ctx->shared_buf);
> @@ -3397,10 +3401,10 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct 
> device *dev,
>
> /* LFB address which is aligned to 1MB boundary per PSP request */
> ret = amdgpu_bo_create_kernel(adev, usbc_pd_fw->size, 0x10,
> -   AMDGPU_GEM_DOMAIN_VRAM,
> -   _buf_bo,
> -   _pri_mc_addr,
> -   _pri_cpu_addr);
> + AMDGPU_GEM_DOMAIN_VRAM |
> + AMDGPU_GEM_DOMAIN_GTT,
> + _buf_bo, _pri_mc_addr,
> + _pri_cpu_addr);
> if (ret)
> goto rel_buf;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c
> index 6373bfb47d55d7..c591ed6553fcc8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c
> @@ -93,7 +93,8 @@ int amdgpu_gfx_rlc_init_sr(struct amdgpu_device *adev, u32 
> dws)
>
> /* allocate save restore block */
> r = amdgpu_bo_create_reserved(adev, dws * 4, PAGE_SIZE,
> - AMDGPU_GEM_DOMAIN_VRAM,
> +

Re: [PATCH] drm/ttm: fix bulk move handling during resource init

2022-06-02 Thread Alex Deucher

On Thu, Jun 2, 2022 at 11:47 AM Christian König
 wrote:
>
> The resource must be on the LRU before ttm_lru_bulk_move_add() is called.
>
> Signed-off-by: Christian König 

This should at least fix the null pointer in these bugs:

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1992
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2034

Alex

> ---
>  drivers/gpu/drm/ttm/ttm_resource.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c 
> b/drivers/gpu/drm/ttm/ttm_resource.c
> index 65889b3caf50..928b9140f3c5 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -169,15 +169,17 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
> res->bus.is_iomem = false;
> res->bus.caching = ttm_cached;
> res->bo = bo;
> -   INIT_LIST_HEAD(>lru);
>
> man = ttm_manager_type(bo->bdev, place->mem_type);
> spin_lock(>bdev->lru_lock);
> man->usage += res->num_pages << PAGE_SHIFT;
> -   if (bo->bulk_move)
> +   if (bo->bulk_move) {
> +   list_add_tail(>lru, >lru[bo->priority]);
> ttm_lru_bulk_move_add(bo->bulk_move, res);
> -   else
> +   } else {
> +   INIT_LIST_HEAD(>lru);
> ttm_resource_move_to_lru_tail(res);
> +   }
> spin_unlock(>bdev->lru_lock);
>  }
>  EXPORT_SYMBOL(ttm_resource_init);
> --
> 2.25.1
>

[PATCH] drm/amdgpu/discovery: add comments about VCN instance handling

2022-06-02 Thread Alex Deucher

Add comments to clarify code that is safe, but triggers and
smatch warning.

Link: https://lists.freedesktop.org/archives/amd-gfx/2022-June/079905.html
Signed-off-by: Alex Deucher 
Cc: Dan Carpenter 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 91f21b725a43..b0811287f017 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -1435,6 +1435,11 @@ static int amdgpu_discovery_get_vcn_info(struct 
amdgpu_device *adev)
return -EINVAL;
}
 
+   /* num_vcn_inst is currently limited to AMDGPU_MAX_VCN_INSTANCES
+* which is smaller than VCN_INFO_TABLE_MAX_NUM_INSTANCES
+* but that may change in the future with new GPUs so keep this
+* check for defensive purposes.
+*/
if (adev->vcn.num_vcn_inst > VCN_INFO_TABLE_MAX_NUM_INSTANCES) {
dev_err(adev->dev, "invalid vcn instances\n");
return -EINVAL;
@@ -1450,6 +1455,9 @@ static int amdgpu_discovery_get_vcn_info(struct 
amdgpu_device *adev)
 
switch (le16_to_cpu(vcn_info->v1.header.version_major)) {
case 1:
+   /* num_vcn_inst is currently limited to AMDGPU_MAX_VCN_INSTANCES
+* so this won't overflow.
+*/
for (v = 0; v < adev->vcn.num_vcn_inst; v++) {
adev->vcn.vcn_codec_disable_mask[v] =

le32_to_cpu(vcn_info->v1.instance_info[v].fuse_data.all_bits);
-- 
2.35.3

Re: [PATCH] drm/amdgpu: Adjust logic around GTT size (v3)

2022-06-02 Thread Alex Deucher

@Christian Koenig
Any objections to this?  I realize that fixing the OOM killer is
ultimately the right approach, but I don't really see how this makes
things worse.  The current scheme is biased towards dGPUs as they have
lots of on board memory so on dGPUs we can end up setting gtt size to
3/4 of system memory already in a lot of cases since there is often as
much vram as system memory.  Due to the limits in ttm, we can't use
more than half at the moment anway, so this shouldn't make things
worse on dGPUs and would help a lot of APUs.  Once could make the
argument that with more vram there is less need for gtt so less chance
for OOM, but I think it is more of a scale issue.  E.g., on dGPUs
you'll generally be running higher resolutions and texture quality,
etc. so the overall memory footprint is just scaled up.

Alex

On Fri, May 20, 2022 at 11:09 AM Alex Deucher  wrote:
>
> Certain GL unit tests for large textures can cause problems
> with the OOM killer since there is no way to link this memory
> to a process.  This was originally mitigated (but not necessarily
> eliminated) by limiting the GTT size.  The problem is this limit
> is often too low for many modern games so just make the limit 1/2
> of system memory. The OOM accounting needs to be addressed, but
> we shouldn't prevent common 3D applications from being usable
> just to potentially mitigate that corner case.
>
> Set default GTT size to max(3G, 1/2 of system ram) by default.
>
> v2: drop previous logic and default to 3/4 of ram
> v3: default to half of ram to align with ttm
>
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 ++--
>  1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index d2b5cccb45c3..7195ed77c85a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1798,18 +1798,26 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
> DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
>  (unsigned) (adev->gmc.real_vram_size / (1024 * 1024)));
>
> -   /* Compute GTT size, either bsaed on 3/4th the size of RAM size
> +   /* Compute GTT size, either bsaed on 1/2 the size of RAM size
>  * or whatever the user passed on module init */
> if (amdgpu_gtt_size == -1) {
> struct sysinfo si;
>
> si_meminfo();
> -   gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> -  adev->gmc.mc_vram_size),
> -  ((uint64_t)si.totalram * si.mem_unit * 3/4));
> -   }
> -   else
> +   /* Certain GL unit tests for large textures can cause problems
> +* with the OOM killer since there is no way to link this 
> memory
> +* to a process.  This was originally mitigated (but not 
> necessarily
> +* eliminated) by limiting the GTT size.  The problem is this 
> limit
> +* is often too low for many modern games so just make the 
> limit 1/2
> +* of system memory which aligns with TTM. The OOM accounting 
> needs
> +* to be addressed, but we shouldn't prevent common 3D 
> applications
> +* from being usable just to potentially mitigate that corner 
> case.
> +*/
> +   gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> +  (u64)si.totalram * si.mem_unit / 2);
> +   } else {
> gtt_size = (uint64_t)amdgpu_gtt_size << 20;
> +   }
>
> /* Initialize GTT memory pool */
> r = amdgpu_gtt_mgr_init(adev, gtt_size);
> --
> 2.35.3
>

Re: [PATCH] drm/ttm: fix bulk move handling during resource init

2022-06-02 Thread Luben Tuikov

Acked-by: Luben Tuikov 

Regards,
Luben

On 2022-06-02 11:47, Christian König wrote:
> The resource must be on the LRU before ttm_lru_bulk_move_add() is called.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/ttm/ttm_resource.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c 
> b/drivers/gpu/drm/ttm/ttm_resource.c
> index 65889b3caf50..928b9140f3c5 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -169,15 +169,17 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
>   res->bus.is_iomem = false;
>   res->bus.caching = ttm_cached;
>   res->bo = bo;
> - INIT_LIST_HEAD(>lru);
>  
>   man = ttm_manager_type(bo->bdev, place->mem_type);
>   spin_lock(>bdev->lru_lock);
>   man->usage += res->num_pages << PAGE_SHIFT;
> - if (bo->bulk_move)
> + if (bo->bulk_move) {
> + list_add_tail(>lru, >lru[bo->priority]);
>   ttm_lru_bulk_move_add(bo->bulk_move, res);
> - else
> + } else {
> + INIT_LIST_HEAD(>lru);
>   ttm_resource_move_to_lru_tail(res);
> + }
>   spin_unlock(>bdev->lru_lock);
>  }
>  EXPORT_SYMBOL(ttm_resource_init);

Regards,
-- 
Luben

[PATCH 2/3] drm/amdgpu: rename vram_scratch into mem_scratch

2022-06-02 Thread Luben Tuikov

From: Christian König 

Rename vram_scratch into mem_scratch and allow allocating it into GTT as
well.

The only problem with that is that we won't have a default page for the
system aperture any more.

Signed-off-by: Christian König 
Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 27 +++---
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfxhub_v3_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v3_0.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v3_0_1.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v3_0_2.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c|  2 +-
 17 files changed, 31 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 96d058c4cd4b7c..38bf6fb08773b1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -603,7 +603,7 @@ int amdgpu_cs_wait_fences_ioctl(struct drm_device *dev, 
void *data,
struct drm_file *filp);
 
 /* VRAM scratch page for HDP bug, default vram page */
-struct amdgpu_vram_scratch {
+struct amdgpu_mem_scratch {
struct amdgpu_bo*robj;
volatile uint32_t   *ptr;
u64 gpu_addr;
@@ -842,7 +842,7 @@ struct amdgpu_device {
 
/* memory management */
struct amdgpu_mman  mman;
-   struct amdgpu_vram_scratch  vram_scratch;
+   struct amdgpu_mem_scratch   mem_scratch;
struct amdgpu_wbwb;
atomic64_t  num_bytes_moved;
atomic64_t  num_evictions;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b5ee0eb984ee65..2681a615054fe3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -920,32 +920,33 @@ static int amdgpu_device_asic_init(struct amdgpu_device 
*adev)
 }
 
 /**
- * amdgpu_device_vram_scratch_init - allocate the VRAM scratch page
+ * amdgpu_device_mem_scratch_init - allocate the VRAM scratch page
  *
  * @adev: amdgpu_device pointer
  *
  * Allocates a scratch page of VRAM for use by various things in the
  * driver.
  */
-static int amdgpu_device_vram_scratch_init(struct amdgpu_device *adev)
+static int amdgpu_device_mem_scratch_init(struct amdgpu_device *adev)
 {
-   return amdgpu_bo_create_kernel(adev, AMDGPU_GPU_PAGE_SIZE,
-  PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM,
-  >vram_scratch.robj,
-  >vram_scratch.gpu_addr,
-  (void **)>vram_scratch.ptr);
+   return amdgpu_bo_create_kernel(adev, AMDGPU_GPU_PAGE_SIZE, PAGE_SIZE,
+  AMDGPU_GEM_DOMAIN_VRAM |
+  AMDGPU_GEM_DOMAIN_GTT,
+  >mem_scratch.robj,
+  >mem_scratch.gpu_addr,
+  (void **)>mem_scratch.ptr);
 }
 
 /**
- * amdgpu_device_vram_scratch_fini - Free the VRAM scratch page
+ * amdgpu_device_mem_scratch_fini - Free the VRAM scratch page
  *
  * @adev: amdgpu_device pointer
  *
  * Frees the VRAM scratch page.
  */
-static void amdgpu_device_vram_scratch_fini(struct amdgpu_device *adev)
+static void amdgpu_device_mem_scratch_fini(struct amdgpu_device *adev)
 {
-   amdgpu_bo_free_kernel(>vram_scratch.robj, NULL, NULL);
+   amdgpu_bo_free_kernel(>mem_scratch.robj, NULL, NULL);
 }
 
 /**
@@ -2367,9 +2368,9 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
if (amdgpu_sriov_vf(adev))
amdgpu_virt_exchange_data(adev);
 
-   r = amdgpu_device_vram_scratch_init(adev);
+   r = amdgpu_device_mem_scratch_init(adev);
if (r) {
-   DRM_ERROR("amdgpu_vram_scratch_init failed 
%d\n", r);
+   DRM_ERROR("amdgpu_mem_scratch_init failed 
%d\n", r);
goto init_failed;
}
r = adev->ip_blocks[i].version->funcs->hw_init((void 
*)adev);
@@ -2839,7 +2840,7 @@ static int amdgpu_device_ip_fini(struct amdgpu_device 
*adev)

[PATCH 3/3] drm/amdgpu: cleanup visible vram size handling

2022-06-02 Thread Luben Tuikov

From: Christian König 

Centralize the limit handling and validation in one place instead
of spreading that around in different hw generations.

Signed-off-by: Christian König 
Acked-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 7 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 7 ---
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 3 ---
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c   | 3 ---
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   | 3 ---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   | 3 ---
 6 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index aebc384531ac8f..689978dad1d58f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -201,6 +201,7 @@ uint64_t amdgpu_gmc_agp_addr(struct ttm_buffer_object *bo)
 void amdgpu_gmc_vram_location(struct amdgpu_device *adev, struct amdgpu_gmc 
*mc,
  u64 base)
 {
+   uint64_t vis_limit = (uint64_t)amdgpu_vis_vram_limit << 20;
uint64_t limit = (uint64_t)amdgpu_vram_limit << 20;
 
mc->vram_start = base;
@@ -208,6 +209,12 @@ void amdgpu_gmc_vram_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc,
if (limit && limit < mc->real_vram_size)
mc->real_vram_size = limit;
 
+   if (vis_limit && vis_limit < mc->visible_vram_size)
+   mc->visible_vram_size = vis_limit;
+
+   if (mc->real_vram_size < mc->visible_vram_size)
+   mc->visible_vram_size = mc->real_vram_size;
+
if (mc->xgmi.num_physical_nodes == 0) {
mc->fb_start = mc->vram_start;
mc->fb_end = mc->vram_end;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index ba3221a25e7536..61006f9f9ed388 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1707,7 +1707,6 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 {
uint64_t gtt_size;
int r;
-   u64 vis_vram_limit;
 
mutex_init(>mman.gtt_window_lock);
 
@@ -1730,12 +1729,6 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
return r;
}
 
-   /* Reduce size of CPU-visible VRAM if requested */
-   vis_vram_limit = (u64)amdgpu_vis_vram_limit * 1024 * 1024;
-   if (amdgpu_vis_vram_limit > 0 &&
-   vis_vram_limit <= adev->gmc.visible_vram_size)
-   adev->gmc.visible_vram_size = vis_vram_limit;
-
/* Change the size here instead of the init above so only lpfn is 
affected */
amdgpu_ttm_set_buffer_funcs_status(adev, false);
 #ifdef CONFIG_64BIT
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 9077dfccaf3cf9..851f415f2dba61 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -827,10 +827,7 @@ static int gmc_v10_0_mc_init(struct amdgpu_device *adev)
}
 #endif
 
-   /* In case the PCI BAR is larger than the actual amount of vram */
adev->gmc.visible_vram_size = adev->gmc.aper_size;
-   if (adev->gmc.visible_vram_size > adev->gmc.real_vram_size)
-   adev->gmc.visible_vram_size = adev->gmc.real_vram_size;
 
/* set the gart size */
if (amdgpu_gart_size == -1)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index b309f3ab2917c3..0b95afececdc2d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -389,10 +389,7 @@ static int gmc_v7_0_mc_init(struct amdgpu_device *adev)
}
 #endif
 
-   /* In case the PCI BAR is larger than the actual amount of vram */
adev->gmc.visible_vram_size = adev->gmc.aper_size;
-   if (adev->gmc.visible_vram_size > adev->gmc.real_vram_size)
-   adev->gmc.visible_vram_size = adev->gmc.real_vram_size;
 
/* set the gart size */
if (amdgpu_gart_size == -1) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 24a256cfd7ceb9..8256795f66461a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -587,10 +587,7 @@ static int gmc_v8_0_mc_init(struct amdgpu_device *adev)
}
 #endif
 
-   /* In case the PCI BAR is larger than the actual amount of vram */
adev->gmc.visible_vram_size = adev->gmc.aper_size;
-   if (adev->gmc.visible_vram_size > adev->gmc.real_vram_size)
-   adev->gmc.visible_vram_size = adev->gmc.real_vram_size;
 
/* set the gart size */
if (amdgpu_gart_size == -1) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 22761a3bb8181e..e246c999b44acd 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1467,10 +1467,7 @@ static int gmc_v9_0_mc_init(struct

[PATCH 1/3] drm/amdgpu: use VRAM|GTT for a bunch of kernel allocations

2022-06-02 Thread Luben Tuikov

From: Christian König 

Technically all of those can use GTT as well, no need to force things
into VRAM.

Signed-off-by: Christian König 
Acked-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   |  7 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c   | 20 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c   |  9 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c   | 14 +
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |  3 ++-
 drivers/gpu/drm/amd/amdgpu/psp_v10_0.c|  3 ++-
 .../amd/pm/powerplay/smumgr/smu10_smumgr.c| 10 --
 7 files changed, 41 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 16699158e00d8c..d799815a0f288f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -338,8 +338,11 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
 * KIQ MQD no matter SRIOV or Bare-metal
 */
r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
-   AMDGPU_GEM_DOMAIN_VRAM, 
>mqd_obj,
-   >mqd_gpu_addr, 
>mqd_ptr);
+   AMDGPU_GEM_DOMAIN_VRAM |
+   AMDGPU_GEM_DOMAIN_GTT,
+   >mqd_obj,
+   >mqd_gpu_addr,
+   >mqd_ptr);
if (r) {
dev_warn(adev->dev, "failed to create ring mqd ob 
(%d)", r);
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index e9411c28d88ba8..116f7fa25aa636 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -748,9 +748,12 @@ static int psp_tmr_init(struct psp_context *psp)
}
 
pptr = amdgpu_sriov_vf(psp->adev) ? _buf : NULL;
-   ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, 
PSP_TMR_SIZE(psp->adev),
- AMDGPU_GEM_DOMAIN_VRAM,
- >tmr_bo, >tmr_mc_addr, pptr);
+   ret = amdgpu_bo_create_kernel(psp->adev, tmr_size,
+ PSP_TMR_SIZE(psp->adev),
+ AMDGPU_GEM_DOMAIN_VRAM |
+ AMDGPU_GEM_DOMAIN_GTT,
+ >tmr_bo, >tmr_mc_addr,
+ pptr);
 
return ret;
 }
@@ -1039,7 +1042,8 @@ int psp_ta_init_shared_buf(struct psp_context *psp,
* physical) for ta to host memory
*/
return amdgpu_bo_create_kernel(psp->adev, mem_ctx->shared_mem_size,
- PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM,
+ PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM |
+ AMDGPU_GEM_DOMAIN_GTT,
  _ctx->shared_bo,
  _ctx->shared_mc_addr,
  _ctx->shared_buf);
@@ -3397,10 +3401,10 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device 
*dev,
 
/* LFB address which is aligned to 1MB boundary per PSP request */
ret = amdgpu_bo_create_kernel(adev, usbc_pd_fw->size, 0x10,
-   AMDGPU_GEM_DOMAIN_VRAM,
-   _buf_bo,
-   _pri_mc_addr,
-   _pri_cpu_addr);
+ AMDGPU_GEM_DOMAIN_VRAM |
+ AMDGPU_GEM_DOMAIN_GTT,
+ _buf_bo, _pri_mc_addr,
+ _pri_cpu_addr);
if (ret)
goto rel_buf;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c
index 6373bfb47d55d7..c591ed6553fcc8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c
@@ -93,7 +93,8 @@ int amdgpu_gfx_rlc_init_sr(struct amdgpu_device *adev, u32 
dws)
 
/* allocate save restore block */
r = amdgpu_bo_create_reserved(adev, dws * 4, PAGE_SIZE,
- AMDGPU_GEM_DOMAIN_VRAM,
+ AMDGPU_GEM_DOMAIN_VRAM |
+ AMDGPU_GEM_DOMAIN_GTT,
  >gfx.rlc.save_restore_obj,
  >gfx.rlc.save_restore_gpu_addr,
  (void **)>gfx.rlc.sr_ptr);
@@ -130,7 +131,8 @@ int amdgpu_gfx_rlc_init_csb(struct amdgpu_device *adev)
/* allocate clear state block */
adev->gfx.rlc.clear_state_size

[PATCH 0/3] Some VRAM cleanups

2022-06-02 Thread Luben Tuikov

Some VRAM cleanups from Christian. Second patch needed some work to be able
to compile with most recent amd-staging-drm-next.

Tested on a couple of differing systems.

Christian König (3):
  drm/amdgpu: use VRAM|GTT for a bunch of kernel allocations
  drm/amdgpu: rename vram_scratch into mem_scratch
  drm/amdgpu: cleanup visible vram size handling

 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 27 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   |  7 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c   |  7 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c   | 20 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c   |  9 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  7 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c   | 14 +++---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |  3 ++-
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfxhub_v3_0.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c|  3 ---
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c |  5 +---
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c |  5 +---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  3 ---
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v3_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v3_0_1.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v3_0_2.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/psp_v10_0.c|  3 ++-
 .../amd/pm/powerplay/smumgr/smu10_smumgr.c| 10 +++
 28 files changed, 79 insertions(+), 74 deletions(-)


base-commit: 6cc31d9a3e0944ca02dda6f53f9c9320b8bee928
-- 
2.36.1.74.g277cf0bc36

[PATCH] drm/ttm: fix bulk move handling during resource init

2022-06-02 Thread Christian König

The resource must be on the LRU before ttm_lru_bulk_move_add() is called.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_resource.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_resource.c 
b/drivers/gpu/drm/ttm/ttm_resource.c
index 65889b3caf50..928b9140f3c5 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -169,15 +169,17 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
res->bus.is_iomem = false;
res->bus.caching = ttm_cached;
res->bo = bo;
-   INIT_LIST_HEAD(>lru);
 
man = ttm_manager_type(bo->bdev, place->mem_type);
spin_lock(>bdev->lru_lock);
man->usage += res->num_pages << PAGE_SHIFT;
-   if (bo->bulk_move)
+   if (bo->bulk_move) {
+   list_add_tail(>lru, >lru[bo->priority]);
ttm_lru_bulk_move_add(bo->bulk_move, res);
-   else
+   } else {
+   INIT_LIST_HEAD(>lru);
ttm_resource_move_to_lru_tail(res);
+   }
spin_unlock(>bdev->lru_lock);
 }
 EXPORT_SYMBOL(ttm_resource_init);
-- 
2.25.1

Re: [kbuild] drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1433 amdgpu_discovery_get_vcn_info() error: buffer overflow 'adev->vcn.vcn_codec_disable_mask' 2 <= 3

2022-06-02 Thread Alex Deucher

On Thu, Jun 2, 2022 at 11:33 AM Dan Carpenter  wrote:
>
> On Thu, Jun 02, 2022 at 10:24:58AM -0400, Alex Deucher wrote:
> > On Thu, Jun 2, 2022 at 7:51 AM Dan Carpenter  
> > wrote:
> > >
> > > On Thu, Jun 02, 2022 at 08:26:03AM +0200, Ernst Sjöstrand wrote:
> > > > Dan: I also ran Smatch which resulted in the following discussion:
> > > >
> > > > https://lists.freedesktop.org/archives/amd-gfx/2022-May/079228.html
> > >
> > > Since the bounds check is dead code which does not make sense and is not
> > > required, another idea would be to just delete it.
> >
> > It wouldn't be dead code if AMDGPU_MAX_VCN_INSTANCES ever increased.
>
> Or we could add a comment to the code I suppose.
>
> /* Impossible in 2022 but this check might sense in the future */

Good idea.  I'll send out a patch.

Thanks,

Alex

>
> regards,
> dan carpenter
>

Re: [kbuild] drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1433 amdgpu_discovery_get_vcn_info() error: buffer overflow 'adev->vcn.vcn_codec_disable_mask' 2 <= 3

2022-06-02 Thread Dan Carpenter

On Thu, Jun 02, 2022 at 10:24:58AM -0400, Alex Deucher wrote:
> On Thu, Jun 2, 2022 at 7:51 AM Dan Carpenter  wrote:
> >
> > On Thu, Jun 02, 2022 at 08:26:03AM +0200, Ernst Sjöstrand wrote:
> > > Dan: I also ran Smatch which resulted in the following discussion:
> > >
> > > https://lists.freedesktop.org/archives/amd-gfx/2022-May/079228.html 
> >
> > Since the bounds check is dead code which does not make sense and is not
> > required, another idea would be to just delete it.
> 
> It wouldn't be dead code if AMDGPU_MAX_VCN_INSTANCES ever increased.

Or we could add a comment to the code I suppose.

/* Impossible in 2022 but this check might sense in the future */

regards,
dan carpenter

Re: [kbuild] drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1433 amdgpu_discovery_get_vcn_info() error: buffer overflow 'adev->vcn.vcn_codec_disable_mask' 2 <= 3

2022-06-02 Thread Alex Deucher

On Thu, Jun 2, 2022 at 7:51 AM Dan Carpenter  wrote:
>
> On Thu, Jun 02, 2022 at 08:26:03AM +0200, Ernst Sjöstrand wrote:
> > Dan: I also ran Smatch which resulted in the following discussion:
> >
> > https://lists.freedesktop.org/archives/amd-gfx/2022-May/079228.html
>
> Since the bounds check is dead code which does not make sense and is not
> required, another idea would be to just delete it.

It wouldn't be dead code if AMDGPU_MAX_VCN_INSTANCES ever increased.

Alex

>
> regards,
> dan carpenter
>

Re: (REGRESSION bisected) Re: amdgpu errors (VM fault / GPU fault detected) with 5.19 merge window snapshots

2022-06-02 Thread Michal Kubecek

On Thu, Jun 02, 2022 at 09:58:22AM -0400, Alex Deucher wrote:
> On Fri, May 27, 2022 at 8:58 AM Michal Kubecek  wrote:
> > On Fri, May 27, 2022 at 11:00:39AM +0200, Michal Kubecek wrote:
> > > Hello,
> > >
> > > while testing 5.19 merge window snapshots (commits babf0bb978e3 and
> > > 7e284070abe5), I keep getting errors like below. I have not seen them
> > > with 5.18 final or older.
> > >
> > > 
> > > [  247.150333] gmc_v8_0_process_interrupt: 46 callbacks suppressed
> > > [  247.150336] amdgpu :0c:00.0: amdgpu: GPU fault detected: 147 
> > > 0x00020802 for process firefox pid 6101 thread firefox:cs0 pid 6116
> > > [  247.150339] amdgpu :0c:00.0: amdgpu:   
> > > VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00107800
> > > [  247.150340] amdgpu :0c:00.0: amdgpu:   
> > > VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0D008002
> > > [  247.150341] amdgpu :0c:00.0: amdgpu: VM fault (0x02, vmid 6, pasid 
> > > 32780) at page 1079296, write from 'TC2' (0x54433200) (8)
> > [...]
> > > [  249.925909] amdgpu :0c:00.0: amdgpu: IH ring buffer overflow 
> > > (0x000844C0, 0x4A00, 0x44D0)
> > > [  250.434986] [drm] Fence fallback timer expired on ring sdma0
> > > [  466.621568] gmc_v8_0_process_interrupt: 122 callbacks suppressed
> > [...]
> > > 
> > >
> > > There does not seem to be any apparent immediate problem with graphics
> > > but when running commit babf0bb978e3, there seemed to be a noticeable
> > > lag in some operations, e.g. when moving a window or repainting large
> > > part of the terminal window in konsole (no idea if it's related).
> > >
> > > My GPU is Radeon Pro WX 2100 (1002:6995). What other information should
> > > I collect to help debugging the issue?
> >
> > Bisected to commit 5255e146c99a ("drm/amdgpu: rework TLB flushing").
> > There seem to be later commits depending on it so I did not test
> > a revert on top of current mainline.
> >
> > I should also mention that most commits tested as "bad" during the
> > bisect did behave much worse than current mainline (errors starting as
> > early as with sddm, visibly damaged screen content, sometimes even
> > crashes). But all of them issued messages similar to those above into
> > kernel log.
> 
> Can you verify that the kernel you tested has this patch:
> https://cgit.freedesktop.org/drm/drm/commit/?id=5be323562c6a699d38430bc068a3fd192be8ed0d

Yes, both of them:

mike@lion:~/work/git/kernel-upstream> git merge-base --is-ancestor 5be323562c6a 
babf0bb978e3 && echo yes
yes

(7e284070abe5 is a later mainline snapshot so it also contains
5be323562c6a)

But it's likely that commit 5be323562c6a fixed most of the problem and
only some corner case was left as most bisect steps had many more error
messages and some even crashed before I was able to even log into KDE.
Compared to that, the mainline snapshots show much fewer errors, no
distorted picture and no crash; on the other hand, applications like
firefox or stellarium seem to trigger the errors quite consistently.

Michal


signature.asc
Description: PGP signature

Re: Explicit VM updates

2022-06-02 Thread Felix Kuehling




Am 2022-06-02 um 02:47 schrieb Christian König:



Am 01.06.22 um 19:39 schrieb Felix Kuehling:


Am 2022-06-01 um 13:22 schrieb Christian König:

Am 01.06.22 um 19:07 schrieb Felix Kuehling:

Am 2022-06-01 um 12:29 schrieb Christian König:

Am 01.06.22 um 17:05 schrieb Felix Kuehling:


Am 2022-06-01 um 08:40 schrieb Christian König:

Hey guys,

so today Bas came up with a new requirement regarding the 
explicit synchronization to VM updates and a bunch of prototype 
patches.


I've been thinking about that stuff for quite some time before, 
but to be honest it's one of the most trickiest parts of the 
driver.


So my current thinking is that we could potentially handle those 
requirements like this:


1. We add some new EXPLICIT flag to context (or CS?) and VM 
IOCTL. This way we either get the new behavior for the whole 
CS+VM or the old one, but never both mixed.


2. When memory is unmapped we keep around the last unmap 
operation inside the bo_va.


3. When memory is freed we add all the CS fences which could 
access that memory + the last unmap operation as BOOKKEEP fences 
to the BO and as mandatory sync fence to the VM.


Memory freed either because of an eviction or because of 
userspace closing the handle will be seen as a combination of 
unmap+free.



The result is the following semantic for userspace to avoid 
implicit synchronization as much as possible:


1. When you allocate and map memory it is mandatory to either 
wait for the mapping operation to complete or to add it as 
dependency for your CS.
    If this isn't followed the application will run into CS 
faults (that's what we pretty much already implemented).


This makes sense.




2. When memory is freed you must unmap that memory first and 
then wait for this unmap operation to complete before freeing 
the memory.
    If this isn't followed the kernel will add a forcefully wait 
to the next CS to block until the unmap is completed.


This would work for now, but it won't work for user mode 
submission in the future. I find it weird that user mode needs to 
wait for the unmap. For user mode, unmap and free should always 
be asynchronous. I can't think of any good reasons to make user 
mode wait for the driver to clean up its stuff.


Could the waiting be done in kernel mode instead? TTM already 
does delayed freeing if there are fences outstanding on a BO 
being freed. This should make it easy to delay freeing until the 
unmap is done without blocking the user mode thread.


This is not about blocking, but synchronization dependencies.


Then I must have misunderstood this sentence: "When memory is freed 
you must unmap that memory first and then wait for this unmap 
operation to complete before freeing the memory." If the pronoun 
"you" is the user mode driver, it means user mode must wait for 
kernel mode to finish unmapping memory before freeing it. Was that 
not what you meant?


Ah, yes. The UMD must wait for the kernel to finish unmapping all 
the maps from the BO before it drops the handle of the BO and with 
that frees it.




In other words the free is not waiting for the unmap to complete, 
but causes command submissions through the kernel to depend on the 
unmap.


I guess I don't understand that dependency. The next command 
submission obviously cannot use the memory that was unmapped. But 
why does it need to synchronize with the unmap operation?


Because of the necessary TLB flush, only after that one is executed 
we can be sure that nobody has access to the memory any more and 
actually free it.


So freeing the memory has to wait for the TLB flush. Why does the 
next command submission need to wait?


Because that's the one triggering the TLB flush. The issue is that 
flushing the TLB while the VMID is in use is really unreliable on most 
hardware generations.


It's been working well enough with ROCm. With user mode command 
submission there is no way to block GPU work while a TLB flush is in 
progress.







User mode submissions are completely unrelated to that.


I mention user mode command submission because there is no way to 
enforce the synchronization you describe here on a user mode queue. 
So this approach is not very future proof.


With user mode queues you need to wait for the work on the queue to 
finish anyway or otherwise you run into VM faults if you just unmap 
or free the memory.


If the next command submission doesn't use the unmapped/freed memory, 
why does it need to wait for the TLB flush?


Because it could potentially use it. When userspace lies to the kernel 
and still accesses the mapping we would allow access to freed up 
memory and create a major security problem.


I'm aware of the potential security problem. That's why I'm recommending 
you don't actually free the memory until the TLB flush is done. So a 
bogus access will either harmlessly access memory that's not freed yet, 
or it will VM fault. It will never access memory that's already freed 
and potentially allocated by

Re: (REGRESSION bisected) Re: amdgpu errors (VM fault / GPU fault detected) with 5.19 merge window snapshots

2022-06-02 Thread Alex Deucher

On Fri, May 27, 2022 at 8:58 AM Michal Kubecek  wrote:
>
> On Fri, May 27, 2022 at 11:00:39AM +0200, Michal Kubecek wrote:
> > Hello,
> >
> > while testing 5.19 merge window snapshots (commits babf0bb978e3 and
> > 7e284070abe5), I keep getting errors like below. I have not seen them
> > with 5.18 final or older.
> >
> > 
> > [  247.150333] gmc_v8_0_process_interrupt: 46 callbacks suppressed
> > [  247.150336] amdgpu :0c:00.0: amdgpu: GPU fault detected: 147 
> > 0x00020802 for process firefox pid 6101 thread firefox:cs0 pid 6116
> > [  247.150339] amdgpu :0c:00.0: amdgpu:   
> > VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00107800
> > [  247.150340] amdgpu :0c:00.0: amdgpu:   
> > VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0D008002
> > [  247.150341] amdgpu :0c:00.0: amdgpu: VM fault (0x02, vmid 6, pasid 
> > 32780) at page 1079296, write from 'TC2' (0x54433200) (8)
> [...]
> > [  249.925909] amdgpu :0c:00.0: amdgpu: IH ring buffer overflow 
> > (0x000844C0, 0x4A00, 0x44D0)
> > [  250.434986] [drm] Fence fallback timer expired on ring sdma0
> > [  466.621568] gmc_v8_0_process_interrupt: 122 callbacks suppressed
> [...]
> > 
> >
> > There does not seem to be any apparent immediate problem with graphics
> > but when running commit babf0bb978e3, there seemed to be a noticeable
> > lag in some operations, e.g. when moving a window or repainting large
> > part of the terminal window in konsole (no idea if it's related).
> >
> > My GPU is Radeon Pro WX 2100 (1002:6995). What other information should
> > I collect to help debugging the issue?
>
> Bisected to commit 5255e146c99a ("drm/amdgpu: rework TLB flushing").
> There seem to be later commits depending on it so I did not test
> a revert on top of current mainline.
>
> I should also mention that most commits tested as "bad" during the
> bisect did behave much worse than current mainline (errors starting as
> early as with sddm, visibly damaged screen content, sometimes even
> crashes). But all of them issued messages similar to those above into
> kernel log.

Can you verify that the kernel you tested has this patch:
https://cgit.freedesktop.org/drm/drm/commit/?id=5be323562c6a699d38430bc068a3fd192be8ed0d

Alex

Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved

2022-06-02 Thread Christian König


Am 02.06.22 um 15:44 schrieb Lazar, Lijo:



On 6/2/2022 7:06 PM, Christian König wrote:

Am 02.06.22 um 15:26 schrieb Lazar, Lijo:



On 6/2/2022 6:54 PM, Lazar, Lijo wrote:



On 6/2/2022 6:50 PM, Philip Yang wrote:

Flush TLBs when existing PDEs are updated because a PTB or PDB moved,
but avoids unnecessary TLB flushes when new PDBs or PTBs are added to
the page table, which commonly happens when memory is mapped for the
first time.

Suggested-by: Christian König 
Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 9596c22fded6..1ea204218903 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device 
*adev,

  {
  struct amdgpu_vm_update_params params;
  struct amdgpu_vm_bo_base *entry;
+    bool flush_tlb_needed = false;
  int r, idx;
  if (list_empty(>relocated))
@@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device 
*adev,

  goto error;
  list_for_each_entry(entry, >relocated, vm_status) {
+    /* vm_flush_needed after updating moved PDEs */
+    flush_tlb_needed |= entry->moved;


That is a strange thing to do for a bool variable. Why not just 
assign it?




Hmm.. In a loop, perhaps you meant logical OR?


Well IIRC C doesn't have a logical or assignment operator "||=", so 
"|=" is used instead which also gets the job done.




var = var || value;  also will work.

BTW, v1 of this patch was incrementing vm->tlb_seq for every entry 
moved. This one increments only once. So is this vm->tlb_seq required 
only to be a bool?


No, it is required to only increment once. It's a sequence number and 
changing it signals that a TLB flush is necessary.


Regards,
Christian.



Thanks,
Lijo


Christian.



Thanks,
Lijo


Thanks,
Lijo


+
  r = amdgpu_vm_pde_update(, entry);
  if (r)
  goto error;
@@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device 
*adev,

  if (r)
  goto error;
-    /* vm_flush_needed after updating PDEs */
-    atomic64_inc(>tlb_seq);
+    if (flush_tlb_needed)
+    atomic64_inc(>tlb_seq);
  while (!list_empty(>relocated)) {
  entry = list_first_entry(>relocated,

Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved

2022-06-02 Thread Lazar, Lijo





On 6/2/2022 7:06 PM, Christian König wrote:

Am 02.06.22 um 15:26 schrieb Lazar, Lijo:



On 6/2/2022 6:54 PM, Lazar, Lijo wrote:



On 6/2/2022 6:50 PM, Philip Yang wrote:

Flush TLBs when existing PDEs are updated because a PTB or PDB moved,
but avoids unnecessary TLB flushes when new PDBs or PTBs are added to
the page table, which commonly happens when memory is mapped for the
first time.

Suggested-by: Christian König 
Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 9596c22fded6..1ea204218903 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device 
*adev,

  {
  struct amdgpu_vm_update_params params;
  struct amdgpu_vm_bo_base *entry;
+    bool flush_tlb_needed = false;
  int r, idx;
  if (list_empty(>relocated))
@@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device 
*adev,

  goto error;
  list_for_each_entry(entry, >relocated, vm_status) {
+    /* vm_flush_needed after updating moved PDEs */
+    flush_tlb_needed |= entry->moved;


That is a strange thing to do for a bool variable. Why not just 
assign it?




Hmm.. In a loop, perhaps you meant logical OR?


Well IIRC C doesn't have a logical or assignment operator "||=", so "|=" 
is used instead which also gets the job done.




var = var || value;  also will work.

BTW, v1 of this patch was incrementing vm->tlb_seq for every entry 
moved. This one increments only once. So is this vm->tlb_seq required 
only to be a bool?


Thanks,
Lijo


Christian.



Thanks,
Lijo


Thanks,
Lijo


+
  r = amdgpu_vm_pde_update(, entry);
  if (r)
  goto error;
@@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device 
*adev,

  if (r)
  goto error;
-    /* vm_flush_needed after updating PDEs */
-    atomic64_inc(>tlb_seq);
+    if (flush_tlb_needed)
+    atomic64_inc(>tlb_seq);
  while (!list_empty(>relocated)) {
  entry = list_first_entry(>relocated,

Re: [PATCH] drm/amdgpu/display: Protect some functions with CONFIG_DRM_AMD_DC_DCN

2022-06-02 Thread Harry Wentland

On 2022-06-01 22:31, Alex Deucher wrote:
> Protect remove_hpo_dp_link_enc_from_ctx() and release_hpo_dp_link_enc()
> with CONFIG_DRM_AMD_DC_DCN as the functions are only called from code
> that is protected by CONFIG_DRM_AMD_DC_DCN.  Fixes build fail with
> -Werror=unused-function.
> 
> Fixes: 9b0e0d433f74 ("drm/amd/display: Add dependant changes for DCN32/321")
> Reported-by: Stephen Rothwell 
> Signed-off-by: Alex Deucher 
> Cc: Aurabindo Pillai 

Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> index b087452e4590..a098fd0cb240 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> @@ -1801,6 +1801,7 @@ static inline void retain_hpo_dp_link_enc(
>   res_ctx->hpo_dp_link_enc_ref_cnts[enc_index]++;
>  }
>  
> +#if defined(CONFIG_DRM_AMD_DC_DCN)
>  static inline void release_hpo_dp_link_enc(
>   struct resource_context *res_ctx,
>   int enc_index)
> @@ -1808,6 +1809,7 @@ static inline void release_hpo_dp_link_enc(
>   ASSERT(res_ctx->hpo_dp_link_enc_ref_cnts[enc_index] > 0);
>   res_ctx->hpo_dp_link_enc_ref_cnts[enc_index]--;
>  }
> +#endif
>  
>  static bool add_hpo_dp_link_enc_to_ctx(struct resource_context *res_ctx,
>   const struct resource_pool *pool,
> @@ -1832,6 +1834,7 @@ static bool add_hpo_dp_link_enc_to_ctx(struct 
> resource_context *res_ctx,
>   return pipe_ctx->link_res.hpo_dp_link_enc != NULL;
>  }
>  
> +#if defined(CONFIG_DRM_AMD_DC_DCN)
>  static void remove_hpo_dp_link_enc_from_ctx(struct resource_context *res_ctx,
>   struct pipe_ctx *pipe_ctx,
>   struct dc_stream_state *stream)
> @@ -1845,6 +1848,7 @@ static void remove_hpo_dp_link_enc_from_ctx(struct 
> resource_context *res_ctx,
>   pipe_ctx->link_res.hpo_dp_link_enc = NULL;
>   }
>  }
> +#endif
>  
>  /* TODO: release audio object */
>  void update_audio_usage(

Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved

2022-06-02 Thread Christian König


Am 02.06.22 um 15:26 schrieb Lazar, Lijo:



On 6/2/2022 6:54 PM, Lazar, Lijo wrote:



On 6/2/2022 6:50 PM, Philip Yang wrote:

Flush TLBs when existing PDEs are updated because a PTB or PDB moved,
but avoids unnecessary TLB flushes when new PDBs or PTBs are added to
the page table, which commonly happens when memory is mapped for the
first time.

Suggested-by: Christian König 
Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 9596c22fded6..1ea204218903 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device 
*adev,

  {
  struct amdgpu_vm_update_params params;
  struct amdgpu_vm_bo_base *entry;
+    bool flush_tlb_needed = false;
  int r, idx;
  if (list_empty(>relocated))
@@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device 
*adev,

  goto error;
  list_for_each_entry(entry, >relocated, vm_status) {
+    /* vm_flush_needed after updating moved PDEs */
+    flush_tlb_needed |= entry->moved;


That is a strange thing to do for a bool variable. Why not just 
assign it?




Hmm.. In a loop, perhaps you meant logical OR?


Well IIRC C doesn't have a logical or assignment operator "||=", so "|=" 
is used instead which also gets the job done.


Christian.



Thanks,
Lijo


Thanks,
Lijo


+
  r = amdgpu_vm_pde_update(, entry);
  if (r)
  goto error;
@@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device 
*adev,

  if (r)
  goto error;
-    /* vm_flush_needed after updating PDEs */
-    atomic64_inc(>tlb_seq);
+    if (flush_tlb_needed)
+    atomic64_inc(>tlb_seq);
  while (!list_empty(>relocated)) {
  entry = list_first_entry(>relocated,

Re: [PATCH] dc: Add ACP_DATA register

2022-06-02 Thread Harry Wentland

On 2022-06-02 08:02, Alan Liu wrote:
> Define ixAZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA
> Define AZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA__SUPPORTS_AI_MASK/SHIFT
> 
> Signed-off-by: Alan Liu 

Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h   | 1 +
>  drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h 
> b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h
> index 6df651a94b0a..581ba639b4ea 100644
> --- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h
> +++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h
> @@ -6981,6 +6981,7 @@
>  #define ixAZALIA_F0_CODEC_PIN_CONTROL_RESPONSE_PIN_SENSE 
>0x23
>  #define ixAZALIA_F0_CODEC_PIN_CONTROL_WIDGET_CONTROL 
>0x24
>  #define ixAZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER
>0x25
> +#define ixAZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA   
>0x27
>  #define ixAZALIA_F0_CODEC_PIN_CONTROL_AUDIO_DESCRIPTOR0  
>0x28
>  #define ixAZALIA_F0_CODEC_PIN_CONTROL_AUDIO_DESCRIPTOR1  
>0x29
>  #define ixAZALIA_F0_CODEC_PIN_CONTROL_AUDIO_DESCRIPTOR2  
>0x2a
> diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h 
> b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h
> index fa1f4374fafe..fd387c7363b6 100644
> --- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h
> +++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h
> @@ -13639,6 +13639,8 @@
>  #define 
> AZALIA_F0_CODEC_PIN_CONTROL_RESPONSE_PIN_SENSE__IMPEDANCE_SENSE__SHIFT 0x0
>  #define AZALIA_F0_CODEC_PIN_CONTROL_WIDGET_CONTROL__OUT_ENABLE_MASK 0x40
>  #define AZALIA_F0_CODEC_PIN_CONTROL_WIDGET_CONTROL__OUT_ENABLE__SHIFT 0x6
> +#define AZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA__SUPPORTS_AI_MASK 0x40
> +#define AZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA__SUPPORTS_AI__SHIFT 0x6
>  #define AZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER__SPEAKER_ALLOCATION_MASK 
> 0x7f
>  #define 
> AZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER__SPEAKER_ALLOCATION__SHIFT 0x0
>  #define AZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER__CHANNEL_ALLOCATION_MASK 
> 0xff00

Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved

2022-06-02 Thread Lazar, Lijo





On 6/2/2022 6:54 PM, Lazar, Lijo wrote:



On 6/2/2022 6:50 PM, Philip Yang wrote:

Flush TLBs when existing PDEs are updated because a PTB or PDB moved,
but avoids unnecessary TLB flushes when new PDBs or PTBs are added to
the page table, which commonly happens when memory is mapped for the
first time.

Suggested-by: Christian König 
Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 9596c22fded6..1ea204218903 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
  {
  struct amdgpu_vm_update_params params;
  struct amdgpu_vm_bo_base *entry;
+    bool flush_tlb_needed = false;
  int r, idx;
  if (list_empty(>relocated))
@@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
  goto error;
  list_for_each_entry(entry, >relocated, vm_status) {
+    /* vm_flush_needed after updating moved PDEs */
+    flush_tlb_needed |= entry->moved;


That is a strange thing to do for a bool variable. Why not just assign it?



Hmm.. In a loop, perhaps you meant logical OR?

Thanks,
Lijo


Thanks,
Lijo


+
  r = amdgpu_vm_pde_update(, entry);
  if (r)
  goto error;
@@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
  if (r)
  goto error;
-    /* vm_flush_needed after updating PDEs */
-    atomic64_inc(>tlb_seq);
+    if (flush_tlb_needed)
+    atomic64_inc(>tlb_seq);
  while (!list_empty(>relocated)) {
  entry = list_first_entry(>relocated,

Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved

2022-06-02 Thread Lazar, Lijo





On 6/2/2022 6:50 PM, Philip Yang wrote:

Flush TLBs when existing PDEs are updated because a PTB or PDB moved,
but avoids unnecessary TLB flushes when new PDBs or PTBs are added to
the page table, which commonly happens when memory is mapped for the
first time.

Suggested-by: Christian König 
Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 9596c22fded6..1ea204218903 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
  {
struct amdgpu_vm_update_params params;
struct amdgpu_vm_bo_base *entry;
+   bool flush_tlb_needed = false;
int r, idx;
  
  	if (list_empty(>relocated))

@@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
goto error;
  
  	list_for_each_entry(entry, >relocated, vm_status) {

+   /* vm_flush_needed after updating moved PDEs */
+   flush_tlb_needed |= entry->moved;


That is a strange thing to do for a bool variable. Why not just assign it?

Thanks,
Lijo


+
r = amdgpu_vm_pde_update(, entry);
if (r)
goto error;
@@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
if (r)
goto error;
  
-	/* vm_flush_needed after updating PDEs */

-   atomic64_inc(>tlb_seq);
+   if (flush_tlb_needed)
+   atomic64_inc(>tlb_seq);
  
  	while (!list_empty(>relocated)) {

entry = list_first_entry(>relocated,

Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved

2022-06-02 Thread Christian König


Am 02.06.22 um 15:20 schrieb Philip Yang:

Flush TLBs when existing PDEs are updated because a PTB or PDB moved,
but avoids unnecessary TLB flushes when new PDBs or PTBs are added to
the page table, which commonly happens when memory is mapped for the
first time.

Suggested-by: Christian König 
Signed-off-by: Philip Yang 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 9596c22fded6..1ea204218903 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
  {
struct amdgpu_vm_update_params params;
struct amdgpu_vm_bo_base *entry;
+   bool flush_tlb_needed = false;
int r, idx;
  
  	if (list_empty(>relocated))

@@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
goto error;
  
  	list_for_each_entry(entry, >relocated, vm_status) {

+   /* vm_flush_needed after updating moved PDEs */
+   flush_tlb_needed |= entry->moved;
+
r = amdgpu_vm_pde_update(, entry);
if (r)
goto error;
@@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
if (r)
goto error;
  
-	/* vm_flush_needed after updating PDEs */

-   atomic64_inc(>tlb_seq);
+   if (flush_tlb_needed)
+   atomic64_inc(>tlb_seq);
  
  	while (!list_empty(>relocated)) {

entry = list_first_entry(>relocated,

[PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved

2022-06-02 Thread Philip Yang

Flush TLBs when existing PDEs are updated because a PTB or PDB moved,
but avoids unnecessary TLB flushes when new PDBs or PTBs are added to
the page table, which commonly happens when memory is mapped for the
first time.

Suggested-by: Christian König 
Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 9596c22fded6..1ea204218903 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
 {
struct amdgpu_vm_update_params params;
struct amdgpu_vm_bo_base *entry;
+   bool flush_tlb_needed = false;
int r, idx;
 
if (list_empty(>relocated))
@@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
goto error;
 
list_for_each_entry(entry, >relocated, vm_status) {
+   /* vm_flush_needed after updating moved PDEs */
+   flush_tlb_needed |= entry->moved;
+
r = amdgpu_vm_pde_update(, entry);
if (r)
goto error;
@@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
if (r)
goto error;
 
-   /* vm_flush_needed after updating PDEs */
-   atomic64_inc(>tlb_seq);
+   if (flush_tlb_needed)
+   atomic64_inc(>tlb_seq);
 
while (!list_empty(>relocated)) {
entry = list_first_entry(>relocated,
-- 
2.35.1

[PATCH] dc: Add ACP_DATA register

2022-06-02 Thread Alan Liu

Define ixAZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA
Define AZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA__SUPPORTS_AI_MASK/SHIFT

Signed-off-by: Alan Liu 
---
 drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h   | 1 +
 drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h 
b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h
index 6df651a94b0a..581ba639b4ea 100644
--- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h
+++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h
@@ -6981,6 +6981,7 @@
 #define ixAZALIA_F0_CODEC_PIN_CONTROL_RESPONSE_PIN_SENSE   
 0x23
 #define ixAZALIA_F0_CODEC_PIN_CONTROL_WIDGET_CONTROL   
 0x24
 #define ixAZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER  
 0x25
+#define ixAZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA 
 0x27
 #define ixAZALIA_F0_CODEC_PIN_CONTROL_AUDIO_DESCRIPTOR0
 0x28
 #define ixAZALIA_F0_CODEC_PIN_CONTROL_AUDIO_DESCRIPTOR1
 0x29
 #define ixAZALIA_F0_CODEC_PIN_CONTROL_AUDIO_DESCRIPTOR2
 0x2a
diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h 
b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h
index fa1f4374fafe..fd387c7363b6 100644
--- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h
+++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h
@@ -13639,6 +13639,8 @@
 #define AZALIA_F0_CODEC_PIN_CONTROL_RESPONSE_PIN_SENSE__IMPEDANCE_SENSE__SHIFT 
0x0
 #define AZALIA_F0_CODEC_PIN_CONTROL_WIDGET_CONTROL__OUT_ENABLE_MASK 0x40
 #define AZALIA_F0_CODEC_PIN_CONTROL_WIDGET_CONTROL__OUT_ENABLE__SHIFT 0x6
+#define AZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA__SUPPORTS_AI_MASK 0x40
+#define AZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA__SUPPORTS_AI__SHIFT 0x6
 #define AZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER__SPEAKER_ALLOCATION_MASK 
0x7f
 #define AZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER__SPEAKER_ALLOCATION__SHIFT 
0x0
 #define AZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER__CHANNEL_ALLOCATION_MASK 
0xff00
-- 
2.25.1

Re: [kbuild] drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1433 amdgpu_discovery_get_vcn_info() error: buffer overflow 'adev->vcn.vcn_codec_disable_mask' 2 <= 3

2022-06-02 Thread Dan Carpenter

On Thu, Jun 02, 2022 at 08:26:03AM +0200, Ernst Sjöstrand wrote:
> Dan: I also ran Smatch which resulted in the following discussion:
> 
> https://lists.freedesktop.org/archives/amd-gfx/2022-May/079228.html 

Since the bounds check is dead code which does not make sense and is not
required, another idea would be to just delete it.

regards,
dan carpenter

[PATCH v3 2/2] drm/amdgpu: adding device coredump support

2022-06-02 Thread Somalapuram Amaranath

Added device coredump information:
- Kernel version
- Module
- Time
- VRAM status
- Guilty process name and PID
- GPU register dumps
v1 -> v2: Variable name change
v1 -> v2: NULL check
v1 -> v2: Code alignment
v1 -> v2: Adding dummy amdgpu_devcoredump_free
v1 -> v2: memset reset_task_info to zero
v2 -> v3: add CONFIG_DEV_COREDUMP for variables
v2 -> v3: remove NULL check on amdgpu_devcoredump_read

Signed-off-by: Somalapuram Amaranath 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  5 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 64 ++
 2 files changed, 69 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index c79d9992b113..1bfbaf65d414 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1044,6 +1044,11 @@ struct amdgpu_device {
uint32_t*reset_dump_reg_list;
uint32_t*reset_dump_reg_value;
int num_regs;
+#ifdef CONFIG_DEV_COREDUMP
+   struct amdgpu_task_info reset_task_info;
+   boolreset_vram_lost;
+   struct timespec64   reset_time;
+#endif
 
struct amdgpu_reset_domain  *reset_domain;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 89c6db03e84b..f1def74aaad0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -32,6 +32,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -4734,6 +4736,59 @@ static int amdgpu_reset_reg_dumps(struct amdgpu_device 
*adev)
return 0;
 }
 
+#ifdef CONFIG_DEV_COREDUMP
+static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset,
+   size_t count, void *data, size_t datalen)
+{
+   struct drm_printer p;
+   struct amdgpu_device *adev = data;
+   struct drm_print_iterator iter;
+   int i;
+
+   iter.data = buffer;
+   iter.offset = 0;
+   iter.start = offset;
+   iter.remain = count;
+
+   p = drm_coredump_printer();
+
+   drm_printf(, " AMDGPU Device Coredump \n");
+   drm_printf(, "kernel: " UTS_RELEASE "\n");
+   drm_printf(, "module: " KBUILD_MODNAME "\n");
+   drm_printf(, "time: %lld.%09ld\n", adev->reset_time.tv_sec, 
adev->reset_time.tv_nsec);
+   if (adev->reset_task_info.pid)
+   drm_printf(, "process_name: %s PID: %d\n",
+  adev->reset_task_info.process_name,
+  adev->reset_task_info.pid);
+
+   if (adev->reset_vram_lost)
+   drm_printf(, "VRAM is lost due to GPU reset!\n");
+   if (adev->num_regs) {
+   drm_printf(, "AMDGPU register dumps:\nOffset: Value:\n");
+
+   for (i = 0; i < adev->num_regs; i++)
+   drm_printf(, "0x%08x: 0x%08x\n",
+  adev->reset_dump_reg_list[i],
+  adev->reset_dump_reg_value[i]);
+   }
+
+   return count - iter.remain;
+}
+
+static void amdgpu_devcoredump_free(void *data)
+{
+}
+
+static void amdgpu_reset_capture_coredumpm(struct amdgpu_device *adev)
+{
+   struct drm_device *dev = adev_to_drm(adev);
+
+   ktime_get_ts64(>reset_time);
+   dev_coredumpm(dev->dev, THIS_MODULE, adev, 0, GFP_KERNEL,
+ amdgpu_devcoredump_read, amdgpu_devcoredump_free);
+}
+#endif
+
 int amdgpu_do_asic_reset(struct list_head *device_list_handle,
 struct amdgpu_reset_context *reset_context)
 {
@@ -4818,6 +4873,15 @@ int amdgpu_do_asic_reset(struct list_head 
*device_list_handle,
goto out;
 
vram_lost = 
amdgpu_device_check_vram_lost(tmp_adev);
+#ifdef CONFIG_DEV_COREDUMP
+   tmp_adev->reset_vram_lost = vram_lost;
+   memset(_adev->reset_task_info, 0,
+   
sizeof(tmp_adev->reset_task_info));
+   if (reset_context->job && 
reset_context->job->vm)
+   tmp_adev->reset_task_info =
+   
reset_context->job->vm->task_info;
+   amdgpu_reset_capture_coredumpm(tmp_adev);
+#endif
if (vram_lost) {
DRM_INFO("VRAM is lost due to GPU 
reset!\n");
amdgpu_inc_vram_lost(tmp_adev);
-- 
2.32.0

[PATCH v3 1/2] drm/amdgpu: save the reset dump register value for devcoredump

2022-06-02 Thread Somalapuram Amaranath

Allocate memory for register value and use the same values for devcoredump.
v1 -> v2: Change krealloc_array() to kmalloc_array()
v2 -> v3: Fix alignment

Signed-off-by: Somalapuram Amaranath 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 7 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 6 +++---
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 76df583663c7..c79d9992b113 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1042,6 +1042,7 @@ struct amdgpu_device {
 
/* reset dump register */
uint32_t*reset_dump_reg_list;
+   uint32_t*reset_dump_reg_value;
int num_regs;
 
struct amdgpu_reset_domain  *reset_domain;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index eedb12f6b8a3..f3ac7912c29c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1709,17 +1709,24 @@ static ssize_t 
amdgpu_reset_dump_register_list_write(struct file *f,
i++;
} while (len < size);
 
+   new = kmalloc_array(i, sizeof(uint32_t), GFP_KERNEL);
+   if (!new) {
+   ret = -ENOMEM;
+   goto error_free;
+   }
ret = down_write_killable(>reset_domain->sem);
if (ret)
goto error_free;
 
swap(adev->reset_dump_reg_list, tmp);
+   swap(adev->reset_dump_reg_value, new);
adev->num_regs = i;
up_write(>reset_domain->sem);
ret = size;
 
 error_free:
kfree(tmp);
+   kfree(new);
return ret;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 4daa0e893965..89c6db03e84b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4720,15 +4720,15 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device 
*adev,
 
 static int amdgpu_reset_reg_dumps(struct amdgpu_device *adev)
 {
-   uint32_t reg_value;
int i;
 
lockdep_assert_held(>reset_domain->sem);
dump_stack();
 
for (i = 0; i < adev->num_regs; i++) {
-   reg_value = RREG32(adev->reset_dump_reg_list[i]);
-   trace_amdgpu_reset_reg_dumps(adev->reset_dump_reg_list[i], 
reg_value);
+   adev->reset_dump_reg_value[i] = 
RREG32(adev->reset_dump_reg_list[i]);
+   trace_amdgpu_reset_reg_dumps(adev->reset_dump_reg_list[i],
+adev->reset_dump_reg_value[i]);
}
 
return 0;
-- 
2.32.0

RE: [PATCH] drm/amdgpu: fix up comment in amdgpu_device_asic_has_dc_support()

2022-06-02 Thread Quan, Evan

[AMD Official Use Only - General]

Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of Alex
> Deucher
> Sent: Wednesday, May 25, 2022 10:09 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: [PATCH] drm/amdgpu: fix up comment in
> amdgpu_device_asic_has_dc_support()
> 
> LVDS support was implemented in DC a while ago.  Just DAC support is left to
> do.
> 
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index dab0c59bfb1b..fa26e462e750 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3286,7 +3286,7 @@ bool amdgpu_device_asic_has_dc_support(enum
> amd_asic_type asic_type)
>   case CHIP_MULLINS:
>   /*
>* We have systems in the wild with these ASICs that require
> -  * LVDS and VGA support which is not supported with DC.
> +  * VGA support which is not supported with DC.
>*
>* Fallback to the non-DC driver here by default so as not to
>* cause regressions.
> --
> 2.35.3

RE: [PATCH] drm/amdgpu/soc21: add mode2 asic reset for SMU IP v13.0.4

2022-06-02 Thread Quan, Evan

[AMD Official Use Only - General]

Acked-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of Alex
> Deucher
> Sent: Friday, May 27, 2022 1:58 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Huang, Tim
> 
> Subject: [PATCH] drm/amdgpu/soc21: add mode2 asic reset for SMU IP
> v13.0.4
> 
> Set the default reset method to mode2 for SMU IP v13.0.4
> 
> Signed-off-by: Tim Huang 
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/soc21.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c
> b/drivers/gpu/drm/amd/amdgpu/soc21.c
> index 9e18a2b22607..a400f5273343 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
> @@ -310,6 +310,7 @@ static enum amd_reset_method
> soc21_asic_reset_method(struct amdgpu_device *adev)  {
>   if (amdgpu_reset_method == AMD_RESET_METHOD_MODE1 ||
> + amdgpu_reset_method == AMD_RESET_METHOD_MODE2 ||
>   amdgpu_reset_method == AMD_RESET_METHOD_BACO)
>   return amdgpu_reset_method;
> 
> @@ -320,6 +321,8 @@ soc21_asic_reset_method(struct amdgpu_device
> *adev)
>   switch (adev->ip_versions[MP1_HWIP][0]) {
>   case IP_VERSION(13, 0, 0):
>   return AMD_RESET_METHOD_MODE1;
> + case IP_VERSION(13, 0, 4):
> + return AMD_RESET_METHOD_MODE2;
>   default:
>   if (amdgpu_dpm_is_baco_supported(adev))
>   return AMD_RESET_METHOD_BACO;
> @@ -341,6 +344,10 @@ static int soc21_asic_reset(struct amdgpu_device
> *adev)
>   dev_info(adev->dev, "BACO reset\n");
>   ret = amdgpu_dpm_baco_reset(adev);
>   break;
> + case AMD_RESET_METHOD_MODE2:
> + dev_info(adev->dev, "MODE2 reset\n");
> + ret = amdgpu_dpm_mode2_reset(adev);
> + break;
>   default:
>   dev_info(adev->dev, "MODE1 reset\n");
>   ret = amdgpu_device_mode1_reset(adev);
> --
> 2.35.3

Re: [PATCH 1/1] drm/amdgpu: Update PDEs flush TLB if PDEs is moved

2022-06-02 Thread Christian König


Am 02.06.22 um 01:19 schrieb Felix Kuehling:

Am 2022-06-01 um 19:12 schrieb Philip Yang:

Update PDEs, PTEs don't need flush TLB after updating mapping, this will
remove the unnecessary TLB flush to reduce map to GPUs time.


This description is unclear. I think what this change does is, flush 
TLBs when existing PDEs are updated (because a PTB or PDB moved). But 
it avoids unnecessary TLB flushes when new PDBs or PTBs are added to 
the page table, which commonly happens when memory is mapped for the 
first time.


Yes, agree. Additional to that (I know reviewing my own suggestion) 
setting a local variable and then incrementing the atomic only once 
might be better because atomics are barriers on x86.


Regards,
Christian.



Regards,
  Felix




Suggested-by: Christian König 
Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 9596c22fded6..8cdfd09fd70d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -755,6 +755,10 @@ int amdgpu_vm_update_pdes(struct amdgpu_device 
*adev,

  goto error;
    list_for_each_entry(entry, >relocated, vm_status) {
+    /* vm_flush_needed after updating moved PDEs */
+    if (entry->moved)
+    atomic64_inc(>tlb_seq);
+
  r = amdgpu_vm_pde_update(, entry);
  if (r)
  goto error;
@@ -764,9 +768,6 @@ int amdgpu_vm_update_pdes(struct amdgpu_device 
*adev,

  if (r)
  goto error;
  -    /* vm_flush_needed after updating PDEs */
-    atomic64_inc(>tlb_seq);
-
  while (!list_empty(>relocated)) {
  entry = list_first_entry(>relocated,
   struct amdgpu_vm_bo_base,

Re: Explicit VM updates

2022-06-02 Thread Christian König


Am 01.06.22 um 19:47 schrieb Bas Nieuwenhuizen:

On Wed, Jun 1, 2022 at 2:40 PM Christian König  wrote:

Hey guys,

so today Bas came up with a new requirement regarding the explicit
synchronization to VM updates and a bunch of prototype patches.

I've been thinking about that stuff for quite some time before, but to
be honest it's one of the most trickiest parts of the driver.

So my current thinking is that we could potentially handle those
requirements like this:

1. We add some new EXPLICIT flag to context (or CS?) and VM IOCTL. This
way we either get the new behavior for the whole CS+VM or the old one,
but never both mixed.

Any VM wide flag would be a concern because libdrm_amdgpu merges
things into a single drm fd for potentially multiple driver (i.e. radv
+ amdvlk + radeonsi). I'm also not sure what you'd need a VM flag for?


What I meant was a flag for the VM IOCTL. E.g. you specify on the IOCTL 
if you want IMPLICIT or EXPLICIT synchronization.


Maybe just specifying a drm_syncobj could be used as indicator that we 
want explicit synchronization for a VM IOCTL operation as well. That's 
what I used in the prototype, but I'm not 100% sure if that's sufficient.


Regards,
Christian.




2. When memory is unmapped we keep around the last unmap operation
inside the bo_va.

3. When memory is freed we add all the CS fences which could access that
memory + the last unmap operation as BOOKKEEP fences to the BO and as
mandatory sync fence to the VM.

Memory freed either because of an eviction or because of userspace
closing the handle will be seen as a combination of unmap+free.


The result is the following semantic for userspace to avoid implicit
synchronization as much as possible:

1. When you allocate and map memory it is mandatory to either wait for
the mapping operation to complete or to add it as dependency for your CS.
  If this isn't followed the application will run into CS faults
(that's what we pretty much already implemented).

2. When memory is freed you must unmap that memory first and then wait
for this unmap operation to complete before freeing the memory.
  If this isn't followed the kernel will add a forcefully wait to the
next CS to block until the unmap is completed.

3. All VM operations requested by userspace will still be executed in
order, e.g. we can't run unmap + map in parallel or something like this.

Is that something you guys can live with? As far as I can see it should
give you the maximum freedom possible, but is still doable.

Regards,
Christian.

Re: Explicit VM updates

2022-06-02 Thread Christian König





Am 01.06.22 um 19:39 schrieb Felix Kuehling:


Am 2022-06-01 um 13:22 schrieb Christian König:

Am 01.06.22 um 19:07 schrieb Felix Kuehling:

Am 2022-06-01 um 12:29 schrieb Christian König:

Am 01.06.22 um 17:05 schrieb Felix Kuehling:


Am 2022-06-01 um 08:40 schrieb Christian König:

Hey guys,

so today Bas came up with a new requirement regarding the 
explicit synchronization to VM updates and a bunch of prototype 
patches.


I've been thinking about that stuff for quite some time before, 
but to be honest it's one of the most trickiest parts of the driver.


So my current thinking is that we could potentially handle those 
requirements like this:


1. We add some new EXPLICIT flag to context (or CS?) and VM 
IOCTL. This way we either get the new behavior for the whole 
CS+VM or the old one, but never both mixed.


2. When memory is unmapped we keep around the last unmap 
operation inside the bo_va.


3. When memory is freed we add all the CS fences which could 
access that memory + the last unmap operation as BOOKKEEP fences 
to the BO and as mandatory sync fence to the VM.


Memory freed either because of an eviction or because of 
userspace closing the handle will be seen as a combination of 
unmap+free.



The result is the following semantic for userspace to avoid 
implicit synchronization as much as possible:


1. When you allocate and map memory it is mandatory to either 
wait for the mapping operation to complete or to add it as 
dependency for your CS.
    If this isn't followed the application will run into CS 
faults (that's what we pretty much already implemented).


This makes sense.




2. When memory is freed you must unmap that memory first and then 
wait for this unmap operation to complete before freeing the memory.
    If this isn't followed the kernel will add a forcefully wait 
to the next CS to block until the unmap is completed.


This would work for now, but it won't work for user mode 
submission in the future. I find it weird that user mode needs to 
wait for the unmap. For user mode, unmap and free should always be 
asynchronous. I can't think of any good reasons to make user mode 
wait for the driver to clean up its stuff.


Could the waiting be done in kernel mode instead? TTM already does 
delayed freeing if there are fences outstanding on a BO being 
freed. This should make it easy to delay freeing until the unmap 
is done without blocking the user mode thread.


This is not about blocking, but synchronization dependencies.


Then I must have misunderstood this sentence: "When memory is freed 
you must unmap that memory first and then wait for this unmap 
operation to complete before freeing the memory." If the pronoun 
"you" is the user mode driver, it means user mode must wait for 
kernel mode to finish unmapping memory before freeing it. Was that 
not what you meant?


Ah, yes. The UMD must wait for the kernel to finish unmapping all the 
maps from the BO before it drops the handle of the BO and with that 
frees it.




In other words the free is not waiting for the unmap to complete, 
but causes command submissions through the kernel to depend on the 
unmap.


I guess I don't understand that dependency. The next command 
submission obviously cannot use the memory that was unmapped. But 
why does it need to synchronize with the unmap operation?


Because of the necessary TLB flush, only after that one is executed 
we can be sure that nobody has access to the memory any more and 
actually free it.


So freeing the memory has to wait for the TLB flush. Why does the next 
command submission need to wait?


Because that's the one triggering the TLB flush. The issue is that 
flushing the TLB while the VMID is in use is really unreliable on most 
hardware generations.




User mode submissions are completely unrelated to that.


I mention user mode command submission because there is no way to 
enforce the synchronization you describe here on a user mode queue. 
So this approach is not very future proof.


With user mode queues you need to wait for the work on the queue to 
finish anyway or otherwise you run into VM faults if you just unmap 
or free the memory.


If the next command submission doesn't use the unmapped/freed memory, 
why does it need to wait for the TLB flush?


Because it could potentially use it. When userspace lies to the kernel 
and still accesses the mapping we would allow access to freed up memory 
and create a major security problem.




If it is using the unmapped/freed memory, that's a user mode bug. But 
waiting for the TLB flush won't fix that. It will only turn a likely 
VM fault into a certain VM fault.


Yeah, exactly that's the intention here.



The guarantee you need to give is, that the memory is not freed and 
reused by anyone else until the TLB flush is done. This dependency 
requires synchronization of the "free" operation with the TLB flush. 
It does not require synchronization with any future command 
submissions in the context that

RE: [PATCH] drm/amdgpu: suppress compile warnings

2022-06-02 Thread Quan, Evan

[AMD Official Use Only - General]



> -Original Message-
> From: Christian König 
> Sent: Monday, May 30, 2022 3:12 PM
> To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: Re: [PATCH] drm/amdgpu: suppress compile warnings
> 
> Am 30.05.22 um 08:50 schrieb Evan Quan:
> > Suppress the compile warnings below:
> >
> >
> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_7_ppt.c:1
> 407:12: warning:
> > stack frame size (1040) exceeds limit (1024) in
> > 'smu_v13_0_7_get_power_profile_mode' [-Wframe-larger-than]
> > drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1986:6: warning:
> > no previous prototype for function 'gfx_v11_0_rlc_stop'
> > [-Wmissing-prototypes]
> >
> > Signed-off-by: Evan Quan 
> > Change-Id: I679436c91cb98afb9fcbef8942fd90a17e5234b5
> > ---
> >   drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c   |  2 +-
> >   drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 12
> ++--
> >   2 files changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> > b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> > index 8c0a3fc7aaa6..cb581cfc7464 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> > @@ -1983,7 +1983,7 @@ static int gfx_v11_0_init_csb(struct
> amdgpu_device *adev)
> > return 0;
> >   }
> >
> > -void gfx_v11_0_rlc_stop(struct amdgpu_device *adev)
> > +static void gfx_v11_0_rlc_stop(struct amdgpu_device *adev)
> >   {
> > u32 tmp = RREG32_SOC15(GC, 0, regRLC_CNTL);
> >
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
> > b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
> > index 4e1861fb2c6a..8fd7652ae883 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
> > @@ -1406,7 +1406,7 @@ static int smu_v13_0_7_get_power_limit(struct
> > smu_context *smu,
> >
> >   static int smu_v13_0_7_get_power_profile_mode(struct smu_context
> *smu, char *buf)
> >   {
> > -   DpmActivityMonitorCoeffIntExternal_t
> activity_monitor_external[PP_SMC_POWER_PROFILE_COUNT];
> > +   DpmActivityMonitorCoeffIntExternal_t *activity_monitor_external;
> > uint32_t i, j, size = 0;
> > int16_t workload_type = 0;
> > int result = 0;
> > @@ -1414,6 +1414,11 @@ static int
> smu_v13_0_7_get_power_profile_mode(struct smu_context *smu, char
> *buf
> > if (!buf)
> > return -EINVAL;
> >
> > +   activity_monitor_external =
> kzalloc(sizeof(DpmActivityMonitorCoeffIntExternal_t) *
> PP_SMC_POWER_PROFILE_COUNT,
> > +   GFP_KERNEL);
> 
> The static checkers will warn about that as well. Please use 
> kcalloc(sizeof(...),
> count, GFP_KERNEL);
[Quan, Evan] I actually do not know that.. The warning is about the stack frame 
size (1040).
The space allocated by kzalloc/kmalloc is not on the stack. Per my knowledge, 
it should not have such limitation.
Did I miss something?

BR
Evan
> 
> Apart from that looks good to me.
> 
> Regards,
> Christian.
> 
> > +   if (!activity_monitor_external)
> > +   return -ENOMEM;
> > +
> > size += sysfs_emit_at(buf, size, "  ");
> > for (i = 0; i <= PP_SMC_POWER_PROFILE_WINDOW3D; i++)
> > size += sysfs_emit_at(buf, size, "%-14s%s",
> > amdgpu_pp_profile_name[i], @@ -1426,14 +1431,17 @@ static int
> smu_v13_0_7_get_power_profile_mode(struct smu_context *smu, char
> *buf
> > workload_type = smu_cmn_to_asic_specific_index(smu,
> >
> CMN2ASIC_MAPPING_WORKLOAD,
> >i);
> > -   if (workload_type < 0)
> > +   if (workload_type < 0) {
> > +   kfree(activity_monitor_external);
> > return -EINVAL;
> > +   }
> >
> > result = smu_cmn_update_table(smu,
> >
> SMU_TABLE_ACTIVITY_MONITOR_COEFF, workload_type,
> >   (void
> *)(_monitor_external[i]), false);
> > if (result) {
> > dev_err(smu->adev->dev, "[%s] Failed to get activity
> monitor!",
> > __func__);
> > +   kfree(activity_monitor_external);
> > return result;
> > }
> > }

Re: [kbuild] drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1433 amdgpu_discovery_get_vcn_info() error: buffer overflow 'adev->vcn.vcn_codec_disable_mask' 2 <= 3

2022-06-02 Thread Ernst Sjöstrand

Dan: I also ran Smatch which resulted in the following discussion:

https://lists.freedesktop.org/archives/amd-gfx/2022-May/079228.html

Regards

Den ons 1 juni 2022 kl 20:44 skrev Alex Deucher :

> On Fri, May 27, 2022 at 3:46 AM Dan Carpenter 
> wrote:
> >
> > [ kbuild bot sent this warning on May 4 but I never heard back and it's
> >   May 27 now so sending a duplicate warning is probably for the best.
> -dan]
> >
> > tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git  master
> > head:   7e284070abe53d448517b80493863595af4ab5f0
> > commit: 622469c87fc3e6c90a980be3e2287d82bd55c977 drm/amdgpu/discovery:
> add a function to parse the vcn info table
> > config: arc-randconfig-m031-20220524 (
> https://download.01.org/0day-ci/archive/20220527/202205271546.ov14n2r8-...@intel.com/config
> )
> > compiler: arceb-elf-gcc (GCC) 11.3.0
> >
> > If you fix the issue, kindly add following tag where applicable
> > Reported-by: kernel test robot 
> > Reported-by: Dan Carpenter 
> >
> > smatch warnings:
> > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1433
> amdgpu_discovery_get_vcn_info() error: buffer overflow
> 'adev->vcn.vcn_codec_disable_mask' 2 <= 3
> >
> > vim +1433 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> >
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1403  int
> amdgpu_discovery_get_vcn_info(struct amdgpu_device *adev)
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1404  {
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1405struct binary_header
> *bhdr;
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1406union vcn_info *vcn_info;
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1407u16 offset;
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1408int v;
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1409
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1410if
> (!adev->mman.discovery_bin) {
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1411DRM_ERROR("ip
> discovery uninitialized\n");
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1412return -EINVAL;
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1413}
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1414
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1415if
> (adev->vcn.num_vcn_inst > VCN_INFO_TABLE_MAX_NUM_INSTANCES) {
> >
> > Capped to 4
> >
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1416
> dev_err(adev->dev, "invalid vcn instances\n");
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1417return -EINVAL;
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1418}
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1419
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1420bhdr = (struct
> binary_header *)adev->mman.discovery_bin;
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1421offset =
> le16_to_cpu(bhdr->table_list[VCN_INFO].offset);
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1422
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1423if (!offset) {
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1424
> dev_err(adev->dev, "invalid vcn table offset\n");
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1425return -EINVAL;
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1426}
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1427
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1428vcn_info = (union
> vcn_info *)(adev->mman.discovery_bin + offset);
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1429
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1430switch
> (le16_to_cpu(vcn_info->v1.header.version_major)) {
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1431case 1:
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1432for (v = 0; v <
> adev->vcn.num_vcn_inst; v++) {
> > 622469c87fc3e6 Alex Deucher 2022-03-30 @1433
> adev->vcn.vcn_codec_disable_mask[v] =
> >
> > But this array doesn't have 4 elements
>
> Correct, but num_vcn_inst can't be larger than
> AMDGPU_MAX_VCN_INSTANCES (2) at the moment thanks to:
> https://patchwork.freedesktop.org/patch/486289/
>
> Alex
>
> >
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1434
> le32_to_cpu(vcn_info->v1.instance_info[v].fuse_data.all_bits);
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1435}
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1436break;
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1437default:
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1438
> dev_err(adev->dev,
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1439
> "Unhandled VCN info table %d.%d\n",
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1440
> le16_to_cpu(vcn_info->v1.header.version_major),
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1441
> le16_to_cpu(vcn_info->v1.header.version_minor));
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1442return -EINVAL;
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1443}
> > 622469c87fc3e6 Alex Deucher 2022-03-30  1444return 0;
> > f39f5bb1c9d68d Xiaojie Yuan 2019-06-20  1445  }
> >
> > --
> > 0-DAY CI Kernel Test Service
> > https://01.org/lkp
> > ___
> > kbuild mailing list -- kbu...@lists.01.org
> > To

56 matches

Mail list logo