XDC 2022: Registration & Call for Presentations still open!
Hello! This is just a reminder that the CFP for XDC in 2022 is still open! The 2022 X.Org Developers Conference is being held in conjunction with the 2022 Wine Developers Conference. This is a meeting to bring together developers working on all things open graphics (Linux kernel, Mesa, DRM, Wayland, X11, etc.) as well as developers for the Wine Project, a key consumer of open graphics. Registration & Call for Proposals are now open for both XDC 2022 and WineConf 2022, which will take place on October 4-6, 2022 in Minneapolis, Minnesota, USA. https://xdc2022.x.org As usual, the conference is free of charge and open to the general public. If you plan on attending, please make sure to register as early as possible! In order to register as attendee, you will therefore need to do it via the XDC website: https://indico.freedesktop.org/event/2/registrations/2/ In addition to registration, the CfP is now open for talks, workshops and demos at XDC 2022. While any serious proposal will be gratefully considered, topics of interest to X.Org and freedesktop.org developers are encouraged. The program focus is on new development, ongoing challenges and anything else that will spark discussions among attendees in the hallway track. We are open to talks across all layers of the graphics stack, from the kernel to desktop environments / graphical applications and about how to make things better for the developers who build them. Head to the CfP page to learn more: https://indico.freedesktop.org/event/2/abstracts/ The deadline for submissions is Monday July 4th, 2022. Check out our Reimbursement Policy to accept speaker expenses for X.Org events like XDC 2022: https://www.x.org/wiki/XorgFoundation/Policies/Reimbursement/ If you have any questions, please send me an email to x...@codeweavers.com, adding on CC the X.org board (board at foundation.x.org). And don't forget, you can follow us on Twitter for all the latest updates and to stay connected: https://twitter.com/XOrgDevConf Best regards, Lyude Paul, on behalf of X.org
Re: [PATCH] drm/amdgpu/discovery: add comments about VCN instance handling
On Thu, Jun 02, 2022 at 12:45:47PM -0400, Alex Deucher wrote: > Add comments to clarify code that is safe, but triggers and > smatch warning. > > Link: https://lists.freedesktop.org/archives/amd-gfx/2022-June/079905.html > Signed-off-by: Alex Deucher > Cc: Dan Carpenter > --- Thanks! regards, dan carpenter
[PATCH 0/3] Drive-by MST fixes for amdgpu
Now that I'm finishing up my work to remove the legacy MST code from the tree, I've come across a couple of various issues that I wrote up patches for along the way. These are some of those patches for amdgpu. Lyude Paul (3): drm/amdgpu/dm/mst: Stop grabbing mst_mgr->lock in compute_mst_dsc_configs_for_state() drm/amdgpu/dm/mst: Stop grabbing mst_mgr->lock in pre_compute_mst_dsc_configs_for_state() drm/amdgpu/dm: Drop != NULL check in dm_mst_get_pbn_divider() .../drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 12 +--- 1 file changed, 1 insertion(+), 11 deletions(-) -- 2.35.3
[PATCH 2/3] drm/amdgpu/dm/mst: Stop grabbing mst_mgr->lock in pre_compute_mst_dsc_configs_for_state()
This lock is only needed if you're iterating through the in-memory topology (e.g. drm_dp_mst_branch->ports, drm_dp_mst_port->mstb, etc.). This doesn't actually seem to be what's going on here though, so we can just drop this lock. Signed-off-by: Lyude Paul --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c index cb3b0e08acc4..1259f2f7a8f9 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -1112,16 +1112,12 @@ static bool if (!is_dsc_need_re_compute(state, dc_state, stream->link)) continue; - mutex_lock(>mst_mgr.lock); if (!compute_mst_dsc_configs_for_link(state, dc_state, stream->link, vars, - _vars_start_index)) { - mutex_unlock(>mst_mgr.lock); + _vars_start_index)) return false; - } - mutex_unlock(>mst_mgr.lock); for (j = 0; j < dc_state->stream_count; j++) { if (dc_state->streams[j]->link == stream->link) -- 2.35.3
[PATCH 3/3] drm/amdgpu/dm: Drop != NULL check in dm_mst_get_pbn_divider()
A lot of code in amdgpu seems to sprinkle in if (foo != NULL) … Checks pretty much all over the place, many times in locations where it's clear foo (whatever foo may be) should never be NULL unless we've run into a programming error. This is definitely one of those places, as dm_mst_get_pbn_divider() should never be getting called with a NULL dc_link pointer. The problem with this code pattern is that many times the places I've seen it used in amdgpu have no real error handling. This is actually quite bad, if we try to avoid the NULL pointer and instead simply skip any code that was expecting a valid pointer - we're already in undefined territory. Subsequent code we execute may have expected sideaffects from the code we skipped that are no longer present, which leads to even more unpredictable behavior then a simple segfault. This could be silent errors or even just another segfault somewhere else. If we simply segfault though, that's not good either. But unlike the former solution, no subsequent code in the kernel thread will execute - and we will likely even get a clear backtrace from the invalid memory access. Of course, the preferred approach is to simply handle the possibility of both NULL and non-NULL pointers with nice error handling code. However, that's not always desirable or even possible, and in those cases it's likely just better to fail predictably rather than unpredictably. This code is a nice example of that - if link is NULL, you'll return a PBN divisor of 0. And thus, you've simply traded in your potential segfault for a potential divide by 0 error. This was something I actually managed to hit while working on the legacy MST removal work. Signed-off-by: Lyude Paul --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c index 1259f2f7a8f9..35c7def8f2bd 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -537,9 +537,6 @@ void amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm, int dm_mst_get_pbn_divider(struct dc_link *link) { - if (!link) - return 0; - return dc_link_bandwidth_kbps(link, dc_link_get_link_cap(link)) / (8 * 1000 * 54); } -- 2.35.3
[PATCH 1/3] drm/amdgpu/dm/mst: Stop grabbing mst_mgr->lock in compute_mst_dsc_configs_for_state()
Noticed this while trying to update amdgpu for the non-atomic MST removal changes - for some reason we appear to grab mst_mgr->lock before computing mst DSC configs. This is wrong though - mst_mgr->lock is only needed while traversing the actual MST topology state - which is not typically something that DRM drivers should be doing themselves anyway. Signed-off-by: Lyude Paul --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c index 9221b6690a4a..cb3b0e08acc4 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -1056,13 +1056,10 @@ bool compute_mst_dsc_configs_for_state(struct drm_atomic_state *state, if (!is_dsc_need_re_compute(state, dc_state, stream->link)) continue; - mutex_lock(>mst_mgr.lock); if (!compute_mst_dsc_configs_for_link(state, dc_state, stream->link, vars, _vars_start_index)) { - mutex_unlock(>mst_mgr.lock); return false; } - mutex_unlock(>mst_mgr.lock); for (j = 0; j < dc_state->stream_count; j++) { if (dc_state->streams[j]->link == stream->link) -- 2.35.3
Re: [PATCH 0/4] PSR-SU-RC DC support and some PSR-SU fixes
Patches 1, 2, and 4 are Reviewed-by: Harry Wentland Harry On 2022-06-02 14:03, sunpeng...@amd.com wrote: > From: Leo Li > > The first two patches here add PSR SU Rate Control support to DC. Support in > amdgpu_dm is still pending to enable this fully. > > The last two patches are some fixes for PSR SU. > > David Zhang (3): > drm/amd/display: expose AMD specific DPCD for PSR-SU-RC support > drm/amd/display: Add PSR-SU-RC support in DC > drm/amd/display: pass panel instance in dirty rect message > > Robin Chen (1): > drm/amd/display: refactor dirty rect dmub command decision > > drivers/gpu/drm/amd/display/dc/core/dc.c | 19 ++- > drivers/gpu/drm/amd/display/dc/core/dc_link.c | 22 + > drivers/gpu/drm/amd/display/dc/dc_link.h | 3 +++ > drivers/gpu/drm/amd/display/dc/dc_types.h | 2 ++ > drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c | 23 ++ > drivers/gpu/drm/amd/display/dc/dce/dmub_psr.h | 2 ++ > .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 24 ++- > .../drm/amd/display/dc/inc/hw/link_encoder.h | 8 +++ > .../amd/display/include/ddc_service_types.h | 4 > 9 files changed, 100 insertions(+), 7 deletions(-) >
Re: [PATCH 3/4] drm/amd/display: pass panel instance in dirty rect message
On 2022-06-02 14:03, sunpeng...@amd.com wrote: > From: David Zhang > > [why] > DMUB FW uses OTG instance to get eDP panel instance. But in case > of MPO multiple pipe indexes are passed to updated the same panel. > The other OTG instance passed would be ignored causing in DMUB not > acknowledging the messages. > > [how] > Add panel instance to dirty rectangle data and cursor update data > structures and pass to DMUB. > I'm not entirely following why passing the panel_inst solves the problem that is described. > Signed-off-by: Mikita Lipski This says the author is David but it has only Mikita's sign-off. We need David's sign-off as well. Harry > Acked-by: Leo Li > --- > drivers/gpu/drm/amd/display/dc/core/dc.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c > b/drivers/gpu/drm/amd/display/dc/core/dc.c > index d4173be11903..31d83297bcb5 100644 > --- a/drivers/gpu/drm/amd/display/dc/core/dc.c > +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c > @@ -2837,10 +2837,14 @@ void dc_dmub_update_dirty_rect(struct dc *dc, > struct dc_context *dc_ctx = dc->ctx; > struct dmub_cmd_update_dirty_rect_data *update_dirty_rect; > unsigned int i, j; > + unsigned int panel_inst = 0; > > if (stream->link->psr_settings.psr_version != DC_PSR_VERSION_SU_1) > return; > > + if (!dc_get_edp_link_panel_inst(dc, stream->link, _inst)) > + return; > + > memset(, 0x0, sizeof(cmd)); > cmd.update_dirty_rect.header.type = DMUB_CMD__UPDATE_DIRTY_RECT; > cmd.update_dirty_rect.header.sub_type = 0; > @@ -2869,6 +2873,7 @@ void dc_dmub_update_dirty_rect(struct dc *dc, > if (pipe_ctx->plane_state != plane_state) > continue; > > + update_dirty_rect->panel_inst = panel_inst; > update_dirty_rect->pipe_idx = j; > dc_dmub_srv_cmd_queue(dc_ctx->dmub_srv, ); > dc_dmub_srv_cmd_execute(dc_ctx->dmub_srv);
[PATCH 4/4] drm/amd/display: refactor dirty rect dmub command decision
From: Robin Chen [Why] To wrap the decision logic of sending dirty rect dmub command for both frame update and cursor update path. Signed-off-by: Robin Chen Acked-by: Leo Li --- drivers/gpu/drm/amd/display/dc/core/dc.c | 14 ++- .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 24 ++- 2 files changed, 31 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index 31d83297bcb5..645ec5bc3a7d 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -2827,6 +2827,18 @@ static void commit_planes_do_stream_update(struct dc *dc, } } +static bool dc_dmub_should_send_dirty_rect_cmd(struct dc *dc, struct dc_stream_state *stream) +{ + if (stream->link->psr_settings.psr_version == DC_PSR_VERSION_SU_1) + return true; + + if (stream->link->psr_settings.psr_version == DC_PSR_VERSION_1 && + dc->debug.enable_sw_cntl_psr) + return true; + + return false; +} + void dc_dmub_update_dirty_rect(struct dc *dc, int surface_count, struct dc_stream_state *stream, @@ -2839,7 +2851,7 @@ void dc_dmub_update_dirty_rect(struct dc *dc, unsigned int i, j; unsigned int panel_inst = 0; - if (stream->link->psr_settings.psr_version != DC_PSR_VERSION_SU_1) + if (!dc_dmub_should_send_dirty_rect_cmd(dc, stream)) return; if (!dc_get_edp_link_panel_inst(dc, stream->link, _inst)) diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c index 7fe06a2c0c04..5b5e0dd13fd0 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c @@ -3325,6 +3325,23 @@ static bool dcn10_can_pipe_disable_cursor(struct pipe_ctx *pipe_ctx) return false; } +static bool dcn10_dmub_should_update_cursor_data( + struct pipe_ctx *pipe_ctx, + struct dc_debug_options *debug) +{ + if (pipe_ctx->plane_state->address.type == PLN_ADDR_TYPE_VIDEO_PROGRESSIVE) + return false; + + if (pipe_ctx->stream->link->psr_settings.psr_version == DC_PSR_VERSION_SU_1) + return true; + + if (pipe_ctx->stream->link->psr_settings.psr_version == DC_PSR_VERSION_1 && + debug->enable_sw_cntl_psr) + return true; + + return false; +} + static void dcn10_dmub_update_cursor_data( struct pipe_ctx *pipe_ctx, struct hubp *hubp, @@ -3346,13 +3363,8 @@ static void dcn10_dmub_update_cursor_data( struct dc_debug_options *debug = >ctx->dc->debug; - if (!debug->enable_sw_cntl_psr && pipe_ctx->stream->link->psr_settings.psr_version != DC_PSR_VERSION_SU_1) + if (!dcn10_dmub_should_update_cursor_data(pipe_ctx, debug)) return; - - if (pipe_ctx->stream->link->psr_settings.psr_version == DC_PSR_VERSION_UNSUPPORTED || - pipe_ctx->plane_state->address.type == PLN_ADDR_TYPE_VIDEO_PROGRESSIVE) - return; - /** * if cur_pos == NULL means the caller is from cursor_set_attribute * then driver use previous cursor position data -- 2.36.1
[PATCH 2/4] drm/amd/display: Add PSR-SU-RC support in DC
From: David Zhang [Why] PSR-SU Rate Control - or PSR-SU-RC - enables PSR-SU panels to work with variable refresh rate to allow for more power savings. Lowering the refresh rate can increase PSR residency by expanding the eDP main link shut down duration. It can also lower panel power consumption. There is a complication with With PSR, since the eDP main link can be shut down. Therefore, the timing controller (TCON) on the eDP sink nees to be able to scan out its remote buffer independant of the main link. To allow the eDP source to specify the sink's refresh rate while the link is off, vendor-specific DPCD registers are used. This allows the eDP source to then "Rate Control" the panel during PSR active. [How] Add DC support to communicate with PSR-SU-RC supported eDP sinks. The sink will need to know the desired VTotal during PSR active. This change only adds support to DC, support in amdgpu_dm is still pending to enable this fully. Signed-off-by: David Zhang Signed-off-by: Leo Li --- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 22 ++ drivers/gpu/drm/amd/display/dc/dc_link.h | 3 +++ drivers/gpu/drm/amd/display/dc/dc_types.h | 2 ++ drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c | 23 +++ drivers/gpu/drm/amd/display/dc/dce/dmub_psr.h | 2 ++ .../drm/amd/display/dc/inc/hw/link_encoder.h | 8 +++ 6 files changed, 60 insertions(+) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c index 31ffb961e18b..3d6dcaa6a483 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c @@ -1795,6 +1795,7 @@ static bool dc_link_construct_legacy(struct dc_link *link, */ program_hpd_filter(link); + link->psr_settings.psr_vtotal_control_support = false; link->psr_settings.psr_version = DC_PSR_VERSION_UNSUPPORTED; DC_LOG_DC("BIOS object table - %s finished successfully.\n", __func__); @@ -3207,6 +3208,7 @@ bool dc_link_setup_psr(struct dc_link *link, /* updateSinkPsrDpcdConfig*/ union dpcd_psr_configuration psr_configuration; union dpcd_alpm_configuration alpm_configuration; + union dpcd_sink_active_vtotal_control_mode vtotal_control = {0}; psr_context->controllerId = CONTROLLER_ID_UNDEFINED; @@ -3276,6 +3278,13 @@ bool dc_link_setup_psr(struct dc_link *link, psr_config->su_y_granularity; psr_context->line_time_in_us = psr_config->line_time_in_us; + + if (link->psr_settings.psr_vtotal_control_support) { + psr_context->rate_control_caps = psr_config->rate_control_caps; + vtotal_control.bits.ENABLE = true; + core_link_write_dpcd(link, DP_SINK_PSR_ACTIVE_VTOTAL_CONTROL_MODE, + _control.raw, sizeof(vtotal_control.raw)); + } } psr_context->channel = link->ddc->ddc_pin->hw_info.ddc_channel; @@ -3408,6 +3417,19 @@ void dc_link_get_psr_residency(const struct dc_link *link, uint32_t *residency) *residency = 0; } +bool dc_link_set_sink_vtotal_in_psr_active(const struct dc_link *link, uint16_t psr_vtotal_idle, uint16_t psr_vtotal_su) +{ + struct dc *dc = link->ctx->dc; + struct dmub_psr *psr = dc->res_pool->psr; + + if (psr == NULL || !link->psr_settings.psr_feature_enabled || !link->psr_settings.psr_vtotal_control_support) + return false; + + psr->funcs->psr_set_sink_vtotal_in_psr_active(psr, psr_vtotal_idle, psr_vtotal_su); + + return true; +} + const struct dc_link_status *dc_link_get_status(const struct dc_link *link) { return >link_status; diff --git a/drivers/gpu/drm/amd/display/dc/dc_link.h b/drivers/gpu/drm/amd/display/dc/dc_link.h index 0bec986a6de8..3ec189dd73da 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_link.h +++ b/drivers/gpu/drm/amd/display/dc/dc_link.h @@ -100,6 +100,7 @@ struct psr_settings { bool psr_feature_enabled; // PSR is supported by sink bool psr_allow_active; // PSR is currently active enum dc_psr_version psr_version;// Internal PSR version, determined based on DPCD + bool psr_vtotal_control_support;// Vtotal control is supported by sink /* These parameters are calculated in Driver, * based on display timing and Sink capabilities. @@ -324,6 +325,8 @@ void dc_link_get_psr_residency(const struct dc_link *link, uint32_t *residency); void dc_link_blank_all_dp_displays(struct dc *dc); void dc_link_blank_dp_stream(struct dc_link *link, bool hw_init); +bool dc_link_set_sink_vtotal_in_psr_active(const struct dc_link *link, + uint16_t psr_vtotal_idle, uint16_t psr_vtotal_su); /* Request DC to detect if there is a Panel connected. *
[PATCH 3/4] drm/amd/display: pass panel instance in dirty rect message
From: David Zhang [why] DMUB FW uses OTG instance to get eDP panel instance. But in case of MPO multiple pipe indexes are passed to updated the same panel. The other OTG instance passed would be ignored causing in DMUB not acknowledging the messages. [how] Add panel instance to dirty rectangle data and cursor update data structures and pass to DMUB. Signed-off-by: Mikita Lipski Acked-by: Leo Li --- drivers/gpu/drm/amd/display/dc/core/dc.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index d4173be11903..31d83297bcb5 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -2837,10 +2837,14 @@ void dc_dmub_update_dirty_rect(struct dc *dc, struct dc_context *dc_ctx = dc->ctx; struct dmub_cmd_update_dirty_rect_data *update_dirty_rect; unsigned int i, j; + unsigned int panel_inst = 0; if (stream->link->psr_settings.psr_version != DC_PSR_VERSION_SU_1) return; + if (!dc_get_edp_link_panel_inst(dc, stream->link, _inst)) + return; + memset(, 0x0, sizeof(cmd)); cmd.update_dirty_rect.header.type = DMUB_CMD__UPDATE_DIRTY_RECT; cmd.update_dirty_rect.header.sub_type = 0; @@ -2869,6 +2873,7 @@ void dc_dmub_update_dirty_rect(struct dc *dc, if (pipe_ctx->plane_state != plane_state) continue; + update_dirty_rect->panel_inst = panel_inst; update_dirty_rect->pipe_idx = j; dc_dmub_srv_cmd_queue(dc_ctx->dmub_srv, ); dc_dmub_srv_cmd_execute(dc_ctx->dmub_srv); -- 2.36.1
[PATCH 0/4] PSR-SU-RC DC support and some PSR-SU fixes
From: Leo Li The first two patches here add PSR SU Rate Control support to DC. Support in amdgpu_dm is still pending to enable this fully. The last two patches are some fixes for PSR SU. David Zhang (3): drm/amd/display: expose AMD specific DPCD for PSR-SU-RC support drm/amd/display: Add PSR-SU-RC support in DC drm/amd/display: pass panel instance in dirty rect message Robin Chen (1): drm/amd/display: refactor dirty rect dmub command decision drivers/gpu/drm/amd/display/dc/core/dc.c | 19 ++- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 22 + drivers/gpu/drm/amd/display/dc/dc_link.h | 3 +++ drivers/gpu/drm/amd/display/dc/dc_types.h | 2 ++ drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c | 23 ++ drivers/gpu/drm/amd/display/dc/dce/dmub_psr.h | 2 ++ .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 24 ++- .../drm/amd/display/dc/inc/hw/link_encoder.h | 8 +++ .../amd/display/include/ddc_service_types.h | 4 9 files changed, 100 insertions(+), 7 deletions(-) -- 2.36.1
[PATCH 1/4] drm/amd/display: expose AMD specific DPCD for PSR-SU-RC support
From: David Zhang [why & how] Expose vendor specific DPCD registers for rate controlling the eDP sink TCON's refresh rate during PSR active. When used in combination with PSR-SU and Freesync, it is called PSR-SU Rate Contorol, or PSR-SU-RC for short. v2: Add all DPCD registers required Signed-off-by: David Zhang Acked-by: Leo Li --- drivers/gpu/drm/amd/display/include/ddc_service_types.h | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/display/include/ddc_service_types.h b/drivers/gpu/drm/amd/display/include/ddc_service_types.h index 20a3d4e23f66..05096c644a60 100644 --- a/drivers/gpu/drm/amd/display/include/ddc_service_types.h +++ b/drivers/gpu/drm/amd/display/include/ddc_service_types.h @@ -41,6 +41,10 @@ #define DP_DEVICE_ID_38EC11 0x38EC11 #define DP_FORCE_PSRSU_CAPABILITY 0x40F +#define DP_SINK_PSR_ACTIVE_VTOTAL 0x373 +#define DP_SINK_PSR_ACTIVE_VTOTAL_CONTROL_MODE 0x375 +#define DP_SOURCE_PSR_ACTIVE_VTOTAL0x376 + enum ddc_result { DDC_RESULT_UNKNOWN = 0, DDC_RESULT_SUCESSFULL, -- 2.36.1
Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved
Am 2022-06-02 um 09:20 schrieb Philip Yang: Flush TLBs when existing PDEs are updated because a PTB or PDB moved, but avoids unnecessary TLB flushes when new PDBs or PTBs are added to the page table, which commonly happens when memory is mapped for the first time. Suggested-by: Christian König Signed-off-by: Philip Yang Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 9596c22fded6..1ea204218903 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, { struct amdgpu_vm_update_params params; struct amdgpu_vm_bo_base *entry; + bool flush_tlb_needed = false; int r, idx; if (list_empty(>relocated)) @@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, goto error; list_for_each_entry(entry, >relocated, vm_status) { + /* vm_flush_needed after updating moved PDEs */ + flush_tlb_needed |= entry->moved; + r = amdgpu_vm_pde_update(, entry); if (r) goto error; @@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, if (r) goto error; - /* vm_flush_needed after updating PDEs */ - atomic64_inc(>tlb_seq); + if (flush_tlb_needed) + atomic64_inc(>tlb_seq); while (!list_empty(>relocated)) { entry = list_first_entry(>relocated,
Re: [PATCH] drm/ttm: fix bulk move handling during resource init
The Null pointer against drm-next: Jun 02 19:59:50 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer dereference, address: 00d8 Jun 02 19:59:50 axion.fireburn.co.uk kernel: #PF: supervisor read access in kernel mode Jun 02 19:59:50 axion.fireburn.co.uk kernel: #PF: error_code(0x) - not-present page Jun 02 19:59:50 axion.fireburn.co.uk kernel: PGD 118700067 P4D 118700067 PUD 11f116067 PMD 0 Jun 02 19:59:50 axion.fireburn.co.uk kernel: Oops: [#1] PREEMPT SMP NOPTI Jun 02 19:59:50 axion.fireburn.co.uk kernel: CPU: 4 PID: 1029 Comm: GravityMark.x64 Tainted: GW 5.18.0-rc5-drm+ #1070 Jun 02 19:59:50 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318 03/29/2022 Jun 02 19:59:50 axion.fireburn.co.uk kernel: RIP: 0010:ttm_device_swapout+0x6a/0x3d0 Jun 02 19:59:50 axion.fireburn.co.uk kernel: Code: 85 ed 74 51 80 7d 01 00 74 4b 48 89 e6 48 89 ef e8 7b dd ff ff 48 85 c0 74 3b 48 89 c3 49 89 e6 48 8b 7b 30 4c 89 ee 44 89 e2 <4c> 8b bf d8 00 00 00 e8 fa a5 > Jun 02 19:59:50 axion.fireburn.co.uk kernel: RSP: :8881605dfc70 EFLAGS: 00010282 Jun 02 19:59:50 axion.fireburn.co.uk kernel: RAX: 888104f85ac8 RBX: 888104f85ac8 RCX: Jun 02 19:59:50 axion.fireburn.co.uk kernel: RDX: 0cc0 RSI: 8881605dfd50 RDI: Jun 02 19:59:50 axion.fireburn.co.uk kernel: RBP: 888104f85140 R08: 888101566240 R09: 88814e57b880 Jun 02 19:59:50 axion.fireburn.co.uk kernel: R10: 0063 R11: 818238a0 R12: 0cc0 Jun 02 19:59:50 axion.fireburn.co.uk kernel: R13: 8881605dfd50 R14: 8881605dfc70 R15: 00691000 Jun 02 19:59:50 axion.fireburn.co.uk kernel: FS: 7f4623fb9740() GS:888fde50() knlGS: Jun 02 19:59:50 axion.fireburn.co.uk kernel: CS: 0010 DS: ES: CR0: 80050033 Jun 02 19:59:50 axion.fireburn.co.uk kernel: CR2: 00d8 CR3: 000102e3c000 CR4: 00150ee0 Jun 02 19:59:50 axion.fireburn.co.uk kernel: Call Trace: Jun 02 19:59:50 axion.fireburn.co.uk kernel: Jun 02 19:59:50 axion.fireburn.co.uk kernel: ? ttm_global_swapout+0xae/0xc0 Jun 02 19:59:50 axion.fireburn.co.uk kernel: ? ttm_tt_populate+0x7d/0x130 Jun 02 19:59:50 axion.fireburn.co.uk kernel: ? ttm_bo_vm_fault_reserved+0x237/0x270 Jun 02 19:59:50 axion.fireburn.co.uk kernel: ? amdgpu_gem_fault+0x92/0xd0 Jun 02 19:59:50 axion.fireburn.co.uk kernel: ? do_fault+0x28e/0x4b0 Jun 02 19:59:50 axion.fireburn.co.uk kernel: ? handle_mm_fault+0x849/0xa80 Jun 02 19:59:50 axion.fireburn.co.uk kernel: ? amdgpu_drm_ioctl+0x68/0x80 Jun 02 19:59:50 axion.fireburn.co.uk kernel: ? do_user_addr_fault+0x275/0x450 Jun 02 19:59:50 axion.fireburn.co.uk kernel: ? asm_exc_page_fault+0x9/0x30 Jun 02 19:59:50 axion.fireburn.co.uk kernel: ? exc_page_fault+0x5f/0x150 Jun 02 19:59:50 axion.fireburn.co.uk kernel: ? asm_exc_page_fault+0x1f/0x30 Jun 02 19:59:50 axion.fireburn.co.uk kernel: Jun 02 19:59:50 axion.fireburn.co.uk kernel: Modules linked in: Jun 02 19:59:50 axion.fireburn.co.uk kernel: CR2: 00d8 Jun 02 19:59:50 axion.fireburn.co.uk kernel: ---[ end trace ]--- Jun 02 19:59:50 axion.fireburn.co.uk kernel: RIP: 0010:ttm_device_swapout+0x6a/0x3d0 Jun 02 19:59:50 axion.fireburn.co.uk kernel: Code: 85 ed 74 51 80 7d 01 00 74 4b 48 89 e6 48 89 ef e8 7b dd ff ff 48 85 c0 74 3b 48 89 c3 49 89 e6 48 8b 7b 30 4c 89 ee 44 89 e2 <4c> 8b bf d8 00 00 00 e8 fa a5 > Jun 02 19:59:50 axion.fireburn.co.uk kernel: RSP: :8881605dfc70 EFLAGS: 00010282 Jun 02 19:59:50 axion.fireburn.co.uk kernel: RAX: 888104f85ac8 RBX: 888104f85ac8 RCX: Jun 02 19:59:50 axion.fireburn.co.uk kernel: RDX: 0cc0 RSI: 8881605dfd50 RDI: Jun 02 19:59:50 axion.fireburn.co.uk kernel: RBP: 888104f85140 R08: 888101566240 R09: 88814e57b880 Jun 02 19:59:50 axion.fireburn.co.uk kernel: R10: 0063 R11: 818238a0 R12: 0cc0 Jun 02 19:59:50 axion.fireburn.co.uk kernel: R13: 8881605dfd50 R14: 8881605dfc70 R15: 00691000 Jun 02 19:59:50 axion.fireburn.co.uk kernel: FS: 7f4623fb9740() GS:888fde50() knlGS: Jun 02 19:59:50 axion.fireburn.co.uk kernel: CS: 0010 DS: ES: CR0: 80050033 Jun 02 19:59:50 axion.fireburn.co.uk kernel: CR2: 00d8 CR3: 000102e3c000 CR4: 00150ee0 Jun 02 19:59:50 axion.fireburn.co.uk kernel: note: GravityMark.x64[1029] exited with preempt_count 1 On Thu, 2 Jun 2022 at 19:55, Christian König wrote: > > That's because drm-misc-next is currently broken and needs a backmerge. > > Please try this patch on top of drm-next. > > Regards, > Christian. > > Am 02.06.22 um 20:08 schrieb Mike Lothian: > > Hi > > > > I'm still seeing Null pointers against Linus's tree and drm-misc with this > > patch > > > > Jun 02 19:04:05
Re: [PATCH] drm/ttm: fix bulk move handling during resource init
That's because drm-misc-next is currently broken and needs a backmerge. Please try this patch on top of drm-next. Regards, Christian. Am 02.06.22 um 20:08 schrieb Mike Lothian: Hi I'm still seeing Null pointers against Linus's tree and drm-misc with this patch Jun 02 19:04:05 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer dereference, address: 0008 Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: supervisor write access in kernel mode Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: error_code(0x0002) - not-present page Jun 02 19:04:05 axion.fireburn.co.uk kernel: PGD 11ee04067 P4D 11ee04067 PUD 15eccb067 PMD 0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI Jun 02 19:04:05 axion.fireburn.co.uk kernel: CPU: 0 PID: 1021 Comm: GravityMark.x64 Tainted: GW 5.18.0-tip+ #3177 Jun 02 19:04:05 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318 03/29/2022 Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP: 0010:ttm_resource_init+0x108/0x210 Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48 39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d 56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89 Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP: 0018:888112e73918 EFLAGS: 00010202 Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: 888206b715d8 RBX: 888206b715a0 RCX: Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: 888206b71cf8 RSI: 888206b71cc0 RDI: 888110605b00 Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: 88816c848c08 R08: 88812235c790 R09: 8881306a4bd8 Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: R11: 81851320 R12: 888110605ad0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: 888206b715a0 R14: 88816c848c58 R15: 888110605ad0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS: 7f4c257c1740() GS:888fde40() knlGS: Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS: 0010 DS: ES: CR0: 80050033 Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0008 CR3: 0001183fc000 CR4: 00350ef0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: Call Trace: Jun 02 19:04:05 axion.fireburn.co.uk kernel: Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_vram_mgr_new+0xbb/0x4b0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? ttm_bo_mem_space+0x89/0x1e0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? ttm_bo_validate+0x80/0x1a0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_cs_bo_validate+0xe9/0x2b0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_syncobj_lookup_and_add_to_sync+0xa0/0xa0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_vm_validate_pt_bos+0xce/0x1c0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_cs_parser_bos+0x522/0x6e0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_cs_ioctl+0x7fe/0xd00 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_cs_report_moved_bytes+0x60/0x60 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? drm_ioctl_kernel+0xcb/0x130 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? drm_ioctl+0x2f5/0x400 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_cs_report_moved_bytes+0x60/0x60 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_drm_ioctl+0x42/0x80 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? __x64_sys_ioctl+0x5e/0xa0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? do_syscall_64+0x6a/0x90 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? exit_to_user_mode_prepare+0x19/0x90 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? entry_SYSCALL_64_after_hwframe+0x46/0xb0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: Jun 02 19:04:05 axion.fireburn.co.uk kernel: Modules linked in: Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0008 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ---[ end trace ]--- Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP: 0010:ttm_resource_init+0x108/0x210 Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48 39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d 56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89 Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP: 0018:888112e73918 EFLAGS: 00010202 Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: 888206b715d8 RBX: 888206b715a0 RCX: Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: 888206b71cf8 RSI: 888206b71cc0 RDI: 888110605b00 Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: 88816c848c08 R08: 88812235c790 R09: 8881306a4bd8 Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: R11: 81851320 R12: 888110605ad0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: 888206b715a0 R14: 88816c848c58 R15: 888110605ad0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS: 7f4c257c1740()
Re: Explicit VM updates
Am 02.06.22 um 16:21 schrieb Felix Kuehling: [SNIP] In other words the free is not waiting for the unmap to complete, but causes command submissions through the kernel to depend on the unmap. I guess I don't understand that dependency. The next command submission obviously cannot use the memory that was unmapped. But why does it need to synchronize with the unmap operation? Because of the necessary TLB flush, only after that one is executed we can be sure that nobody has access to the memory any more and actually free it. So freeing the memory has to wait for the TLB flush. Why does the next command submission need to wait? Because that's the one triggering the TLB flush. The issue is that flushing the TLB while the VMID is in use is really unreliable on most hardware generations. It's been working well enough with ROCm. With user mode command submission there is no way to block GPU work while a TLB flush is in progress. Yeah, but at least on Navi 1x that's so horrible broken that the SDMA could write anywhere when we would try this. User mode submissions are completely unrelated to that. I mention user mode command submission because there is no way to enforce the synchronization you describe here on a user mode queue. So this approach is not very future proof. With user mode queues you need to wait for the work on the queue to finish anyway or otherwise you run into VM faults if you just unmap or free the memory. If the next command submission doesn't use the unmapped/freed memory, why does it need to wait for the TLB flush? Because it could potentially use it. When userspace lies to the kernel and still accesses the mapping we would allow access to freed up memory and create a major security problem. I'm aware of the potential security problem. That's why I'm recommending you don't actually free the memory until the TLB flush is done. So a bogus access will either harmlessly access memory that's not freed yet, or it will VM fault. It will never access memory that's already freed and potentially allocated by someone else. Yes, that's the idea. The question is just when we can do the TLB flush. If it is using the unmapped/freed memory, that's a user mode bug. But waiting for the TLB flush won't fix that. It will only turn a likely VM fault into a certain VM fault. Yeah, exactly that's the intention here. The guarantee you need to give is, that the memory is not freed and reused by anyone else until the TLB flush is done. This dependency requires synchronization of the "free" operation with the TLB flush. It does not require synchronization with any future command submissions in the context that freed the memory. See above, the future command submission is what triggers the TLB flush because only then we can easily execute it without to much hassle. That seems to be a limitation of your current command submission model. User mode command submission will not be able to trigger a TLB flush. Unmapping or freeing memory should be the trigger in that case. That's how it works with KFD. That said, our TLB flushes aren't as well pipelined (which could probably be improved), and your strategy can probably batch more TLB flushes, so I see where you're coming from. Well the mapping/unmapping IOCTL should certainly trigger the TLB flushes for the user mode queues, but as I said this is completely independent to this here. The limitation is on the kernel CS IOCTL, not the VM IOCTL. So that is completely unrelated to this. For Vega and Navi 2x we could use async TLB flushes and on gfx6, gfx7 and gfx8 we could use double TLB flushes with grace time, but Navi 1x is so horrible broken regarding this that I don't see how else we could do that. We're using heavy-weight TLB flushes on SOC15 GPUs. On Vega20 with XGMI we need double flushes to be safe. I'm raising my concerns because I don't think making user mode wait is a good strategy long-term. And I believe this explicit sync and explicit VM update should be designed with an eye for future user-mode command submission models. Yeah, but as already discussed with Daniel and Jason that will never ever work correctly. IOCTLs can't depend on user mode queues in any way. So user space can only block or rather call the map, unmap, free functions at the right time. If you need short-term workarounds for broken hardware, that's another issue. But it would be good if that could be kept out of the API. Well as I said that is completely unrelated to user mode queues. The restriction is on the CS API, not the VM API. Regards, Christian. Regards, Felix Regards, Christian. Regards, Felix The signal that TLB flush is completed comes from the MES in this case. Regards, Christian. Regards, Felix Regards, Christian. Regards, Felix 3. All VM operations requested by userspace will still be executed in order, e.g. we can't run unmap + map
Re: [PATCH] drm/ttm: fix bulk move handling during resource init
Here's the output from drm-misc too: Jun 02 19:05:32 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer dereference, address: 00d8 Jun 02 19:05:32 axion.fireburn.co.uk kernel: #PF: supervisor read access in kernel mode Jun 02 19:05:32 axion.fireburn.co.uk kernel: #PF: error_code(0x) - not-present page Jun 02 19:05:32 axion.fireburn.co.uk kernel: PGD 11dfbb067 P4D 11dfbb067 PUD 170cd7067 PMD 0 Jun 02 19:05:32 axion.fireburn.co.uk kernel: Oops: [#1] PREEMPT SMP NOPTI Jun 02 19:05:32 axion.fireburn.co.uk kernel: CPU: 1 PID: 1040 Comm: GravityMark.x64 Tainted: GW 5.18.0-rc5-misc+ #10 Jun 02 19:05:32 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318 03/29/2022 Jun 02 19:05:32 axion.fireburn.co.uk kernel: RIP: 0010:ttm_device_swapout+0x6a/0x3d0 Jun 02 19:05:32 axion.fireburn.co.uk kernel: Code: 85 ed 74 51 80 7d 01 00 74 4b 48 89 e6 48 89 ef e8 7b dd ff ff 48 85 c0 74 3b 48 89 c3 49 89 e6 48 8b 7b 30 4c 89 ee 44 89 e2 <4c> 8b bf d8 00 00 00 e8 fa a5 > Jun 02 19:05:32 axion.fireburn.co.uk kernel: RSP: :888125a17c70 EFLAGS: 00010286 Jun 02 19:05:32 axion.fireburn.co.uk kernel: RAX: 888104f45aa0 RBX: 888104f45aa0 RCX: Jun 02 19:05:32 axion.fireburn.co.uk kernel: RDX: 0cc0 RSI: 888125a17d50 RDI: Jun 02 19:05:32 axion.fireburn.co.uk kernel: RBP: 888104f45118 R08: 88812278da68 R09: 888102eaa680 Jun 02 19:05:32 axion.fireburn.co.uk kernel: R10: 0063 R11: 818212a0 R12: 0cc0 Jun 02 19:05:32 axion.fireburn.co.uk kernel: R13: 888125a17d50 R14: 888125a17c70 R15: 00691000 Jun 02 19:05:32 axion.fireburn.co.uk kernel: FS: 7fd9e0671740() GS:888fde44() knlGS: Jun 02 19:05:32 axion.fireburn.co.uk kernel: CS: 0010 DS: ES: CR0: 80050033 Jun 02 19:05:32 axion.fireburn.co.uk kernel: CR2: 00d8 CR3: 000125c5a000 CR4: 00150ee0 Jun 02 19:05:32 axion.fireburn.co.uk kernel: Call Trace: Jun 02 19:05:32 axion.fireburn.co.uk kernel: Jun 02 19:05:32 axion.fireburn.co.uk kernel: ? ttm_global_swapout+0xae/0xc0 Jun 02 19:05:32 axion.fireburn.co.uk kernel: ? ttm_tt_populate+0x7d/0x130 Jun 02 19:05:32 axion.fireburn.co.uk kernel: ? ttm_bo_vm_fault_reserved+0x237/0x270 Jun 02 19:05:32 axion.fireburn.co.uk kernel: ? amdgpu_gem_fault+0x92/0xd0 Jun 02 19:05:32 axion.fireburn.co.uk kernel: ? do_fault+0x28e/0x4b0 Jun 02 19:05:32 axion.fireburn.co.uk kernel: ? handle_mm_fault+0x849/0xa80 Jun 02 19:05:32 axion.fireburn.co.uk kernel: ? do_user_addr_fault+0x275/0x450 Jun 02 19:05:32 axion.fireburn.co.uk kernel: ? asm_exc_page_fault+0x9/0x30 Jun 02 19:05:32 axion.fireburn.co.uk kernel: ? exc_page_fault+0x5f/0x150 Jun 02 19:05:32 axion.fireburn.co.uk kernel: ? asm_exc_page_fault+0x1f/0x30 Jun 02 19:05:32 axion.fireburn.co.uk kernel: Jun 02 19:05:32 axion.fireburn.co.uk kernel: Modules linked in: Jun 02 19:05:32 axion.fireburn.co.uk kernel: CR2: 00d8 Jun 02 19:05:32 axion.fireburn.co.uk kernel: ---[ end trace ]--- Jun 02 19:05:32 axion.fireburn.co.uk kernel: RIP: 0010:ttm_device_swapout+0x6a/0x3d0 Jun 02 19:05:32 axion.fireburn.co.uk kernel: Code: 85 ed 74 51 80 7d 01 00 74 4b 48 89 e6 48 89 ef e8 7b dd ff ff 48 85 c0 74 3b 48 89 c3 49 89 e6 48 8b 7b 30 4c 89 ee 44 89 e2 <4c> 8b bf d8 00 00 00 e8 fa a5 > Jun 02 19:05:32 axion.fireburn.co.uk kernel: RSP: :888125a17c70 EFLAGS: 00010286 Jun 02 19:05:32 axion.fireburn.co.uk kernel: RAX: 888104f45aa0 RBX: 888104f45aa0 RCX: Jun 02 19:05:32 axion.fireburn.co.uk kernel: RDX: 0cc0 RSI: 888125a17d50 RDI: Jun 02 19:05:32 axion.fireburn.co.uk kernel: RBP: 888104f45118 R08: 88812278da68 R09: 888102eaa680 Jun 02 19:05:32 axion.fireburn.co.uk kernel: R10: 0063 R11: 818212a0 R12: 0cc0 Jun 02 19:05:32 axion.fireburn.co.uk kernel: R13: 888125a17d50 R14: 888125a17c70 R15: 00691000 Jun 02 19:05:32 axion.fireburn.co.uk kernel: FS: 7fd9e0671740() GS:888fde44() knlGS: Jun 02 19:05:32 axion.fireburn.co.uk kernel: CS: 0010 DS: ES: CR0: 80050033 Jun 02 19:05:32 axion.fireburn.co.uk kernel: CR2: 00d8 CR3: 000125c5a000 CR4: 00150ee0 Jun 02 19:05:32 axion.fireburn.co.uk kernel: note: GravityMark.x64[1040] exited with preempt_count 1 On Thu, 2 Jun 2022 at 19:08, Mike Lothian wrote: > > Hi > > I'm still seeing Null pointers against Linus's tree and drm-misc with this > patch > > Jun 02 19:04:05 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer > dereference, address: 0008 > Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: supervisor write > access in kernel mode > Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: error_code(0x0002) - > not-present page > Jun 02 19:04:05
Re: [PATCH] drm/ttm: fix bulk move handling during resource init
Hi I'm still seeing Null pointers against Linus's tree and drm-misc with this patch Jun 02 19:04:05 axion.fireburn.co.uk kernel: BUG: kernel NULL pointer dereference, address: 0008 Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: supervisor write access in kernel mode Jun 02 19:04:05 axion.fireburn.co.uk kernel: #PF: error_code(0x0002) - not-present page Jun 02 19:04:05 axion.fireburn.co.uk kernel: PGD 11ee04067 P4D 11ee04067 PUD 15eccb067 PMD 0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI Jun 02 19:04:05 axion.fireburn.co.uk kernel: CPU: 0 PID: 1021 Comm: GravityMark.x64 Tainted: GW 5.18.0-tip+ #3177 Jun 02 19:04:05 axion.fireburn.co.uk kernel: Hardware name: ASUSTeK COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.318 03/29/2022 Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP: 0010:ttm_resource_init+0x108/0x210 Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48 39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d 56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89 > Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP: 0018:888112e73918 EFLAGS: 00010202 Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: 888206b715d8 RBX: 888206b715a0 RCX: Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: 888206b71cf8 RSI: 888206b71cc0 RDI: 888110605b00 Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: 88816c848c08 R08: 88812235c790 R09: 8881306a4bd8 Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: R11: 81851320 R12: 888110605ad0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: 888206b715a0 R14: 88816c848c58 R15: 888110605ad0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS: 7f4c257c1740() GS:888fde40() knlGS: Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS: 0010 DS: ES: CR0: 80050033 Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0008 CR3: 0001183fc000 CR4: 00350ef0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: Call Trace: Jun 02 19:04:05 axion.fireburn.co.uk kernel: Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_vram_mgr_new+0xbb/0x4b0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? ttm_bo_mem_space+0x89/0x1e0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? ttm_bo_validate+0x80/0x1a0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_cs_bo_validate+0xe9/0x2b0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_syncobj_lookup_and_add_to_sync+0xa0/0xa0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_vm_validate_pt_bos+0xce/0x1c0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_cs_parser_bos+0x522/0x6e0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_cs_ioctl+0x7fe/0xd00 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_cs_report_moved_bytes+0x60/0x60 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? drm_ioctl_kernel+0xcb/0x130 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? drm_ioctl+0x2f5/0x400 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_cs_report_moved_bytes+0x60/0x60 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? amdgpu_drm_ioctl+0x42/0x80 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? __x64_sys_ioctl+0x5e/0xa0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? do_syscall_64+0x6a/0x90 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? exit_to_user_mode_prepare+0x19/0x90 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ? entry_SYSCALL_64_after_hwframe+0x46/0xb0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: Jun 02 19:04:05 axion.fireburn.co.uk kernel: Modules linked in: Jun 02 19:04:05 axion.fireburn.co.uk kernel: CR2: 0008 Jun 02 19:04:05 axion.fireburn.co.uk kernel: ---[ end trace ]--- Jun 02 19:04:05 axion.fireburn.co.uk kernel: RIP: 0010:ttm_resource_init+0x108/0x210 Jun 02 19:04:05 axion.fireburn.co.uk kernel: Code: 48 8b 74 0a 08 48 39 de 0f 84 82 00 00 00 48 8b 7b 38 4c 8b 4b 40 4c 8d 44 0a 08 48 8d 56 38 4c 89 4f 08 49 89 39 48 8b 4e 38 <48> 89 41 08 48 89 4b 38 48 89 > Jun 02 19:04:05 axion.fireburn.co.uk kernel: RSP: 0018:888112e73918 EFLAGS: 00010202 Jun 02 19:04:05 axion.fireburn.co.uk kernel: RAX: 888206b715d8 RBX: 888206b715a0 RCX: Jun 02 19:04:05 axion.fireburn.co.uk kernel: RDX: 888206b71cf8 RSI: 888206b71cc0 RDI: 888110605b00 Jun 02 19:04:05 axion.fireburn.co.uk kernel: RBP: 88816c848c08 R08: 88812235c790 R09: 8881306a4bd8 Jun 02 19:04:05 axion.fireburn.co.uk kernel: R10: R11: 81851320 R12: 888110605ad0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: R13: 888206b715a0 R14: 88816c848c58 R15: 888110605ad0 Jun 02 19:04:05 axion.fireburn.co.uk kernel: FS: 7f4c257c1740() GS:888fde40() knlGS: Jun 02 19:04:05 axion.fireburn.co.uk kernel: CS: 0010 DS: ES: CR0: 80050033 Jun 02 19:04:05
Re: [PATCH] drm/amdgpu: Adjust logic around GTT size (v3)
I totally agree on the reasoning, but I have the strong feeling that this will blow up in our face once more. I've tried to raise this limit twice already and had to revert it both times. And the reasons why I had to revert it haven't changed since them. Christian. Am 02.06.22 um 18:40 schrieb Alex Deucher: @Christian Koenig Any objections to this? I realize that fixing the OOM killer is ultimately the right approach, but I don't really see how this makes things worse. The current scheme is biased towards dGPUs as they have lots of on board memory so on dGPUs we can end up setting gtt size to 3/4 of system memory already in a lot of cases since there is often as much vram as system memory. Due to the limits in ttm, we can't use more than half at the moment anway, so this shouldn't make things worse on dGPUs and would help a lot of APUs. Once could make the argument that with more vram there is less need for gtt so less chance for OOM, but I think it is more of a scale issue. E.g., on dGPUs you'll generally be running higher resolutions and texture quality, etc. so the overall memory footprint is just scaled up. Alex On Fri, May 20, 2022 at 11:09 AM Alex Deucher wrote: Certain GL unit tests for large textures can cause problems with the OOM killer since there is no way to link this memory to a process. This was originally mitigated (but not necessarily eliminated) by limiting the GTT size. The problem is this limit is often too low for many modern games so just make the limit 1/2 of system memory. The OOM accounting needs to be addressed, but we shouldn't prevent common 3D applications from being usable just to potentially mitigate that corner case. Set default GTT size to max(3G, 1/2 of system ram) by default. v2: drop previous logic and default to 3/4 of ram v3: default to half of ram to align with ttm Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index d2b5cccb45c3..7195ed77c85a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -1798,18 +1798,26 @@ int amdgpu_ttm_init(struct amdgpu_device *adev) DRM_INFO("amdgpu: %uM of VRAM memory ready\n", (unsigned) (adev->gmc.real_vram_size / (1024 * 1024))); - /* Compute GTT size, either bsaed on 3/4th the size of RAM size + /* Compute GTT size, either bsaed on 1/2 the size of RAM size * or whatever the user passed on module init */ if (amdgpu_gtt_size == -1) { struct sysinfo si; si_meminfo(); - gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20), - adev->gmc.mc_vram_size), - ((uint64_t)si.totalram * si.mem_unit * 3/4)); - } - else + /* Certain GL unit tests for large textures can cause problems +* with the OOM killer since there is no way to link this memory +* to a process. This was originally mitigated (but not necessarily +* eliminated) by limiting the GTT size. The problem is this limit +* is often too low for many modern games so just make the limit 1/2 +* of system memory which aligns with TTM. The OOM accounting needs +* to be addressed, but we shouldn't prevent common 3D applications +* from being usable just to potentially mitigate that corner case. +*/ + gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20), + (u64)si.totalram * si.mem_unit / 2); + } else { gtt_size = (uint64_t)amdgpu_gtt_size << 20; + } /* Initialize GTT memory pool */ r = amdgpu_gtt_mgr_init(adev, gtt_size); -- 2.35.3
Re: [PATCH 1/3] drm/amdgpu: use VRAM|GTT for a bunch of kernel allocations
On Thu, Jun 2, 2022 at 12:12 PM Luben Tuikov wrote: > > From: Christian König > > Technically all of those can use GTT as well, no need to force things > into VRAM. > > Signed-off-by: Christian König > Acked-by: Luben Tuikov > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 7 +-- > drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 20 +++ > drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c | 9 ++--- > drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 14 + > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 ++- > drivers/gpu/drm/amd/amdgpu/psp_v10_0.c| 3 ++- > .../amd/pm/powerplay/smumgr/smu10_smumgr.c| 10 -- We need to audit the new files which have been added since the time this patch set was written. E.g., gfx_v10_.c and gfx_v11_0.c, and psp_v11_0.c, swsmu, etc. have been added in the meantime. Alex > 7 files changed, 41 insertions(+), 25 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > index 16699158e00d8c..d799815a0f288f 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > @@ -338,8 +338,11 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, > * KIQ MQD no matter SRIOV or Bare-metal > */ > r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, > - AMDGPU_GEM_DOMAIN_VRAM, > >mqd_obj, > - >mqd_gpu_addr, > >mqd_ptr); > + AMDGPU_GEM_DOMAIN_VRAM | > + AMDGPU_GEM_DOMAIN_GTT, > + >mqd_obj, > + >mqd_gpu_addr, > + >mqd_ptr); > if (r) { > dev_warn(adev->dev, "failed to create ring mqd ob > (%d)", r); > return r; > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > index e9411c28d88ba8..116f7fa25aa636 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > @@ -748,9 +748,12 @@ static int psp_tmr_init(struct psp_context *psp) > } > > pptr = amdgpu_sriov_vf(psp->adev) ? _buf : NULL; > - ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, > PSP_TMR_SIZE(psp->adev), > - AMDGPU_GEM_DOMAIN_VRAM, > - >tmr_bo, >tmr_mc_addr, pptr); > + ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, > + PSP_TMR_SIZE(psp->adev), > + AMDGPU_GEM_DOMAIN_VRAM | > + AMDGPU_GEM_DOMAIN_GTT, > + >tmr_bo, >tmr_mc_addr, > + pptr); > > return ret; > } > @@ -1039,7 +1042,8 @@ int psp_ta_init_shared_buf(struct psp_context *psp, > * physical) for ta to host memory > */ > return amdgpu_bo_create_kernel(psp->adev, mem_ctx->shared_mem_size, > - PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM, > + PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM | > + AMDGPU_GEM_DOMAIN_GTT, > _ctx->shared_bo, > _ctx->shared_mc_addr, > _ctx->shared_buf); > @@ -3397,10 +3401,10 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct > device *dev, > > /* LFB address which is aligned to 1MB boundary per PSP request */ > ret = amdgpu_bo_create_kernel(adev, usbc_pd_fw->size, 0x10, > - AMDGPU_GEM_DOMAIN_VRAM, > - _buf_bo, > - _pri_mc_addr, > - _pri_cpu_addr); > + AMDGPU_GEM_DOMAIN_VRAM | > + AMDGPU_GEM_DOMAIN_GTT, > + _buf_bo, _pri_mc_addr, > + _pri_cpu_addr); > if (ret) > goto rel_buf; > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c > index 6373bfb47d55d7..c591ed6553fcc8 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c > @@ -93,7 +93,8 @@ int amdgpu_gfx_rlc_init_sr(struct amdgpu_device *adev, u32 > dws) > > /* allocate save restore block */ > r = amdgpu_bo_create_reserved(adev, dws * 4, PAGE_SIZE, > - AMDGPU_GEM_DOMAIN_VRAM, > +
Re: [PATCH] drm/ttm: fix bulk move handling during resource init
On Thu, Jun 2, 2022 at 11:47 AM Christian König wrote: > > The resource must be on the LRU before ttm_lru_bulk_move_add() is called. > > Signed-off-by: Christian König This should at least fix the null pointer in these bugs: Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1992 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2034 Alex > --- > drivers/gpu/drm/ttm/ttm_resource.c | 8 +--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_resource.c > b/drivers/gpu/drm/ttm/ttm_resource.c > index 65889b3caf50..928b9140f3c5 100644 > --- a/drivers/gpu/drm/ttm/ttm_resource.c > +++ b/drivers/gpu/drm/ttm/ttm_resource.c > @@ -169,15 +169,17 @@ void ttm_resource_init(struct ttm_buffer_object *bo, > res->bus.is_iomem = false; > res->bus.caching = ttm_cached; > res->bo = bo; > - INIT_LIST_HEAD(>lru); > > man = ttm_manager_type(bo->bdev, place->mem_type); > spin_lock(>bdev->lru_lock); > man->usage += res->num_pages << PAGE_SHIFT; > - if (bo->bulk_move) > + if (bo->bulk_move) { > + list_add_tail(>lru, >lru[bo->priority]); > ttm_lru_bulk_move_add(bo->bulk_move, res); > - else > + } else { > + INIT_LIST_HEAD(>lru); > ttm_resource_move_to_lru_tail(res); > + } > spin_unlock(>bdev->lru_lock); > } > EXPORT_SYMBOL(ttm_resource_init); > -- > 2.25.1 >
[PATCH] drm/amdgpu/discovery: add comments about VCN instance handling
Add comments to clarify code that is safe, but triggers and smatch warning. Link: https://lists.freedesktop.org/archives/amd-gfx/2022-June/079905.html Signed-off-by: Alex Deucher Cc: Dan Carpenter --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index 91f21b725a43..b0811287f017 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -1435,6 +1435,11 @@ static int amdgpu_discovery_get_vcn_info(struct amdgpu_device *adev) return -EINVAL; } + /* num_vcn_inst is currently limited to AMDGPU_MAX_VCN_INSTANCES +* which is smaller than VCN_INFO_TABLE_MAX_NUM_INSTANCES +* but that may change in the future with new GPUs so keep this +* check for defensive purposes. +*/ if (adev->vcn.num_vcn_inst > VCN_INFO_TABLE_MAX_NUM_INSTANCES) { dev_err(adev->dev, "invalid vcn instances\n"); return -EINVAL; @@ -1450,6 +1455,9 @@ static int amdgpu_discovery_get_vcn_info(struct amdgpu_device *adev) switch (le16_to_cpu(vcn_info->v1.header.version_major)) { case 1: + /* num_vcn_inst is currently limited to AMDGPU_MAX_VCN_INSTANCES +* so this won't overflow. +*/ for (v = 0; v < adev->vcn.num_vcn_inst; v++) { adev->vcn.vcn_codec_disable_mask[v] = le32_to_cpu(vcn_info->v1.instance_info[v].fuse_data.all_bits); -- 2.35.3
Re: [PATCH] drm/amdgpu: Adjust logic around GTT size (v3)
@Christian Koenig Any objections to this? I realize that fixing the OOM killer is ultimately the right approach, but I don't really see how this makes things worse. The current scheme is biased towards dGPUs as they have lots of on board memory so on dGPUs we can end up setting gtt size to 3/4 of system memory already in a lot of cases since there is often as much vram as system memory. Due to the limits in ttm, we can't use more than half at the moment anway, so this shouldn't make things worse on dGPUs and would help a lot of APUs. Once could make the argument that with more vram there is less need for gtt so less chance for OOM, but I think it is more of a scale issue. E.g., on dGPUs you'll generally be running higher resolutions and texture quality, etc. so the overall memory footprint is just scaled up. Alex On Fri, May 20, 2022 at 11:09 AM Alex Deucher wrote: > > Certain GL unit tests for large textures can cause problems > with the OOM killer since there is no way to link this memory > to a process. This was originally mitigated (but not necessarily > eliminated) by limiting the GTT size. The problem is this limit > is often too low for many modern games so just make the limit 1/2 > of system memory. The OOM accounting needs to be addressed, but > we shouldn't prevent common 3D applications from being usable > just to potentially mitigate that corner case. > > Set default GTT size to max(3G, 1/2 of system ram) by default. > > v2: drop previous logic and default to 3/4 of ram > v3: default to half of ram to align with ttm > > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 ++-- > 1 file changed, 14 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > index d2b5cccb45c3..7195ed77c85a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > @@ -1798,18 +1798,26 @@ int amdgpu_ttm_init(struct amdgpu_device *adev) > DRM_INFO("amdgpu: %uM of VRAM memory ready\n", > (unsigned) (adev->gmc.real_vram_size / (1024 * 1024))); > > - /* Compute GTT size, either bsaed on 3/4th the size of RAM size > + /* Compute GTT size, either bsaed on 1/2 the size of RAM size > * or whatever the user passed on module init */ > if (amdgpu_gtt_size == -1) { > struct sysinfo si; > > si_meminfo(); > - gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20), > - adev->gmc.mc_vram_size), > - ((uint64_t)si.totalram * si.mem_unit * 3/4)); > - } > - else > + /* Certain GL unit tests for large textures can cause problems > +* with the OOM killer since there is no way to link this > memory > +* to a process. This was originally mitigated (but not > necessarily > +* eliminated) by limiting the GTT size. The problem is this > limit > +* is often too low for many modern games so just make the > limit 1/2 > +* of system memory which aligns with TTM. The OOM accounting > needs > +* to be addressed, but we shouldn't prevent common 3D > applications > +* from being usable just to potentially mitigate that corner > case. > +*/ > + gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20), > + (u64)si.totalram * si.mem_unit / 2); > + } else { > gtt_size = (uint64_t)amdgpu_gtt_size << 20; > + } > > /* Initialize GTT memory pool */ > r = amdgpu_gtt_mgr_init(adev, gtt_size); > -- > 2.35.3 >
Re: [PATCH] drm/ttm: fix bulk move handling during resource init
Acked-by: Luben Tuikov Regards, Luben On 2022-06-02 11:47, Christian König wrote: > The resource must be on the LRU before ttm_lru_bulk_move_add() is called. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/ttm/ttm_resource.c | 8 +--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_resource.c > b/drivers/gpu/drm/ttm/ttm_resource.c > index 65889b3caf50..928b9140f3c5 100644 > --- a/drivers/gpu/drm/ttm/ttm_resource.c > +++ b/drivers/gpu/drm/ttm/ttm_resource.c > @@ -169,15 +169,17 @@ void ttm_resource_init(struct ttm_buffer_object *bo, > res->bus.is_iomem = false; > res->bus.caching = ttm_cached; > res->bo = bo; > - INIT_LIST_HEAD(>lru); > > man = ttm_manager_type(bo->bdev, place->mem_type); > spin_lock(>bdev->lru_lock); > man->usage += res->num_pages << PAGE_SHIFT; > - if (bo->bulk_move) > + if (bo->bulk_move) { > + list_add_tail(>lru, >lru[bo->priority]); > ttm_lru_bulk_move_add(bo->bulk_move, res); > - else > + } else { > + INIT_LIST_HEAD(>lru); > ttm_resource_move_to_lru_tail(res); > + } > spin_unlock(>bdev->lru_lock); > } > EXPORT_SYMBOL(ttm_resource_init); Regards, -- Luben
[PATCH 2/3] drm/amdgpu: rename vram_scratch into mem_scratch
From: Christian König Rename vram_scratch into mem_scratch and allow allocating it into GTT as well. The only problem with that is that we won't have a default page for the system aperture any more. Signed-off-by: Christian König Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 27 +++--- drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c | 2 +- drivers/gpu/drm/amd/amdgpu/gfxhub_v3_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c| 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c| 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c| 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c| 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v3_0.c| 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v3_0_1.c | 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v3_0_2.c | 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c| 2 +- 17 files changed, 31 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 96d058c4cd4b7c..38bf6fb08773b1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -603,7 +603,7 @@ int amdgpu_cs_wait_fences_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); /* VRAM scratch page for HDP bug, default vram page */ -struct amdgpu_vram_scratch { +struct amdgpu_mem_scratch { struct amdgpu_bo*robj; volatile uint32_t *ptr; u64 gpu_addr; @@ -842,7 +842,7 @@ struct amdgpu_device { /* memory management */ struct amdgpu_mman mman; - struct amdgpu_vram_scratch vram_scratch; + struct amdgpu_mem_scratch mem_scratch; struct amdgpu_wbwb; atomic64_t num_bytes_moved; atomic64_t num_evictions; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index b5ee0eb984ee65..2681a615054fe3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -920,32 +920,33 @@ static int amdgpu_device_asic_init(struct amdgpu_device *adev) } /** - * amdgpu_device_vram_scratch_init - allocate the VRAM scratch page + * amdgpu_device_mem_scratch_init - allocate the VRAM scratch page * * @adev: amdgpu_device pointer * * Allocates a scratch page of VRAM for use by various things in the * driver. */ -static int amdgpu_device_vram_scratch_init(struct amdgpu_device *adev) +static int amdgpu_device_mem_scratch_init(struct amdgpu_device *adev) { - return amdgpu_bo_create_kernel(adev, AMDGPU_GPU_PAGE_SIZE, - PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM, - >vram_scratch.robj, - >vram_scratch.gpu_addr, - (void **)>vram_scratch.ptr); + return amdgpu_bo_create_kernel(adev, AMDGPU_GPU_PAGE_SIZE, PAGE_SIZE, + AMDGPU_GEM_DOMAIN_VRAM | + AMDGPU_GEM_DOMAIN_GTT, + >mem_scratch.robj, + >mem_scratch.gpu_addr, + (void **)>mem_scratch.ptr); } /** - * amdgpu_device_vram_scratch_fini - Free the VRAM scratch page + * amdgpu_device_mem_scratch_fini - Free the VRAM scratch page * * @adev: amdgpu_device pointer * * Frees the VRAM scratch page. */ -static void amdgpu_device_vram_scratch_fini(struct amdgpu_device *adev) +static void amdgpu_device_mem_scratch_fini(struct amdgpu_device *adev) { - amdgpu_bo_free_kernel(>vram_scratch.robj, NULL, NULL); + amdgpu_bo_free_kernel(>mem_scratch.robj, NULL, NULL); } /** @@ -2367,9 +2368,9 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) if (amdgpu_sriov_vf(adev)) amdgpu_virt_exchange_data(adev); - r = amdgpu_device_vram_scratch_init(adev); + r = amdgpu_device_mem_scratch_init(adev); if (r) { - DRM_ERROR("amdgpu_vram_scratch_init failed %d\n", r); + DRM_ERROR("amdgpu_mem_scratch_init failed %d\n", r); goto init_failed; } r = adev->ip_blocks[i].version->funcs->hw_init((void *)adev); @@ -2839,7 +2840,7 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
[PATCH 3/3] drm/amdgpu: cleanup visible vram size handling
From: Christian König Centralize the limit handling and validation in one place instead of spreading that around in different hw generations. Signed-off-by: Christian König Acked-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 7 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 7 --- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 3 --- drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 3 --- drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 3 --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 --- 6 files changed, 7 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index aebc384531ac8f..689978dad1d58f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -201,6 +201,7 @@ uint64_t amdgpu_gmc_agp_addr(struct ttm_buffer_object *bo) void amdgpu_gmc_vram_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc, u64 base) { + uint64_t vis_limit = (uint64_t)amdgpu_vis_vram_limit << 20; uint64_t limit = (uint64_t)amdgpu_vram_limit << 20; mc->vram_start = base; @@ -208,6 +209,12 @@ void amdgpu_gmc_vram_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc, if (limit && limit < mc->real_vram_size) mc->real_vram_size = limit; + if (vis_limit && vis_limit < mc->visible_vram_size) + mc->visible_vram_size = vis_limit; + + if (mc->real_vram_size < mc->visible_vram_size) + mc->visible_vram_size = mc->real_vram_size; + if (mc->xgmi.num_physical_nodes == 0) { mc->fb_start = mc->vram_start; mc->fb_end = mc->vram_end; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index ba3221a25e7536..61006f9f9ed388 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -1707,7 +1707,6 @@ int amdgpu_ttm_init(struct amdgpu_device *adev) { uint64_t gtt_size; int r; - u64 vis_vram_limit; mutex_init(>mman.gtt_window_lock); @@ -1730,12 +1729,6 @@ int amdgpu_ttm_init(struct amdgpu_device *adev) return r; } - /* Reduce size of CPU-visible VRAM if requested */ - vis_vram_limit = (u64)amdgpu_vis_vram_limit * 1024 * 1024; - if (amdgpu_vis_vram_limit > 0 && - vis_vram_limit <= adev->gmc.visible_vram_size) - adev->gmc.visible_vram_size = vis_vram_limit; - /* Change the size here instead of the init above so only lpfn is affected */ amdgpu_ttm_set_buffer_funcs_status(adev, false); #ifdef CONFIG_64BIT diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c index 9077dfccaf3cf9..851f415f2dba61 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c @@ -827,10 +827,7 @@ static int gmc_v10_0_mc_init(struct amdgpu_device *adev) } #endif - /* In case the PCI BAR is larger than the actual amount of vram */ adev->gmc.visible_vram_size = adev->gmc.aper_size; - if (adev->gmc.visible_vram_size > adev->gmc.real_vram_size) - adev->gmc.visible_vram_size = adev->gmc.real_vram_size; /* set the gart size */ if (amdgpu_gart_size == -1) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c index b309f3ab2917c3..0b95afececdc2d 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c @@ -389,10 +389,7 @@ static int gmc_v7_0_mc_init(struct amdgpu_device *adev) } #endif - /* In case the PCI BAR is larger than the actual amount of vram */ adev->gmc.visible_vram_size = adev->gmc.aper_size; - if (adev->gmc.visible_vram_size > adev->gmc.real_vram_size) - adev->gmc.visible_vram_size = adev->gmc.real_vram_size; /* set the gart size */ if (amdgpu_gart_size == -1) { diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c index 24a256cfd7ceb9..8256795f66461a 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c @@ -587,10 +587,7 @@ static int gmc_v8_0_mc_init(struct amdgpu_device *adev) } #endif - /* In case the PCI BAR is larger than the actual amount of vram */ adev->gmc.visible_vram_size = adev->gmc.aper_size; - if (adev->gmc.visible_vram_size > adev->gmc.real_vram_size) - adev->gmc.visible_vram_size = adev->gmc.real_vram_size; /* set the gart size */ if (amdgpu_gart_size == -1) { diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 22761a3bb8181e..e246c999b44acd 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -1467,10 +1467,7 @@ static int gmc_v9_0_mc_init(struct
[PATCH 1/3] drm/amdgpu: use VRAM|GTT for a bunch of kernel allocations
From: Christian König Technically all of those can use GTT as well, no need to force things into VRAM. Signed-off-by: Christian König Acked-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 7 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 20 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c | 9 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 14 + drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 ++- drivers/gpu/drm/amd/amdgpu/psp_v10_0.c| 3 ++- .../amd/pm/powerplay/smumgr/smu10_smumgr.c| 10 -- 7 files changed, 41 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index 16699158e00d8c..d799815a0f288f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c @@ -338,8 +338,11 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, * KIQ MQD no matter SRIOV or Bare-metal */ r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, - AMDGPU_GEM_DOMAIN_VRAM, >mqd_obj, - >mqd_gpu_addr, >mqd_ptr); + AMDGPU_GEM_DOMAIN_VRAM | + AMDGPU_GEM_DOMAIN_GTT, + >mqd_obj, + >mqd_gpu_addr, + >mqd_ptr); if (r) { dev_warn(adev->dev, "failed to create ring mqd ob (%d)", r); return r; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index e9411c28d88ba8..116f7fa25aa636 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -748,9 +748,12 @@ static int psp_tmr_init(struct psp_context *psp) } pptr = amdgpu_sriov_vf(psp->adev) ? _buf : NULL; - ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, PSP_TMR_SIZE(psp->adev), - AMDGPU_GEM_DOMAIN_VRAM, - >tmr_bo, >tmr_mc_addr, pptr); + ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, + PSP_TMR_SIZE(psp->adev), + AMDGPU_GEM_DOMAIN_VRAM | + AMDGPU_GEM_DOMAIN_GTT, + >tmr_bo, >tmr_mc_addr, + pptr); return ret; } @@ -1039,7 +1042,8 @@ int psp_ta_init_shared_buf(struct psp_context *psp, * physical) for ta to host memory */ return amdgpu_bo_create_kernel(psp->adev, mem_ctx->shared_mem_size, - PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM, + PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM | + AMDGPU_GEM_DOMAIN_GTT, _ctx->shared_bo, _ctx->shared_mc_addr, _ctx->shared_buf); @@ -3397,10 +3401,10 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev, /* LFB address which is aligned to 1MB boundary per PSP request */ ret = amdgpu_bo_create_kernel(adev, usbc_pd_fw->size, 0x10, - AMDGPU_GEM_DOMAIN_VRAM, - _buf_bo, - _pri_mc_addr, - _pri_cpu_addr); + AMDGPU_GEM_DOMAIN_VRAM | + AMDGPU_GEM_DOMAIN_GTT, + _buf_bo, _pri_mc_addr, + _pri_cpu_addr); if (ret) goto rel_buf; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c index 6373bfb47d55d7..c591ed6553fcc8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c @@ -93,7 +93,8 @@ int amdgpu_gfx_rlc_init_sr(struct amdgpu_device *adev, u32 dws) /* allocate save restore block */ r = amdgpu_bo_create_reserved(adev, dws * 4, PAGE_SIZE, - AMDGPU_GEM_DOMAIN_VRAM, + AMDGPU_GEM_DOMAIN_VRAM | + AMDGPU_GEM_DOMAIN_GTT, >gfx.rlc.save_restore_obj, >gfx.rlc.save_restore_gpu_addr, (void **)>gfx.rlc.sr_ptr); @@ -130,7 +131,8 @@ int amdgpu_gfx_rlc_init_csb(struct amdgpu_device *adev) /* allocate clear state block */ adev->gfx.rlc.clear_state_size
[PATCH 0/3] Some VRAM cleanups
Some VRAM cleanups from Christian. Second patch needed some work to be able to compile with most recent amd-staging-drm-next. Tested on a couple of differing systems. Christian König (3): drm/amdgpu: use VRAM|GTT for a bunch of kernel allocations drm/amdgpu: rename vram_scratch into mem_scratch drm/amdgpu: cleanup visible vram size handling drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 27 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 7 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 7 + drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 20 -- drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c | 9 --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 7 - drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 14 +++--- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 ++- drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c | 2 +- drivers/gpu/drm/amd/amdgpu/gfxhub_v3_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c| 3 --- drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 5 +--- drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 5 +--- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 --- drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c | 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c | 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v3_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v3_0_1.c | 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v3_0_2.c | 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | 2 +- drivers/gpu/drm/amd/amdgpu/psp_v10_0.c| 3 ++- .../amd/pm/powerplay/smumgr/smu10_smumgr.c| 10 +++ 28 files changed, 79 insertions(+), 74 deletions(-) base-commit: 6cc31d9a3e0944ca02dda6f53f9c9320b8bee928 -- 2.36.1.74.g277cf0bc36
[PATCH] drm/ttm: fix bulk move handling during resource init
The resource must be on the LRU before ttm_lru_bulk_move_add() is called. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_resource.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c index 65889b3caf50..928b9140f3c5 100644 --- a/drivers/gpu/drm/ttm/ttm_resource.c +++ b/drivers/gpu/drm/ttm/ttm_resource.c @@ -169,15 +169,17 @@ void ttm_resource_init(struct ttm_buffer_object *bo, res->bus.is_iomem = false; res->bus.caching = ttm_cached; res->bo = bo; - INIT_LIST_HEAD(>lru); man = ttm_manager_type(bo->bdev, place->mem_type); spin_lock(>bdev->lru_lock); man->usage += res->num_pages << PAGE_SHIFT; - if (bo->bulk_move) + if (bo->bulk_move) { + list_add_tail(>lru, >lru[bo->priority]); ttm_lru_bulk_move_add(bo->bulk_move, res); - else + } else { + INIT_LIST_HEAD(>lru); ttm_resource_move_to_lru_tail(res); + } spin_unlock(>bdev->lru_lock); } EXPORT_SYMBOL(ttm_resource_init); -- 2.25.1
Re: [kbuild] drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1433 amdgpu_discovery_get_vcn_info() error: buffer overflow 'adev->vcn.vcn_codec_disable_mask' 2 <= 3
On Thu, Jun 2, 2022 at 11:33 AM Dan Carpenter wrote: > > On Thu, Jun 02, 2022 at 10:24:58AM -0400, Alex Deucher wrote: > > On Thu, Jun 2, 2022 at 7:51 AM Dan Carpenter > > wrote: > > > > > > On Thu, Jun 02, 2022 at 08:26:03AM +0200, Ernst Sjöstrand wrote: > > > > Dan: I also ran Smatch which resulted in the following discussion: > > > > > > > > https://lists.freedesktop.org/archives/amd-gfx/2022-May/079228.html > > > > > > Since the bounds check is dead code which does not make sense and is not > > > required, another idea would be to just delete it. > > > > It wouldn't be dead code if AMDGPU_MAX_VCN_INSTANCES ever increased. > > Or we could add a comment to the code I suppose. > > /* Impossible in 2022 but this check might sense in the future */ Good idea. I'll send out a patch. Thanks, Alex > > regards, > dan carpenter >
Re: [kbuild] drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1433 amdgpu_discovery_get_vcn_info() error: buffer overflow 'adev->vcn.vcn_codec_disable_mask' 2 <= 3
On Thu, Jun 02, 2022 at 10:24:58AM -0400, Alex Deucher wrote: > On Thu, Jun 2, 2022 at 7:51 AM Dan Carpenter wrote: > > > > On Thu, Jun 02, 2022 at 08:26:03AM +0200, Ernst Sjöstrand wrote: > > > Dan: I also ran Smatch which resulted in the following discussion: > > > > > > https://lists.freedesktop.org/archives/amd-gfx/2022-May/079228.html > > > > Since the bounds check is dead code which does not make sense and is not > > required, another idea would be to just delete it. > > It wouldn't be dead code if AMDGPU_MAX_VCN_INSTANCES ever increased. Or we could add a comment to the code I suppose. /* Impossible in 2022 but this check might sense in the future */ regards, dan carpenter
Re: [kbuild] drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1433 amdgpu_discovery_get_vcn_info() error: buffer overflow 'adev->vcn.vcn_codec_disable_mask' 2 <= 3
On Thu, Jun 2, 2022 at 7:51 AM Dan Carpenter wrote: > > On Thu, Jun 02, 2022 at 08:26:03AM +0200, Ernst Sjöstrand wrote: > > Dan: I also ran Smatch which resulted in the following discussion: > > > > https://lists.freedesktop.org/archives/amd-gfx/2022-May/079228.html > > Since the bounds check is dead code which does not make sense and is not > required, another idea would be to just delete it. It wouldn't be dead code if AMDGPU_MAX_VCN_INSTANCES ever increased. Alex > > regards, > dan carpenter >
Re: (REGRESSION bisected) Re: amdgpu errors (VM fault / GPU fault detected) with 5.19 merge window snapshots
On Thu, Jun 02, 2022 at 09:58:22AM -0400, Alex Deucher wrote: > On Fri, May 27, 2022 at 8:58 AM Michal Kubecek wrote: > > On Fri, May 27, 2022 at 11:00:39AM +0200, Michal Kubecek wrote: > > > Hello, > > > > > > while testing 5.19 merge window snapshots (commits babf0bb978e3 and > > > 7e284070abe5), I keep getting errors like below. I have not seen them > > > with 5.18 final or older. > > > > > > > > > [ 247.150333] gmc_v8_0_process_interrupt: 46 callbacks suppressed > > > [ 247.150336] amdgpu :0c:00.0: amdgpu: GPU fault detected: 147 > > > 0x00020802 for process firefox pid 6101 thread firefox:cs0 pid 6116 > > > [ 247.150339] amdgpu :0c:00.0: amdgpu: > > > VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00107800 > > > [ 247.150340] amdgpu :0c:00.0: amdgpu: > > > VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0D008002 > > > [ 247.150341] amdgpu :0c:00.0: amdgpu: VM fault (0x02, vmid 6, pasid > > > 32780) at page 1079296, write from 'TC2' (0x54433200) (8) > > [...] > > > [ 249.925909] amdgpu :0c:00.0: amdgpu: IH ring buffer overflow > > > (0x000844C0, 0x4A00, 0x44D0) > > > [ 250.434986] [drm] Fence fallback timer expired on ring sdma0 > > > [ 466.621568] gmc_v8_0_process_interrupt: 122 callbacks suppressed > > [...] > > > > > > > > > There does not seem to be any apparent immediate problem with graphics > > > but when running commit babf0bb978e3, there seemed to be a noticeable > > > lag in some operations, e.g. when moving a window or repainting large > > > part of the terminal window in konsole (no idea if it's related). > > > > > > My GPU is Radeon Pro WX 2100 (1002:6995). What other information should > > > I collect to help debugging the issue? > > > > Bisected to commit 5255e146c99a ("drm/amdgpu: rework TLB flushing"). > > There seem to be later commits depending on it so I did not test > > a revert on top of current mainline. > > > > I should also mention that most commits tested as "bad" during the > > bisect did behave much worse than current mainline (errors starting as > > early as with sddm, visibly damaged screen content, sometimes even > > crashes). But all of them issued messages similar to those above into > > kernel log. > > Can you verify that the kernel you tested has this patch: > https://cgit.freedesktop.org/drm/drm/commit/?id=5be323562c6a699d38430bc068a3fd192be8ed0d Yes, both of them: mike@lion:~/work/git/kernel-upstream> git merge-base --is-ancestor 5be323562c6a babf0bb978e3 && echo yes yes (7e284070abe5 is a later mainline snapshot so it also contains 5be323562c6a) But it's likely that commit 5be323562c6a fixed most of the problem and only some corner case was left as most bisect steps had many more error messages and some even crashed before I was able to even log into KDE. Compared to that, the mainline snapshots show much fewer errors, no distorted picture and no crash; on the other hand, applications like firefox or stellarium seem to trigger the errors quite consistently. Michal signature.asc Description: PGP signature
Re: Explicit VM updates
Am 2022-06-02 um 02:47 schrieb Christian König: Am 01.06.22 um 19:39 schrieb Felix Kuehling: Am 2022-06-01 um 13:22 schrieb Christian König: Am 01.06.22 um 19:07 schrieb Felix Kuehling: Am 2022-06-01 um 12:29 schrieb Christian König: Am 01.06.22 um 17:05 schrieb Felix Kuehling: Am 2022-06-01 um 08:40 schrieb Christian König: Hey guys, so today Bas came up with a new requirement regarding the explicit synchronization to VM updates and a bunch of prototype patches. I've been thinking about that stuff for quite some time before, but to be honest it's one of the most trickiest parts of the driver. So my current thinking is that we could potentially handle those requirements like this: 1. We add some new EXPLICIT flag to context (or CS?) and VM IOCTL. This way we either get the new behavior for the whole CS+VM or the old one, but never both mixed. 2. When memory is unmapped we keep around the last unmap operation inside the bo_va. 3. When memory is freed we add all the CS fences which could access that memory + the last unmap operation as BOOKKEEP fences to the BO and as mandatory sync fence to the VM. Memory freed either because of an eviction or because of userspace closing the handle will be seen as a combination of unmap+free. The result is the following semantic for userspace to avoid implicit synchronization as much as possible: 1. When you allocate and map memory it is mandatory to either wait for the mapping operation to complete or to add it as dependency for your CS. If this isn't followed the application will run into CS faults (that's what we pretty much already implemented). This makes sense. 2. When memory is freed you must unmap that memory first and then wait for this unmap operation to complete before freeing the memory. If this isn't followed the kernel will add a forcefully wait to the next CS to block until the unmap is completed. This would work for now, but it won't work for user mode submission in the future. I find it weird that user mode needs to wait for the unmap. For user mode, unmap and free should always be asynchronous. I can't think of any good reasons to make user mode wait for the driver to clean up its stuff. Could the waiting be done in kernel mode instead? TTM already does delayed freeing if there are fences outstanding on a BO being freed. This should make it easy to delay freeing until the unmap is done without blocking the user mode thread. This is not about blocking, but synchronization dependencies. Then I must have misunderstood this sentence: "When memory is freed you must unmap that memory first and then wait for this unmap operation to complete before freeing the memory." If the pronoun "you" is the user mode driver, it means user mode must wait for kernel mode to finish unmapping memory before freeing it. Was that not what you meant? Ah, yes. The UMD must wait for the kernel to finish unmapping all the maps from the BO before it drops the handle of the BO and with that frees it. In other words the free is not waiting for the unmap to complete, but causes command submissions through the kernel to depend on the unmap. I guess I don't understand that dependency. The next command submission obviously cannot use the memory that was unmapped. But why does it need to synchronize with the unmap operation? Because of the necessary TLB flush, only after that one is executed we can be sure that nobody has access to the memory any more and actually free it. So freeing the memory has to wait for the TLB flush. Why does the next command submission need to wait? Because that's the one triggering the TLB flush. The issue is that flushing the TLB while the VMID is in use is really unreliable on most hardware generations. It's been working well enough with ROCm. With user mode command submission there is no way to block GPU work while a TLB flush is in progress. User mode submissions are completely unrelated to that. I mention user mode command submission because there is no way to enforce the synchronization you describe here on a user mode queue. So this approach is not very future proof. With user mode queues you need to wait for the work on the queue to finish anyway or otherwise you run into VM faults if you just unmap or free the memory. If the next command submission doesn't use the unmapped/freed memory, why does it need to wait for the TLB flush? Because it could potentially use it. When userspace lies to the kernel and still accesses the mapping we would allow access to freed up memory and create a major security problem. I'm aware of the potential security problem. That's why I'm recommending you don't actually free the memory until the TLB flush is done. So a bogus access will either harmlessly access memory that's not freed yet, or it will VM fault. It will never access memory that's already freed and potentially allocated by
Re: (REGRESSION bisected) Re: amdgpu errors (VM fault / GPU fault detected) with 5.19 merge window snapshots
On Fri, May 27, 2022 at 8:58 AM Michal Kubecek wrote: > > On Fri, May 27, 2022 at 11:00:39AM +0200, Michal Kubecek wrote: > > Hello, > > > > while testing 5.19 merge window snapshots (commits babf0bb978e3 and > > 7e284070abe5), I keep getting errors like below. I have not seen them > > with 5.18 final or older. > > > > > > [ 247.150333] gmc_v8_0_process_interrupt: 46 callbacks suppressed > > [ 247.150336] amdgpu :0c:00.0: amdgpu: GPU fault detected: 147 > > 0x00020802 for process firefox pid 6101 thread firefox:cs0 pid 6116 > > [ 247.150339] amdgpu :0c:00.0: amdgpu: > > VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00107800 > > [ 247.150340] amdgpu :0c:00.0: amdgpu: > > VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0D008002 > > [ 247.150341] amdgpu :0c:00.0: amdgpu: VM fault (0x02, vmid 6, pasid > > 32780) at page 1079296, write from 'TC2' (0x54433200) (8) > [...] > > [ 249.925909] amdgpu :0c:00.0: amdgpu: IH ring buffer overflow > > (0x000844C0, 0x4A00, 0x44D0) > > [ 250.434986] [drm] Fence fallback timer expired on ring sdma0 > > [ 466.621568] gmc_v8_0_process_interrupt: 122 callbacks suppressed > [...] > > > > > > There does not seem to be any apparent immediate problem with graphics > > but when running commit babf0bb978e3, there seemed to be a noticeable > > lag in some operations, e.g. when moving a window or repainting large > > part of the terminal window in konsole (no idea if it's related). > > > > My GPU is Radeon Pro WX 2100 (1002:6995). What other information should > > I collect to help debugging the issue? > > Bisected to commit 5255e146c99a ("drm/amdgpu: rework TLB flushing"). > There seem to be later commits depending on it so I did not test > a revert on top of current mainline. > > I should also mention that most commits tested as "bad" during the > bisect did behave much worse than current mainline (errors starting as > early as with sddm, visibly damaged screen content, sometimes even > crashes). But all of them issued messages similar to those above into > kernel log. Can you verify that the kernel you tested has this patch: https://cgit.freedesktop.org/drm/drm/commit/?id=5be323562c6a699d38430bc068a3fd192be8ed0d Alex
Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved
Am 02.06.22 um 15:44 schrieb Lazar, Lijo: On 6/2/2022 7:06 PM, Christian König wrote: Am 02.06.22 um 15:26 schrieb Lazar, Lijo: On 6/2/2022 6:54 PM, Lazar, Lijo wrote: On 6/2/2022 6:50 PM, Philip Yang wrote: Flush TLBs when existing PDEs are updated because a PTB or PDB moved, but avoids unnecessary TLB flushes when new PDBs or PTBs are added to the page table, which commonly happens when memory is mapped for the first time. Suggested-by: Christian König Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 9596c22fded6..1ea204218903 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, { struct amdgpu_vm_update_params params; struct amdgpu_vm_bo_base *entry; + bool flush_tlb_needed = false; int r, idx; if (list_empty(>relocated)) @@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, goto error; list_for_each_entry(entry, >relocated, vm_status) { + /* vm_flush_needed after updating moved PDEs */ + flush_tlb_needed |= entry->moved; That is a strange thing to do for a bool variable. Why not just assign it? Hmm.. In a loop, perhaps you meant logical OR? Well IIRC C doesn't have a logical or assignment operator "||=", so "|=" is used instead which also gets the job done. var = var || value; also will work. BTW, v1 of this patch was incrementing vm->tlb_seq for every entry moved. This one increments only once. So is this vm->tlb_seq required only to be a bool? No, it is required to only increment once. It's a sequence number and changing it signals that a TLB flush is necessary. Regards, Christian. Thanks, Lijo Christian. Thanks, Lijo Thanks, Lijo + r = amdgpu_vm_pde_update(, entry); if (r) goto error; @@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, if (r) goto error; - /* vm_flush_needed after updating PDEs */ - atomic64_inc(>tlb_seq); + if (flush_tlb_needed) + atomic64_inc(>tlb_seq); while (!list_empty(>relocated)) { entry = list_first_entry(>relocated,
Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved
On 6/2/2022 7:06 PM, Christian König wrote: Am 02.06.22 um 15:26 schrieb Lazar, Lijo: On 6/2/2022 6:54 PM, Lazar, Lijo wrote: On 6/2/2022 6:50 PM, Philip Yang wrote: Flush TLBs when existing PDEs are updated because a PTB or PDB moved, but avoids unnecessary TLB flushes when new PDBs or PTBs are added to the page table, which commonly happens when memory is mapped for the first time. Suggested-by: Christian König Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 9596c22fded6..1ea204218903 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, { struct amdgpu_vm_update_params params; struct amdgpu_vm_bo_base *entry; + bool flush_tlb_needed = false; int r, idx; if (list_empty(>relocated)) @@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, goto error; list_for_each_entry(entry, >relocated, vm_status) { + /* vm_flush_needed after updating moved PDEs */ + flush_tlb_needed |= entry->moved; That is a strange thing to do for a bool variable. Why not just assign it? Hmm.. In a loop, perhaps you meant logical OR? Well IIRC C doesn't have a logical or assignment operator "||=", so "|=" is used instead which also gets the job done. var = var || value; also will work. BTW, v1 of this patch was incrementing vm->tlb_seq for every entry moved. This one increments only once. So is this vm->tlb_seq required only to be a bool? Thanks, Lijo Christian. Thanks, Lijo Thanks, Lijo + r = amdgpu_vm_pde_update(, entry); if (r) goto error; @@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, if (r) goto error; - /* vm_flush_needed after updating PDEs */ - atomic64_inc(>tlb_seq); + if (flush_tlb_needed) + atomic64_inc(>tlb_seq); while (!list_empty(>relocated)) { entry = list_first_entry(>relocated,
Re: [PATCH] drm/amdgpu/display: Protect some functions with CONFIG_DRM_AMD_DC_DCN
On 2022-06-01 22:31, Alex Deucher wrote: > Protect remove_hpo_dp_link_enc_from_ctx() and release_hpo_dp_link_enc() > with CONFIG_DRM_AMD_DC_DCN as the functions are only called from code > that is protected by CONFIG_DRM_AMD_DC_DCN. Fixes build fail with > -Werror=unused-function. > > Fixes: 9b0e0d433f74 ("drm/amd/display: Add dependant changes for DCN32/321") > Reported-by: Stephen Rothwell > Signed-off-by: Alex Deucher > Cc: Aurabindo Pillai Reviewed-by: Harry Wentland Harry > --- > drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c > b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c > index b087452e4590..a098fd0cb240 100644 > --- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c > +++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c > @@ -1801,6 +1801,7 @@ static inline void retain_hpo_dp_link_enc( > res_ctx->hpo_dp_link_enc_ref_cnts[enc_index]++; > } > > +#if defined(CONFIG_DRM_AMD_DC_DCN) > static inline void release_hpo_dp_link_enc( > struct resource_context *res_ctx, > int enc_index) > @@ -1808,6 +1809,7 @@ static inline void release_hpo_dp_link_enc( > ASSERT(res_ctx->hpo_dp_link_enc_ref_cnts[enc_index] > 0); > res_ctx->hpo_dp_link_enc_ref_cnts[enc_index]--; > } > +#endif > > static bool add_hpo_dp_link_enc_to_ctx(struct resource_context *res_ctx, > const struct resource_pool *pool, > @@ -1832,6 +1834,7 @@ static bool add_hpo_dp_link_enc_to_ctx(struct > resource_context *res_ctx, > return pipe_ctx->link_res.hpo_dp_link_enc != NULL; > } > > +#if defined(CONFIG_DRM_AMD_DC_DCN) > static void remove_hpo_dp_link_enc_from_ctx(struct resource_context *res_ctx, > struct pipe_ctx *pipe_ctx, > struct dc_stream_state *stream) > @@ -1845,6 +1848,7 @@ static void remove_hpo_dp_link_enc_from_ctx(struct > resource_context *res_ctx, > pipe_ctx->link_res.hpo_dp_link_enc = NULL; > } > } > +#endif > > /* TODO: release audio object */ > void update_audio_usage(
Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved
Am 02.06.22 um 15:26 schrieb Lazar, Lijo: On 6/2/2022 6:54 PM, Lazar, Lijo wrote: On 6/2/2022 6:50 PM, Philip Yang wrote: Flush TLBs when existing PDEs are updated because a PTB or PDB moved, but avoids unnecessary TLB flushes when new PDBs or PTBs are added to the page table, which commonly happens when memory is mapped for the first time. Suggested-by: Christian König Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 9596c22fded6..1ea204218903 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, { struct amdgpu_vm_update_params params; struct amdgpu_vm_bo_base *entry; + bool flush_tlb_needed = false; int r, idx; if (list_empty(>relocated)) @@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, goto error; list_for_each_entry(entry, >relocated, vm_status) { + /* vm_flush_needed after updating moved PDEs */ + flush_tlb_needed |= entry->moved; That is a strange thing to do for a bool variable. Why not just assign it? Hmm.. In a loop, perhaps you meant logical OR? Well IIRC C doesn't have a logical or assignment operator "||=", so "|=" is used instead which also gets the job done. Christian. Thanks, Lijo Thanks, Lijo + r = amdgpu_vm_pde_update(, entry); if (r) goto error; @@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, if (r) goto error; - /* vm_flush_needed after updating PDEs */ - atomic64_inc(>tlb_seq); + if (flush_tlb_needed) + atomic64_inc(>tlb_seq); while (!list_empty(>relocated)) { entry = list_first_entry(>relocated,
Re: [PATCH] dc: Add ACP_DATA register
On 2022-06-02 08:02, Alan Liu wrote: > Define ixAZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA > Define AZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA__SUPPORTS_AI_MASK/SHIFT > > Signed-off-by: Alan Liu Reviewed-by: Harry Wentland Harry > --- > drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h | 1 + > drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h | 2 ++ > 2 files changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h > b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h > index 6df651a94b0a..581ba639b4ea 100644 > --- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h > +++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h > @@ -6981,6 +6981,7 @@ > #define ixAZALIA_F0_CODEC_PIN_CONTROL_RESPONSE_PIN_SENSE >0x23 > #define ixAZALIA_F0_CODEC_PIN_CONTROL_WIDGET_CONTROL >0x24 > #define ixAZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER >0x25 > +#define ixAZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA >0x27 > #define ixAZALIA_F0_CODEC_PIN_CONTROL_AUDIO_DESCRIPTOR0 >0x28 > #define ixAZALIA_F0_CODEC_PIN_CONTROL_AUDIO_DESCRIPTOR1 >0x29 > #define ixAZALIA_F0_CODEC_PIN_CONTROL_AUDIO_DESCRIPTOR2 >0x2a > diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h > b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h > index fa1f4374fafe..fd387c7363b6 100644 > --- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h > +++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h > @@ -13639,6 +13639,8 @@ > #define > AZALIA_F0_CODEC_PIN_CONTROL_RESPONSE_PIN_SENSE__IMPEDANCE_SENSE__SHIFT 0x0 > #define AZALIA_F0_CODEC_PIN_CONTROL_WIDGET_CONTROL__OUT_ENABLE_MASK 0x40 > #define AZALIA_F0_CODEC_PIN_CONTROL_WIDGET_CONTROL__OUT_ENABLE__SHIFT 0x6 > +#define AZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA__SUPPORTS_AI_MASK 0x40 > +#define AZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA__SUPPORTS_AI__SHIFT 0x6 > #define AZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER__SPEAKER_ALLOCATION_MASK > 0x7f > #define > AZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER__SPEAKER_ALLOCATION__SHIFT 0x0 > #define AZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER__CHANNEL_ALLOCATION_MASK > 0xff00
Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved
On 6/2/2022 6:54 PM, Lazar, Lijo wrote: On 6/2/2022 6:50 PM, Philip Yang wrote: Flush TLBs when existing PDEs are updated because a PTB or PDB moved, but avoids unnecessary TLB flushes when new PDBs or PTBs are added to the page table, which commonly happens when memory is mapped for the first time. Suggested-by: Christian König Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 9596c22fded6..1ea204218903 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, { struct amdgpu_vm_update_params params; struct amdgpu_vm_bo_base *entry; + bool flush_tlb_needed = false; int r, idx; if (list_empty(>relocated)) @@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, goto error; list_for_each_entry(entry, >relocated, vm_status) { + /* vm_flush_needed after updating moved PDEs */ + flush_tlb_needed |= entry->moved; That is a strange thing to do for a bool variable. Why not just assign it? Hmm.. In a loop, perhaps you meant logical OR? Thanks, Lijo Thanks, Lijo + r = amdgpu_vm_pde_update(, entry); if (r) goto error; @@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, if (r) goto error; - /* vm_flush_needed after updating PDEs */ - atomic64_inc(>tlb_seq); + if (flush_tlb_needed) + atomic64_inc(>tlb_seq); while (!list_empty(>relocated)) { entry = list_first_entry(>relocated,
Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved
On 6/2/2022 6:50 PM, Philip Yang wrote: Flush TLBs when existing PDEs are updated because a PTB or PDB moved, but avoids unnecessary TLB flushes when new PDBs or PTBs are added to the page table, which commonly happens when memory is mapped for the first time. Suggested-by: Christian König Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 9596c22fded6..1ea204218903 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, { struct amdgpu_vm_update_params params; struct amdgpu_vm_bo_base *entry; + bool flush_tlb_needed = false; int r, idx; if (list_empty(>relocated)) @@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, goto error; list_for_each_entry(entry, >relocated, vm_status) { + /* vm_flush_needed after updating moved PDEs */ + flush_tlb_needed |= entry->moved; That is a strange thing to do for a bool variable. Why not just assign it? Thanks, Lijo + r = amdgpu_vm_pde_update(, entry); if (r) goto error; @@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, if (r) goto error; - /* vm_flush_needed after updating PDEs */ - atomic64_inc(>tlb_seq); + if (flush_tlb_needed) + atomic64_inc(>tlb_seq); while (!list_empty(>relocated)) { entry = list_first_entry(>relocated,
Re: [PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved
Am 02.06.22 um 15:20 schrieb Philip Yang: Flush TLBs when existing PDEs are updated because a PTB or PDB moved, but avoids unnecessary TLB flushes when new PDBs or PTBs are added to the page table, which commonly happens when memory is mapped for the first time. Suggested-by: Christian König Signed-off-by: Philip Yang Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 9596c22fded6..1ea204218903 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, { struct amdgpu_vm_update_params params; struct amdgpu_vm_bo_base *entry; + bool flush_tlb_needed = false; int r, idx; if (list_empty(>relocated)) @@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, goto error; list_for_each_entry(entry, >relocated, vm_status) { + /* vm_flush_needed after updating moved PDEs */ + flush_tlb_needed |= entry->moved; + r = amdgpu_vm_pde_update(, entry); if (r) goto error; @@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, if (r) goto error; - /* vm_flush_needed after updating PDEs */ - atomic64_inc(>tlb_seq); + if (flush_tlb_needed) + atomic64_inc(>tlb_seq); while (!list_empty(>relocated)) { entry = list_first_entry(>relocated,
[PATCH v2 1/1] drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved
Flush TLBs when existing PDEs are updated because a PTB or PDB moved, but avoids unnecessary TLB flushes when new PDBs or PTBs are added to the page table, which commonly happens when memory is mapped for the first time. Suggested-by: Christian König Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 9596c22fded6..1ea204218903 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -737,6 +737,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, { struct amdgpu_vm_update_params params; struct amdgpu_vm_bo_base *entry; + bool flush_tlb_needed = false; int r, idx; if (list_empty(>relocated)) @@ -755,6 +756,9 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, goto error; list_for_each_entry(entry, >relocated, vm_status) { + /* vm_flush_needed after updating moved PDEs */ + flush_tlb_needed |= entry->moved; + r = amdgpu_vm_pde_update(, entry); if (r) goto error; @@ -764,8 +768,8 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, if (r) goto error; - /* vm_flush_needed after updating PDEs */ - atomic64_inc(>tlb_seq); + if (flush_tlb_needed) + atomic64_inc(>tlb_seq); while (!list_empty(>relocated)) { entry = list_first_entry(>relocated, -- 2.35.1
[PATCH] dc: Add ACP_DATA register
Define ixAZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA Define AZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA__SUPPORTS_AI_MASK/SHIFT Signed-off-by: Alan Liu --- drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h | 1 + drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h | 2 ++ 2 files changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h index 6df651a94b0a..581ba639b4ea 100644 --- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h +++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_d.h @@ -6981,6 +6981,7 @@ #define ixAZALIA_F0_CODEC_PIN_CONTROL_RESPONSE_PIN_SENSE 0x23 #define ixAZALIA_F0_CODEC_PIN_CONTROL_WIDGET_CONTROL 0x24 #define ixAZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER 0x25 +#define ixAZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA 0x27 #define ixAZALIA_F0_CODEC_PIN_CONTROL_AUDIO_DESCRIPTOR0 0x28 #define ixAZALIA_F0_CODEC_PIN_CONTROL_AUDIO_DESCRIPTOR1 0x29 #define ixAZALIA_F0_CODEC_PIN_CONTROL_AUDIO_DESCRIPTOR2 0x2a diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h index fa1f4374fafe..fd387c7363b6 100644 --- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h +++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_11_0_sh_mask.h @@ -13639,6 +13639,8 @@ #define AZALIA_F0_CODEC_PIN_CONTROL_RESPONSE_PIN_SENSE__IMPEDANCE_SENSE__SHIFT 0x0 #define AZALIA_F0_CODEC_PIN_CONTROL_WIDGET_CONTROL__OUT_ENABLE_MASK 0x40 #define AZALIA_F0_CODEC_PIN_CONTROL_WIDGET_CONTROL__OUT_ENABLE__SHIFT 0x6 +#define AZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA__SUPPORTS_AI_MASK 0x40 +#define AZALIA_F0_CODEC_PIN_CONTROL_ACP_DATA__SUPPORTS_AI__SHIFT 0x6 #define AZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER__SPEAKER_ALLOCATION_MASK 0x7f #define AZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER__SPEAKER_ALLOCATION__SHIFT 0x0 #define AZALIA_F0_CODEC_PIN_CONTROL_CHANNEL_SPEAKER__CHANNEL_ALLOCATION_MASK 0xff00 -- 2.25.1
Re: [kbuild] drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1433 amdgpu_discovery_get_vcn_info() error: buffer overflow 'adev->vcn.vcn_codec_disable_mask' 2 <= 3
On Thu, Jun 02, 2022 at 08:26:03AM +0200, Ernst Sjöstrand wrote: > Dan: I also ran Smatch which resulted in the following discussion: > > https://lists.freedesktop.org/archives/amd-gfx/2022-May/079228.html Since the bounds check is dead code which does not make sense and is not required, another idea would be to just delete it. regards, dan carpenter
[PATCH v3 2/2] drm/amdgpu: adding device coredump support
Added device coredump information: - Kernel version - Module - Time - VRAM status - Guilty process name and PID - GPU register dumps v1 -> v2: Variable name change v1 -> v2: NULL check v1 -> v2: Code alignment v1 -> v2: Adding dummy amdgpu_devcoredump_free v1 -> v2: memset reset_task_info to zero v2 -> v3: add CONFIG_DEV_COREDUMP for variables v2 -> v3: remove NULL check on amdgpu_devcoredump_read Signed-off-by: Somalapuram Amaranath --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 5 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 64 ++ 2 files changed, 69 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index c79d9992b113..1bfbaf65d414 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1044,6 +1044,11 @@ struct amdgpu_device { uint32_t*reset_dump_reg_list; uint32_t*reset_dump_reg_value; int num_regs; +#ifdef CONFIG_DEV_COREDUMP + struct amdgpu_task_info reset_task_info; + boolreset_vram_lost; + struct timespec64 reset_time; +#endif struct amdgpu_reset_domain *reset_domain; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 89c6db03e84b..f1def74aaad0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -32,6 +32,8 @@ #include #include #include +#include +#include #include #include @@ -4734,6 +4736,59 @@ static int amdgpu_reset_reg_dumps(struct amdgpu_device *adev) return 0; } +#ifdef CONFIG_DEV_COREDUMP +static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, + size_t count, void *data, size_t datalen) +{ + struct drm_printer p; + struct amdgpu_device *adev = data; + struct drm_print_iterator iter; + int i; + + iter.data = buffer; + iter.offset = 0; + iter.start = offset; + iter.remain = count; + + p = drm_coredump_printer(); + + drm_printf(, " AMDGPU Device Coredump \n"); + drm_printf(, "kernel: " UTS_RELEASE "\n"); + drm_printf(, "module: " KBUILD_MODNAME "\n"); + drm_printf(, "time: %lld.%09ld\n", adev->reset_time.tv_sec, adev->reset_time.tv_nsec); + if (adev->reset_task_info.pid) + drm_printf(, "process_name: %s PID: %d\n", + adev->reset_task_info.process_name, + adev->reset_task_info.pid); + + if (adev->reset_vram_lost) + drm_printf(, "VRAM is lost due to GPU reset!\n"); + if (adev->num_regs) { + drm_printf(, "AMDGPU register dumps:\nOffset: Value:\n"); + + for (i = 0; i < adev->num_regs; i++) + drm_printf(, "0x%08x: 0x%08x\n", + adev->reset_dump_reg_list[i], + adev->reset_dump_reg_value[i]); + } + + return count - iter.remain; +} + +static void amdgpu_devcoredump_free(void *data) +{ +} + +static void amdgpu_reset_capture_coredumpm(struct amdgpu_device *adev) +{ + struct drm_device *dev = adev_to_drm(adev); + + ktime_get_ts64(>reset_time); + dev_coredumpm(dev->dev, THIS_MODULE, adev, 0, GFP_KERNEL, + amdgpu_devcoredump_read, amdgpu_devcoredump_free); +} +#endif + int amdgpu_do_asic_reset(struct list_head *device_list_handle, struct amdgpu_reset_context *reset_context) { @@ -4818,6 +4873,15 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle, goto out; vram_lost = amdgpu_device_check_vram_lost(tmp_adev); +#ifdef CONFIG_DEV_COREDUMP + tmp_adev->reset_vram_lost = vram_lost; + memset(_adev->reset_task_info, 0, + sizeof(tmp_adev->reset_task_info)); + if (reset_context->job && reset_context->job->vm) + tmp_adev->reset_task_info = + reset_context->job->vm->task_info; + amdgpu_reset_capture_coredumpm(tmp_adev); +#endif if (vram_lost) { DRM_INFO("VRAM is lost due to GPU reset!\n"); amdgpu_inc_vram_lost(tmp_adev); -- 2.32.0
[PATCH v3 1/2] drm/amdgpu: save the reset dump register value for devcoredump
Allocate memory for register value and use the same values for devcoredump. v1 -> v2: Change krealloc_array() to kmalloc_array() v2 -> v3: Fix alignment Signed-off-by: Somalapuram Amaranath --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 7 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++--- 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 76df583663c7..c79d9992b113 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1042,6 +1042,7 @@ struct amdgpu_device { /* reset dump register */ uint32_t*reset_dump_reg_list; + uint32_t*reset_dump_reg_value; int num_regs; struct amdgpu_reset_domain *reset_domain; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index eedb12f6b8a3..f3ac7912c29c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -1709,17 +1709,24 @@ static ssize_t amdgpu_reset_dump_register_list_write(struct file *f, i++; } while (len < size); + new = kmalloc_array(i, sizeof(uint32_t), GFP_KERNEL); + if (!new) { + ret = -ENOMEM; + goto error_free; + } ret = down_write_killable(>reset_domain->sem); if (ret) goto error_free; swap(adev->reset_dump_reg_list, tmp); + swap(adev->reset_dump_reg_value, new); adev->num_regs = i; up_write(>reset_domain->sem); ret = size; error_free: kfree(tmp); + kfree(new); return ret; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 4daa0e893965..89c6db03e84b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4720,15 +4720,15 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev, static int amdgpu_reset_reg_dumps(struct amdgpu_device *adev) { - uint32_t reg_value; int i; lockdep_assert_held(>reset_domain->sem); dump_stack(); for (i = 0; i < adev->num_regs; i++) { - reg_value = RREG32(adev->reset_dump_reg_list[i]); - trace_amdgpu_reset_reg_dumps(adev->reset_dump_reg_list[i], reg_value); + adev->reset_dump_reg_value[i] = RREG32(adev->reset_dump_reg_list[i]); + trace_amdgpu_reset_reg_dumps(adev->reset_dump_reg_list[i], +adev->reset_dump_reg_value[i]); } return 0; -- 2.32.0
RE: [PATCH] drm/amdgpu: fix up comment in amdgpu_device_asic_has_dc_support()
[AMD Official Use Only - General] Reviewed-by: Evan Quan > -Original Message- > From: amd-gfx On Behalf Of Alex > Deucher > Sent: Wednesday, May 25, 2022 10:09 AM > To: amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander > Subject: [PATCH] drm/amdgpu: fix up comment in > amdgpu_device_asic_has_dc_support() > > LVDS support was implemented in DC a while ago. Just DAC support is left to > do. > > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index dab0c59bfb1b..fa26e462e750 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -3286,7 +3286,7 @@ bool amdgpu_device_asic_has_dc_support(enum > amd_asic_type asic_type) > case CHIP_MULLINS: > /* >* We have systems in the wild with these ASICs that require > - * LVDS and VGA support which is not supported with DC. > + * VGA support which is not supported with DC. >* >* Fallback to the non-DC driver here by default so as not to >* cause regressions. > -- > 2.35.3
RE: [PATCH] drm/amdgpu/soc21: add mode2 asic reset for SMU IP v13.0.4
[AMD Official Use Only - General] Acked-by: Evan Quan > -Original Message- > From: amd-gfx On Behalf Of Alex > Deucher > Sent: Friday, May 27, 2022 1:58 AM > To: amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander ; Huang, Tim > > Subject: [PATCH] drm/amdgpu/soc21: add mode2 asic reset for SMU IP > v13.0.4 > > Set the default reset method to mode2 for SMU IP v13.0.4 > > Signed-off-by: Tim Huang > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/soc21.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c > b/drivers/gpu/drm/amd/amdgpu/soc21.c > index 9e18a2b22607..a400f5273343 100644 > --- a/drivers/gpu/drm/amd/amdgpu/soc21.c > +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c > @@ -310,6 +310,7 @@ static enum amd_reset_method > soc21_asic_reset_method(struct amdgpu_device *adev) { > if (amdgpu_reset_method == AMD_RESET_METHOD_MODE1 || > + amdgpu_reset_method == AMD_RESET_METHOD_MODE2 || > amdgpu_reset_method == AMD_RESET_METHOD_BACO) > return amdgpu_reset_method; > > @@ -320,6 +321,8 @@ soc21_asic_reset_method(struct amdgpu_device > *adev) > switch (adev->ip_versions[MP1_HWIP][0]) { > case IP_VERSION(13, 0, 0): > return AMD_RESET_METHOD_MODE1; > + case IP_VERSION(13, 0, 4): > + return AMD_RESET_METHOD_MODE2; > default: > if (amdgpu_dpm_is_baco_supported(adev)) > return AMD_RESET_METHOD_BACO; > @@ -341,6 +344,10 @@ static int soc21_asic_reset(struct amdgpu_device > *adev) > dev_info(adev->dev, "BACO reset\n"); > ret = amdgpu_dpm_baco_reset(adev); > break; > + case AMD_RESET_METHOD_MODE2: > + dev_info(adev->dev, "MODE2 reset\n"); > + ret = amdgpu_dpm_mode2_reset(adev); > + break; > default: > dev_info(adev->dev, "MODE1 reset\n"); > ret = amdgpu_device_mode1_reset(adev); > -- > 2.35.3
Re: [PATCH 1/1] drm/amdgpu: Update PDEs flush TLB if PDEs is moved
Am 02.06.22 um 01:19 schrieb Felix Kuehling: Am 2022-06-01 um 19:12 schrieb Philip Yang: Update PDEs, PTEs don't need flush TLB after updating mapping, this will remove the unnecessary TLB flush to reduce map to GPUs time. This description is unclear. I think what this change does is, flush TLBs when existing PDEs are updated (because a PTB or PDB moved). But it avoids unnecessary TLB flushes when new PDBs or PTBs are added to the page table, which commonly happens when memory is mapped for the first time. Yes, agree. Additional to that (I know reviewing my own suggestion) setting a local variable and then incrementing the atomic only once might be better because atomics are barriers on x86. Regards, Christian. Regards, Felix Suggested-by: Christian König Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 9596c22fded6..8cdfd09fd70d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -755,6 +755,10 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, goto error; list_for_each_entry(entry, >relocated, vm_status) { + /* vm_flush_needed after updating moved PDEs */ + if (entry->moved) + atomic64_inc(>tlb_seq); + r = amdgpu_vm_pde_update(, entry); if (r) goto error; @@ -764,9 +768,6 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, if (r) goto error; - /* vm_flush_needed after updating PDEs */ - atomic64_inc(>tlb_seq); - while (!list_empty(>relocated)) { entry = list_first_entry(>relocated, struct amdgpu_vm_bo_base,
Re: Explicit VM updates
Am 01.06.22 um 19:47 schrieb Bas Nieuwenhuizen: On Wed, Jun 1, 2022 at 2:40 PM Christian König wrote: Hey guys, so today Bas came up with a new requirement regarding the explicit synchronization to VM updates and a bunch of prototype patches. I've been thinking about that stuff for quite some time before, but to be honest it's one of the most trickiest parts of the driver. So my current thinking is that we could potentially handle those requirements like this: 1. We add some new EXPLICIT flag to context (or CS?) and VM IOCTL. This way we either get the new behavior for the whole CS+VM or the old one, but never both mixed. Any VM wide flag would be a concern because libdrm_amdgpu merges things into a single drm fd for potentially multiple driver (i.e. radv + amdvlk + radeonsi). I'm also not sure what you'd need a VM flag for? What I meant was a flag for the VM IOCTL. E.g. you specify on the IOCTL if you want IMPLICIT or EXPLICIT synchronization. Maybe just specifying a drm_syncobj could be used as indicator that we want explicit synchronization for a VM IOCTL operation as well. That's what I used in the prototype, but I'm not 100% sure if that's sufficient. Regards, Christian. 2. When memory is unmapped we keep around the last unmap operation inside the bo_va. 3. When memory is freed we add all the CS fences which could access that memory + the last unmap operation as BOOKKEEP fences to the BO and as mandatory sync fence to the VM. Memory freed either because of an eviction or because of userspace closing the handle will be seen as a combination of unmap+free. The result is the following semantic for userspace to avoid implicit synchronization as much as possible: 1. When you allocate and map memory it is mandatory to either wait for the mapping operation to complete or to add it as dependency for your CS. If this isn't followed the application will run into CS faults (that's what we pretty much already implemented). 2. When memory is freed you must unmap that memory first and then wait for this unmap operation to complete before freeing the memory. If this isn't followed the kernel will add a forcefully wait to the next CS to block until the unmap is completed. 3. All VM operations requested by userspace will still be executed in order, e.g. we can't run unmap + map in parallel or something like this. Is that something you guys can live with? As far as I can see it should give you the maximum freedom possible, but is still doable. Regards, Christian.
Re: Explicit VM updates
Am 01.06.22 um 19:39 schrieb Felix Kuehling: Am 2022-06-01 um 13:22 schrieb Christian König: Am 01.06.22 um 19:07 schrieb Felix Kuehling: Am 2022-06-01 um 12:29 schrieb Christian König: Am 01.06.22 um 17:05 schrieb Felix Kuehling: Am 2022-06-01 um 08:40 schrieb Christian König: Hey guys, so today Bas came up with a new requirement regarding the explicit synchronization to VM updates and a bunch of prototype patches. I've been thinking about that stuff for quite some time before, but to be honest it's one of the most trickiest parts of the driver. So my current thinking is that we could potentially handle those requirements like this: 1. We add some new EXPLICIT flag to context (or CS?) and VM IOCTL. This way we either get the new behavior for the whole CS+VM or the old one, but never both mixed. 2. When memory is unmapped we keep around the last unmap operation inside the bo_va. 3. When memory is freed we add all the CS fences which could access that memory + the last unmap operation as BOOKKEEP fences to the BO and as mandatory sync fence to the VM. Memory freed either because of an eviction or because of userspace closing the handle will be seen as a combination of unmap+free. The result is the following semantic for userspace to avoid implicit synchronization as much as possible: 1. When you allocate and map memory it is mandatory to either wait for the mapping operation to complete or to add it as dependency for your CS. If this isn't followed the application will run into CS faults (that's what we pretty much already implemented). This makes sense. 2. When memory is freed you must unmap that memory first and then wait for this unmap operation to complete before freeing the memory. If this isn't followed the kernel will add a forcefully wait to the next CS to block until the unmap is completed. This would work for now, but it won't work for user mode submission in the future. I find it weird that user mode needs to wait for the unmap. For user mode, unmap and free should always be asynchronous. I can't think of any good reasons to make user mode wait for the driver to clean up its stuff. Could the waiting be done in kernel mode instead? TTM already does delayed freeing if there are fences outstanding on a BO being freed. This should make it easy to delay freeing until the unmap is done without blocking the user mode thread. This is not about blocking, but synchronization dependencies. Then I must have misunderstood this sentence: "When memory is freed you must unmap that memory first and then wait for this unmap operation to complete before freeing the memory." If the pronoun "you" is the user mode driver, it means user mode must wait for kernel mode to finish unmapping memory before freeing it. Was that not what you meant? Ah, yes. The UMD must wait for the kernel to finish unmapping all the maps from the BO before it drops the handle of the BO and with that frees it. In other words the free is not waiting for the unmap to complete, but causes command submissions through the kernel to depend on the unmap. I guess I don't understand that dependency. The next command submission obviously cannot use the memory that was unmapped. But why does it need to synchronize with the unmap operation? Because of the necessary TLB flush, only after that one is executed we can be sure that nobody has access to the memory any more and actually free it. So freeing the memory has to wait for the TLB flush. Why does the next command submission need to wait? Because that's the one triggering the TLB flush. The issue is that flushing the TLB while the VMID is in use is really unreliable on most hardware generations. User mode submissions are completely unrelated to that. I mention user mode command submission because there is no way to enforce the synchronization you describe here on a user mode queue. So this approach is not very future proof. With user mode queues you need to wait for the work on the queue to finish anyway or otherwise you run into VM faults if you just unmap or free the memory. If the next command submission doesn't use the unmapped/freed memory, why does it need to wait for the TLB flush? Because it could potentially use it. When userspace lies to the kernel and still accesses the mapping we would allow access to freed up memory and create a major security problem. If it is using the unmapped/freed memory, that's a user mode bug. But waiting for the TLB flush won't fix that. It will only turn a likely VM fault into a certain VM fault. Yeah, exactly that's the intention here. The guarantee you need to give is, that the memory is not freed and reused by anyone else until the TLB flush is done. This dependency requires synchronization of the "free" operation with the TLB flush. It does not require synchronization with any future command submissions in the context that
RE: [PATCH] drm/amdgpu: suppress compile warnings
[AMD Official Use Only - General] > -Original Message- > From: Christian König > Sent: Monday, May 30, 2022 3:12 PM > To: Quan, Evan ; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander > Subject: Re: [PATCH] drm/amdgpu: suppress compile warnings > > Am 30.05.22 um 08:50 schrieb Evan Quan: > > Suppress the compile warnings below: > > > > > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_7_ppt.c:1 > 407:12: warning: > > stack frame size (1040) exceeds limit (1024) in > > 'smu_v13_0_7_get_power_profile_mode' [-Wframe-larger-than] > > drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1986:6: warning: > > no previous prototype for function 'gfx_v11_0_rlc_stop' > > [-Wmissing-prototypes] > > > > Signed-off-by: Evan Quan > > Change-Id: I679436c91cb98afb9fcbef8942fd90a17e5234b5 > > --- > > drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +- > > drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 12 > ++-- > > 2 files changed, 11 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > > b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > > index 8c0a3fc7aaa6..cb581cfc7464 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > > @@ -1983,7 +1983,7 @@ static int gfx_v11_0_init_csb(struct > amdgpu_device *adev) > > return 0; > > } > > > > -void gfx_v11_0_rlc_stop(struct amdgpu_device *adev) > > +static void gfx_v11_0_rlc_stop(struct amdgpu_device *adev) > > { > > u32 tmp = RREG32_SOC15(GC, 0, regRLC_CNTL); > > > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c > > b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c > > index 4e1861fb2c6a..8fd7652ae883 100644 > > --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c > > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c > > @@ -1406,7 +1406,7 @@ static int smu_v13_0_7_get_power_limit(struct > > smu_context *smu, > > > > static int smu_v13_0_7_get_power_profile_mode(struct smu_context > *smu, char *buf) > > { > > - DpmActivityMonitorCoeffIntExternal_t > activity_monitor_external[PP_SMC_POWER_PROFILE_COUNT]; > > + DpmActivityMonitorCoeffIntExternal_t *activity_monitor_external; > > uint32_t i, j, size = 0; > > int16_t workload_type = 0; > > int result = 0; > > @@ -1414,6 +1414,11 @@ static int > smu_v13_0_7_get_power_profile_mode(struct smu_context *smu, char > *buf > > if (!buf) > > return -EINVAL; > > > > + activity_monitor_external = > kzalloc(sizeof(DpmActivityMonitorCoeffIntExternal_t) * > PP_SMC_POWER_PROFILE_COUNT, > > + GFP_KERNEL); > > The static checkers will warn about that as well. Please use > kcalloc(sizeof(...), > count, GFP_KERNEL); [Quan, Evan] I actually do not know that.. The warning is about the stack frame size (1040). The space allocated by kzalloc/kmalloc is not on the stack. Per my knowledge, it should not have such limitation. Did I miss something? BR Evan > > Apart from that looks good to me. > > Regards, > Christian. > > > + if (!activity_monitor_external) > > + return -ENOMEM; > > + > > size += sysfs_emit_at(buf, size, " "); > > for (i = 0; i <= PP_SMC_POWER_PROFILE_WINDOW3D; i++) > > size += sysfs_emit_at(buf, size, "%-14s%s", > > amdgpu_pp_profile_name[i], @@ -1426,14 +1431,17 @@ static int > smu_v13_0_7_get_power_profile_mode(struct smu_context *smu, char > *buf > > workload_type = smu_cmn_to_asic_specific_index(smu, > > > CMN2ASIC_MAPPING_WORKLOAD, > >i); > > - if (workload_type < 0) > > + if (workload_type < 0) { > > + kfree(activity_monitor_external); > > return -EINVAL; > > + } > > > > result = smu_cmn_update_table(smu, > > > SMU_TABLE_ACTIVITY_MONITOR_COEFF, workload_type, > > (void > *)(_monitor_external[i]), false); > > if (result) { > > dev_err(smu->adev->dev, "[%s] Failed to get activity > monitor!", > > __func__); > > + kfree(activity_monitor_external); > > return result; > > } > > }
Re: [kbuild] drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1433 amdgpu_discovery_get_vcn_info() error: buffer overflow 'adev->vcn.vcn_codec_disable_mask' 2 <= 3
Dan: I also ran Smatch which resulted in the following discussion: https://lists.freedesktop.org/archives/amd-gfx/2022-May/079228.html Regards Den ons 1 juni 2022 kl 20:44 skrev Alex Deucher : > On Fri, May 27, 2022 at 3:46 AM Dan Carpenter > wrote: > > > > [ kbuild bot sent this warning on May 4 but I never heard back and it's > > May 27 now so sending a duplicate warning is probably for the best. > -dan] > > > > tree: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > > head: 7e284070abe53d448517b80493863595af4ab5f0 > > commit: 622469c87fc3e6c90a980be3e2287d82bd55c977 drm/amdgpu/discovery: > add a function to parse the vcn info table > > config: arc-randconfig-m031-20220524 ( > https://download.01.org/0day-ci/archive/20220527/202205271546.ov14n2r8-...@intel.com/config > ) > > compiler: arceb-elf-gcc (GCC) 11.3.0 > > > > If you fix the issue, kindly add following tag where applicable > > Reported-by: kernel test robot > > Reported-by: Dan Carpenter > > > > smatch warnings: > > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1433 > amdgpu_discovery_get_vcn_info() error: buffer overflow > 'adev->vcn.vcn_codec_disable_mask' 2 <= 3 > > > > vim +1433 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > > > 622469c87fc3e6 Alex Deucher 2022-03-30 1403 int > amdgpu_discovery_get_vcn_info(struct amdgpu_device *adev) > > 622469c87fc3e6 Alex Deucher 2022-03-30 1404 { > > 622469c87fc3e6 Alex Deucher 2022-03-30 1405struct binary_header > *bhdr; > > 622469c87fc3e6 Alex Deucher 2022-03-30 1406union vcn_info *vcn_info; > > 622469c87fc3e6 Alex Deucher 2022-03-30 1407u16 offset; > > 622469c87fc3e6 Alex Deucher 2022-03-30 1408int v; > > 622469c87fc3e6 Alex Deucher 2022-03-30 1409 > > 622469c87fc3e6 Alex Deucher 2022-03-30 1410if > (!adev->mman.discovery_bin) { > > 622469c87fc3e6 Alex Deucher 2022-03-30 1411DRM_ERROR("ip > discovery uninitialized\n"); > > 622469c87fc3e6 Alex Deucher 2022-03-30 1412return -EINVAL; > > 622469c87fc3e6 Alex Deucher 2022-03-30 1413} > > 622469c87fc3e6 Alex Deucher 2022-03-30 1414 > > 622469c87fc3e6 Alex Deucher 2022-03-30 1415if > (adev->vcn.num_vcn_inst > VCN_INFO_TABLE_MAX_NUM_INSTANCES) { > > > > Capped to 4 > > > > 622469c87fc3e6 Alex Deucher 2022-03-30 1416 > dev_err(adev->dev, "invalid vcn instances\n"); > > 622469c87fc3e6 Alex Deucher 2022-03-30 1417return -EINVAL; > > 622469c87fc3e6 Alex Deucher 2022-03-30 1418} > > 622469c87fc3e6 Alex Deucher 2022-03-30 1419 > > 622469c87fc3e6 Alex Deucher 2022-03-30 1420bhdr = (struct > binary_header *)adev->mman.discovery_bin; > > 622469c87fc3e6 Alex Deucher 2022-03-30 1421offset = > le16_to_cpu(bhdr->table_list[VCN_INFO].offset); > > 622469c87fc3e6 Alex Deucher 2022-03-30 1422 > > 622469c87fc3e6 Alex Deucher 2022-03-30 1423if (!offset) { > > 622469c87fc3e6 Alex Deucher 2022-03-30 1424 > dev_err(adev->dev, "invalid vcn table offset\n"); > > 622469c87fc3e6 Alex Deucher 2022-03-30 1425return -EINVAL; > > 622469c87fc3e6 Alex Deucher 2022-03-30 1426} > > 622469c87fc3e6 Alex Deucher 2022-03-30 1427 > > 622469c87fc3e6 Alex Deucher 2022-03-30 1428vcn_info = (union > vcn_info *)(adev->mman.discovery_bin + offset); > > 622469c87fc3e6 Alex Deucher 2022-03-30 1429 > > 622469c87fc3e6 Alex Deucher 2022-03-30 1430switch > (le16_to_cpu(vcn_info->v1.header.version_major)) { > > 622469c87fc3e6 Alex Deucher 2022-03-30 1431case 1: > > 622469c87fc3e6 Alex Deucher 2022-03-30 1432for (v = 0; v < > adev->vcn.num_vcn_inst; v++) { > > 622469c87fc3e6 Alex Deucher 2022-03-30 @1433 > adev->vcn.vcn_codec_disable_mask[v] = > > > > But this array doesn't have 4 elements > > Correct, but num_vcn_inst can't be larger than > AMDGPU_MAX_VCN_INSTANCES (2) at the moment thanks to: > https://patchwork.freedesktop.org/patch/486289/ > > Alex > > > > > 622469c87fc3e6 Alex Deucher 2022-03-30 1434 > le32_to_cpu(vcn_info->v1.instance_info[v].fuse_data.all_bits); > > 622469c87fc3e6 Alex Deucher 2022-03-30 1435} > > 622469c87fc3e6 Alex Deucher 2022-03-30 1436break; > > 622469c87fc3e6 Alex Deucher 2022-03-30 1437default: > > 622469c87fc3e6 Alex Deucher 2022-03-30 1438 > dev_err(adev->dev, > > 622469c87fc3e6 Alex Deucher 2022-03-30 1439 > "Unhandled VCN info table %d.%d\n", > > 622469c87fc3e6 Alex Deucher 2022-03-30 1440 > le16_to_cpu(vcn_info->v1.header.version_major), > > 622469c87fc3e6 Alex Deucher 2022-03-30 1441 > le16_to_cpu(vcn_info->v1.header.version_minor)); > > 622469c87fc3e6 Alex Deucher 2022-03-30 1442return -EINVAL; > > 622469c87fc3e6 Alex Deucher 2022-03-30 1443} > > 622469c87fc3e6 Alex Deucher 2022-03-30 1444return 0; > > f39f5bb1c9d68d Xiaojie Yuan 2019-06-20 1445 } > > > > -- > > 0-DAY CI Kernel Test Service > > https://01.org/lkp > > ___ > > kbuild mailing list -- kbu...@lists.01.org > > To