Re: [RFC 0/2] Add generic FPU api similar to x86
While you already CCed a bunch of people stuff like that needs to go to the appropriate mailing list and not just amd-gfx. Especially LKML so that other core devs can take a look as well. Regards, Christian. Am 19.07.21 um 21:52 schrieb Anson Jacob: This is an attempt to have generic FPU enable/disable calls similar to x86. So that we can simplify gpu/drm/amd/display/dc/os_types.h Also adds FPU correctness logic seen in x86. Anson Jacob (2): ppc/fpu: Add generic FPU api similar to x86 drm/amd/display: Use PPC FPU functions arch/powerpc/include/asm/switch_to.h | 29 ++--- arch/powerpc/kernel/process.c | 130 ++ drivers/gpu/drm/amd/display/dc/os_types.h | 28 + 3 files changed, 139 insertions(+), 48 deletions(-) ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH v3 3/4] drm/amd/display: Add control mechanism for FPU utilization
DC invokes DC_FPU_START/END in multiple parts of the code; this can create a situation where we invoke this FPU operation in a nested way or exit too early. For avoiding this situation, this commit adds a mechanism where dc_fpu_begin/end manages the access to kernel_fpu_begin/end. Change since V2: - Christian: Do not use this_cpu_* between get/put_cpu_ptr(). Change since V1: - Use a better variable names - Use get_cpu_ptr and put_cpu_ptr to better balance preemption enable and disable Cc: Harry Wentland Cc: Anson Jacob Cc: Christian König Cc: Hersen Wu Cc: Aric Cyr Signed-off-by: Rodrigo Siqueira --- .../amd/display/amdgpu_dm/amdgpu_dm_trace.h | 13 --- .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 36 --- drivers/gpu/drm/amd/display/dc/dc_trace.h | 4 +-- 3 files changed, 42 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h index 230bb12c405e..fdcaea22b456 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h @@ -638,23 +638,26 @@ TRACE_EVENT(amdgpu_refresh_rate_track, ); TRACE_EVENT(dcn_fpu, - TP_PROTO(bool begin, const char *function, const int line), - TP_ARGS(begin, function, line), + TP_PROTO(bool begin, const char *function, const int line, const int recursion_depth), + TP_ARGS(begin, function, line, recursion_depth), TP_STRUCT__entry( __field(bool, begin) __field(const char *, function) __field(int, line) +__field(int, recursion_depth) ), TP_fast_assign( __entry->begin = begin; __entry->function = function; __entry->line = line; + __entry->recursion_depth = recursion_depth; ), - TP_printk("%s()+%d: %s", + TP_printk("%s: recursion_depth: %d: %s()+%d:", + __entry->begin ? "begin" : "end", + __entry->recursion_depth, __entry->function, - __entry->line, - __entry->begin ? "begin" : "end" + __entry->line ) ); diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c index d5d156a4517e..d0d3e8a34db5 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c @@ -28,6 +28,19 @@ #include +/** + * DOC: DC FPU manipulation overview + * + * DC core uses FPU operations in multiple parts of the code, which requires a + * more specialized way to manage these areas' entrance. To fulfill this + * requirement, we created some wrapper functions that encapsulate + * kernel_fpu_begin/end to better fit our need in the display component. In + * summary, in this file, you can find functions related to FPU operation + * management. + */ + +static DEFINE_PER_CPU(int, fpu_recursion_depth); + /** * dc_fpu_begin - Enables FPU protection * @function_name: A string containing the function name for debug purposes @@ -43,8 +56,16 @@ */ void dc_fpu_begin(const char *function_name, const int line) { - TRACE_DCN_FPU(true, function_name, line); - kernel_fpu_begin(); + int *pcpu; + + pcpu = get_cpu_ptr(&fpu_recursion_depth); + *pcpu += 1; + + if (*pcpu == 1) + kernel_fpu_begin(); + + TRACE_DCN_FPU(true, function_name, line, *pcpu); + put_cpu_ptr(&fpu_recursion_depth); } /** @@ -59,6 +80,13 @@ void dc_fpu_begin(const char *function_name, const int line) */ void dc_fpu_end(const char *function_name, const int line) { - TRACE_DCN_FPU(false, function_name, line); - kernel_fpu_end(); + int *pcpu; + + pcpu = get_cpu_ptr(&fpu_recursion_depth); + *pcpu -= 1; + if (*pcpu <= 0) + kernel_fpu_end(); + + TRACE_DCN_FPU(false, function_name, line, *pcpu); + put_cpu_ptr(&fpu_recursion_depth); } diff --git a/drivers/gpu/drm/amd/display/dc/dc_trace.h b/drivers/gpu/drm/amd/display/dc/dc_trace.h index d598ba697e45..c711797e5c9e 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_trace.h +++ b/drivers/gpu/drm/amd/display/dc/dc_trace.h @@ -38,5 +38,5 @@ #define TRACE_DCN_CLOCK_STATE(dcn_clocks) \ trace_amdgpu_dm_dc_clocks_state(dcn_clocks) -#define TRACE_DCN_FPU(begin, function, line) \ - trace_dcn_fpu(begin, function, line) +#define TRACE_DCN_FPU(begin, function, line, ref_count) \ + trace_dcn_fpu(begin, function, line, ref_count) -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH v3 1/4] drm/amd/display: Introduce FPU directory inside DC
The display core files rely on FPU, which requires to be compiled with special flags. Ideally, we don't want these FPU operations spread around the DC code; nevertheless, it happens in the current source. This commit introduces a new directory named fpu_operations that intends to centralize all files that require the FPU compilation flag. As part of this new component, this patch also moves one of the functions (dcn20_populate_dml_writeback_from_context) that require FPU access to a single shared file. Notice that this is the first part of the work, and it does not fix the FPU issue yet; we still need other patches for achieving the complete FPU isolation. Changes since V2: - Christian: Remove unnecessary wrapper. - lkp: Add missing prototype. - Only compile the FPU operations if the DCN option is enabled. Change since V1: - Update documentation and rebase. Cc: Harry Wentland Cc: Anson Jacob Cc: Christian König Cc: Hersen Wu Cc: Aric Cyr Reported-by: kernel test robot Signed-off-by: Rodrigo Siqueira --- drivers/gpu/drm/amd/display/dc/Makefile | 1 + .../drm/amd/display/dc/dcn20/dcn20_resource.c | 39 + .../drm/amd/display/dc/dcn20/dcn20_resource.h | 2 - .../drm/amd/display/dc/dcn21/dcn21_resource.c | 2 + .../amd/display/dc/fpu_operations/Makefile| 58 + .../drm/amd/display/dc/fpu_operations/dcn2x.c | 84 +++ .../drm/amd/display/dc/fpu_operations/dcn2x.h | 33 7 files changed, 180 insertions(+), 39 deletions(-) create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/Makefile create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.h diff --git a/drivers/gpu/drm/amd/display/dc/Makefile b/drivers/gpu/drm/amd/display/dc/Makefile index 943fcb164876..0de4baefa91e 100644 --- a/drivers/gpu/drm/amd/display/dc/Makefile +++ b/drivers/gpu/drm/amd/display/dc/Makefile @@ -35,6 +35,7 @@ DC_LIBS += dcn301 DC_LIBS += dcn302 DC_LIBS += dcn303 DC_LIBS += dcn31 +DC_LIBS += fpu_operations endif DC_LIBS += dce120 diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c index 1b05a37b674d..f99b09643a52 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c @@ -35,6 +35,8 @@ #include "include/irq_service_interface.h" #include "dcn20/dcn20_resource.h" +#include "fpu_operations/dcn2x.h" + #include "dcn10/dcn10_hubp.h" #include "dcn10/dcn10_ipp.h" #include "dcn20_hubbub.h" @@ -1974,43 +1976,6 @@ void dcn20_split_stream_for_mpc( ASSERT(primary_pipe->plane_state); } -void dcn20_populate_dml_writeback_from_context( - struct dc *dc, struct resource_context *res_ctx, display_e2e_pipe_params_st *pipes) -{ - int pipe_cnt, i; - - for (i = 0, pipe_cnt = 0; i < dc->res_pool->pipe_count; i++) { - struct dc_writeback_info *wb_info = &res_ctx->pipe_ctx[i].stream->writeback_info[0]; - - if (!res_ctx->pipe_ctx[i].stream) - continue; - - /* Set writeback information */ - pipes[pipe_cnt].dout.wb_enable = (wb_info->wb_enabled == true) ? 1 : 0; - pipes[pipe_cnt].dout.num_active_wb++; - pipes[pipe_cnt].dout.wb.wb_src_height = wb_info->dwb_params.cnv_params.crop_height; - pipes[pipe_cnt].dout.wb.wb_src_width = wb_info->dwb_params.cnv_params.crop_width; - pipes[pipe_cnt].dout.wb.wb_dst_width = wb_info->dwb_params.dest_width; - pipes[pipe_cnt].dout.wb.wb_dst_height = wb_info->dwb_params.dest_height; - pipes[pipe_cnt].dout.wb.wb_htaps_luma = 1; - pipes[pipe_cnt].dout.wb.wb_vtaps_luma = 1; - pipes[pipe_cnt].dout.wb.wb_htaps_chroma = wb_info->dwb_params.scaler_taps.h_taps_c; - pipes[pipe_cnt].dout.wb.wb_vtaps_chroma = wb_info->dwb_params.scaler_taps.v_taps_c; - pipes[pipe_cnt].dout.wb.wb_hratio = 1.0; - pipes[pipe_cnt].dout.wb.wb_vratio = 1.0; - if (wb_info->dwb_params.out_format == dwb_scaler_mode_yuv420) { - if (wb_info->dwb_params.output_depth == DWB_OUTPUT_PIXEL_DEPTH_8BPC) - pipes[pipe_cnt].dout.wb.wb_pixel_format = dm_420_8; - else - pipes[pipe_cnt].dout.wb.wb_pixel_format = dm_420_10; - } else - pipes[pipe_cnt].dout.wb.wb_pixel_format = dm_444_32; - - pipe_cnt++; - } - -} - int dcn20_populate_dml_pipes_from_context( struct dc *dc, struct dc_state *context, diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.h b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.h index c8f3127bbcdf..6ec8ff45f0f7 100644 --- a/drivers/gpu/drm/amd/display/dc/
[PATCH v3 2/4] drm/amd/display: Add FPU event trace
We don't have any mechanism for tracing FPU operations inside the display core, making the debug work a little bit tricky. This commit introduces a trace mechanism inside our DC_FP_START/END macros for trying to alleviate this problem. Changes since V2: - Make sure that we compile FPU operation only in the context that DCN is enabled. Cc: Harry Wentland Cc: Anson Jacob Cc: Christian König Cc: Hersen Wu Cc: Aric Cyr Signed-off-by: Rodrigo Siqueira --- .../gpu/drm/amd/display/amdgpu_dm/Makefile| 4 ++ .../amd/display/amdgpu_dm/amdgpu_dm_trace.h | 21 ++ .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 64 +++ .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.h| 33 ++ drivers/gpu/drm/amd/display/dc/dc_trace.h | 3 + drivers/gpu/drm/amd/display/dc/os_types.h | 6 +- 6 files changed, 128 insertions(+), 3 deletions(-) create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile b/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile index 91fb72c96545..718e123a3230 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile @@ -27,6 +27,10 @@ AMDGPUDM = amdgpu_dm.o amdgpu_dm_irq.o amdgpu_dm_mst_types.o amdgpu_dm_color.o +ifdef CONFIG_DRM_AMD_DC_DCN +AMDGPUDM += dc_fpu.o +endif + ifneq ($(CONFIG_DRM_AMD_DC),) AMDGPUDM += amdgpu_dm_services.o amdgpu_dm_helpers.o amdgpu_dm_pp_smu.o amdgpu_dm_psr.o endif diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h index 46a33f64cf8e..230bb12c405e 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h @@ -637,6 +637,27 @@ TRACE_EVENT(amdgpu_refresh_rate_track, __entry->refresh_rate_ns) ); +TRACE_EVENT(dcn_fpu, + TP_PROTO(bool begin, const char *function, const int line), + TP_ARGS(begin, function, line), + + TP_STRUCT__entry( +__field(bool, begin) +__field(const char *, function) +__field(int, line) + ), + TP_fast_assign( + __entry->begin = begin; + __entry->function = function; + __entry->line = line; + ), + TP_printk("%s()+%d: %s", + __entry->function, + __entry->line, + __entry->begin ? "begin" : "end" + ) +); + #endif /* _AMDGPU_DM_TRACE_H_ */ #undef TRACE_INCLUDE_PATH diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c new file mode 100644 index ..d5d156a4517e --- /dev/null +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c @@ -0,0 +1,64 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright 2021 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: AMD + * + */ + +#include "dc_trace.h" + +#include + +/** + * dc_fpu_begin - Enables FPU protection + * @function_name: A string containing the function name for debug purposes + * (usually __func__) + * + * @line: A line number where DC_FP_START was invoked for debug purpose + * (usually __LINE__) + * + * This function is responsible for managing the use of kernel_fpu_begin() with + * the advantage of providing an event trace for debugging. + * + * Note: Do not call this function directly; always use DC_FP_START(). + */ +void dc_fpu_begin(const char *function_name, const int line) +{ + TRACE_DCN_FPU(true, function_name, line); + kernel_fpu_begin(); +} + +/** + * dc_fpu_end - Disable FPU protection + * @function_name: A string containing the function name for debug pur
[PATCH v3 4/4] drm/amd/display: Add DC_FP helper to check FPU state
To fully isolate FPU operations in a single place, we must avoid situations where compilers spill FP values to registers due to FP enable in a specific C file. Note that even if we isolate all FPU functions in a single file and call its interface from other files, the compiler might enable the use of FPU before we call DC_FP_START. Nevertheless, it is the programmer's responsibility to invoke DC_FP_START/END in the correct place. To highlight situations where developers forgot to use the FP protection before calling the DC FPU interface functions, we introduce a helper that checks if the function is invoked under FP protection. If not, it will trigger a kernel warning. Changes cince V2 (Christian): - Do not use this_cpu_* between get/put_cpu_ptr(). - In the kernel documentation, better describe restrictions. - Make dc_assert_fp_enabled trigger the ASSERT message. Changes since V1: - Remove fp_enable variables - Rename dc_is_fp_enabled to dc_assert_fp_enabled - Replace wrong variable type Cc: Harry Wentland Cc: Anson Jacob Cc: Christian König Cc: Hersen Wu Cc: Aric Cyr Signed-off-by: Rodrigo Siqueira --- .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 19 +++ .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.h| 1 + .../drm/amd/display/dc/dcn20/dcn20_resource.c | 2 ++ .../drm/amd/display/dc/fpu_operations/dcn2x.c | 18 ++ 4 files changed, 40 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c index d0d3e8a34db5..107e1c50576e 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c @@ -41,6 +41,25 @@ static DEFINE_PER_CPU(int, fpu_recursion_depth); +/** + * dc_assert_fp_enabled - Check if FPU protection is enabled + * + * This function tells if the code is already under FPU protection or not. A + * function that works as an API for a set of FPU operations can use this + * function for checking if the caller invoked it after DC_FP_START(). For + * example, take a look at dcn2x.c file. + */ +inline void dc_assert_fp_enabled(void) +{ + int *pcpu, depth = 0; + + pcpu = get_cpu_ptr(&fpu_recursion_depth); + depth = *pcpu; + put_cpu_ptr(&fpu_recursion_depth); + + ASSERT(depth > 1); +} + /** * dc_fpu_begin - Enables FPU protection * @function_name: A string containing the function name for debug purposes diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h index fb54983c5c60..b8275b397920 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h @@ -27,6 +27,7 @@ #ifndef __DC_FPU_H__ #define __DC_FPU_H__ +void dc_assert_fp_enabled(void); void dc_fpu_begin(const char *function_name, const int line); void dc_fpu_end(const char *function_name, const int line); diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c index f99b09643a52..d0b34c7f99dc 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c @@ -2355,7 +2355,9 @@ int dcn20_populate_dml_pipes_from_context( } /* populate writeback information */ + DC_FP_START(); dc->res_pool->funcs->populate_dml_writeback_from_context(dc, res_ctx, pipes); + DC_FP_END(); return pipe_cnt; } diff --git a/drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c b/drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c index 95a0b89302a9..3ea90090264c 100644 --- a/drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c +++ b/drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c @@ -43,6 +43,22 @@ *that deals with FP register is contained within this call. * 3. All function that needs to be accessed outside this file requires a *public interface that not uses any FPU reference. + * 4. Developers **must not** use DC_FP_START/END in this file, but they need + *to ensure that the caller invokes it before access any function available + *in this file. For this reason, public functions in this file must invoke + *dc_assert_fp_enabled(); + * + * Let's expand a little bit more the idea in the code pattern. To fully + * isolate FPU operations in a single place, we must avoid situations where + * compilers spill FP values to registers due to FP enable in a specific C + * file. Note that even if we isolate all FPU functions in a single file and + * call its interface from other files, the compiler might enable the use of + * FPU before we call DC_FP_START. Nevertheless, it is the programmer's + * responsibility to invoke DC_FP_START/END in the correct place. To highlight + * situations where developers forgot to use the FP protection before calling + * the DC FPU interface functions, we introduce a helper that checks if the + * function is invoked under FP protecti
[PATCH v3 0/4] Base changes for isolating FPU operation in a single place
Hi, In the display core, we utilize floats and doubles units for calculating modesetting parameters. One side effect of our approach to use double-precision is the fact that we spread multiple FPU access across our driver, which means that we can accidentally clobber user space FPU state. # Challenges 1. Keep in mind that this FPU code is ingrained in our display driver and performs several crucial tasks. Additionally, we already have multiple architectures available in the kernel and a large set of users; in other words, we prefer to avoid a radical approach that might break our user's system. 2. We share our display code with other OSs; thus, we need to maintain the interoperability between these two systems. 3. We need a mechanism for identifying which function uses FPU registers; fortunately, Peter Zijlstra wrote a series a couple of months ago where he introduced an FPU check for objtool. I used the following command for identifying the potential FPU usage: ./tools/objtool/objtool check -Ffa "drivers/gpu/drm/amd/display/dc/ANY_FILE.o" 4. Since our code heavily relies on FPU and the fact that we spread kernel_fpu_begin/end across multiple functions, we can have some complex scenarios that will require code refactoring. However, we want to avoid complicated changes since this is a formula to introduce regressions; we want something that allows us to fix it in small, safe, and reliable steps. 5. Unfortunately, for legacy reasons, we have some problems in how we program our FPU access, which in some weird scenarios can generate situations where we try to enter in the fpu mode multiple times or exit too early. # Our approach For trying to solve this problem, we came up with the following strategy: 1. Keep in mind that we are using kernel_fpu_begin/end spread in various areas and sometimes across multiple functions. If we try to move some of the functions to an isolated place, we can generate a situation where we can call the FPU protection more than once, causing multiple warnings. We can deal with this problem by adding a thin management layer around the kernel_fpu_begin/end used inside the display. 2. We will need a trace mechanism for this FPU management inside our display code. 3. After we get the thin layer that manages FPU, we can start to move each function that uses FPU to a centralized place. Our DQE runs multiple tests in different ASICs every week; we can take advantage of this to ensure that our FPU patches work does not introduce any regression. The idea is to work on a specific part of the code every week (e.g., week 1: DCN2, week 1: DCN2.1, etc.). 4. Finally, after we can isolate the FPU operations in a single place, we can altogether remove the FPU flags from other files and eliminate an unnecessary code introduced to deal with this problem. We can also remove the thin layer added in the step 3. # This series To maintain the interoperability between multiple OSes, we already have a define named DC_FP_START/END, which is a straightforward wrapper to kernel_fpu_begin/end in the Linux side. In this series, I decided to expand the scope of this DC_FP_* wrapper to trace FPU entrance and exit in the display code, but I also add a mechanism for managing the entrance and exit of kernel_fpu_begin/end. You can see the details on how I did that in the last two patches. I also isolate a simple function that requires FPU access to demonstrate my strategy for isolating this FPU access in a single place. If this series gets accepted, the following steps consist of moving all FPU functions weekly until we isolate everything in the fpu_operation folder. Change Since V2: - Make sure to compile FPU operation only when DCN is enabled (officially, we only enable it for x86). - Check cross-compile with ARM and x86_32. Everything looks fine. - Avoid call this_cpu_* operations between get/put_cpu_ptr. - Fix GCC warnings. - Update documentation. - Update our assert mechanism. - Remove unnecessary wrappers. Changes since V1: - Use a better name for variables. - Update documentation. - Avoid preemption. * See update details per commit message Best Regards Rodrigo Siqueira Rodrigo Siqueira (4): drm/amd/display: Introduce FPU directory inside DC drm/amd/display: Add FPU event trace drm/amd/display: Add control mechanism for FPU utilization drm/amd/display: Add DC_FP helper to check FPU state .../gpu/drm/amd/display/amdgpu_dm/Makefile| 4 + .../amd/display/amdgpu_dm/amdgpu_dm_trace.h | 24 .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 111 ++ .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.h| 34 ++ drivers/gpu/drm/amd/display/dc/Makefile | 1 + drivers/gpu/drm/amd/display/dc/dc_trace.h | 3 + .../drm/amd/display/dc/dcn20/dcn20_resource.c | 41 +-- .../drm/amd/display/dc/dcn20/dcn20_resource.h | 2 - .../drm/amd/display/dc/dcn21/dcn21_resource.c | 2 + .../amd/display/dc/fpu_operations/Makefile| 58 + .../drm/amd/dis
[PATCH] drm/amd/pm: Fix a bug communicating with the SMU (v5)
This fixes a bug which if we probe a non-existing I2C device, and the SMU returns 0xFF, from then on we can never communicate with the SMU, because the code before this patch reads and interprets 0xFF as a terminal error, and thus we never write 0 into register 90 to clear the status (and subsequently send a new command to the SMU.) It is not an error that the SMU returns status 0xFF. This means that the SMU executed the last command successfully (execution status), but the command result is an error of some sort (execution result), depending on what the command was. When doing a status check of the SMU, before we send a new command, the only status which precludes us from sending a new command is 0--the SMU hasn't finished executing a previous command, and 0xFC--the SMU is busy. This bug was seen as the following line in the kernel log, amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right state! when subsequent SMU commands, not necessarily related to I2C, were sent to the SMU. This patch fixes this bug. v2: Add a comment to the description of __smu_cmn_poll_stat() to explain why we're NOT defining the SMU FW return codes as macros, but are instead hard-coding them. Such a change, can be followed up by a subsequent patch. v3: The changes are, a) Add comments to break labels in __smu_cmn_reg2errno(). b) When an unknown/unspecified/undefined result is returned back from the SMU, map that to -EREMOTEIO, to distinguish failure at the SMU FW. c) Add kernel-doc to smu_cmn_send_msg_without_waiting(), smu_cmn_wait_for_response(), smu_cmn_send_smc_msg_with_param(). d) In smu_cmn_send_smc_msg_with_param(), since we wait for completion of the command, if the result of the completion is undefined/unknown/unspecified, we print that to the kernel log. v4: a) Add macros as requested, though redundant, to be removed when SMU consolidates for all ASICs--see comment in code. b) Get out if the SMU code is unknown. v5: Rename the macro names. Cc: Alex Deucher Cc: Evan Quan Cc: Lijo Lazar Fixes: fcb1fe9c9e0031 ("drm/amd/powerplay: pre-check the SMU state before issuing message") Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 288 + drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h | 3 +- 2 files changed, 244 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c index c902fdf322c1be..baf26d5bfb8ff4 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c @@ -55,7 +55,7 @@ #undef __SMU_DUMMY_MAP #define __SMU_DUMMY_MAP(type) #type -static const char* __smu_message_names[] = { +static const char * const __smu_message_names[] = { SMU_MESSAGE_TYPES }; @@ -76,55 +76,258 @@ static void smu_cmn_read_arg(struct smu_context *smu, *arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82); } -int smu_cmn_wait_for_response(struct smu_context *smu) +/* Redefine the SMU error codes here. + * + * Note that these definitions are redundant and should be removed + * when the SMU has exported a unified header file containing these + * macros, which header file we can just include and use the SMU's + * macros. At the moment, these error codes are defined by the SMU + * per-ASIC unfortunately, yet we're a one driver for all ASICs. + */ +#define SMU_RESP_NONE 0 +#define SMU_RESP_OK 1 +#define SMU_RESP_CMD_FAIL 0xFF +#define SMU_RESP_CMD_UNKNOWN0xFE +#define SMU_RESP_CMD_BAD_PREREQ 0xFD +#define SMU_RESP_BUSY_OTHER 0xFC +#define SMU_RESP_DEBUG_END 0xFB + +/** + * __smu_cmn_poll_stat -- poll for a status from the SMU + * smu: a pointer to SMU context + * + * Returns the status of the SMU, which could be, + *0, the SMU is busy with your previous command; + *1, execution status: success, execution result: success; + * 0xFF, execution status: success, execution result: failure; + * 0xFE, unknown command; + * 0xFD, valid command, but bad (command) prerequisites; + * 0xFC, the command was rejected as the SMU is busy; + * 0xFB, "SMC_Result_DebugDataDumpEnd". + * + * The values here are not defined by macros, because I'd rather we + * include a single header file which defines them, which is + * maintained by the SMU FW team, so that we're impervious to firmware + * changes. At the moment those values are defined in various header + * files, one for each ASIC, yet here we're a single ASIC-agnostic + * interface. Such a change can be followed-up by a subsequent patch. + */ +static u32 __smu_cmn_poll_stat(struct smu_context *smu) { struct amdgpu_device *adev = smu->adev; - uint32_t cur_value, i, timeout = adev->usec_timeout * 20; + int timeout = adev->usec_timeout * 20; + u32 reg; - for (i = 0; i < timeout; i++) { - cur_value = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90); - if ((cur_value & MP1_C
[PATCH] drm/amdkfd: avoid conflicting address mappings
[Why] Avoid conflict with address ranges mapped by SVM mechanism that try to be allocated again through ioctl_alloc in the same process. And viceversa. [How] For ioctl_alloc_memory_of_gpu allocations Check if the address range passed into ioctl memory alloc does not exist already in the kfd_process svms->objects interval tree. For SVM allocations Look for the address range into the interval tree VA from the VM inside of each pdds used in a kfd_process. Signed-off-by: Alex Sierra --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 13 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 79 +++- 2 files changed, 75 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 67541c30327a..f39baaa22a62 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1251,6 +1251,7 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file *filep, struct kfd_process_device *pdd; void *mem; struct kfd_dev *dev; + struct svm_range_list *svms = &p->svms; int idr_handle; long err; uint64_t offset = args->mmap_offset; @@ -1259,6 +1260,18 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file *filep, if (args->size == 0) return -EINVAL; +#if IS_ENABLED(CONFIG_HSA_AMD_SVM) + mutex_lock(&svms->lock); + if (interval_tree_iter_first(&svms->objects, +args->va_addr >> PAGE_SHIFT, +(args->va_addr + args->size - 1) >> PAGE_SHIFT)) { + pr_err("Address: 0x%llx already allocated by SVM\n", + args->va_addr); + mutex_unlock(&svms->lock); + return -EADDRINUSE; + } + mutex_unlock(&svms->lock); +#endif dev = kfd_device_by_id(args->gpu_id); if (!dev) return -EINVAL; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 31f3f24cef6a..043ee0467916 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -2581,9 +2581,54 @@ int svm_range_list_init(struct kfd_process *p) return 0; } +/** + * svm_range_is_vm_bo_mapped - check if virtual address range mapped already + * @p: current kfd_process + * @start: range start address, in pages + * @last: range last address, in pages + * + * The purpose is to avoid virtual address ranges already allocated by + * kfd_ioctl_alloc_memory_of_gpu ioctl. + * It looks for each pdd in the kfd_process. + * + * Context: Process context + * + * Return 0 - OK, if the range is not mapped. + * Otherwise error code: + * -EADDRINUSE - if address is mapped already by kfd_ioctl_alloc_memory_of_gpu + * -ERESTARTSYS - A wait for the buffer to become unreserved was interrupted by + * a signal. Release all buffer reservations and return to user-space. + */ +static int +svm_range_is_vm_bo_mapped(struct kfd_process *p, uint64_t start, uint64_t last) +{ + uint32_t i; + int r; + + for (i = 0; i < p->n_pdds; i++) { + struct amdgpu_vm *vm; + + if (!p->pdds[i]->drm_priv) + continue; + + vm = drm_priv_to_vm(p->pdds[i]->drm_priv); + r = amdgpu_bo_reserve(vm->root.bo, false); + if (r) + return r; + if (interval_tree_iter_first(&vm->va, start, last)) { + pr_debug("Range [0x%llx 0x%llx] already mapped\n", start, last); + amdgpu_bo_unreserve(vm->root.bo); + return -EADDRINUSE; + } + amdgpu_bo_unreserve(vm->root.bo); + } + + return 0; +} + /** * svm_range_is_valid - check if virtual address range is valid - * @mm: current process mm_struct + * @mm: current kfd_process * @start: range start address, in pages * @size: range size, in pages * @@ -2592,28 +2637,27 @@ int svm_range_list_init(struct kfd_process *p) * Context: Process context * * Return: - * true - valid svm range - * false - invalid svm range + * 0 - OK, otherwise error code */ -static bool -svm_range_is_valid(struct mm_struct *mm, uint64_t start, uint64_t size) +static int +svm_range_is_valid(struct kfd_process *p, uint64_t start, uint64_t size) { const unsigned long device_vma = VM_IO | VM_PFNMAP | VM_MIXEDMAP; struct vm_area_struct *vma; unsigned long end; + unsigned long start_unchg = start; start <<= PAGE_SHIFT; end = start + (size << PAGE_SHIFT); - do { - vma = find_vma(mm, start); + vma = find_vma(p->mm, start); if (!vma || start < vma->vm_start || (vma->vm_flags & device_vma)) - return false; + return -EFAULT; start = min(end, vma->vm
Re: [PATCH] drm/amdgpu: Fix documentaion for dm_dmub_outbox1_low_irq
On 2021-07-19 1:49 p.m., Anson Jacob wrote: > Fix make htmldocs complaint: > ./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:628: warning: Excess > function parameter 'interrupt_params' description in 'DMUB_TRACE_MAX_READ' > > Signed-off-by: Anson Jacob > --- > drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index 7ed104e3756b..02bba9a202a8 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -625,7 +625,6 @@ static void dm_dcn_vertical_interrupt0_high_irq(void > *interrupt_params) > * Handles the Outbox Interrupt > * event handler. > */ > -#define DMUB_TRACE_MAX_READ 64 > static void dm_dmub_outbox1_low_irq(void *interrupt_params) > { > struct dmub_notification notify; > @@ -635,6 +634,8 @@ static void dm_dmub_outbox1_low_irq(void > *interrupt_params) > struct dmcub_trace_buf_entry entry = { 0 }; > uint32_t count = 0; > > +#define DMUB_TRACE_MAX_READ 64 > + I'd prefer to keep macro definitions separate from functions. A better place for this would be right before the function comment. Harry > if (dc_enable_dmub_notifications(adev->dm.dc)) { > if (irq_params->irq_src == DC_IRQ_SOURCE_DMCUB_OUTBOX) { > do { > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v4 06/13] include/linux/mm.h: helpers to check zone device generic type
Regards, Oak On 2021-07-17, 3:22 PM, "amd-gfx on behalf of Alex Sierra" wrote: Two helpers added. One checks if zone device page is generic type. The other if page is either private or generic type. Signed-off-by: Alex Sierra --- include/linux/mm.h | 8 1 file changed, 8 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index d8d79bb94be8..f5b247a63044 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1125,6 +1125,14 @@ static inline bool is_device_private_page(const struct page *page) page->pgmap->type == MEMORY_DEVICE_PRIVATE; } +static inline bool is_device_page(const struct page *page) The function name here is confusing. Theoretically as long as page's zone number is ZONE_DEVICE, then the page is a device page. You put the condition more strict below just because the kfd svm implementation only uses MEMORY_DEVICE_PRIVATE/GENERIC. But MEMORY_DEVICE_FS_DAX and MEMORY_DEVICE_PCI_P2PDMA is also device memory (compared to normal cpu system memory). +{ + return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) && + is_zone_device_page(page) && + (page->pgmap->type == MEMORY_DEVICE_PRIVATE || +page->pgmap->type == MEMORY_DEVICE_GENERIC); +} + static inline bool is_pci_p2pdma_page(const struct page *page) { return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) && -- 2.32.0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Coak.zeng%40amd.com%7C6a01845fd1524360d79a08d949583365%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637621465431851292%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=nHblZdGYyYuQVws%2BgG4HgnKpGiQXCfxxEnWZuozKy9g%3D&reserved=0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[RFC 0/2] Add generic FPU api similar to x86
This is an attempt to have generic FPU enable/disable calls similar to x86. So that we can simplify gpu/drm/amd/display/dc/os_types.h Also adds FPU correctness logic seen in x86. Anson Jacob (2): ppc/fpu: Add generic FPU api similar to x86 drm/amd/display: Use PPC FPU functions arch/powerpc/include/asm/switch_to.h | 29 ++--- arch/powerpc/kernel/process.c | 130 ++ drivers/gpu/drm/amd/display/dc/os_types.h | 28 + 3 files changed, 139 insertions(+), 48 deletions(-) -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[RFC 1/2] ppc/fpu: Add generic FPU api similar to x86
- Add kernel_fpu_begin & kernel_fpu_end API as x86 - Add logic similar to x86 to ensure fpu begin/end call correctness - Add kernel_fpu_enabled to know if FPU is enabled Signed-off-by: Anson Jacob --- arch/powerpc/include/asm/switch_to.h | 29 ++ arch/powerpc/kernel/process.c| 130 +++ 2 files changed, 137 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h index 9d1fbd8be1c7..aded7aa661c0 100644 --- a/arch/powerpc/include/asm/switch_to.h +++ b/arch/powerpc/include/asm/switch_to.h @@ -41,10 +41,7 @@ extern void enable_kernel_fp(void); extern void flush_fp_to_thread(struct task_struct *); extern void giveup_fpu(struct task_struct *); extern void save_fpu(struct task_struct *); -static inline void disable_kernel_fp(void) -{ - msr_check_and_clear(MSR_FP); -} +extern void disable_kernel_fp(void); #else static inline void save_fpu(struct task_struct *t) { } static inline void flush_fp_to_thread(struct task_struct *t) { } @@ -55,10 +52,7 @@ extern void enable_kernel_altivec(void); extern void flush_altivec_to_thread(struct task_struct *); extern void giveup_altivec(struct task_struct *); extern void save_altivec(struct task_struct *); -static inline void disable_kernel_altivec(void) -{ - msr_check_and_clear(MSR_VEC); -} +extern void disable_kernel_altivec(void); #else static inline void save_altivec(struct task_struct *t) { } static inline void __giveup_altivec(struct task_struct *t) { } @@ -67,20 +61,7 @@ static inline void __giveup_altivec(struct task_struct *t) { } #ifdef CONFIG_VSX extern void enable_kernel_vsx(void); extern void flush_vsx_to_thread(struct task_struct *); -static inline void disable_kernel_vsx(void) -{ - msr_check_and_clear(MSR_FP|MSR_VEC|MSR_VSX); -} -#else -static inline void enable_kernel_vsx(void) -{ - BUILD_BUG(); -} - -static inline void disable_kernel_vsx(void) -{ - BUILD_BUG(); -} +extern void disable_kernel_vsx(void); #endif #ifdef CONFIG_SPE @@ -114,4 +95,8 @@ static inline void clear_task_ebb(struct task_struct *t) extern int set_thread_tidr(struct task_struct *t); +bool kernel_fpu_enabled(void); +void kernel_fpu_begin(void); +void kernel_fpu_end(void); + #endif /* _ASM_POWERPC_SWITCH_TO_H */ diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 185beb290580..2ced8c6a3fab 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -75,6 +75,17 @@ #define TM_DEBUG(x...) do { } while(0) #endif +/* + * Track whether the kernel is using the FPU state + * currently. + * + * This flag is used: + * + * - kernel_fpu_begin()/end() correctness + * - kernel_fpu_enabled info + */ +static DEFINE_PER_CPU(bool, in_kernel_fpu); + extern unsigned long _get_SP(void); #ifdef CONFIG_PPC_TRANSACTIONAL_MEM @@ -212,6 +223,9 @@ void enable_kernel_fp(void) unsigned long cpumsr; WARN_ON(preemptible()); + WARN_ON_ONCE(this_cpu_read(in_kernel_fpu)); + + this_cpu_write(in_kernel_fpu, true); cpumsr = msr_check_and_set(MSR_FP); @@ -231,6 +245,15 @@ void enable_kernel_fp(void) } } EXPORT_SYMBOL(enable_kernel_fp); + +void disable_kernel_fp(void) +{ + WARN_ON_ONCE(!this_cpu_read(in_kernel_fpu)); + + this_cpu_write(in_kernel_fpu, false); + msr_check_and_clear(MSR_FP); +} +EXPORT_SYMBOL(disable_kernel_fp); #else static inline void __giveup_fpu(struct task_struct *tsk) { } #endif /* CONFIG_PPC_FPU */ @@ -263,6 +286,9 @@ void enable_kernel_altivec(void) unsigned long cpumsr; WARN_ON(preemptible()); + WARN_ON_ONCE(this_cpu_read(in_kernel_fpu)); + + this_cpu_write(in_kernel_fpu, true); cpumsr = msr_check_and_set(MSR_VEC); @@ -283,6 +309,14 @@ void enable_kernel_altivec(void) } EXPORT_SYMBOL(enable_kernel_altivec); +void disable_kernel_altivec(void) +{ + WARN_ON_ONCE(!this_cpu_read(in_kernel_fpu)); + + this_cpu_write(in_kernel_fpu, false); + msr_check_and_clear(MSR_VEC); +} +EXPORT_SYMBOL(disable_kernel_altivec); /* * Make sure the VMX/Altivec register state in the * the thread_struct is up to date for task tsk. @@ -333,6 +367,9 @@ void enable_kernel_vsx(void) unsigned long cpumsr; WARN_ON(preemptible()); + WARN_ON_ONCE(this_cpu_read(in_kernel_fpu)); + + this_cpu_write(in_kernel_fpu, true); cpumsr = msr_check_and_set(MSR_FP|MSR_VEC|MSR_VSX); @@ -354,6 +391,15 @@ void enable_kernel_vsx(void) } EXPORT_SYMBOL(enable_kernel_vsx); +void disable_kernel_vsx(void) +{ + WARN_ON_ONCE(!this_cpu_read(in_kernel_fpu)); + + this_cpu_write(in_kernel_fpu, false); + msr_check_and_clear(MSR_FP|MSR_VEC|MSR_VSX); +} +EXPORT_SYMBOL(disable_kernel_vsx); + void flush_vsx_to_thread(struct task_struct *tsk) { if (tsk->thread.regs) { @@ -406,6 +452,90 @@ void flush_spe_to_thread(struct task_struct *tsk) } #en
[RFC 2/2] drm/amd/display: Use PPC FPU functions
Use kernel_fpu_begin & kernel_fpu_end for PPC Depends on "ppc/fpu: Add generic FPU api similar to x86" Signed-off-by: Anson Jacob --- drivers/gpu/drm/amd/display/dc/os_types.h | 28 ++- 1 file changed, 2 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/os_types.h b/drivers/gpu/drm/amd/display/dc/os_types.h index 126c2f3a4dd3..999c5103357e 100644 --- a/drivers/gpu/drm/amd/display/dc/os_types.h +++ b/drivers/gpu/drm/amd/display/dc/os_types.h @@ -57,32 +57,8 @@ #define DC_FP_END() kernel_fpu_end() #elif defined(CONFIG_PPC64) #include -#include -#define DC_FP_START() { \ - if (cpu_has_feature(CPU_FTR_VSX_COMP)) { \ - preempt_disable(); \ - enable_kernel_vsx(); \ - } else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP)) { \ - preempt_disable(); \ - enable_kernel_altivec(); \ - } else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE)) { \ - preempt_disable(); \ - enable_kernel_fp(); \ - } \ -} -#define DC_FP_END() { \ - if (cpu_has_feature(CPU_FTR_VSX_COMP)) { \ - disable_kernel_vsx(); \ - preempt_enable(); \ - } else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP)) { \ - disable_kernel_altivec(); \ - preempt_enable(); \ - } else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE)) { \ - disable_kernel_fp(); \ - preempt_enable(); \ - } \ -} -#endif +#define DC_FP_START() kernel_fpu_begin() +#define DC_FP_END() kernel_fpu_end() #endif /* -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: Fix documentaion for dm_dmub_outbox1_low_irq
Fix make htmldocs complaint: ./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:628: warning: Excess function parameter 'interrupt_params' description in 'DMUB_TRACE_MAX_READ' Signed-off-by: Anson Jacob --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 7ed104e3756b..02bba9a202a8 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -625,7 +625,6 @@ static void dm_dcn_vertical_interrupt0_high_irq(void *interrupt_params) * Handles the Outbox Interrupt * event handler. */ -#define DMUB_TRACE_MAX_READ 64 static void dm_dmub_outbox1_low_irq(void *interrupt_params) { struct dmub_notification notify; @@ -635,6 +634,8 @@ static void dm_dmub_outbox1_low_irq(void *interrupt_params) struct dmcub_trace_buf_entry entry = { 0 }; uint32_t count = 0; +#define DMUB_TRACE_MAX_READ 64 + if (dc_enable_dmub_notifications(adev->dm.dc)) { if (irq_params->irq_src == DC_IRQ_SOURCE_DMCUB_OUTBOX) { do { -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/pm: Fix a bug communicating with the SMU (v4)
On Mon, Jul 19, 2021 at 12:16 PM Luben Tuikov wrote: > > This fixes a bug which if we probe a non-existing > I2C device, and the SMU returns 0xFF, from then on > we can never communicate with the SMU, because the > code before this patch reads and interprets 0xFF > as a terminal error, and thus we never write 0 > into register 90 to clear the status (and > subsequently send a new command to the SMU.) > > It is not an error that the SMU returns status > 0xFF. This means that the SMU executed the last > command successfully (execution status), but the > command result is an error of some sort (execution > result), depending on what the command was. > > When doing a status check of the SMU, before we > send a new command, the only status which > precludes us from sending a new command is 0--the > SMU hasn't finished executing a previous command, > and 0xFC--the SMU is busy. > > This bug was seen as the following line in the > kernel log, > > amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right > state! > > when subsequent SMU commands, not necessarily > related to I2C, were sent to the SMU. > > This patch fixes this bug. > > v2: Add a comment to the description of > __smu_cmn_poll_stat() to explain why we're NOT > defining the SMU FW return codes as macros, but > are instead hard-coding them. Such a change, can > be followed up by a subsequent patch. > > v3: The changes are, > a) Add comments to break labels in >__smu_cmn_reg2errno(). > > b) When an unknown/unspecified/undefined result is >returned back from the SMU, map that to >-EREMOTEIO, to distinguish failure at the SMU >FW. > > c) Add kernel-doc to >smu_cmn_send_msg_without_waiting(), >smu_cmn_wait_for_response(), >smu_cmn_send_smc_msg_with_param(). > > d) In smu_cmn_send_smc_msg_with_param(), since we >wait for completion of the command, if the >result of the completion is >undefined/unknown/unspecified, we print that to >the kernel log. > > v4: a) Add macros as requested, though redundant, to > be removed when SMU consolidates for all > ASICs--see comment in code. > b) Get out if the SMU code is unknown. > > Cc: Alex Deucher > Cc: Evan Quan > Cc: Lijo Lazar > Fixes: fcb1fe9c9e0031 ("drm/amd/powerplay: pre-check the SMU state before > issuing message") > Signed-off-by: Luben Tuikov > --- > drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 294 + > drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h | 3 +- > 2 files changed, 250 insertions(+), 47 deletions(-) > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c > b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c > index c902fdf322c1be..9b30737bd541d6 100644 > --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c > @@ -55,7 +55,7 @@ > > #undef __SMU_DUMMY_MAP > #define __SMU_DUMMY_MAP(type) #type > -static const char* __smu_message_names[] = { > +static const char * const __smu_message_names[] = { > SMU_MESSAGE_TYPES > }; > > @@ -76,55 +76,264 @@ static void smu_cmn_read_arg(struct smu_context *smu, > *arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82); > } > > -int smu_cmn_wait_for_response(struct smu_context *smu) > +/* Redefine the SMU error codes here. > + * > + * Note that these definitions are redundant and should be removed > + * when the SMU has exported a unified header file containing these > + * macros, which header file we can just include and use the SMU's > + * macros. At the moment, these error codes are defined by the SMU > + * per-ASIC unfortunately, yet we're a one driver for all ASICs. > + */ > + > +#define SMU_BUSY 0 This is really a response per se, it's just that the SMU hasn't updated the response register yet. Maybe SMU_RESP_NONE > +#define SMU_GOOD 1 > +#define SMU_CMD_FAIL 0xFF > +#define SMU_CMD_UNKNOWN0xFE > +#define SMU_CMD_BAD_PREREQ 0xFD > +#define SMU_BUSY_OTHER 0xFC > +#define SMU_DEBUG_END 0xFB I'd change the macros to include RESP for consistency E.g., #define SMU_RESP_OK 1 #define SMU_RESP_CMD_FAIL 0xFF etc. > + > +/** > + * __smu_cmn_poll_stat -- poll for a status from the SMU > + * smu: a pointer to SMU context > + * > + * Returns the status of the SMU, which could be, > + *0, the SMU is busy with your previous command; > + *1, execution status: success, execution result: success; > + * 0xFF, execution status: success, execution result: failure; > + * 0xFE, unknown command; > + * 0xFD, valid command, but bad (command) prerequisites; > + * 0xFC, the command was rejected as the SMU is busy; > + * 0xFB, "SMC_Result_DebugDataDumpEnd". > + * > + * The values here are not defined by macros, because I'd rather we > + * include a single header file which defines them, which is > + * maintained by the SMU FW team, so that we're impervious to firmware > + * changes. At the moment those values are defined in various header > + * files, one for each ASIC, yet here we're a s
[PATCH] drm/amd/pm: Fix a bug communicating with the SMU (v4)
This fixes a bug which if we probe a non-existing I2C device, and the SMU returns 0xFF, from then on we can never communicate with the SMU, because the code before this patch reads and interprets 0xFF as a terminal error, and thus we never write 0 into register 90 to clear the status (and subsequently send a new command to the SMU.) It is not an error that the SMU returns status 0xFF. This means that the SMU executed the last command successfully (execution status), but the command result is an error of some sort (execution result), depending on what the command was. When doing a status check of the SMU, before we send a new command, the only status which precludes us from sending a new command is 0--the SMU hasn't finished executing a previous command, and 0xFC--the SMU is busy. This bug was seen as the following line in the kernel log, amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right state! when subsequent SMU commands, not necessarily related to I2C, were sent to the SMU. This patch fixes this bug. v2: Add a comment to the description of __smu_cmn_poll_stat() to explain why we're NOT defining the SMU FW return codes as macros, but are instead hard-coding them. Such a change, can be followed up by a subsequent patch. v3: The changes are, a) Add comments to break labels in __smu_cmn_reg2errno(). b) When an unknown/unspecified/undefined result is returned back from the SMU, map that to -EREMOTEIO, to distinguish failure at the SMU FW. c) Add kernel-doc to smu_cmn_send_msg_without_waiting(), smu_cmn_wait_for_response(), smu_cmn_send_smc_msg_with_param(). d) In smu_cmn_send_smc_msg_with_param(), since we wait for completion of the command, if the result of the completion is undefined/unknown/unspecified, we print that to the kernel log. v4: a) Add macros as requested, though redundant, to be removed when SMU consolidates for all ASICs--see comment in code. b) Get out if the SMU code is unknown. Cc: Alex Deucher Cc: Evan Quan Cc: Lijo Lazar Fixes: fcb1fe9c9e0031 ("drm/amd/powerplay: pre-check the SMU state before issuing message") Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 294 + drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h | 3 +- 2 files changed, 250 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c index c902fdf322c1be..9b30737bd541d6 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c @@ -55,7 +55,7 @@ #undef __SMU_DUMMY_MAP #define __SMU_DUMMY_MAP(type) #type -static const char* __smu_message_names[] = { +static const char * const __smu_message_names[] = { SMU_MESSAGE_TYPES }; @@ -76,55 +76,264 @@ static void smu_cmn_read_arg(struct smu_context *smu, *arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82); } -int smu_cmn_wait_for_response(struct smu_context *smu) +/* Redefine the SMU error codes here. + * + * Note that these definitions are redundant and should be removed + * when the SMU has exported a unified header file containing these + * macros, which header file we can just include and use the SMU's + * macros. At the moment, these error codes are defined by the SMU + * per-ASIC unfortunately, yet we're a one driver for all ASICs. + */ + +#define SMU_BUSY 0 +#define SMU_GOOD 1 +#define SMU_CMD_FAIL 0xFF +#define SMU_CMD_UNKNOWN0xFE +#define SMU_CMD_BAD_PREREQ 0xFD +#define SMU_BUSY_OTHER 0xFC +#define SMU_DEBUG_END 0xFB + +/** + * __smu_cmn_poll_stat -- poll for a status from the SMU + * smu: a pointer to SMU context + * + * Returns the status of the SMU, which could be, + *0, the SMU is busy with your previous command; + *1, execution status: success, execution result: success; + * 0xFF, execution status: success, execution result: failure; + * 0xFE, unknown command; + * 0xFD, valid command, but bad (command) prerequisites; + * 0xFC, the command was rejected as the SMU is busy; + * 0xFB, "SMC_Result_DebugDataDumpEnd". + * + * The values here are not defined by macros, because I'd rather we + * include a single header file which defines them, which is + * maintained by the SMU FW team, so that we're impervious to firmware + * changes. At the moment those values are defined in various header + * files, one for each ASIC, yet here we're a single ASIC-agnostic + * interface. Such a change can be followed-up by a subsequent patch. + */ +static u32 __smu_cmn_poll_stat(struct smu_context *smu) { struct amdgpu_device *adev = smu->adev; - uint32_t cur_value, i, timeout = adev->usec_timeout * 20; + int timeout = adev->usec_timeout * 20; + u32 reg; - for (i = 0; i < timeout; i++) { - cur_value = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90); - if ((cur_value & MP1_C2PMSG_90__CONTENT_MASK) != 0) - return c
RE: [PATCH 2/3] drm/amdkfd: report xgmi bandwidth between direct peers to the kfd
[AMD Official Use Only] > -Original Message- > From: Kuehling, Felix > Sent: Saturday, July 17, 2021 1:47 AM > To: Kim, Jonathan ; amd- > g...@lists.freedesktop.org > Subject: Re: [PATCH 2/3] drm/amdkfd: report xgmi bandwidth between > direct peers to the kfd > > Am 2021-07-16 um 12:43 p.m. schrieb Jonathan Kim: > > Report the min/max bandwidth in megabytes to the kfd for direct xgmi > > connections only. > > By "direct XGMI connections", you mean this doesn't work for links with > more than one hop? Will that spew out DRM_ERROR messages for such links? > Then it's probably better to downgrade that to an INFO. No DRM_ERROR only happens if psp fails on invoke. I've added footnotes to the description and code to clear this up. Non-adjacent peers return num_links as 0 since indirect route is unknown and linkage is asymmetrical. > > > > > > v2: change reporting from num links to bandwidth > > > > Signed-off-by: Jonathan Kim > > This patch is OK to provide bandwidth information on Aldebaran. What can > we do on older GPUs? Can we assume num_links = 1? Or maybe have some > hard-coded numbers depending on the number of nodes in the hive? We could assume num_links = 1 but that wouldn't represent certain non-Aldebaran setups well. For non-Aldebaran min/max bandwidth, we may be able to get away with setting non-zero values on non-adjacent peers since setup is symmetrical to date but that may raise questions on why Aldebaran indirect min/max-bandwidth is 0. For consistency, we'd have to use num_hops then to check directness. Maybe it's worth making a bid to the FW team to support all other chips moving forward ... Thanks, Jon > > Either way, patch 1 and 2 are > > Reviewed-by: Felix Kuehling > > > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 23 > > ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 1 + > > drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 12 +++ > > drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 2 ++ > > drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 12 +++ > > 5 files changed, 50 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > index bfab2f9fdd17..3978578a1c49 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > @@ -553,6 +553,29 @@ uint8_t > amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst, struct kgd_dev > *s > > return (uint8_t)ret; > > } > > > > +int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst, > > +struct kgd_dev *src, bool is_min) { > > + struct amdgpu_device *adev = (struct amdgpu_device *)dst, > *peer_adev; > > + int num_links; > > + > > + if (adev->asic_type != CHIP_ALDEBARAN) > > + return 0; > > + > > + if (src) > > + peer_adev = (struct amdgpu_device *)src; > > + > > + num_links = is_min ? 1 : amdgpu_xgmi_get_num_links(adev, > peer_adev); > > + if (num_links < 0) { > > + DRM_ERROR("amdgpu: failed to get xgmi num links between > node %d and %d. ret = %d\n", > > + adev->gmc.xgmi.physical_node_id, > > + peer_adev->gmc.xgmi.physical_node_id, num_links); > > + num_links = 0; > > + } > > + > > + /* Aldebaran xGMI DPM is defeatured so assume x16 x 25Gbps for > bandwidth. */ > > + return (num_links * 16 * 25000)/BITS_PER_BYTE; } > > + > > uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct kgd_dev > *kgd) > > { > > struct amdgpu_device *adev = (struct amdgpu_device *)kgd; diff --git > > a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > > index 81264517d532..e12fccb2d2c4 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > > @@ -226,6 +226,7 @@ uint32_t amdgpu_amdkfd_get_num_gws(struct > kgd_dev > > *kgd); uint32_t amdgpu_amdkfd_get_asic_rev_id(struct kgd_dev *kgd); > > int amdgpu_amdkfd_get_noretry(struct kgd_dev *kgd); uint8_t > > amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst, struct > kgd_dev > > *src); > > +int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst, > > +struct kgd_dev *src, bool is_min); > > > > /* Read user wptr from a specified user address space with page fault > > * disabled. The memory must be pinned and mapped to the hardware > > when diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c > > index 8567d5d77346..258cf86b32f6 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c > > @@ -486,6 +486,18 @@ int amdgpu_xgmi_get_hops_count(struct > amdgpu_device *adev, > > return -EINVAL; > > } > > > > +int amdgpu_xgmi_get_num_links(struct amdgpu_device *adev, > > + struct amdgpu_device *peer_adev) > > +{ > > + struct psp_xgmi_topology_info *top = &adev- > >psp.xgmi_context.top_info; > > + int i; >
RE: [PATCH 2/3] drm/amdkfd: report xgmi bandwidth between direct peers to the kfd
[AMD Official Use Only] > -Original Message- > From: Lazar, Lijo > Sent: Monday, July 19, 2021 3:22 AM > To: Kim, Jonathan ; amd- > g...@lists.freedesktop.org > Cc: Kuehling, Felix > Subject: Re: [PATCH 2/3] drm/amdkfd: report xgmi bandwidth between > direct peers to the kfd > > > > On 7/16/2021 10:13 PM, Jonathan Kim wrote: > > Report the min/max bandwidth in megabytes to the kfd for direct xgmi > > connections only. > > > > v2: change reporting from num links to bandwidth > > > > Signed-off-by: Jonathan Kim > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 23 > ++ > > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 1 + > > drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 12 +++ > > drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 2 ++ > > drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 12 +++ > > 5 files changed, 50 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > index bfab2f9fdd17..3978578a1c49 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > @@ -553,6 +553,29 @@ uint8_t > amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst, struct kgd_dev > *s > > return (uint8_t)ret; > > } > > > > +int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst, > > +struct kgd_dev *src, bool is_min) { > > + struct amdgpu_device *adev = (struct amdgpu_device *)dst, > *peer_adev; > > + int num_links; > > + > > + if (adev->asic_type != CHIP_ALDEBARAN) > > + return 0; > > + > > + if (src) > > + peer_adev = (struct amdgpu_device *)src; > > + > > + num_links = is_min ? 1 : amdgpu_xgmi_get_num_links(adev, > peer_adev); > > + if (num_links < 0) { > > + DRM_ERROR("amdgpu: failed to get xgmi num links between > node %d and %d. ret = %d\n", > > + adev->gmc.xgmi.physical_node_id, > > + peer_adev->gmc.xgmi.physical_node_id, num_links); > > + num_links = 0; > > + } > > + > > + /* Aldebaran xGMI DPM is defeatured so assume x16 x 25Gbps for > bandwidth. */ > > + return (num_links * 16 * 25000)/BITS_PER_BYTE; > > Instead of having ASIC family checks and bandwidth info in interface > functions, better to have this info come from base layer (amdgpu_xgmi or > xgmi ip). That will help to handle other ASICs. Ok. We can revisit this as a follow up. Maybe the full solution is a link width/speed support mask analogous to pcie. Thanks, Jon > > Thanks, > Lijo > > > uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct kgd_dev > *kgd) > > { > > struct amdgpu_device *adev = (struct amdgpu_device *)kgd; diff > > --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > > index 81264517d532..e12fccb2d2c4 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > > @@ -226,6 +226,7 @@ uint32_t amdgpu_amdkfd_get_num_gws(struct > kgd_dev *kgd); > > uint32_t amdgpu_amdkfd_get_asic_rev_id(struct kgd_dev *kgd); > > int amdgpu_amdkfd_get_noretry(struct kgd_dev *kgd); > > uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst, > > struct kgd_dev *src); > > +int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst, > > +struct kgd_dev *src, bool is_min); > > > > /* Read user wptr from a specified user address space with page fault > >* disabled. The memory must be pinned and mapped to the hardware > > when diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c > > index 8567d5d77346..258cf86b32f6 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c > > @@ -486,6 +486,18 @@ int amdgpu_xgmi_get_hops_count(struct > amdgpu_device *adev, > > return -EINVAL; > > } > > > > +int amdgpu_xgmi_get_num_links(struct amdgpu_device *adev, > > + struct amdgpu_device *peer_adev) > > +{ > > + struct psp_xgmi_topology_info *top = &adev- > >psp.xgmi_context.top_info; > > + int i; > > + > > + for (i = 0 ; i < top->num_nodes; ++i) > > + if (top->nodes[i].node_id == peer_adev->gmc.xgmi.node_id) > > + return top->nodes[i].num_links; > > + return -EINVAL; > > +} > > + > > int amdgpu_xgmi_add_device(struct amdgpu_device *adev) > > { > > struct psp_xgmi_topology_info *top_info; diff --git > > a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h > > index 12969c0830d5..d2189bf7d428 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h > > @@ -59,6 +59,8 @@ int amdgpu_xgmi_remove_device(struct > amdgpu_device *adev); > > int amdgpu_xgmi_set_pstate(struct amdgpu_device *adev, int pstate); > > int amdgpu_xgmi_get_hops_count(struct amdgpu_device *adev, > > struct amdgpu_device *
RE: [PATCH 3/3] drm/amdkfd: report pcie bandwidth to the kfd
[AMD Official Use Only] > -Original Message- > From: Kuehling, Felix > Sent: Saturday, July 17, 2021 1:37 AM > To: Kim, Jonathan ; amd- > g...@lists.freedesktop.org > Subject: Re: [PATCH 3/3] drm/amdkfd: report pcie bandwidth to the kfd > > Am 2021-07-16 um 12:43 p.m. schrieb Jonathan Kim: > > Similar to xGMI reporting the min/max bandwidth between direct peers, > > PCIe will report the min/max bandwidth to the KFD. > > > > v2: change to bandwidth > > > > Signed-off-by: Jonathan Kim > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 61 > > ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 1 + > > drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 4 ++ > > 3 files changed, 66 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > index 3978578a1c49..b7db52f1a9d1 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > @@ -21,6 +21,7 @@ > > */ > > > > #include "amdgpu_amdkfd.h" > > +#include "amd_pcie.h" > > #include "amd_shared.h" > > > > #include "amdgpu.h" > > @@ -576,6 +577,66 @@ int > amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst, struct > kgd_dev > > return (num_links * 16 * 25000)/BITS_PER_BYTE; } > > > > +int amdgpu_amdkfd_get_pcie_bandwidth_mbytes(struct kgd_dev *dev, > bool > > +is_min) { > > + struct amdgpu_device *adev = (struct amdgpu_device *)dev; > > + int num_lanes_shift = is_min ? ffs(adev->pm.pcie_mlw_mask >> > > + > CAIL_PCIE_LINK_WIDTH_SUPPORT_SHIFT) - 1 : > > + fls(adev->pm.pcie_mlw_mask >> > > + > CAIL_PCIE_LINK_WIDTH_SUPPORT_SHIFT) - 1; > > + int gen_speed_shift = is_min ? ffs(adev->pm.pcie_gen_mask >> > > + > CAIL_PCIE_LINK_SPEED_SUPPORT_SHIFT) - 1 : > > + fls(adev->pm.pcie_gen_mask >> > > + > CAIL_PCIE_LINK_SPEED_SUPPORT_SHIFT) - 1; > > The shifting is not necessary because you undo it below. I think this would > do the trick and be more readable: > > int num_lanes_shift = (is_min ? ffs(adev->pm.pcie_mlw_mask) : > fls(adev->pm.pcie_mlw_mask)) - 1; > int gen_speed_shift = (is_min ? ffs(adev->pm.pcie_gen_mask) : > fls(adev->pm.pcie_gen_mask)) - 1; Ok thanks for the review and suggestion. I've adjusted your suggestion by masking pcie_gen_mask with CAIL_PCIE_LINK_SPEED_SUPPORT_MASK as the mask sets some non-speed related lower bits. Thanks, Jon > uint32_t num_lanes_mask = 1 << num_lanes_shift; > uint32_t gen_speed_mask = 1 << gen_speed_shift; > > With that fixed, this patch is > > Reviewed-by: Felix Kuehling > > > > + uint32_t num_lanes_mask = (1UL << num_lanes_shift) << > CAIL_PCIE_LINK_WIDTH_SUPPORT_SHIFT; > > + uint32_t gen_speed_mask = (1UL << gen_speed_shift) << > CAIL_PCIE_LINK_SPEED_SUPPORT_SHIFT; > > + int num_lanes_factor = 0, gen_speed_mbits_factor = 0; > > + > > + switch (num_lanes_mask) { > > + case CAIL_PCIE_LINK_WIDTH_SUPPORT_X1: > > + num_lanes_factor = 1; > > + break; > > + case CAIL_PCIE_LINK_WIDTH_SUPPORT_X2: > > + num_lanes_factor = 2; > > + break; > > + case CAIL_PCIE_LINK_WIDTH_SUPPORT_X4: > > + num_lanes_factor = 4; > > + break; > > + case CAIL_PCIE_LINK_WIDTH_SUPPORT_X8: > > + num_lanes_factor = 8; > > + break; > > + case CAIL_PCIE_LINK_WIDTH_SUPPORT_X12: > > + num_lanes_factor = 12; > > + break; > > + case CAIL_PCIE_LINK_WIDTH_SUPPORT_X16: > > + num_lanes_factor = 16; > > + break; > > + case CAIL_PCIE_LINK_WIDTH_SUPPORT_X32: > > + num_lanes_factor = 32; > > + break; > > + } > > + > > + switch (gen_speed_mask) { > > + case CAIL_PCIE_LINK_SPEED_SUPPORT_GEN1: > > + gen_speed_mbits_factor = 2500; > > + break; > > + case CAIL_PCIE_LINK_SPEED_SUPPORT_GEN2: > > + gen_speed_mbits_factor = 5000; > > + break; > > + case CAIL_PCIE_LINK_SPEED_SUPPORT_GEN3: > > + gen_speed_mbits_factor = 8000; > > + break; > > + case CAIL_PCIE_LINK_SPEED_SUPPORT_GEN4: > > + gen_speed_mbits_factor = 16000; > > + break; > > + case CAIL_PCIE_LINK_SPEED_SUPPORT_GEN5: > > + gen_speed_mbits_factor = 32000; > > + break; > > + } > > + > > + return (num_lanes_factor * > gen_speed_mbits_factor)/BITS_PER_BYTE; > > +} > > + > > uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct kgd_dev > *kgd) > > { > > struct amdgpu_device *adev = (struct amdgpu_device *)kgd; diff --git > > a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > > index e12fccb2d2c4..5d73f3108d13 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h > > @@ -227,6 +227,7 @@ uint3
Re: [PATCH] drm/amdgpu: Fix documentaion for amdgpu_bo_add_to_shadow_list
[Public] Reviewed-by: Alex Deucher From: amd-gfx on behalf of Anson Jacob Sent: Monday, July 19, 2021 11:19 AM To: amd-gfx@lists.freedesktop.org Cc: Jacob, Anson Subject: [PATCH] drm/amdgpu: Fix documentaion for amdgpu_bo_add_to_shadow_list make htmldocs complaints about parameter for amdgpu_bo_add_to_shadow_list ./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Excess function parameter 'bo' description in 'amdgpu_bo_add_to_shadow_list' ./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Function parameter or member 'vmbo' not described in 'amdgpu_bo_add_to_shadow_list' ./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Excess function parameter 'bo' description in 'amdgpu_bo_add_to_shadow_list' Signed-off-by: Anson Jacob --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index ea339eaac399..4e2c0270208f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -755,7 +755,7 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo) /** * amdgpu_bo_add_to_shadow_list - add a BO to the shadow list * - * @bo: BO that will be inserted into the shadow list + * @vmbo: BO that will be inserted into the shadow list * * Insert a BO to the shadow list. */ -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Calexander.deucher%40amd.com%7Cdd6220ed9b714257c0d408d94ac8957e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637623047634433649%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=p6H1ou0jYE7BArtppdvJIdgtcXMqEMwRgUu9s%2BR6vEA%3D&reserved=0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: Fix documentaion for amdgpu_bo_add_to_shadow_list
make htmldocs complaints about parameter for amdgpu_bo_add_to_shadow_list ./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Excess function parameter 'bo' description in 'amdgpu_bo_add_to_shadow_list' ./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Function parameter or member 'vmbo' not described in 'amdgpu_bo_add_to_shadow_list' ./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Excess function parameter 'bo' description in 'amdgpu_bo_add_to_shadow_list' Signed-off-by: Anson Jacob --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index ea339eaac399..4e2c0270208f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -755,7 +755,7 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo) /** * amdgpu_bo_add_to_shadow_list - add a BO to the shadow list * - * @bo: BO that will be inserted into the shadow list + * @vmbo: BO that will be inserted into the shadow list * * Insert a BO to the shadow list. */ -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/1] drm/amdkfd: Allow querying SVM attributes that are clear
On 2021-07-16 10:46 p.m., Felix Kuehling wrote: Currently the SVM get_attr call allows querying, which flags are set in the entire address range. Add the opposite query, which flags are clear in the entire address range. Both queries can be combined in a single get_attr call, which allows answering questions such as, "is this address range coherent, non-coherent, or a mix of both"? Signed-off-by: Felix Kuehling Reviewed-by: Philip Yang --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 19 +-- include/uapi/linux/kfd_ioctl.h | 16 +--- 2 files changed, 22 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index e7e99c5070b9..c66221a9c02d 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -3019,7 +3019,8 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size, struct svm_range *prange; uint32_t prefetch_loc = KFD_IOCTL_SVM_LOCATION_UNDEFINED; uint32_t location = KFD_IOCTL_SVM_LOCATION_UNDEFINED; - uint32_t flags = 0x; + uint32_t flags_and = 0x; + uint32_t flags_or = 0; int gpuidx; uint32_t i; @@ -3046,12 +3047,12 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size, get_accessible = true; break; case KFD_IOCTL_SVM_ATTR_SET_FLAGS: + case KFD_IOCTL_SVM_ATTR_CLR_FLAGS: get_flags = true; break; case KFD_IOCTL_SVM_ATTR_GRANULARITY: get_granularity = true; break; - case KFD_IOCTL_SVM_ATTR_CLR_FLAGS: case KFD_IOCTL_SVM_ATTR_ACCESS_IN_PLACE: case KFD_IOCTL_SVM_ATTR_NO_ACCESS: fallthrough; @@ -3069,7 +3070,8 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size, if (!node) { pr_debug("range attrs not found return default values\n"); svm_range_set_default_attributes(&location, &prefetch_loc, - &granularity, &flags); + &granularity, &flags_and); + flags_or = flags_and; if (p->xnack_enabled) bitmap_copy(bitmap_access, svms->bitmap_supported, MAX_GPU_INSTANCE); @@ -3115,8 +3117,10 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size, bitmap_and(bitmap_aip, bitmap_aip, prange->bitmap_aip, MAX_GPU_INSTANCE); } - if (get_flags) - flags &= prange->flags; + if (get_flags) { + flags_and &= prange->flags; + flags_or |= prange->flags; + } if (get_granularity && prange->granularity < granularity) granularity = prange->granularity; @@ -3150,7 +3154,10 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size, attrs[i].type = KFD_IOCTL_SVM_ATTR_NO_ACCESS; break; case KFD_IOCTL_SVM_ATTR_SET_FLAGS: - attrs[i].value = flags; + attrs[i].value = flags_and; + break; + case KFD_IOCTL_SVM_ATTR_CLR_FLAGS: + attrs[i].value = ~flags_or; break; case KFD_IOCTL_SVM_ATTR_GRANULARITY: attrs[i].value = (uint32_t)granularity; diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h index 3cb5b5dd9f77..af96af174dc4 100644 --- a/include/uapi/linux/kfd_ioctl.h +++ b/include/uapi/linux/kfd_ioctl.h @@ -31,9 +31,10 @@ * - 1.3 - Add SMI events support * - 1.4 - Indicate new SRAM EDC bit in device properties * - 1.5 - Add SVM API + * - 1.6 - Query clear flags in SVM get_attr API */ #define KFD_IOCTL_MAJOR_VERSION 1 -#define KFD_IOCTL_MINOR_VERSION 5 +#define KFD_IOCTL_MINOR_VERSION 6 struct kfd_ioctl_get_version_args { __u32 major_version; /* from KFD */ @@ -575,18 +576,19 @@ struct kfd_ioctl_svm_attribute { * @KFD_IOCTL_SVM_ATTR_PREFERRED_LOC or * @KFD_IOCTL_SVM_ATTR_PREFETCH_LOC resepctively. For * @KFD_IOCTL_SVM_ATTR_SET_FLAGS, flags of all pages will be - * aggregated by bitwise AND. The minimum migration granularity - * throughout the range will be returned for - * @KFD_IOCTL_SVM_ATTR_GRANULARITY. + * aggregated by bitwise AND. That means, a flag will be set in the + * output, if that flag is set for all pages in the range. For + * @KFD_IOCTL_SVM_ATTR_CLR_FLAGS, flags of all pages will be + * aggregated by bitwise NOR. That means, a flag will be set in the + * output, if that flag is clear for all pages in the range. + * The minimum migration granularity throughout the range will be + * returned for @KFD_IOCTL_SVM_ATTR_GRANULARITY. * * Querying of accessibility attributes works by initializing the * attribute type to @KFD_IOCTL_SVM_ATTR_ACCESS and the value to the * GPUID being queried. Multiple attributes can be given to allow * querying multiple GPUIDs. The ioctl function overwrites the * attribute type to indicate the access for the specified GPU. - * - * @KFD_IOCTL_SVM_ATTR_CLR_FLAGS is invalid for - * @KFD_IOCTL_SVM_OP_GET_ATTR. */ struct kfd_ioctl_svm_args { __u64 start_addr; ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman
[PATCH v2] drm/amdgpu - Corrected the video codecs array name for yellow carp
From: Veerabadhran Gopalakrishnan Signed-off-by: Veerabadhran Gopalakrishnan --- drivers/gpu/drm/amd/amdgpu/nv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index cf73a6923..751c7b8b1 100644 --- a/drivers/gpu/drm/amd/amdgpu/nv.c +++ b/drivers/gpu/drm/amd/amdgpu/nv.c @@ -173,8 +173,8 @@ static const struct amdgpu_video_codec_info yc_video_codecs_decode_array[] = { }; static const struct amdgpu_video_codecs yc_video_codecs_decode = { - .codec_count = ARRAY_SIZE(bg_video_codecs_decode_array), - .codec_array = bg_video_codecs_decode_array, + .codec_count = ARRAY_SIZE(yc_video_codecs_decode_array), + .codec_array = yc_video_codecs_decode_array, }; static int nv_query_video_codecs(struct amdgpu_device *adev, bool encode, -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu - Corrected the video codecs array name for yellow carp
[Public] .codec_count = ARRAY_SIZE(bg_video_codecs_decode_array), You need to change bg_xxx to yc_xxx as well. Regards, Guchun -Original Message- From: amd-gfx On Behalf Of veerabadhran.gopalakrish...@amd.com Sent: Monday, July 19, 2021 9:39 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Boyuan ; Gopalakrishnan, Veerabadhran (Veera) ; Zhu, James ; Liu, Leo ; Rao, Srinath Subject: [PATCH] drm/amdgpu - Corrected the video codecs array name for yellow carp From: Veerabadhran Gopalakrishnan Signed-off-by: Veerabadhran Gopalakrishnan --- drivers/gpu/drm/amd/amdgpu/nv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index cf73a6923..3027b44df 100644 --- a/drivers/gpu/drm/amd/amdgpu/nv.c +++ b/drivers/gpu/drm/amd/amdgpu/nv.c @@ -174,7 +174,7 @@ static const struct amdgpu_video_codec_info yc_video_codecs_decode_array[] = { static const struct amdgpu_video_codecs yc_video_codecs_decode = { .codec_count = ARRAY_SIZE(bg_video_codecs_decode_array), - .codec_array = bg_video_codecs_decode_array, + .codec_array = yc_video_codecs_decode_array, }; static int nv_query_video_codecs(struct amdgpu_device *adev, bool encode, -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cguchun.chen%40amd.com%7C5eeeb3a77fad4647678008d94aba9688%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637622987531891570%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=DI5h9oLQdSykmLEUgDXoobouTACHLMk99hnP92udTMo%3D&reserved=0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu - Corrected the video codecs array name for yellow carp
From: Veerabadhran Gopalakrishnan Signed-off-by: Veerabadhran Gopalakrishnan --- drivers/gpu/drm/amd/amdgpu/nv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index cf73a6923..3027b44df 100644 --- a/drivers/gpu/drm/amd/amdgpu/nv.c +++ b/drivers/gpu/drm/amd/amdgpu/nv.c @@ -174,7 +174,7 @@ static const struct amdgpu_video_codec_info yc_video_codecs_decode_array[] = { static const struct amdgpu_video_codecs yc_video_codecs_decode = { .codec_count = ARRAY_SIZE(bg_video_codecs_decode_array), - .codec_array = bg_video_codecs_decode_array, + .codec_array = yc_video_codecs_decode_array, }; static int nv_query_video_codecs(struct amdgpu_device *adev, bool encode, -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/pm: Fix a bug communicating with the SMU (v3)
On 2021-07-19 4:24 a.m., Lazar, Lijo wrote: On 7/14/2021 11:28 PM, Luben Tuikov wrote: This fixes a bug which if we probe a non-existing I2C device, and the SMU returns 0xFF, from then on we can never communicate with the SMU, because the code before this patch reads and interprets 0xFF as a terminal error, and thus we never write 0 into register 90 to clear the status (and subsequently send a new command to the SMU.) It is not an error that the SMU returns status 0xFF. This means that the SMU executed the last command successfully (execution status), but the command result is an error of some sort (execution result), depending on what the command was. When doing a status check of the SMU, before we send a new command, the only status which precludes us from sending a new command is 0--the SMU hasn't finished executing a previous command, and 0xFC--the SMU is busy. This bug was seen as the following line in the kernel log, amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right state! when subsequent SMU commands, not necessarily related to I2C, were sent to the SMU. This patch fixes this bug. v2: Add a comment to the description of __smu_cmn_poll_stat() to explain why we're NOT defining the SMU FW return codes as macros, but are instead hard-coding them. Such a change, can be followed up by a subsequent patch. v3: The changes are, a) Add comments to break labels in __smu_cmn_reg2errno(). b) When an unknown/unspecified/undefined result is returned back from the SMU, map that to -EREMOTEIO, to distinguish failure at the SMU FW. c) Add kernel-doc to smu_cmn_send_msg_without_waiting(), smu_cmn_wait_for_response(), smu_cmn_send_smc_msg_with_param(). d) In smu_cmn_send_smc_msg_with_param(), since we wait for completion of the command, if the result of the completion is undefined/unknown/unspecified, we print that to the kernel log. Cc: Alex Deucher Cc: Evan Quan Cc: Lijo Lazar Fixes: fcb1fe9c9e0031 ("drm/amd/powerplay: pre-check the SMU state before issuing message") Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 274 - drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h | 3 +- 2 files changed, 230 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c index c902fdf322c1be..dde10c51daa106 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c @@ -55,7 +55,7 @@ #undef __SMU_DUMMY_MAP #define __SMU_DUMMY_MAP(type) #type -static const char* __smu_message_names[] = { +static const char * const __smu_message_names[] = { SMU_MESSAGE_TYPES }; @@ -76,55 +76,246 @@ static void smu_cmn_read_arg(struct smu_context *smu, *arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82); } -int smu_cmn_wait_for_response(struct smu_context *smu) +/** + * __smu_cmn_poll_stat -- poll for a status from the SMU + * smu: a pointer to SMU context + * + * Returns the status of the SMU, which could be, + *0, the SMU is busy with your previous command; + *1, execution status: success, execution result: success; + * 0xFF, execution status: success, execution result: failure; ^ right here + * 0xFE, unknown command; + * 0xFD, valid command, but bad (command) prerequisites; + * 0xFC, the command was rejected as the SMU is busy; + * 0xFB, "SMC_Result_DebugDataDumpEnd". + * + * The values here are not defined by macros, because I'd rather we + * include a single header file which defines them, which is + * maintained by the SMU FW team, so that we're impervious to firmware + * changes. At the moment those values are defined in various header + * files, one for each ASIC, yet here we're a single ASIC-agnostic + * interface. Such a change can be followed-up by a subsequent patch. + */ +static u32 __smu_cmn_poll_stat(struct smu_context *smu) { struct amdgpu_device *adev = smu->adev; - uint32_t cur_value, i, timeout = adev->usec_timeout * 20; + int timeout = adev->usec_timeout * 20; + u32 reg; - for (i = 0; i < timeout; i++) { - cur_value = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90); - if ((cur_value & MP1_C2PMSG_90__CONTENT_MASK) != 0) - return cur_value; + for ( ; timeout > 0; timeout--) { + reg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90); + if ((reg & MP1_C2PMSG_90__CONTENT_MASK) != 0) + break; udelay(1); } - /* timeout means wrong logic */ - if (i == timeout) - return -ETIME; - - return RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90); + return reg; } -int smu_cmn_send_msg_without_waiting(struct smu_context *smu, - uint16_t msg, uint32_t param) +static void __smu_cmn_reg_print_error(struct smu_context *smu, + u32 reg_c2pmsg_90, + int msg_index, + u32 param, + enum smu_message_type msg) {
RE: [PATCH 00/32] DC Patches July 18, 2021
[Public] Hi all, This week this patchset was tested on the following systems: HP Envy 360, with Ryzen 5 4500U, with the following display types: eDP 1080p 60hz, 4k 60hz (via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 1680*1050 60hz (via USB-C to DP and then DP to DVI/VGA) AMD Ryzen 9 5900H, with the following display types: eDP 1080p 60hz, 4k 60hz (via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 1680*1050 60hz (via USB-C to DP and then DP to DVI/VGA) Sapphire Pulse RX5700XT with the following display types: 4k 60hz (via DP/HDMI), 1440p 144hz (via DP/HDMI), 1680*1050 60hz (via DP to DVI/VGA) Reference AMD RX6800 with the following display types: 4k 60hz (via DP/HDMI and USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI and USB-C to DP/HDMI), 1680*1050 60hz (via DP to DVI/VGA) Included testing using a Startech DP 1.4 MST hub at 2x 4k 60hz, and 3x 1080p 60hz on all systems. Tested-by: Daniel Wheeler Thank you, Dan Wheeler Technologist | AMD SW Display -- 1 Commerce Valley Dr E, Thornhill, ON L3T 7X6 Facebook | Twitter | amd.com -Original Message- From: Siqueira, Rodrigo Sent: July 18, 2021 10:06 AM To: amd-gfx@lists.freedesktop.org Cc: Wentland, Harry ; Li, Sun peng (Leo) ; Lakha, Bhawanpreet ; Siqueira, Rodrigo ; Pillai, Aurabindo ; Zhuo, Qingqing ; R, Bindu ; Jacob, Anson ; Wheeler, Daniel Subject: [PATCH 00/32] DC Patches July 18, 2021 DC version 3.2.144 brings improvements in multiple areas. In summary, we highlight: - Code improvements for passive - Cursor manipulation enhancements - Expand debug in some areas - Fix problems in DML - Other minor code refactors Cc: Daniel Wheeler Anthony Koo (2): drm/amd/display: [FW Promotion] Release 0.0.73 drm/amd/display: [FW Promotion] Release 0.0.75 Aric Cyr (3): drm/amd/display: 3.2.143 drm/amd/display: 3.2.144 drm/amd/display: 3.2.145 Aurabindo Pillai (1): drm/amd/display: add debug print for DCC validation failure Bindu Ramamurthy (2): drm/amd/display: Populate socclk entries for dcn3.02/3.03 drm/amd/display: Populate dtbclk entries for dcn3.02/3.03 Camille Cho (1): drm/amd/display: Only set default brightness for OLED Charlene Liu (1): drm/amd/display: reset dpcd_cap.dpcd_rev for passive dongle. Dmytro Laktyushkin (1): drm/amd/display: remove compbuf size wait Eric Yang (3): drm/amd/display: implement workaround for riommu related hang drm/amd/display: add workaround for riommu invalidation request hang drm/amd/display: change zstate allow msg condition Ian Chen (1): drm/amd/display: Extend dmub_cmd_psr_copy_settings_data struct Jake Wang (1): drm/amd/display: Fixed hardware power down bypass during headless boot Josip Pavic (1): drm/amd/display: log additional register state for debug Krunoslav Kovac (2): drm/amd/display: Assume active upper layer owns the HW cursor drm/amd/display: Refine condition for cursor visibility Michael Strauss (1): drm/amd/display: Enable eDP ILR on DCN2.1 Mikita Lipski (2): drm/amd/display: Prevent Diags from entering S2 drm/amd/display: Remove MALL function from DCN3.1 Nevenko Stupar (1): drm/amd/display: Line Buffer changes Nicholas Kazlauskas (3): drm/amd/display: Fix max vstartup calculation for modes with borders drm/amd/display: Query VCO frequency from register for DCN3.1 drm/amd/display: Update bounding box for DCN3.1 Oliver Logush (1): drm/amd/display: Fix timer_per_pixel unit error Stylon Wang (1): drm/amd/display: Re-enable "Guard ASSR with internal display flag" Victor Lu (1): drm/amd/display: Fix comparison error in dcn21 DML Wesley Chalmers (1): drm/amd/display: Add copyright notice to new files Zhan Liu (1): drm/amd/display: Reduce delay when sink device not able to ACK 00340h write sunglee (1): drm/amd/display: DCN2X Prefer ODM over bottom pipe to find second pipe .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 +- .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 4 ++ .../display/dc/clk_mgr/dcn301/vg_clk_mgr.c| 2 +- .../display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c | 59 --- .../display/dc/clk_mgr/dcn31/dcn31_clk_mgr.h | 54 -- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 8 +++ .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 72 +-- .../drm/amd/display/dc/core/dc_link_dpcd.c| 25 +++ drivers/gpu/drm/amd/display/dc/dc.h | 12 ++-- drivers/gpu/drm/amd/display/dc/dc_dp_types.h | 1 + .../gpu/drm/amd/display/dc/dce/dce_hwseq.h| 4 +- .../drm/amd/display/dc/dcn10/dcn10_dpp_dscl.c | 7 +- .../drm/amd/display/dc/dcn10/dcn10_hubbub.h | 19 - .../gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c | 29 .../gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h | 4 ++ .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 42 ++- .../gpu/drm/amd/display/dc/dcn10
Re: [PATCH v4 03/13] kernel: resource: lookup_resource as exported symbol
On Sat, Jul 17, 2021 at 02:21:25PM -0500, Alex Sierra wrote: > return res; > } > - > +EXPORT_SYMBOL_GPL(lookup_resource); > /* Please keep this empty line (after the EXPORT_SYMBOL). ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v4 08/13] mm: call pgmap->ops->page_free for DEVICE_GENERIC pages
On Sat, Jul 17, 2021 at 02:21:30PM -0500, Alex Sierra wrote: > Add MEMORY_DEVICE_GENERIC case to free_zone_device_page > callback. > Device generic type memory case is now able to free its > pages properly. No need for the early line breaks: Add MEMORY_DEVICE_GENERIC case to free_zone_device_page callback. Device generic type memory case is now able to free its pages properly. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v4 05/13] drm/amdkfd: generic type as sys mem on migration to ram
On Sat, Jul 17, 2021 at 02:21:27PM -0500, Alex Sierra wrote: > + migrate.flags = adev->gmc.xgmi.connected_to_cpu ? > + MIGRATE_VMA_SELECT_SYSTEM : > MIGRATE_VMA_SELECT_DEVICE_PRIVATE; Please just use a good old if/else to keep the code readable. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority
[AMD Official Use Only] Besides, I think our current KMD have three types of kernel sdma jobs: 1) adev->mman.entity, it is already a KERNEL priority entity 2) vm->immediate 3) vm->delay Do you mean now vm->immediate or delay are used as moving jobs instead of mman.entity ? Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Liu, Monk Sent: Monday, July 19, 2021 5:40 PM To: 'Christian König' ; Chen, JingWen ; amd-gfx@lists.freedesktop.org Cc: Chen, Horace Subject: RE: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority [AMD Official Use Only] If there is move jobs clashing there we probably need to fix the bugs of those move jobs Previously I believe you also remember that we agreed to always trust kernel jobs especially paging jobs, Without set paging jobs' priority to KERNEL level how can we keep that protocol ? do you have a better idea? Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Christian König Sent: Monday, July 19, 2021 4:25 PM To: Chen, JingWen ; amd-gfx@lists.freedesktop.org Cc: Chen, Horace ; Liu, Monk Subject: Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority Am 19.07.21 um 07:57 schrieb Jingwen Chen: > [Why] > Current vm_pte entities have NORMAL priority, in SRIOV multi-vf use > case, the vf flr happens first and then job time out is found. > There can be several jobs timeout during a very small time slice. > And if the innocent sdma job time out is found before the real bad > job, then the innocent sdma job will be set to guilty as it only has > NORMAL priority. This will lead to a page fault after resubmitting > job. > > [How] > sdma should always have KERNEL priority. The kernel job will always be > resubmitted. I'm not sure if that is a good idea. We intentionally didn't gave the page table updates kernel priority to avoid clashing with the move jobs. Christian. > > Signed-off-by: Jingwen Chen > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 358316d6a38c..f7526b67cc5d 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -2923,13 +2923,13 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct > amdgpu_vm *vm) > INIT_LIST_HEAD(&vm->done); > > /* create scheduler entities for page table updates */ > - r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_NORMAL, > + r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_KERNEL, > adev->vm_manager.vm_pte_scheds, > adev->vm_manager.vm_pte_num_scheds, NULL); > if (r) > return r; > > - r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_NORMAL, > + r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_KERNEL, > adev->vm_manager.vm_pte_scheds, > adev->vm_manager.vm_pte_num_scheds, NULL); > if (r) ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority
[AMD Official Use Only] If there is move jobs clashing there we probably need to fix the bugs of those move jobs Previously I believe you also remember that we agreed to always trust kernel jobs especially paging jobs, Without set paging jobs' priority to KERNEL level how can we keep that protocol ? do you have a better idea? Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Christian König Sent: Monday, July 19, 2021 4:25 PM To: Chen, JingWen ; amd-gfx@lists.freedesktop.org Cc: Chen, Horace ; Liu, Monk Subject: Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority Am 19.07.21 um 07:57 schrieb Jingwen Chen: > [Why] > Current vm_pte entities have NORMAL priority, in SRIOV multi-vf use > case, the vf flr happens first and then job time out is found. > There can be several jobs timeout during a very small time slice. > And if the innocent sdma job time out is found before the real bad > job, then the innocent sdma job will be set to guilty as it only has > NORMAL priority. This will lead to a page fault after resubmitting > job. > > [How] > sdma should always have KERNEL priority. The kernel job will always be > resubmitted. I'm not sure if that is a good idea. We intentionally didn't gave the page table updates kernel priority to avoid clashing with the move jobs. Christian. > > Signed-off-by: Jingwen Chen > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 358316d6a38c..f7526b67cc5d 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -2923,13 +2923,13 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct > amdgpu_vm *vm) > INIT_LIST_HEAD(&vm->done); > > /* create scheduler entities for page table updates */ > - r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_NORMAL, > + r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_KERNEL, > adev->vm_manager.vm_pte_scheds, > adev->vm_manager.vm_pte_num_scheds, NULL); > if (r) > return r; > > - r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_NORMAL, > + r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_KERNEL, > adev->vm_manager.vm_pte_scheds, > adev->vm_manager.vm_pte_num_scheds, NULL); > if (r) ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/pm: Fix a bug communicating with the SMU (v3)
On 7/14/2021 11:28 PM, Luben Tuikov wrote: This fixes a bug which if we probe a non-existing I2C device, and the SMU returns 0xFF, from then on we can never communicate with the SMU, because the code before this patch reads and interprets 0xFF as a terminal error, and thus we never write 0 into register 90 to clear the status (and subsequently send a new command to the SMU.) It is not an error that the SMU returns status 0xFF. This means that the SMU executed the last command successfully (execution status), but the command result is an error of some sort (execution result), depending on what the command was. When doing a status check of the SMU, before we send a new command, the only status which precludes us from sending a new command is 0--the SMU hasn't finished executing a previous command, and 0xFC--the SMU is busy. This bug was seen as the following line in the kernel log, amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right state! when subsequent SMU commands, not necessarily related to I2C, were sent to the SMU. This patch fixes this bug. v2: Add a comment to the description of __smu_cmn_poll_stat() to explain why we're NOT defining the SMU FW return codes as macros, but are instead hard-coding them. Such a change, can be followed up by a subsequent patch. v3: The changes are, a) Add comments to break labels in __smu_cmn_reg2errno(). b) When an unknown/unspecified/undefined result is returned back from the SMU, map that to -EREMOTEIO, to distinguish failure at the SMU FW. c) Add kernel-doc to smu_cmn_send_msg_without_waiting(), smu_cmn_wait_for_response(), smu_cmn_send_smc_msg_with_param(). d) In smu_cmn_send_smc_msg_with_param(), since we wait for completion of the command, if the result of the completion is undefined/unknown/unspecified, we print that to the kernel log. Cc: Alex Deucher Cc: Evan Quan Cc: Lijo Lazar Fixes: fcb1fe9c9e0031 ("drm/amd/powerplay: pre-check the SMU state before issuing message") Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 274 - drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h | 3 +- 2 files changed, 230 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c index c902fdf322c1be..dde10c51daa106 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c @@ -55,7 +55,7 @@ #undef __SMU_DUMMY_MAP #define __SMU_DUMMY_MAP(type) #type -static const char* __smu_message_names[] = { +static const char * const __smu_message_names[] = { SMU_MESSAGE_TYPES }; @@ -76,55 +76,246 @@ static void smu_cmn_read_arg(struct smu_context *smu, *arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82); } -int smu_cmn_wait_for_response(struct smu_context *smu) +/** + * __smu_cmn_poll_stat -- poll for a status from the SMU + * smu: a pointer to SMU context + * + * Returns the status of the SMU, which could be, + *0, the SMU is busy with your previous command; + *1, execution status: success, execution result: success; + * 0xFF, execution status: success, execution result: failure; + * 0xFE, unknown command; + * 0xFD, valid command, but bad (command) prerequisites; + * 0xFC, the command was rejected as the SMU is busy; + * 0xFB, "SMC_Result_DebugDataDumpEnd". + * + * The values here are not defined by macros, because I'd rather we + * include a single header file which defines them, which is + * maintained by the SMU FW team, so that we're impervious to firmware + * changes. At the moment those values are defined in various header + * files, one for each ASIC, yet here we're a single ASIC-agnostic + * interface. Such a change can be followed-up by a subsequent patch. + */ +static u32 __smu_cmn_poll_stat(struct smu_context *smu) { struct amdgpu_device *adev = smu->adev; - uint32_t cur_value, i, timeout = adev->usec_timeout * 20; + int timeout = adev->usec_timeout * 20; + u32 reg; - for (i = 0; i < timeout; i++) { - cur_value = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90); - if ((cur_value & MP1_C2PMSG_90__CONTENT_MASK) != 0) - return cur_value; + for ( ; timeout > 0; timeout--) { + reg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90); + if ((reg & MP1_C2PMSG_90__CONTENT_MASK) != 0) + break; udelay(1); } - /* timeout means wrong logic */ - if (i == timeout) - return -ETIME; - - return RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90); + return reg; } -int smu_cmn_send_msg_without_waiting(struct smu_context *smu, -uint16_t msg, uint32_t param) +static void __smu_cmn_reg_print_error(struct smu_context *smu, + u32 reg_c2pmsg_90, + int msg_ind
Re: [PATCH 2/3] drm/amdkfd: report xgmi bandwidth between direct peers to the kfd
On 7/16/2021 10:13 PM, Jonathan Kim wrote: Report the min/max bandwidth in megabytes to the kfd for direct xgmi connections only. v2: change reporting from num links to bandwidth Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 23 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 12 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 2 ++ drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 12 +++ 5 files changed, 50 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index bfab2f9fdd17..3978578a1c49 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -553,6 +553,29 @@ uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst, struct kgd_dev *s return (uint8_t)ret; } +int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst, struct kgd_dev *src, bool is_min) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)dst, *peer_adev; + int num_links; + + if (adev->asic_type != CHIP_ALDEBARAN) + return 0; + + if (src) + peer_adev = (struct amdgpu_device *)src; + + num_links = is_min ? 1 : amdgpu_xgmi_get_num_links(adev, peer_adev); + if (num_links < 0) { + DRM_ERROR("amdgpu: failed to get xgmi num links between node %d and %d. ret = %d\n", + adev->gmc.xgmi.physical_node_id, + peer_adev->gmc.xgmi.physical_node_id, num_links); + num_links = 0; + } + + /* Aldebaran xGMI DPM is defeatured so assume x16 x 25Gbps for bandwidth. */ + return (num_links * 16 * 25000)/BITS_PER_BYTE; Instead of having ASIC family checks and bandwidth info in interface functions, better to have this info come from base layer (amdgpu_xgmi or xgmi ip). That will help to handle other ASICs. Thanks, Lijo uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct kgd_dev *kgd) { struct amdgpu_device *adev = (struct amdgpu_device *)kgd; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index 81264517d532..e12fccb2d2c4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -226,6 +226,7 @@ uint32_t amdgpu_amdkfd_get_num_gws(struct kgd_dev *kgd); uint32_t amdgpu_amdkfd_get_asic_rev_id(struct kgd_dev *kgd); int amdgpu_amdkfd_get_noretry(struct kgd_dev *kgd); uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst, struct kgd_dev *src); +int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst, struct kgd_dev *src, bool is_min); /* Read user wptr from a specified user address space with page fault * disabled. The memory must be pinned and mapped to the hardware when diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c index 8567d5d77346..258cf86b32f6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c @@ -486,6 +486,18 @@ int amdgpu_xgmi_get_hops_count(struct amdgpu_device *adev, return -EINVAL; } +int amdgpu_xgmi_get_num_links(struct amdgpu_device *adev, + struct amdgpu_device *peer_adev) +{ + struct psp_xgmi_topology_info *top = &adev->psp.xgmi_context.top_info; + int i; + + for (i = 0 ; i < top->num_nodes; ++i) + if (top->nodes[i].node_id == peer_adev->gmc.xgmi.node_id) + return top->nodes[i].num_links; + return -EINVAL; +} + int amdgpu_xgmi_add_device(struct amdgpu_device *adev) { struct psp_xgmi_topology_info *top_info; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h index 12969c0830d5..d2189bf7d428 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h @@ -59,6 +59,8 @@ int amdgpu_xgmi_remove_device(struct amdgpu_device *adev); int amdgpu_xgmi_set_pstate(struct amdgpu_device *adev, int pstate); int amdgpu_xgmi_get_hops_count(struct amdgpu_device *adev, struct amdgpu_device *peer_adev); +int amdgpu_xgmi_get_num_links(struct amdgpu_device *adev, + struct amdgpu_device *peer_adev); uint64_t amdgpu_xgmi_get_relative_phy_addr(struct amdgpu_device *adev, uint64_t addr); static inline bool amdgpu_xgmi_same_hive(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c index c6b02aee4993..40ce6239c813 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c @@ -1989,6 +1989,13 @@ static int kfd_fill_gpu_direct_io_link_to_cpu(int *avail_size, sub_type_hdr->flags |= CRAT_IOLINK_FLAGS_BI_DIRECTIONAL;
Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority
Am 19.07.21 um 11:42 schrieb Liu, Monk: [AMD Official Use Only] Besides, I think our current KMD have three types of kernel sdma jobs: 1) adev->mman.entity, it is already a KERNEL priority entity 2) vm->immediate 3) vm->delay Do you mean now vm->immediate or delay are used as moving jobs instead of mman.entity ? No, exactly that's the point. vm->immediate and vm->delayed are not for kernel paging jobs. Those are used for userspace page table updates. I agree that those should probably not considered guilty, but modifying the priority of them is not the right way of doing that. Regards, Christian. Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Liu, Monk Sent: Monday, July 19, 2021 5:40 PM To: 'Christian König' ; Chen, JingWen ; amd-gfx@lists.freedesktop.org Cc: Chen, Horace Subject: RE: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority [AMD Official Use Only] If there is move jobs clashing there we probably need to fix the bugs of those move jobs Previously I believe you also remember that we agreed to always trust kernel jobs especially paging jobs, Without set paging jobs' priority to KERNEL level how can we keep that protocol ? do you have a better idea? Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Christian König Sent: Monday, July 19, 2021 4:25 PM To: Chen, JingWen ; amd-gfx@lists.freedesktop.org Cc: Chen, Horace ; Liu, Monk Subject: Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority Am 19.07.21 um 07:57 schrieb Jingwen Chen: [Why] Current vm_pte entities have NORMAL priority, in SRIOV multi-vf use case, the vf flr happens first and then job time out is found. There can be several jobs timeout during a very small time slice. And if the innocent sdma job time out is found before the real bad job, then the innocent sdma job will be set to guilty as it only has NORMAL priority. This will lead to a page fault after resubmitting job. [How] sdma should always have KERNEL priority. The kernel job will always be resubmitted. I'm not sure if that is a good idea. We intentionally didn't gave the page table updates kernel priority to avoid clashing with the move jobs. Christian. Signed-off-by: Jingwen Chen --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 358316d6a38c..f7526b67cc5d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2923,13 +2923,13 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm) INIT_LIST_HEAD(&vm->done); /* create scheduler entities for page table updates */ - r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_NORMAL, + r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_KERNEL, adev->vm_manager.vm_pte_scheds, adev->vm_manager.vm_pte_num_scheds, NULL); if (r) return r; - r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_NORMAL, + r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_KERNEL, adev->vm_manager.vm_pte_scheds, adev->vm_manager.vm_pte_num_scheds, NULL); if (r) ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amd/amdgpu: vm entities should have kernel priority
[Why] Current vm_pte entities have NORMAL priority, in SRIOV multi-vf use case, the vf flr happens first and then job time out is found. There can be several jobs timeout during a very small time slice. And if the innocent sdma job time out is found before the real bad job, then the innocent sdma job will be set to guilty as it only has NORMAL priority. This will lead to a page fault after resubmitting job. [How] sdma should always have KERNEL priority. The kernel job will always be resubmitted. Signed-off-by: Jingwen Chen --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 358316d6a38c..f7526b67cc5d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2923,13 +2923,13 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm) INIT_LIST_HEAD(&vm->done); /* create scheduler entities for page table updates */ - r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_NORMAL, + r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_KERNEL, adev->vm_manager.vm_pte_scheds, adev->vm_manager.vm_pte_num_scheds, NULL); if (r) return r; - r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_NORMAL, + r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_KERNEL, adev->vm_manager.vm_pte_scheds, adev->vm_manager.vm_pte_num_scheds, NULL); if (r) -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority
Am 19.07.21 um 07:57 schrieb Jingwen Chen: [Why] Current vm_pte entities have NORMAL priority, in SRIOV multi-vf use case, the vf flr happens first and then job time out is found. There can be several jobs timeout during a very small time slice. And if the innocent sdma job time out is found before the real bad job, then the innocent sdma job will be set to guilty as it only has NORMAL priority. This will lead to a page fault after resubmitting job. [How] sdma should always have KERNEL priority. The kernel job will always be resubmitted. I'm not sure if that is a good idea. We intentionally didn't gave the page table updates kernel priority to avoid clashing with the move jobs. Christian. Signed-off-by: Jingwen Chen --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 358316d6a38c..f7526b67cc5d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2923,13 +2923,13 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm) INIT_LIST_HEAD(&vm->done); /* create scheduler entities for page table updates */ - r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_NORMAL, + r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_KERNEL, adev->vm_manager.vm_pte_scheds, adev->vm_manager.vm_pte_num_scheds, NULL); if (r) return r; - r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_NORMAL, + r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_KERNEL, adev->vm_manager.vm_pte_scheds, adev->vm_manager.vm_pte_num_scheds, NULL); if (r) ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH v2] drm/amd/amdgpu: Recovery vcn instance iterate.
[AMD Official Use Only] Hi Leo Can you help to review this patch? -- BW Pengju Zhou > -Original Message- > From: amd-gfx On Behalf Of Zhou, > Peng Ju > Sent: Friday, July 16, 2021 10:15 AM > To: Liu, Monk ; Alex Deucher > ; Liu, Leo > Cc: amd-gfx list > Subject: RE: [PATCH v2] drm/amd/amdgpu: Recovery vcn instance iterate. > > [AMD Official Use Only] > > Hi @Liu, Leo > > Can you help to review this patch? > Monk and Alex have reviewed it. > > > -- > BW > Pengju Zhou > > > > > -Original Message- > > From: Liu, Monk > > Sent: Thursday, July 15, 2021 7:54 AM > > To: Alex Deucher ; Zhou, Peng Ju > > ; Liu, Leo > > Cc: amd-gfx list > > Subject: RE: [PATCH v2] drm/amd/amdgpu: Recovery vcn instance iterate. > > > > [AMD Official Use Only] > > > > Reviewed-by: Monk Liu > > > > You might need @Liu, Leo's review as well > > > > Thanks > > > > -- > > Monk Liu | Cloud-GPU Core team > > -- > > > > -Original Message- > > From: amd-gfx On Behalf Of Alex > > Deucher > > Sent: Wednesday, July 14, 2021 10:49 PM > > To: Zhou, Peng Ju > > Cc: amd-gfx list > > Subject: Re: [PATCH v2] drm/amd/amdgpu: Recovery vcn instance iterate. > > > > On Tue, Jul 13, 2021 at 6:31 AM Peng Ju Zhou > wrote: > > > > > > The previous logic is recording the amount of valid vcn instances to > > > use them on SRIOV, it is a hard task due to the vcn accessment is > > > based on the index of the vcn instance. > > > > > > Check if the vcn instance enabled before do instance init. > > > > > > Signed-off-by: Peng Ju Zhou > > > > Acked-by: Alex Deucher > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 33 > > > --- > > > 1 file changed, 20 insertions(+), 13 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c > > > b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c > > > index c3580de3ea9c..d11fea2c9d90 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c > > > @@ -88,9 +88,7 @@ static int vcn_v3_0_early_init(void *handle) > > > int i; > > > > > > if (amdgpu_sriov_vf(adev)) { > > > - for (i = 0; i < VCN_INSTANCES_SIENNA_CICHLID; i++) > > > - if (amdgpu_vcn_is_disabled_vcn(adev, > > > VCN_DECODE_RING, i)) > > > - adev->vcn.num_vcn_inst++; > > > + adev->vcn.num_vcn_inst = VCN_INSTANCES_SIENNA_CICHLID; > > > adev->vcn.harvest_config = 0; > > > adev->vcn.num_enc_rings = 1; > > > > > > @@ -151,8 +149,7 @@ static int vcn_v3_0_sw_init(void *handle) > > > adev->firmware.fw_size += > > > ALIGN(le32_to_cpu(hdr->ucode_size_bytes), > > > PAGE_SIZE); > > > > > > - if ((adev->vcn.num_vcn_inst == > VCN_INSTANCES_SIENNA_CICHLID) > > || > > > - (amdgpu_sriov_vf(adev) && adev->asic_type == > > CHIP_SIENNA_CICHLID)) { > > > + if (adev->vcn.num_vcn_inst == > > > + VCN_INSTANCES_SIENNA_CICHLID) { > > > > > > adev->firmware.ucode[AMDGPU_UCODE_ID_VCN1].ucode_id > = > > AMDGPU_UCODE_ID_VCN1; > > > adev->firmware.ucode[AMDGPU_UCODE_ID_VCN1].fw = > adev- > > >vcn.fw; > > > adev->firmware.fw_size += @@ -322,18 +319,28 > > > @@ static int vcn_v3_0_hw_init(void *handle) > > > continue; > > > > > > ring = &adev->vcn.inst[i].ring_dec; > > > - ring->wptr = 0; > > > - ring->wptr_old = 0; > > > - vcn_v3_0_dec_ring_set_wptr(ring); > > > - ring->sched.ready = true; > > > - > > > - for (j = 0; j < adev->vcn.num_enc_rings; ++j) { > > > - ring = &adev->vcn.inst[i].ring_enc[j]; > > > + if (amdgpu_vcn_is_disabled_vcn(adev, > > > VCN_DECODE_RING, > i)) > > { > > > + ring->sched.ready = false; > > > + dev_info(adev->dev, "ring %s is disabled > > > by > hypervisor\n", > > ring->name); > > > + } else { > > > ring->wptr = 0; > > > ring->wptr_old = 0; > > > - vcn_v3_0_enc_ring_set_wptr(ring); > > > + vcn_v3_0_dec_ring_set_wptr(ring); > > > ring->sched.ready = true; > > > } > > > + > > > + for (j = 0; j < adev->vcn.num_enc_rings; ++j) { > > > + ring = &adev->vcn.inst[i].ring_enc[j]; > > > + if (amdgpu_v