Re: [RFC 0/2] Add generic FPU api similar to x86

2021-07-19 Thread Christian König
While you already CCed a bunch of people stuff like that needs to go to 
the appropriate mailing list and not just amd-gfx.


Especially LKML so that other core devs can take a look as well.

Regards,
Christian.

Am 19.07.21 um 21:52 schrieb Anson Jacob:

This is an attempt to have generic FPU enable/disable
calls similar to x86.
So that we can simplify gpu/drm/amd/display/dc/os_types.h

Also adds FPU correctness logic seen in x86.

Anson Jacob (2):
   ppc/fpu: Add generic FPU api similar to x86
   drm/amd/display: Use PPC FPU functions

  arch/powerpc/include/asm/switch_to.h  |  29 ++---
  arch/powerpc/kernel/process.c | 130 ++
  drivers/gpu/drm/amd/display/dc/os_types.h |  28 +
  3 files changed, 139 insertions(+), 48 deletions(-)



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v3 3/4] drm/amd/display: Add control mechanism for FPU utilization

2021-07-19 Thread Rodrigo Siqueira
DC invokes DC_FPU_START/END in multiple parts of the code; this can
create a situation where we invoke this FPU operation in a nested way or
exit too early. For avoiding this situation, this commit adds a
mechanism where dc_fpu_begin/end manages the access to
kernel_fpu_begin/end.

Change since V2:
- Christian: Do not use this_cpu_* between get/put_cpu_ptr().

Change since V1:
- Use a better variable names
- Use get_cpu_ptr and put_cpu_ptr to better balance preemption enable
and disable

Cc: Harry Wentland 
Cc: Anson Jacob 
Cc: Christian König 
Cc: Hersen Wu 
Cc: Aric Cyr 
Signed-off-by: Rodrigo Siqueira 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_trace.h   | 13 ---
 .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 36 ---
 drivers/gpu/drm/amd/display/dc/dc_trace.h |  4 +--
 3 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h
index 230bb12c405e..fdcaea22b456 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h
@@ -638,23 +638,26 @@ TRACE_EVENT(amdgpu_refresh_rate_track,
 );
 
 TRACE_EVENT(dcn_fpu,
-   TP_PROTO(bool begin, const char *function, const int line),
-   TP_ARGS(begin, function, line),
+   TP_PROTO(bool begin, const char *function, const int line, const 
int recursion_depth),
+   TP_ARGS(begin, function, line, recursion_depth),
 
TP_STRUCT__entry(
 __field(bool, begin)
 __field(const char *, function)
 __field(int, line)
+__field(int, recursion_depth)
),
TP_fast_assign(
   __entry->begin = begin;
   __entry->function = function;
   __entry->line = line;
+  __entry->recursion_depth = recursion_depth;
),
-   TP_printk("%s()+%d: %s",
+   TP_printk("%s: recursion_depth: %d: %s()+%d:",
+ __entry->begin ? "begin" : "end",
+ __entry->recursion_depth,
  __entry->function,
- __entry->line,
- __entry->begin ? "begin" : "end"
+ __entry->line
)
 );
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
index d5d156a4517e..d0d3e8a34db5 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
@@ -28,6 +28,19 @@
 
 #include 
 
+/**
+ * DOC: DC FPU manipulation overview
+ *
+ * DC core uses FPU operations in multiple parts of the code, which requires a
+ * more specialized way to manage these areas' entrance. To fulfill this
+ * requirement, we created some wrapper functions that encapsulate
+ * kernel_fpu_begin/end to better fit our need in the display component. In
+ * summary, in this file, you can find functions related to FPU operation
+ * management.
+ */
+
+static DEFINE_PER_CPU(int, fpu_recursion_depth);
+
 /**
  * dc_fpu_begin - Enables FPU protection
  * @function_name: A string containing the function name for debug purposes
@@ -43,8 +56,16 @@
  */
 void dc_fpu_begin(const char *function_name, const int line)
 {
-   TRACE_DCN_FPU(true, function_name, line);
-   kernel_fpu_begin();
+   int *pcpu;
+
+   pcpu = get_cpu_ptr(&fpu_recursion_depth);
+   *pcpu += 1;
+
+   if (*pcpu == 1)
+   kernel_fpu_begin();
+
+   TRACE_DCN_FPU(true, function_name, line, *pcpu);
+   put_cpu_ptr(&fpu_recursion_depth);
 }
 
 /**
@@ -59,6 +80,13 @@ void dc_fpu_begin(const char *function_name, const int line)
  */
 void dc_fpu_end(const char *function_name, const int line)
 {
-   TRACE_DCN_FPU(false, function_name, line);
-   kernel_fpu_end();
+   int *pcpu;
+
+   pcpu = get_cpu_ptr(&fpu_recursion_depth);
+   *pcpu -= 1;
+   if (*pcpu <= 0)
+   kernel_fpu_end();
+
+   TRACE_DCN_FPU(false, function_name, line, *pcpu);
+   put_cpu_ptr(&fpu_recursion_depth);
 }
diff --git a/drivers/gpu/drm/amd/display/dc/dc_trace.h 
b/drivers/gpu/drm/amd/display/dc/dc_trace.h
index d598ba697e45..c711797e5c9e 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_trace.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_trace.h
@@ -38,5 +38,5 @@
 #define TRACE_DCN_CLOCK_STATE(dcn_clocks) \
trace_amdgpu_dm_dc_clocks_state(dcn_clocks)
 
-#define TRACE_DCN_FPU(begin, function, line) \
-   trace_dcn_fpu(begin, function, line)
+#define TRACE_DCN_FPU(begin, function, line, ref_count) \
+   trace_dcn_fpu(begin, function, line, ref_count)
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v3 1/4] drm/amd/display: Introduce FPU directory inside DC

2021-07-19 Thread Rodrigo Siqueira
The display core files rely on FPU, which requires to be compiled with
special flags. Ideally, we don't want these FPU operations spread around
the DC code; nevertheless, it happens in the current source. This commit
introduces a new directory named fpu_operations that intends to
centralize all files that require the FPU compilation flag. As part of
this new component, this patch also moves one of the functions
(dcn20_populate_dml_writeback_from_context) that require FPU access to a
single shared file. Notice that this is the first part of the work, and
it does not fix the FPU issue yet; we still need other patches for
achieving the complete FPU isolation.

Changes since V2:
- Christian: Remove unnecessary wrapper.
- lkp: Add missing prototype.
- Only compile the FPU operations if the DCN option is enabled.

Change since V1:
- Update documentation and rebase.

Cc: Harry Wentland 
Cc: Anson Jacob 
Cc: Christian König 
Cc: Hersen Wu 
Cc: Aric Cyr 
Reported-by: kernel test robot 
Signed-off-by: Rodrigo Siqueira 
---
 drivers/gpu/drm/amd/display/dc/Makefile   |  1 +
 .../drm/amd/display/dc/dcn20/dcn20_resource.c | 39 +
 .../drm/amd/display/dc/dcn20/dcn20_resource.h |  2 -
 .../drm/amd/display/dc/dcn21/dcn21_resource.c |  2 +
 .../amd/display/dc/fpu_operations/Makefile| 58 +
 .../drm/amd/display/dc/fpu_operations/dcn2x.c | 84 +++
 .../drm/amd/display/dc/fpu_operations/dcn2x.h | 33 
 7 files changed, 180 insertions(+), 39 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/Makefile
 create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.h

diff --git a/drivers/gpu/drm/amd/display/dc/Makefile 
b/drivers/gpu/drm/amd/display/dc/Makefile
index 943fcb164876..0de4baefa91e 100644
--- a/drivers/gpu/drm/amd/display/dc/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/Makefile
@@ -35,6 +35,7 @@ DC_LIBS += dcn301
 DC_LIBS += dcn302
 DC_LIBS += dcn303
 DC_LIBS += dcn31
+DC_LIBS += fpu_operations
 endif
 
 DC_LIBS += dce120
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index 1b05a37b674d..f99b09643a52 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -35,6 +35,8 @@
 #include "include/irq_service_interface.h"
 #include "dcn20/dcn20_resource.h"
 
+#include "fpu_operations/dcn2x.h"
+
 #include "dcn10/dcn10_hubp.h"
 #include "dcn10/dcn10_ipp.h"
 #include "dcn20_hubbub.h"
@@ -1974,43 +1976,6 @@ void dcn20_split_stream_for_mpc(
ASSERT(primary_pipe->plane_state);
 }
 
-void dcn20_populate_dml_writeback_from_context(
-   struct dc *dc, struct resource_context *res_ctx, 
display_e2e_pipe_params_st *pipes)
-{
-   int pipe_cnt, i;
-
-   for (i = 0, pipe_cnt = 0; i < dc->res_pool->pipe_count; i++) {
-   struct dc_writeback_info *wb_info = 
&res_ctx->pipe_ctx[i].stream->writeback_info[0];
-
-   if (!res_ctx->pipe_ctx[i].stream)
-   continue;
-
-   /* Set writeback information */
-   pipes[pipe_cnt].dout.wb_enable = (wb_info->wb_enabled == true) 
? 1 : 0;
-   pipes[pipe_cnt].dout.num_active_wb++;
-   pipes[pipe_cnt].dout.wb.wb_src_height = 
wb_info->dwb_params.cnv_params.crop_height;
-   pipes[pipe_cnt].dout.wb.wb_src_width = 
wb_info->dwb_params.cnv_params.crop_width;
-   pipes[pipe_cnt].dout.wb.wb_dst_width = 
wb_info->dwb_params.dest_width;
-   pipes[pipe_cnt].dout.wb.wb_dst_height = 
wb_info->dwb_params.dest_height;
-   pipes[pipe_cnt].dout.wb.wb_htaps_luma = 1;
-   pipes[pipe_cnt].dout.wb.wb_vtaps_luma = 1;
-   pipes[pipe_cnt].dout.wb.wb_htaps_chroma = 
wb_info->dwb_params.scaler_taps.h_taps_c;
-   pipes[pipe_cnt].dout.wb.wb_vtaps_chroma = 
wb_info->dwb_params.scaler_taps.v_taps_c;
-   pipes[pipe_cnt].dout.wb.wb_hratio = 1.0;
-   pipes[pipe_cnt].dout.wb.wb_vratio = 1.0;
-   if (wb_info->dwb_params.out_format == dwb_scaler_mode_yuv420) {
-   if (wb_info->dwb_params.output_depth == 
DWB_OUTPUT_PIXEL_DEPTH_8BPC)
-   pipes[pipe_cnt].dout.wb.wb_pixel_format = 
dm_420_8;
-   else
-   pipes[pipe_cnt].dout.wb.wb_pixel_format = 
dm_420_10;
-   } else
-   pipes[pipe_cnt].dout.wb.wb_pixel_format = dm_444_32;
-
-   pipe_cnt++;
-   }
-
-}
-
 int dcn20_populate_dml_pipes_from_context(
struct dc *dc,
struct dc_state *context,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.h 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.h
index c8f3127bbcdf..6ec8ff45f0f7 100644
--- a/drivers/gpu/drm/amd/display/dc/

[PATCH v3 2/4] drm/amd/display: Add FPU event trace

2021-07-19 Thread Rodrigo Siqueira
We don't have any mechanism for tracing FPU operations inside the
display core, making the debug work a little bit tricky. This commit
introduces a trace mechanism inside our DC_FP_START/END macros for
trying to alleviate this problem.

Changes since V2:
- Make sure that we compile FPU operation only in the context that DCN
  is enabled.

Cc: Harry Wentland 
Cc: Anson Jacob 
Cc: Christian König 
Cc: Hersen Wu 
Cc: Aric Cyr 
Signed-off-by: Rodrigo Siqueira 
---
 .../gpu/drm/amd/display/amdgpu_dm/Makefile|  4 ++
 .../amd/display/amdgpu_dm/amdgpu_dm_trace.h   | 21 ++
 .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 64 +++
 .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.h| 33 ++
 drivers/gpu/drm/amd/display/dc/dc_trace.h |  3 +
 drivers/gpu/drm/amd/display/dc/os_types.h |  6 +-
 6 files changed, 128 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
 create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile 
b/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile
index 91fb72c96545..718e123a3230 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile
@@ -27,6 +27,10 @@
 
 AMDGPUDM = amdgpu_dm.o amdgpu_dm_irq.o amdgpu_dm_mst_types.o amdgpu_dm_color.o
 
+ifdef CONFIG_DRM_AMD_DC_DCN
+AMDGPUDM += dc_fpu.o
+endif
+
 ifneq ($(CONFIG_DRM_AMD_DC),)
 AMDGPUDM += amdgpu_dm_services.o amdgpu_dm_helpers.o amdgpu_dm_pp_smu.o 
amdgpu_dm_psr.o
 endif
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h
index 46a33f64cf8e..230bb12c405e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_trace.h
@@ -637,6 +637,27 @@ TRACE_EVENT(amdgpu_refresh_rate_track,
  __entry->refresh_rate_ns)
 );
 
+TRACE_EVENT(dcn_fpu,
+   TP_PROTO(bool begin, const char *function, const int line),
+   TP_ARGS(begin, function, line),
+
+   TP_STRUCT__entry(
+__field(bool, begin)
+__field(const char *, function)
+__field(int, line)
+   ),
+   TP_fast_assign(
+  __entry->begin = begin;
+  __entry->function = function;
+  __entry->line = line;
+   ),
+   TP_printk("%s()+%d: %s",
+ __entry->function,
+ __entry->line,
+ __entry->begin ? "begin" : "end"
+   )
+);
+
 #endif /* _AMDGPU_DM_TRACE_H_ */
 
 #undef TRACE_INCLUDE_PATH
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
new file mode 100644
index ..d5d156a4517e
--- /dev/null
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
@@ -0,0 +1,64 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright 2021 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: AMD
+ *
+ */
+
+#include "dc_trace.h"
+
+#include 
+
+/**
+ * dc_fpu_begin - Enables FPU protection
+ * @function_name: A string containing the function name for debug purposes
+ *   (usually __func__)
+ *
+ * @line: A line number where DC_FP_START was invoked for debug purpose
+ *   (usually __LINE__)
+ *
+ * This function is responsible for managing the use of kernel_fpu_begin() with
+ * the advantage of providing an event trace for debugging.
+ *
+ * Note: Do not call this function directly; always use DC_FP_START().
+ */
+void dc_fpu_begin(const char *function_name, const int line)
+{
+   TRACE_DCN_FPU(true, function_name, line);
+   kernel_fpu_begin();
+}
+
+/**
+ * dc_fpu_end - Disable FPU protection
+ * @function_name: A string containing the function name for debug pur

[PATCH v3 4/4] drm/amd/display: Add DC_FP helper to check FPU state

2021-07-19 Thread Rodrigo Siqueira
To fully isolate FPU operations in a single place, we must avoid
situations where compilers spill FP values to registers due to FP enable
in a specific C file. Note that even if we isolate all FPU functions in
a single file and call its interface from other files, the compiler
might enable the use of FPU before we call DC_FP_START. Nevertheless, it
is the programmer's responsibility to invoke DC_FP_START/END in the
correct place. To highlight situations where developers forgot to use
the FP protection before calling the DC FPU interface functions, we
introduce a helper that checks if the function is invoked under FP
protection. If not, it will trigger a kernel warning.

Changes cince V2 (Christian):
- Do not use this_cpu_* between get/put_cpu_ptr().
- In the kernel documentation, better describe restrictions.
- Make dc_assert_fp_enabled trigger the ASSERT message.

Changes since V1:
- Remove fp_enable variables
- Rename dc_is_fp_enabled to dc_assert_fp_enabled
- Replace wrong variable type

Cc: Harry Wentland 
Cc: Anson Jacob 
Cc: Christian König 
Cc: Hersen Wu 
Cc: Aric Cyr 
Signed-off-by: Rodrigo Siqueira 
---
 .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 19 +++
 .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.h|  1 +
 .../drm/amd/display/dc/dcn20/dcn20_resource.c |  2 ++
 .../drm/amd/display/dc/fpu_operations/dcn2x.c | 18 ++
 4 files changed, 40 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
index d0d3e8a34db5..107e1c50576e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
@@ -41,6 +41,25 @@
 
 static DEFINE_PER_CPU(int, fpu_recursion_depth);
 
+/**
+ * dc_assert_fp_enabled - Check if FPU protection is enabled
+ *
+ * This function tells if the code is already under FPU protection or not. A
+ * function that works as an API for a set of FPU operations can use this
+ * function for checking if the caller invoked it after DC_FP_START(). For
+ * example, take a look at dcn2x.c file.
+ */
+inline void dc_assert_fp_enabled(void)
+{
+   int *pcpu, depth = 0;
+
+   pcpu = get_cpu_ptr(&fpu_recursion_depth);
+   depth = *pcpu;
+   put_cpu_ptr(&fpu_recursion_depth);
+
+   ASSERT(depth > 1);
+}
+
 /**
  * dc_fpu_begin - Enables FPU protection
  * @function_name: A string containing the function name for debug purposes
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h
index fb54983c5c60..b8275b397920 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.h
@@ -27,6 +27,7 @@
 #ifndef __DC_FPU_H__
 #define __DC_FPU_H__
 
+void dc_assert_fp_enabled(void);
 void dc_fpu_begin(const char *function_name, const int line);
 void dc_fpu_end(const char *function_name, const int line);
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index f99b09643a52..d0b34c7f99dc 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -2355,7 +2355,9 @@ int dcn20_populate_dml_pipes_from_context(
}
 
/* populate writeback information */
+   DC_FP_START();
dc->res_pool->funcs->populate_dml_writeback_from_context(dc, res_ctx, 
pipes);
+   DC_FP_END();
 
return pipe_cnt;
 }
diff --git a/drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c 
b/drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c
index 95a0b89302a9..3ea90090264c 100644
--- a/drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c
+++ b/drivers/gpu/drm/amd/display/dc/fpu_operations/dcn2x.c
@@ -43,6 +43,22 @@
  *that deals with FP register is contained within this call.
  * 3. All function that needs to be accessed outside this file requires a
  *public interface that not uses any FPU reference.
+ * 4. Developers **must not** use DC_FP_START/END in this file, but they need
+ *to ensure that the caller invokes it before access any function available
+ *in this file. For this reason, public functions in this file must invoke
+ *dc_assert_fp_enabled();
+ *
+ * Let's expand a little bit more the idea in the code pattern. To fully
+ * isolate FPU operations in a single place, we must avoid situations where
+ * compilers spill FP values to registers due to FP enable in a specific C
+ * file. Note that even if we isolate all FPU functions in a single file and
+ * call its interface from other files, the compiler might enable the use of
+ * FPU before we call DC_FP_START. Nevertheless, it is the programmer's
+ * responsibility to invoke DC_FP_START/END in the correct place. To highlight
+ * situations where developers forgot to use the FP protection before calling
+ * the DC FPU interface functions, we introduce a helper that checks if the
+ * function is invoked under FP protecti

[PATCH v3 0/4] Base changes for isolating FPU operation in a single place

2021-07-19 Thread Rodrigo Siqueira
Hi,

In the display core, we utilize floats and doubles units for calculating
modesetting parameters. One side effect of our approach to use double-precision
is the fact that we spread multiple FPU access across our driver, which means
that we can accidentally clobber user space FPU state.

# Challenges

1. Keep in mind that this FPU code is ingrained in our display driver and
performs several crucial tasks. Additionally, we already have multiple
architectures available in the kernel and a large set of users; in other words,
we prefer to avoid a radical approach that might break our user's system.

2. We share our display code with other OSs; thus, we need to maintain the
interoperability between these two systems.

3. We need a mechanism for identifying which function uses FPU registers;
fortunately, Peter Zijlstra wrote a series a couple of months ago where he
introduced an FPU check for objtool. I used the following command for
identifying the potential FPU usage:

 ./tools/objtool/objtool check -Ffa "drivers/gpu/drm/amd/display/dc/ANY_FILE.o"

4. Since our code heavily relies on FPU and the fact that we spread
kernel_fpu_begin/end across multiple functions, we can have some complex
scenarios that will require code refactoring. However, we want to avoid
complicated changes since this is a formula to introduce regressions; we want
something that allows us to fix it in small, safe, and reliable steps.

5. Unfortunately, for legacy reasons, we have some problems in how we program
our FPU access, which in some weird scenarios can generate situations where we
try to enter in the fpu mode multiple times or exit too early.

# Our approach

For trying to solve this problem, we came up with the following strategy:

1. Keep in mind that we are using kernel_fpu_begin/end spread in various areas
and sometimes across multiple functions. If we try to move some of the
functions to an isolated place, we can generate a situation where we can call
the FPU protection more than once, causing multiple warnings. We can deal with
this problem by adding a thin management layer around the kernel_fpu_begin/end
used inside the display.

2. We will need a trace mechanism for this FPU management inside our display
code.

3. After we get the thin layer that manages FPU, we can start to move each
function that uses FPU to a centralized place. Our DQE runs multiple tests in
different ASICs every week; we can take advantage of this to ensure that our
FPU patches work does not introduce any regression. The idea is to work on a
specific part of the code every week (e.g., week 1: DCN2, week 1: DCN2.1,
etc.).

4. Finally, after we can isolate the FPU operations in a single place, we can
altogether remove the FPU flags from other files and eliminate an unnecessary
code introduced to deal with this problem. We can also remove the thin layer
added in the step 3.

# This series

To maintain the interoperability between multiple OSes, we already have a
define named DC_FP_START/END, which is a straightforward wrapper to
kernel_fpu_begin/end in the Linux side. In this series, I decided to expand the
scope of this DC_FP_* wrapper to trace FPU entrance and exit in the display
code, but I also add a mechanism for managing the entrance and exit of
kernel_fpu_begin/end. You can see the details on how I did that in the last two
patches.

I also isolate a simple function that requires FPU access to demonstrate my
strategy for isolating this FPU access in a single place. If this series gets
accepted, the following steps consist of moving all FPU functions weekly until
we isolate everything in the fpu_operation folder.

Change Since V2:
- Make sure to compile FPU operation only when DCN is enabled
  (officially, we only enable it for x86).
- Check cross-compile with ARM and x86_32. Everything looks fine.
- Avoid call this_cpu_* operations between get/put_cpu_ptr.
- Fix GCC warnings.
- Update documentation.
- Update our assert mechanism.
- Remove unnecessary wrappers.

Changes since V1:
- Use a better name for variables.
- Update documentation.
- Avoid preemption.

* See update details per commit message

Best Regards
Rodrigo Siqueira

Rodrigo Siqueira (4):
  drm/amd/display: Introduce FPU directory inside DC
  drm/amd/display: Add FPU event trace
  drm/amd/display: Add control mechanism for FPU utilization
  drm/amd/display: Add DC_FP helper to check FPU state

 .../gpu/drm/amd/display/amdgpu_dm/Makefile|   4 +
 .../amd/display/amdgpu_dm/amdgpu_dm_trace.h   |  24 
 .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 111 ++
 .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.h|  34 ++
 drivers/gpu/drm/amd/display/dc/Makefile   |   1 +
 drivers/gpu/drm/amd/display/dc/dc_trace.h |   3 +
 .../drm/amd/display/dc/dcn20/dcn20_resource.c |  41 +--
 .../drm/amd/display/dc/dcn20/dcn20_resource.h |   2 -
 .../drm/amd/display/dc/dcn21/dcn21_resource.c |   2 +
 .../amd/display/dc/fpu_operations/Makefile|  58 +
 .../drm/amd/dis

[PATCH] drm/amd/pm: Fix a bug communicating with the SMU (v5)

2021-07-19 Thread Luben Tuikov
This fixes a bug which if we probe a non-existing
I2C device, and the SMU returns 0xFF, from then on
we can never communicate with the SMU, because the
code before this patch reads and interprets 0xFF
as a terminal error, and thus we never write 0
into register 90 to clear the status (and
subsequently send a new command to the SMU.)

It is not an error that the SMU returns status
0xFF. This means that the SMU executed the last
command successfully (execution status), but the
command result is an error of some sort (execution
result), depending on what the command was.

When doing a status check of the SMU, before we
send a new command, the only status which
precludes us from sending a new command is 0--the
SMU hasn't finished executing a previous command,
and 0xFC--the SMU is busy.

This bug was seen as the following line in the
kernel log,

amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right 
state!

when subsequent SMU commands, not necessarily
related to I2C, were sent to the SMU.

This patch fixes this bug.

v2: Add a comment to the description of
__smu_cmn_poll_stat() to explain why we're NOT
defining the SMU FW return codes as macros, but
are instead hard-coding them. Such a change, can
be followed up by a subsequent patch.

v3: The changes are,
a) Add comments to break labels in
   __smu_cmn_reg2errno().

b) When an unknown/unspecified/undefined result is
   returned back from the SMU, map that to
   -EREMOTEIO, to distinguish failure at the SMU
   FW.

c) Add kernel-doc to
   smu_cmn_send_msg_without_waiting(),
   smu_cmn_wait_for_response(),
   smu_cmn_send_smc_msg_with_param().

d) In smu_cmn_send_smc_msg_with_param(), since we
   wait for completion of the command, if the
   result of the completion is
   undefined/unknown/unspecified, we print that to
   the kernel log.

v4: a) Add macros as requested, though redundant, to
be removed when SMU consolidates for all
ASICs--see comment in code.
b) Get out if the SMU code is unknown.

v5: Rename the macro names.

Cc: Alex Deucher 
Cc: Evan Quan 
Cc: Lijo Lazar 
Fixes: fcb1fe9c9e0031 ("drm/amd/powerplay: pre-check the SMU state before 
issuing message")
Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 288 +
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h |   3 +-
 2 files changed, 244 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index c902fdf322c1be..baf26d5bfb8ff4 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -55,7 +55,7 @@
 
 #undef __SMU_DUMMY_MAP
 #define __SMU_DUMMY_MAP(type)  #type
-static const char* __smu_message_names[] = {
+static const char * const __smu_message_names[] = {
SMU_MESSAGE_TYPES
 };
 
@@ -76,55 +76,258 @@ static void smu_cmn_read_arg(struct smu_context *smu,
*arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82);
 }
 
-int smu_cmn_wait_for_response(struct smu_context *smu)
+/* Redefine the SMU error codes here.
+ *
+ * Note that these definitions are redundant and should be removed
+ * when the SMU has exported a unified header file containing these
+ * macros, which header file we can just include and use the SMU's
+ * macros. At the moment, these error codes are defined by the SMU
+ * per-ASIC unfortunately, yet we're a one driver for all ASICs.
+ */
+#define SMU_RESP_NONE   0
+#define SMU_RESP_OK 1
+#define SMU_RESP_CMD_FAIL   0xFF
+#define SMU_RESP_CMD_UNKNOWN0xFE
+#define SMU_RESP_CMD_BAD_PREREQ 0xFD
+#define SMU_RESP_BUSY_OTHER 0xFC
+#define SMU_RESP_DEBUG_END  0xFB
+
+/**
+ * __smu_cmn_poll_stat -- poll for a status from the SMU
+ * smu: a pointer to SMU context
+ *
+ * Returns the status of the SMU, which could be,
+ *0, the SMU is busy with your previous command;
+ *1, execution status: success, execution result: success;
+ * 0xFF, execution status: success, execution result: failure;
+ * 0xFE, unknown command;
+ * 0xFD, valid command, but bad (command) prerequisites;
+ * 0xFC, the command was rejected as the SMU is busy;
+ * 0xFB, "SMC_Result_DebugDataDumpEnd".
+ *
+ * The values here are not defined by macros, because I'd rather we
+ * include a single header file which defines them, which is
+ * maintained by the SMU FW team, so that we're impervious to firmware
+ * changes. At the moment those values are defined in various header
+ * files, one for each ASIC, yet here we're a single ASIC-agnostic
+ * interface. Such a change can be followed-up by a subsequent patch.
+ */
+static u32 __smu_cmn_poll_stat(struct smu_context *smu)
 {
struct amdgpu_device *adev = smu->adev;
-   uint32_t cur_value, i, timeout = adev->usec_timeout * 20;
+   int timeout = adev->usec_timeout * 20;
+   u32 reg;
 
-   for (i = 0; i < timeout; i++) {
-   cur_value = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90);
-   if ((cur_value & MP1_C

[PATCH] drm/amdkfd: avoid conflicting address mappings

2021-07-19 Thread Alex Sierra
[Why]
Avoid conflict with address ranges mapped by SVM
mechanism that try to be allocated again through
ioctl_alloc in the same process. And viceversa.

[How]
For ioctl_alloc_memory_of_gpu allocations
Check if the address range passed into ioctl memory
alloc does not exist already in the kfd_process
svms->objects interval tree.

For SVM allocations
Look for the address range into the interval tree VA from
the VM inside of each pdds used in a kfd_process.

Signed-off-by: Alex Sierra 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 13 
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 79 +++-
 2 files changed, 75 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 67541c30327a..f39baaa22a62 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1251,6 +1251,7 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
*filep,
struct kfd_process_device *pdd;
void *mem;
struct kfd_dev *dev;
+   struct svm_range_list *svms = &p->svms;
int idr_handle;
long err;
uint64_t offset = args->mmap_offset;
@@ -1259,6 +1260,18 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
*filep,
if (args->size == 0)
return -EINVAL;
 
+#if IS_ENABLED(CONFIG_HSA_AMD_SVM)
+   mutex_lock(&svms->lock);
+   if (interval_tree_iter_first(&svms->objects,
+args->va_addr >> PAGE_SHIFT,
+(args->va_addr + args->size - 1) >> 
PAGE_SHIFT)) {
+   pr_err("Address: 0x%llx already allocated by SVM\n",
+   args->va_addr);
+   mutex_unlock(&svms->lock);
+   return -EADDRINUSE;
+   }
+   mutex_unlock(&svms->lock);
+#endif
dev = kfd_device_by_id(args->gpu_id);
if (!dev)
return -EINVAL;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 31f3f24cef6a..043ee0467916 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2581,9 +2581,54 @@ int svm_range_list_init(struct kfd_process *p)
return 0;
 }
 
+/**
+ * svm_range_is_vm_bo_mapped - check if virtual address range mapped already
+ * @p: current kfd_process
+ * @start: range start address, in pages
+ * @last: range last address, in pages
+ *
+ * The purpose is to avoid virtual address ranges already allocated by
+ * kfd_ioctl_alloc_memory_of_gpu ioctl.
+ * It looks for each pdd in the kfd_process.
+ *
+ * Context: Process context
+ *
+ * Return 0 - OK, if the range is not mapped.
+ * Otherwise error code:
+ * -EADDRINUSE - if address is mapped already by kfd_ioctl_alloc_memory_of_gpu
+ * -ERESTARTSYS - A wait for the buffer to become unreserved was interrupted by
+ * a signal. Release all buffer reservations and return to user-space.
+ */
+static int
+svm_range_is_vm_bo_mapped(struct kfd_process *p, uint64_t start, uint64_t last)
+{
+   uint32_t i;
+   int r;
+
+   for (i = 0; i < p->n_pdds; i++) {
+   struct amdgpu_vm *vm;
+
+   if (!p->pdds[i]->drm_priv)
+   continue;
+
+   vm = drm_priv_to_vm(p->pdds[i]->drm_priv);
+   r = amdgpu_bo_reserve(vm->root.bo, false);
+   if (r)
+   return r;
+   if (interval_tree_iter_first(&vm->va, start, last)) {
+   pr_debug("Range [0x%llx 0x%llx] already mapped\n", 
start, last);
+   amdgpu_bo_unreserve(vm->root.bo);
+   return -EADDRINUSE;
+   }
+   amdgpu_bo_unreserve(vm->root.bo);
+   }
+
+   return 0;
+}
+
 /**
  * svm_range_is_valid - check if virtual address range is valid
- * @mm: current process mm_struct
+ * @mm: current kfd_process
  * @start: range start address, in pages
  * @size: range size, in pages
  *
@@ -2592,28 +2637,27 @@ int svm_range_list_init(struct kfd_process *p)
  * Context: Process context
  *
  * Return:
- *  true - valid svm range
- *  false - invalid svm range
+ *  0 - OK, otherwise error code
  */
-static bool
-svm_range_is_valid(struct mm_struct *mm, uint64_t start, uint64_t size)
+static int
+svm_range_is_valid(struct kfd_process *p, uint64_t start, uint64_t size)
 {
const unsigned long device_vma = VM_IO | VM_PFNMAP | VM_MIXEDMAP;
struct vm_area_struct *vma;
unsigned long end;
+   unsigned long start_unchg = start;
 
start <<= PAGE_SHIFT;
end = start + (size << PAGE_SHIFT);
-
do {
-   vma = find_vma(mm, start);
+   vma = find_vma(p->mm, start);
if (!vma || start < vma->vm_start ||
(vma->vm_flags & device_vma))
-   return false;
+   return -EFAULT;
start = min(end, vma->vm

Re: [PATCH] drm/amdgpu: Fix documentaion for dm_dmub_outbox1_low_irq

2021-07-19 Thread Harry Wentland



On 2021-07-19 1:49 p.m., Anson Jacob wrote:
> Fix make htmldocs complaint:
> ./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:628: warning: Excess 
> function parameter 'interrupt_params' description in 'DMUB_TRACE_MAX_READ'
> 
> Signed-off-by: Anson Jacob 
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 7ed104e3756b..02bba9a202a8 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -625,7 +625,6 @@ static void dm_dcn_vertical_interrupt0_high_irq(void 
> *interrupt_params)
>   * Handles the Outbox Interrupt
>   * event handler.
>   */
> -#define DMUB_TRACE_MAX_READ 64
>  static void dm_dmub_outbox1_low_irq(void *interrupt_params)
>  {
>   struct dmub_notification notify;
> @@ -635,6 +634,8 @@ static void dm_dmub_outbox1_low_irq(void 
> *interrupt_params)
>   struct dmcub_trace_buf_entry entry = { 0 };
>   uint32_t count = 0;
>  
> +#define DMUB_TRACE_MAX_READ 64
> +

I'd prefer to keep macro definitions separate from functions. A better
place for this would be right before the function comment.

Harry

>   if (dc_enable_dmub_notifications(adev->dm.dc)) {
>   if (irq_params->irq_src == DC_IRQ_SOURCE_DMCUB_OUTBOX) {
>   do {
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v4 06/13] include/linux/mm.h: helpers to check zone device generic type

2021-07-19 Thread Zeng, Oak


Regards,
Oak 

 

On 2021-07-17, 3:22 PM, "amd-gfx on behalf of Alex Sierra" 
 wrote:

Two helpers added. One checks if zone device page is generic
type. The other if page is either private or generic type.

Signed-off-by: Alex Sierra 
---
 include/linux/mm.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index d8d79bb94be8..f5b247a63044 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1125,6 +1125,14 @@ static inline bool is_device_private_page(const 
struct page *page)
page->pgmap->type == MEMORY_DEVICE_PRIVATE;
 }

+static inline bool is_device_page(const struct page *page)

The function name here is confusing. Theoretically as long as page's zone 
number is ZONE_DEVICE, then the page is a device page. You put the condition 
more strict below just because the kfd svm implementation only uses 
MEMORY_DEVICE_PRIVATE/GENERIC. But MEMORY_DEVICE_FS_DAX and 
MEMORY_DEVICE_PCI_P2PDMA is also device memory (compared to normal cpu system 
memory).

+{
+   return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
+   is_zone_device_page(page) &&
+   (page->pgmap->type == MEMORY_DEVICE_PRIVATE ||
+page->pgmap->type == MEMORY_DEVICE_GENERIC);
+}
+
 static inline bool is_pci_p2pdma_page(const struct page *page)
 {
return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Coak.zeng%40amd.com%7C6a01845fd1524360d79a08d949583365%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637621465431851292%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=nHblZdGYyYuQVws%2BgG4HgnKpGiQXCfxxEnWZuozKy9g%3D&reserved=0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[RFC 0/2] Add generic FPU api similar to x86

2021-07-19 Thread Anson Jacob
This is an attempt to have generic FPU enable/disable
calls similar to x86. 
So that we can simplify gpu/drm/amd/display/dc/os_types.h

Also adds FPU correctness logic seen in x86.

Anson Jacob (2):
  ppc/fpu: Add generic FPU api similar to x86
  drm/amd/display: Use PPC FPU functions

 arch/powerpc/include/asm/switch_to.h  |  29 ++---
 arch/powerpc/kernel/process.c | 130 ++
 drivers/gpu/drm/amd/display/dc/os_types.h |  28 +
 3 files changed, 139 insertions(+), 48 deletions(-)

-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[RFC 1/2] ppc/fpu: Add generic FPU api similar to x86

2021-07-19 Thread Anson Jacob
- Add kernel_fpu_begin & kernel_fpu_end API as x86
- Add logic similar to x86 to ensure fpu
  begin/end call correctness
- Add kernel_fpu_enabled to know if FPU is enabled

Signed-off-by: Anson Jacob 
---
 arch/powerpc/include/asm/switch_to.h |  29 ++
 arch/powerpc/kernel/process.c| 130 +++
 2 files changed, 137 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/switch_to.h 
b/arch/powerpc/include/asm/switch_to.h
index 9d1fbd8be1c7..aded7aa661c0 100644
--- a/arch/powerpc/include/asm/switch_to.h
+++ b/arch/powerpc/include/asm/switch_to.h
@@ -41,10 +41,7 @@ extern void enable_kernel_fp(void);
 extern void flush_fp_to_thread(struct task_struct *);
 extern void giveup_fpu(struct task_struct *);
 extern void save_fpu(struct task_struct *);
-static inline void disable_kernel_fp(void)
-{
-   msr_check_and_clear(MSR_FP);
-}
+extern void disable_kernel_fp(void);
 #else
 static inline void save_fpu(struct task_struct *t) { }
 static inline void flush_fp_to_thread(struct task_struct *t) { }
@@ -55,10 +52,7 @@ extern void enable_kernel_altivec(void);
 extern void flush_altivec_to_thread(struct task_struct *);
 extern void giveup_altivec(struct task_struct *);
 extern void save_altivec(struct task_struct *);
-static inline void disable_kernel_altivec(void)
-{
-   msr_check_and_clear(MSR_VEC);
-}
+extern void disable_kernel_altivec(void);
 #else
 static inline void save_altivec(struct task_struct *t) { }
 static inline void __giveup_altivec(struct task_struct *t) { }
@@ -67,20 +61,7 @@ static inline void __giveup_altivec(struct task_struct *t) { 
}
 #ifdef CONFIG_VSX
 extern void enable_kernel_vsx(void);
 extern void flush_vsx_to_thread(struct task_struct *);
-static inline void disable_kernel_vsx(void)
-{
-   msr_check_and_clear(MSR_FP|MSR_VEC|MSR_VSX);
-}
-#else
-static inline void enable_kernel_vsx(void)
-{
-   BUILD_BUG();
-}
-
-static inline void disable_kernel_vsx(void)
-{
-   BUILD_BUG();
-}
+extern void disable_kernel_vsx(void);
 #endif
 
 #ifdef CONFIG_SPE
@@ -114,4 +95,8 @@ static inline void clear_task_ebb(struct task_struct *t)
 
 extern int set_thread_tidr(struct task_struct *t);
 
+bool kernel_fpu_enabled(void);
+void kernel_fpu_begin(void);
+void kernel_fpu_end(void);
+
 #endif /* _ASM_POWERPC_SWITCH_TO_H */
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 185beb290580..2ced8c6a3fab 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -75,6 +75,17 @@
 #define TM_DEBUG(x...) do { } while(0)
 #endif
 
+/*
+ * Track whether the kernel is using the FPU state
+ * currently.
+ *
+ * This flag is used:
+ *
+ *   - kernel_fpu_begin()/end() correctness
+ *   - kernel_fpu_enabled info
+ */
+static DEFINE_PER_CPU(bool, in_kernel_fpu);
+
 extern unsigned long _get_SP(void);
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
@@ -212,6 +223,9 @@ void enable_kernel_fp(void)
unsigned long cpumsr;
 
WARN_ON(preemptible());
+   WARN_ON_ONCE(this_cpu_read(in_kernel_fpu));
+
+   this_cpu_write(in_kernel_fpu, true);
 
cpumsr = msr_check_and_set(MSR_FP);
 
@@ -231,6 +245,15 @@ void enable_kernel_fp(void)
}
 }
 EXPORT_SYMBOL(enable_kernel_fp);
+
+void disable_kernel_fp(void)
+{
+   WARN_ON_ONCE(!this_cpu_read(in_kernel_fpu));
+
+   this_cpu_write(in_kernel_fpu, false);
+   msr_check_and_clear(MSR_FP);
+}
+EXPORT_SYMBOL(disable_kernel_fp);
 #else
 static inline void __giveup_fpu(struct task_struct *tsk) { }
 #endif /* CONFIG_PPC_FPU */
@@ -263,6 +286,9 @@ void enable_kernel_altivec(void)
unsigned long cpumsr;
 
WARN_ON(preemptible());
+   WARN_ON_ONCE(this_cpu_read(in_kernel_fpu));
+
+   this_cpu_write(in_kernel_fpu, true);
 
cpumsr = msr_check_and_set(MSR_VEC);
 
@@ -283,6 +309,14 @@ void enable_kernel_altivec(void)
 }
 EXPORT_SYMBOL(enable_kernel_altivec);
 
+void disable_kernel_altivec(void)
+{
+   WARN_ON_ONCE(!this_cpu_read(in_kernel_fpu));
+
+   this_cpu_write(in_kernel_fpu, false);
+   msr_check_and_clear(MSR_VEC);
+}
+EXPORT_SYMBOL(disable_kernel_altivec);
 /*
  * Make sure the VMX/Altivec register state in the
  * the thread_struct is up to date for task tsk.
@@ -333,6 +367,9 @@ void enable_kernel_vsx(void)
unsigned long cpumsr;
 
WARN_ON(preemptible());
+   WARN_ON_ONCE(this_cpu_read(in_kernel_fpu));
+
+   this_cpu_write(in_kernel_fpu, true);
 
cpumsr = msr_check_and_set(MSR_FP|MSR_VEC|MSR_VSX);
 
@@ -354,6 +391,15 @@ void enable_kernel_vsx(void)
 }
 EXPORT_SYMBOL(enable_kernel_vsx);
 
+void disable_kernel_vsx(void)
+{
+   WARN_ON_ONCE(!this_cpu_read(in_kernel_fpu));
+
+   this_cpu_write(in_kernel_fpu, false);
+   msr_check_and_clear(MSR_FP|MSR_VEC|MSR_VSX);
+}
+EXPORT_SYMBOL(disable_kernel_vsx);
+
 void flush_vsx_to_thread(struct task_struct *tsk)
 {
if (tsk->thread.regs) {
@@ -406,6 +452,90 @@ void flush_spe_to_thread(struct task_struct *tsk)
 }
 #en

[RFC 2/2] drm/amd/display: Use PPC FPU functions

2021-07-19 Thread Anson Jacob
Use kernel_fpu_begin & kernel_fpu_end for PPC

Depends on "ppc/fpu: Add generic FPU api similar to x86"

Signed-off-by: Anson Jacob 
---
 drivers/gpu/drm/amd/display/dc/os_types.h | 28 ++-
 1 file changed, 2 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/os_types.h 
b/drivers/gpu/drm/amd/display/dc/os_types.h
index 126c2f3a4dd3..999c5103357e 100644
--- a/drivers/gpu/drm/amd/display/dc/os_types.h
+++ b/drivers/gpu/drm/amd/display/dc/os_types.h
@@ -57,32 +57,8 @@
 #define DC_FP_END() kernel_fpu_end()
 #elif defined(CONFIG_PPC64)
 #include 
-#include 
-#define DC_FP_START() { \
-   if (cpu_has_feature(CPU_FTR_VSX_COMP)) { \
-   preempt_disable(); \
-   enable_kernel_vsx(); \
-   } else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP)) { \
-   preempt_disable(); \
-   enable_kernel_altivec(); \
-   } else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE)) { \
-   preempt_disable(); \
-   enable_kernel_fp(); \
-   } \
-}
-#define DC_FP_END() { \
-   if (cpu_has_feature(CPU_FTR_VSX_COMP)) { \
-   disable_kernel_vsx(); \
-   preempt_enable(); \
-   } else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP)) { \
-   disable_kernel_altivec(); \
-   preempt_enable(); \
-   } else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE)) { \
-   disable_kernel_fp(); \
-   preempt_enable(); \
-   } \
-}
-#endif
+#define DC_FP_START() kernel_fpu_begin()
+#define DC_FP_END() kernel_fpu_end()
 #endif
 
 /*
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Fix documentaion for dm_dmub_outbox1_low_irq

2021-07-19 Thread Anson Jacob
Fix make htmldocs complaint:
./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:628: warning: Excess 
function parameter 'interrupt_params' description in 'DMUB_TRACE_MAX_READ'

Signed-off-by: Anson Jacob 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 7ed104e3756b..02bba9a202a8 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -625,7 +625,6 @@ static void dm_dcn_vertical_interrupt0_high_irq(void 
*interrupt_params)
  * Handles the Outbox Interrupt
  * event handler.
  */
-#define DMUB_TRACE_MAX_READ 64
 static void dm_dmub_outbox1_low_irq(void *interrupt_params)
 {
struct dmub_notification notify;
@@ -635,6 +634,8 @@ static void dm_dmub_outbox1_low_irq(void *interrupt_params)
struct dmcub_trace_buf_entry entry = { 0 };
uint32_t count = 0;
 
+#define DMUB_TRACE_MAX_READ 64
+
if (dc_enable_dmub_notifications(adev->dm.dc)) {
if (irq_params->irq_src == DC_IRQ_SOURCE_DMCUB_OUTBOX) {
do {
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/pm: Fix a bug communicating with the SMU (v4)

2021-07-19 Thread Alex Deucher
On Mon, Jul 19, 2021 at 12:16 PM Luben Tuikov  wrote:
>
> This fixes a bug which if we probe a non-existing
> I2C device, and the SMU returns 0xFF, from then on
> we can never communicate with the SMU, because the
> code before this patch reads and interprets 0xFF
> as a terminal error, and thus we never write 0
> into register 90 to clear the status (and
> subsequently send a new command to the SMU.)
>
> It is not an error that the SMU returns status
> 0xFF. This means that the SMU executed the last
> command successfully (execution status), but the
> command result is an error of some sort (execution
> result), depending on what the command was.
>
> When doing a status check of the SMU, before we
> send a new command, the only status which
> precludes us from sending a new command is 0--the
> SMU hasn't finished executing a previous command,
> and 0xFC--the SMU is busy.
>
> This bug was seen as the following line in the
> kernel log,
>
> amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right 
> state!
>
> when subsequent SMU commands, not necessarily
> related to I2C, were sent to the SMU.
>
> This patch fixes this bug.
>
> v2: Add a comment to the description of
> __smu_cmn_poll_stat() to explain why we're NOT
> defining the SMU FW return codes as macros, but
> are instead hard-coding them. Such a change, can
> be followed up by a subsequent patch.
>
> v3: The changes are,
> a) Add comments to break labels in
>__smu_cmn_reg2errno().
>
> b) When an unknown/unspecified/undefined result is
>returned back from the SMU, map that to
>-EREMOTEIO, to distinguish failure at the SMU
>FW.
>
> c) Add kernel-doc to
>smu_cmn_send_msg_without_waiting(),
>smu_cmn_wait_for_response(),
>smu_cmn_send_smc_msg_with_param().
>
> d) In smu_cmn_send_smc_msg_with_param(), since we
>wait for completion of the command, if the
>result of the completion is
>undefined/unknown/unspecified, we print that to
>the kernel log.
>
> v4: a) Add macros as requested, though redundant, to
> be removed when SMU consolidates for all
> ASICs--see comment in code.
> b) Get out if the SMU code is unknown.
>
> Cc: Alex Deucher 
> Cc: Evan Quan 
> Cc: Lijo Lazar 
> Fixes: fcb1fe9c9e0031 ("drm/amd/powerplay: pre-check the SMU state before 
> issuing message")
> Signed-off-by: Luben Tuikov 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 294 +
>  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h |   3 +-
>  2 files changed, 250 insertions(+), 47 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index c902fdf322c1be..9b30737bd541d6 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -55,7 +55,7 @@
>
>  #undef __SMU_DUMMY_MAP
>  #define __SMU_DUMMY_MAP(type)  #type
> -static const char* __smu_message_names[] = {
> +static const char * const __smu_message_names[] = {
> SMU_MESSAGE_TYPES
>  };
>
> @@ -76,55 +76,264 @@ static void smu_cmn_read_arg(struct smu_context *smu,
> *arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82);
>  }
>
> -int smu_cmn_wait_for_response(struct smu_context *smu)
> +/* Redefine the SMU error codes here.
> + *
> + * Note that these definitions are redundant and should be removed
> + * when the SMU has exported a unified header file containing these
> + * macros, which header file we can just include and use the SMU's
> + * macros. At the moment, these error codes are defined by the SMU
> + * per-ASIC unfortunately, yet we're a one driver for all ASICs.
> + */
> +
> +#define SMU_BUSY   0

This is really a response per se, it's just that the SMU hasn't
updated the response register yet.
Maybe SMU_RESP_NONE

> +#define SMU_GOOD   1
> +#define SMU_CMD_FAIL   0xFF
> +#define SMU_CMD_UNKNOWN0xFE
> +#define SMU_CMD_BAD_PREREQ 0xFD
> +#define SMU_BUSY_OTHER 0xFC
> +#define SMU_DEBUG_END  0xFB

I'd change the macros to include RESP for consistency E.g.,
#define SMU_RESP_OK   1
#define SMU_RESP_CMD_FAIL   0xFF
etc.

> +
> +/**
> + * __smu_cmn_poll_stat -- poll for a status from the SMU
> + * smu: a pointer to SMU context
> + *
> + * Returns the status of the SMU, which could be,
> + *0, the SMU is busy with your previous command;
> + *1, execution status: success, execution result: success;
> + * 0xFF, execution status: success, execution result: failure;
> + * 0xFE, unknown command;
> + * 0xFD, valid command, but bad (command) prerequisites;
> + * 0xFC, the command was rejected as the SMU is busy;
> + * 0xFB, "SMC_Result_DebugDataDumpEnd".
> + *
> + * The values here are not defined by macros, because I'd rather we
> + * include a single header file which defines them, which is
> + * maintained by the SMU FW team, so that we're impervious to firmware
> + * changes. At the moment those values are defined in various header
> + * files, one for each ASIC, yet here we're a s

[PATCH] drm/amd/pm: Fix a bug communicating with the SMU (v4)

2021-07-19 Thread Luben Tuikov
This fixes a bug which if we probe a non-existing
I2C device, and the SMU returns 0xFF, from then on
we can never communicate with the SMU, because the
code before this patch reads and interprets 0xFF
as a terminal error, and thus we never write 0
into register 90 to clear the status (and
subsequently send a new command to the SMU.)

It is not an error that the SMU returns status
0xFF. This means that the SMU executed the last
command successfully (execution status), but the
command result is an error of some sort (execution
result), depending on what the command was.

When doing a status check of the SMU, before we
send a new command, the only status which
precludes us from sending a new command is 0--the
SMU hasn't finished executing a previous command,
and 0xFC--the SMU is busy.

This bug was seen as the following line in the
kernel log,

amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right 
state!

when subsequent SMU commands, not necessarily
related to I2C, were sent to the SMU.

This patch fixes this bug.

v2: Add a comment to the description of
__smu_cmn_poll_stat() to explain why we're NOT
defining the SMU FW return codes as macros, but
are instead hard-coding them. Such a change, can
be followed up by a subsequent patch.

v3: The changes are,
a) Add comments to break labels in
   __smu_cmn_reg2errno().

b) When an unknown/unspecified/undefined result is
   returned back from the SMU, map that to
   -EREMOTEIO, to distinguish failure at the SMU
   FW.

c) Add kernel-doc to
   smu_cmn_send_msg_without_waiting(),
   smu_cmn_wait_for_response(),
   smu_cmn_send_smc_msg_with_param().

d) In smu_cmn_send_smc_msg_with_param(), since we
   wait for completion of the command, if the
   result of the completion is
   undefined/unknown/unspecified, we print that to
   the kernel log.

v4: a) Add macros as requested, though redundant, to
be removed when SMU consolidates for all
ASICs--see comment in code.
b) Get out if the SMU code is unknown.

Cc: Alex Deucher 
Cc: Evan Quan 
Cc: Lijo Lazar 
Fixes: fcb1fe9c9e0031 ("drm/amd/powerplay: pre-check the SMU state before 
issuing message")
Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 294 +
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h |   3 +-
 2 files changed, 250 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index c902fdf322c1be..9b30737bd541d6 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -55,7 +55,7 @@
 
 #undef __SMU_DUMMY_MAP
 #define __SMU_DUMMY_MAP(type)  #type
-static const char* __smu_message_names[] = {
+static const char * const __smu_message_names[] = {
SMU_MESSAGE_TYPES
 };
 
@@ -76,55 +76,264 @@ static void smu_cmn_read_arg(struct smu_context *smu,
*arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82);
 }
 
-int smu_cmn_wait_for_response(struct smu_context *smu)
+/* Redefine the SMU error codes here.
+ *
+ * Note that these definitions are redundant and should be removed
+ * when the SMU has exported a unified header file containing these
+ * macros, which header file we can just include and use the SMU's
+ * macros. At the moment, these error codes are defined by the SMU
+ * per-ASIC unfortunately, yet we're a one driver for all ASICs.
+ */
+
+#define SMU_BUSY   0
+#define SMU_GOOD   1
+#define SMU_CMD_FAIL   0xFF
+#define SMU_CMD_UNKNOWN0xFE
+#define SMU_CMD_BAD_PREREQ 0xFD
+#define SMU_BUSY_OTHER 0xFC
+#define SMU_DEBUG_END  0xFB
+
+/**
+ * __smu_cmn_poll_stat -- poll for a status from the SMU
+ * smu: a pointer to SMU context
+ *
+ * Returns the status of the SMU, which could be,
+ *0, the SMU is busy with your previous command;
+ *1, execution status: success, execution result: success;
+ * 0xFF, execution status: success, execution result: failure;
+ * 0xFE, unknown command;
+ * 0xFD, valid command, but bad (command) prerequisites;
+ * 0xFC, the command was rejected as the SMU is busy;
+ * 0xFB, "SMC_Result_DebugDataDumpEnd".
+ *
+ * The values here are not defined by macros, because I'd rather we
+ * include a single header file which defines them, which is
+ * maintained by the SMU FW team, so that we're impervious to firmware
+ * changes. At the moment those values are defined in various header
+ * files, one for each ASIC, yet here we're a single ASIC-agnostic
+ * interface. Such a change can be followed-up by a subsequent patch.
+ */
+static u32 __smu_cmn_poll_stat(struct smu_context *smu)
 {
struct amdgpu_device *adev = smu->adev;
-   uint32_t cur_value, i, timeout = adev->usec_timeout * 20;
+   int timeout = adev->usec_timeout * 20;
+   u32 reg;
 
-   for (i = 0; i < timeout; i++) {
-   cur_value = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90);
-   if ((cur_value & MP1_C2PMSG_90__CONTENT_MASK) != 0)
-   return c

RE: [PATCH 2/3] drm/amdkfd: report xgmi bandwidth between direct peers to the kfd

2021-07-19 Thread Kim, Jonathan
[AMD Official Use Only]

> -Original Message-
> From: Kuehling, Felix 
> Sent: Saturday, July 17, 2021 1:47 AM
> To: Kim, Jonathan ; amd-
> g...@lists.freedesktop.org
> Subject: Re: [PATCH 2/3] drm/amdkfd: report xgmi bandwidth between
> direct peers to the kfd
>
> Am 2021-07-16 um 12:43 p.m. schrieb Jonathan Kim:
> > Report the min/max bandwidth in megabytes to the kfd for direct xgmi
> > connections only.
>
> By "direct XGMI connections", you mean this doesn't work for links with
> more than one hop? Will that spew out DRM_ERROR messages for such links?
> Then it's probably better to downgrade that to an INFO.

No DRM_ERROR only happens if psp fails on invoke.
I've added footnotes to the description and code to clear this up.
Non-adjacent peers return num_links as 0 since indirect route is unknown and 
linkage is asymmetrical.

>
>
> >
> > v2: change reporting from num links to bandwidth
> >
> > Signed-off-by: Jonathan Kim 
>
> This patch is OK to provide bandwidth information on Aldebaran. What can
> we do on older GPUs? Can we assume num_links = 1? Or maybe have some
> hard-coded numbers depending on the number of nodes in the hive?

We could assume num_links = 1 but that wouldn't represent certain non-Aldebaran 
setups well.
For non-Aldebaran min/max bandwidth, we may be able to get away with setting 
non-zero values on non-adjacent peers since setup is symmetrical to date but 
that may raise questions on why Aldebaran indirect min/max-bandwidth is 0.  For 
consistency, we'd have to use num_hops then to check directness.
Maybe it's worth making a bid to the FW team to support all other chips moving 
forward  ...

Thanks,

Jon

>
> Either way, patch 1 and 2 are
>
> Reviewed-by: Felix Kuehling 
>
>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 23
> > ++
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  1 +
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   | 12 +++
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  2 ++
> >  drivers/gpu/drm/amd/amdkfd/kfd_crat.c  | 12 +++
> >  5 files changed, 50 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > index bfab2f9fdd17..3978578a1c49 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > @@ -553,6 +553,29 @@ uint8_t
> amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst, struct kgd_dev
> *s
> > return  (uint8_t)ret;
> >  }
> >
> > +int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst,
> > +struct kgd_dev *src, bool is_min) {
> > +   struct amdgpu_device *adev = (struct amdgpu_device *)dst,
> *peer_adev;
> > +   int num_links;
> > +
> > +   if (adev->asic_type != CHIP_ALDEBARAN)
> > +   return 0;
> > +
> > +   if (src)
> > +   peer_adev = (struct amdgpu_device *)src;
> > +
> > +   num_links = is_min ? 1 : amdgpu_xgmi_get_num_links(adev,
> peer_adev);
> > +   if (num_links < 0) {
> > +   DRM_ERROR("amdgpu: failed to get xgmi num links between
> node %d and %d. ret = %d\n",
> > +   adev->gmc.xgmi.physical_node_id,
> > +   peer_adev->gmc.xgmi.physical_node_id, num_links);
> > +   num_links = 0;
> > +   }
> > +
> > +   /* Aldebaran xGMI DPM is defeatured so assume x16 x 25Gbps for
> bandwidth. */
> > +   return (num_links * 16 * 25000)/BITS_PER_BYTE; }
> > +
> >  uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct kgd_dev
> *kgd)
> > {
> > struct amdgpu_device *adev = (struct amdgpu_device *)kgd; diff --git
> > a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > index 81264517d532..e12fccb2d2c4 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > @@ -226,6 +226,7 @@ uint32_t amdgpu_amdkfd_get_num_gws(struct
> kgd_dev
> > *kgd);  uint32_t amdgpu_amdkfd_get_asic_rev_id(struct kgd_dev *kgd);
> > int amdgpu_amdkfd_get_noretry(struct kgd_dev *kgd);  uint8_t
> > amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst, struct
> kgd_dev
> > *src);
> > +int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst,
> > +struct kgd_dev *src, bool is_min);
> >
> >  /* Read user wptr from a specified user address space with page fault
> >   * disabled. The memory must be pinned and mapped to the hardware
> > when diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> > index 8567d5d77346..258cf86b32f6 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> > @@ -486,6 +486,18 @@ int amdgpu_xgmi_get_hops_count(struct
> amdgpu_device *adev,
> > return  -EINVAL;
> >  }
> >
> > +int amdgpu_xgmi_get_num_links(struct amdgpu_device *adev,
> > +   struct amdgpu_device *peer_adev)
> > +{
> > +   struct psp_xgmi_topology_info *top = &adev-
> >psp.xgmi_context.top_info;
> > +   int i;
>

RE: [PATCH 2/3] drm/amdkfd: report xgmi bandwidth between direct peers to the kfd

2021-07-19 Thread Kim, Jonathan
[AMD Official Use Only]

> -Original Message-
> From: Lazar, Lijo 
> Sent: Monday, July 19, 2021 3:22 AM
> To: Kim, Jonathan ; amd-
> g...@lists.freedesktop.org
> Cc: Kuehling, Felix 
> Subject: Re: [PATCH 2/3] drm/amdkfd: report xgmi bandwidth between
> direct peers to the kfd
>
>
>
> On 7/16/2021 10:13 PM, Jonathan Kim wrote:
> > Report the min/max bandwidth in megabytes to the kfd for direct xgmi
> > connections only.
> >
> > v2: change reporting from num links to bandwidth
> >
> > Signed-off-by: Jonathan Kim 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 23
> ++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  1 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   | 12 +++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  2 ++
> >   drivers/gpu/drm/amd/amdkfd/kfd_crat.c  | 12 +++
> >   5 files changed, 50 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > index bfab2f9fdd17..3978578a1c49 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > @@ -553,6 +553,29 @@ uint8_t
> amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst, struct kgd_dev
> *s
> > return  (uint8_t)ret;
> >   }
> >
> > +int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst,
> > +struct kgd_dev *src, bool is_min) {
> > +   struct amdgpu_device *adev = (struct amdgpu_device *)dst,
> *peer_adev;
> > +   int num_links;
> > +
> > +   if (adev->asic_type != CHIP_ALDEBARAN)
> > +   return 0;
> > +
> > +   if (src)
> > +   peer_adev = (struct amdgpu_device *)src;
> > +
> > +   num_links = is_min ? 1 : amdgpu_xgmi_get_num_links(adev,
> peer_adev);
> > +   if (num_links < 0) {
> > +   DRM_ERROR("amdgpu: failed to get xgmi num links between
> node %d and %d. ret = %d\n",
> > +   adev->gmc.xgmi.physical_node_id,
> > +   peer_adev->gmc.xgmi.physical_node_id, num_links);
> > +   num_links = 0;
> > +   }
> > +
> > +   /* Aldebaran xGMI DPM is defeatured so assume x16 x 25Gbps for
> bandwidth. */
> > +   return (num_links * 16 * 25000)/BITS_PER_BYTE;
>
> Instead of having ASIC family checks and bandwidth info in interface
> functions, better to have this info come from base layer (amdgpu_xgmi or
> xgmi ip). That will help to handle other ASICs.

Ok.  We can revisit this as a follow up.  Maybe the full solution is a link 
width/speed support mask analogous to pcie.

Thanks,

Jon

>
> Thanks,
> Lijo
>
> >   uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct kgd_dev
> *kgd)
> >   {
> > struct amdgpu_device *adev = (struct amdgpu_device *)kgd; diff
> > --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > index 81264517d532..e12fccb2d2c4 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > @@ -226,6 +226,7 @@ uint32_t amdgpu_amdkfd_get_num_gws(struct
> kgd_dev *kgd);
> >   uint32_t amdgpu_amdkfd_get_asic_rev_id(struct kgd_dev *kgd);
> >   int amdgpu_amdkfd_get_noretry(struct kgd_dev *kgd);
> >   uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst,
> > struct kgd_dev *src);
> > +int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst,
> > +struct kgd_dev *src, bool is_min);
> >
> >   /* Read user wptr from a specified user address space with page fault
> >* disabled. The memory must be pinned and mapped to the hardware
> > when diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> > index 8567d5d77346..258cf86b32f6 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> > @@ -486,6 +486,18 @@ int amdgpu_xgmi_get_hops_count(struct
> amdgpu_device *adev,
> > return  -EINVAL;
> >   }
> >
> > +int amdgpu_xgmi_get_num_links(struct amdgpu_device *adev,
> > +   struct amdgpu_device *peer_adev)
> > +{
> > +   struct psp_xgmi_topology_info *top = &adev-
> >psp.xgmi_context.top_info;
> > +   int i;
> > +
> > +   for (i = 0 ; i < top->num_nodes; ++i)
> > +   if (top->nodes[i].node_id == peer_adev->gmc.xgmi.node_id)
> > +   return top->nodes[i].num_links;
> > +   return  -EINVAL;
> > +}
> > +
> >   int amdgpu_xgmi_add_device(struct amdgpu_device *adev)
> >   {
> > struct psp_xgmi_topology_info *top_info; diff --git
> > a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> > index 12969c0830d5..d2189bf7d428 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> > @@ -59,6 +59,8 @@ int amdgpu_xgmi_remove_device(struct
> amdgpu_device *adev);
> >   int amdgpu_xgmi_set_pstate(struct amdgpu_device *adev, int pstate);
> >   int amdgpu_xgmi_get_hops_count(struct amdgpu_device *adev,
> > struct amdgpu_device *

RE: [PATCH 3/3] drm/amdkfd: report pcie bandwidth to the kfd

2021-07-19 Thread Kim, Jonathan
[AMD Official Use Only]

> -Original Message-
> From: Kuehling, Felix 
> Sent: Saturday, July 17, 2021 1:37 AM
> To: Kim, Jonathan ; amd-
> g...@lists.freedesktop.org
> Subject: Re: [PATCH 3/3] drm/amdkfd: report pcie bandwidth to the kfd
>
> Am 2021-07-16 um 12:43 p.m. schrieb Jonathan Kim:
> > Similar to xGMI reporting the min/max bandwidth between direct peers,
> > PCIe will report the min/max bandwidth to the KFD.
> >
> > v2: change to bandwidth
> >
> > Signed-off-by: Jonathan Kim 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 61
> > ++
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  1 +
> >  drivers/gpu/drm/amd/amdkfd/kfd_crat.c  |  4 ++
> >  3 files changed, 66 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > index 3978578a1c49..b7db52f1a9d1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> > @@ -21,6 +21,7 @@
> >   */
> >
> >  #include "amdgpu_amdkfd.h"
> > +#include "amd_pcie.h"
> >  #include "amd_shared.h"
> >
> >  #include "amdgpu.h"
> > @@ -576,6 +577,66 @@ int
> amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst, struct
> kgd_dev
> > return (num_links * 16 * 25000)/BITS_PER_BYTE;  }
> >
> > +int amdgpu_amdkfd_get_pcie_bandwidth_mbytes(struct kgd_dev *dev,
> bool
> > +is_min) {
> > +   struct amdgpu_device *adev = (struct amdgpu_device *)dev;
> > +   int num_lanes_shift = is_min ? ffs(adev->pm.pcie_mlw_mask >>
> > +
>   CAIL_PCIE_LINK_WIDTH_SUPPORT_SHIFT) - 1 :
> > +   fls(adev->pm.pcie_mlw_mask >>
> > +
>   CAIL_PCIE_LINK_WIDTH_SUPPORT_SHIFT) - 1;
> > +   int gen_speed_shift = is_min ? ffs(adev->pm.pcie_gen_mask >>
> > +
>   CAIL_PCIE_LINK_SPEED_SUPPORT_SHIFT) - 1 :
> > +   fls(adev->pm.pcie_gen_mask >>
> > +
>   CAIL_PCIE_LINK_SPEED_SUPPORT_SHIFT) - 1;
>
> The shifting is not necessary because you undo it below. I think this would
> do the trick and be more readable:
>
>   int num_lanes_shift = (is_min ? ffs(adev->pm.pcie_mlw_mask) :
>   fls(adev->pm.pcie_mlw_mask)) - 1;
>   int gen_speed_shift = (is_min ? ffs(adev->pm.pcie_gen_mask) :
>   fls(adev->pm.pcie_gen_mask)) - 1;

Ok thanks for the review and suggestion.  I've adjusted your suggestion by 
masking pcie_gen_mask with CAIL_PCIE_LINK_SPEED_SUPPORT_MASK as the mask sets 
some non-speed related lower bits.

Thanks,

Jon

>   uint32_t num_lanes_mask = 1 << num_lanes_shift;
>   uint32_t gen_speed_mask = 1 << gen_speed_shift;
>
> With that fixed, this patch is
>
> Reviewed-by: Felix Kuehling 
>
>
> > +   uint32_t num_lanes_mask = (1UL << num_lanes_shift) <<
> CAIL_PCIE_LINK_WIDTH_SUPPORT_SHIFT;
> > +   uint32_t gen_speed_mask = (1UL << gen_speed_shift) <<
> CAIL_PCIE_LINK_SPEED_SUPPORT_SHIFT;
> > +   int num_lanes_factor = 0, gen_speed_mbits_factor = 0;
> > +
> > +   switch (num_lanes_mask) {
> > +   case CAIL_PCIE_LINK_WIDTH_SUPPORT_X1:
> > +   num_lanes_factor = 1;
> > +   break;
> > +   case CAIL_PCIE_LINK_WIDTH_SUPPORT_X2:
> > +   num_lanes_factor = 2;
> > +   break;
> > +   case CAIL_PCIE_LINK_WIDTH_SUPPORT_X4:
> > +   num_lanes_factor = 4;
> > +   break;
> > +   case CAIL_PCIE_LINK_WIDTH_SUPPORT_X8:
> > +   num_lanes_factor = 8;
> > +   break;
> > +   case CAIL_PCIE_LINK_WIDTH_SUPPORT_X12:
> > +   num_lanes_factor = 12;
> > +   break;
> > +   case CAIL_PCIE_LINK_WIDTH_SUPPORT_X16:
> > +   num_lanes_factor = 16;
> > +   break;
> > +   case CAIL_PCIE_LINK_WIDTH_SUPPORT_X32:
> > +   num_lanes_factor = 32;
> > +   break;
> > +   }
> > +
> > +   switch (gen_speed_mask) {
> > +   case CAIL_PCIE_LINK_SPEED_SUPPORT_GEN1:
> > +   gen_speed_mbits_factor = 2500;
> > +   break;
> > +   case CAIL_PCIE_LINK_SPEED_SUPPORT_GEN2:
> > +   gen_speed_mbits_factor = 5000;
> > +   break;
> > +   case CAIL_PCIE_LINK_SPEED_SUPPORT_GEN3:
> > +   gen_speed_mbits_factor = 8000;
> > +   break;
> > +   case CAIL_PCIE_LINK_SPEED_SUPPORT_GEN4:
> > +   gen_speed_mbits_factor = 16000;
> > +   break;
> > +   case CAIL_PCIE_LINK_SPEED_SUPPORT_GEN5:
> > +   gen_speed_mbits_factor = 32000;
> > +   break;
> > +   }
> > +
> > +   return (num_lanes_factor *
> gen_speed_mbits_factor)/BITS_PER_BYTE;
> > +}
> > +
> >  uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct kgd_dev
> *kgd)
> > {
> > struct amdgpu_device *adev = (struct amdgpu_device *)kgd; diff --git
> > a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > index e12fccb2d2c4..5d73f3108d13 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > @@ -227,6 +227,7 @@ uint3

Re: [PATCH] drm/amdgpu: Fix documentaion for amdgpu_bo_add_to_shadow_list

2021-07-19 Thread Deucher, Alexander
[Public]

Reviewed-by: Alex Deucher 

From: amd-gfx  on behalf of Anson Jacob 

Sent: Monday, July 19, 2021 11:19 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Jacob, Anson 
Subject: [PATCH] drm/amdgpu: Fix documentaion for amdgpu_bo_add_to_shadow_list

make htmldocs complaints about parameter for amdgpu_bo_add_to_shadow_list

./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Excess function 
parameter 'bo' description in 'amdgpu_bo_add_to_shadow_list'
./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Function parameter 
or member 'vmbo' not described in 'amdgpu_bo_add_to_shadow_list'
./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Excess function 
parameter 'bo' description in 'amdgpu_bo_add_to_shadow_list'

Signed-off-by: Anson Jacob 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index ea339eaac399..4e2c0270208f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -755,7 +755,7 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
 /**
  * amdgpu_bo_add_to_shadow_list - add a BO to the shadow list
  *
- * @bo: BO that will be inserted into the shadow list
+ * @vmbo: BO that will be inserted into the shadow list
  *
  * Insert a BO to the shadow list.
  */
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Calexander.deucher%40amd.com%7Cdd6220ed9b714257c0d408d94ac8957e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637623047634433649%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=p6H1ou0jYE7BArtppdvJIdgtcXMqEMwRgUu9s%2BR6vEA%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Fix documentaion for amdgpu_bo_add_to_shadow_list

2021-07-19 Thread Anson Jacob
make htmldocs complaints about parameter for amdgpu_bo_add_to_shadow_list

./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Excess function 
parameter 'bo' description in 'amdgpu_bo_add_to_shadow_list'
./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Function parameter 
or member 'vmbo' not described in 'amdgpu_bo_add_to_shadow_list'
./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Excess function 
parameter 'bo' description in 'amdgpu_bo_add_to_shadow_list'

Signed-off-by: Anson Jacob 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index ea339eaac399..4e2c0270208f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -755,7 +755,7 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
 /**
  * amdgpu_bo_add_to_shadow_list - add a BO to the shadow list
  *
- * @bo: BO that will be inserted into the shadow list
+ * @vmbo: BO that will be inserted into the shadow list
  *
  * Insert a BO to the shadow list.
  */
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/1] drm/amdkfd: Allow querying SVM attributes that are clear

2021-07-19 Thread philip yang

  


On 2021-07-16 10:46 p.m., Felix
  Kuehling wrote:


  Currently the SVM get_attr call allows querying, which flags are set
in the entire address range. Add the opposite query, which flags are
clear in the entire address range. Both queries can be combined in a
single get_attr call, which allows answering questions such as, "is
this address range coherent, non-coherent, or a mix of both"?

Signed-off-by: Felix Kuehling 

Reviewed-by: Philip Yang 

  
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 19 +--
 include/uapi/linux/kfd_ioctl.h   | 16 +---
 2 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index e7e99c5070b9..c66221a9c02d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -3019,7 +3019,8 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size,
 	struct svm_range *prange;
 	uint32_t prefetch_loc = KFD_IOCTL_SVM_LOCATION_UNDEFINED;
 	uint32_t location = KFD_IOCTL_SVM_LOCATION_UNDEFINED;
-	uint32_t flags = 0x;
+	uint32_t flags_and = 0x;
+	uint32_t flags_or = 0;
 	int gpuidx;
 	uint32_t i;
 
@@ -3046,12 +3047,12 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size,
 			get_accessible = true;
 			break;
 		case KFD_IOCTL_SVM_ATTR_SET_FLAGS:
+		case KFD_IOCTL_SVM_ATTR_CLR_FLAGS:
 			get_flags = true;
 			break;
 		case KFD_IOCTL_SVM_ATTR_GRANULARITY:
 			get_granularity = true;
 			break;
-		case KFD_IOCTL_SVM_ATTR_CLR_FLAGS:
 		case KFD_IOCTL_SVM_ATTR_ACCESS_IN_PLACE:
 		case KFD_IOCTL_SVM_ATTR_NO_ACCESS:
 			fallthrough;
@@ -3069,7 +3070,8 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size,
 	if (!node) {
 		pr_debug("range attrs not found return default values\n");
 		svm_range_set_default_attributes(&location, &prefetch_loc,
-		 &granularity, &flags);
+		 &granularity, &flags_and);
+		flags_or = flags_and;
 		if (p->xnack_enabled)
 			bitmap_copy(bitmap_access, svms->bitmap_supported,
 MAX_GPU_INSTANCE);
@@ -3115,8 +3117,10 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size,
 			bitmap_and(bitmap_aip, bitmap_aip,
    prange->bitmap_aip, MAX_GPU_INSTANCE);
 		}
-		if (get_flags)
-			flags &= prange->flags;
+		if (get_flags) {
+			flags_and &= prange->flags;
+			flags_or |= prange->flags;
+		}
 
 		if (get_granularity && prange->granularity < granularity)
 			granularity = prange->granularity;
@@ -3150,7 +3154,10 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size,
 attrs[i].type = KFD_IOCTL_SVM_ATTR_NO_ACCESS;
 			break;
 		case KFD_IOCTL_SVM_ATTR_SET_FLAGS:
-			attrs[i].value = flags;
+			attrs[i].value = flags_and;
+			break;
+		case KFD_IOCTL_SVM_ATTR_CLR_FLAGS:
+			attrs[i].value = ~flags_or;
 			break;
 		case KFD_IOCTL_SVM_ATTR_GRANULARITY:
 			attrs[i].value = (uint32_t)granularity;
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index 3cb5b5dd9f77..af96af174dc4 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -31,9 +31,10 @@
  * - 1.3 - Add SMI events support
  * - 1.4 - Indicate new SRAM EDC bit in device properties
  * - 1.5 - Add SVM API
+ * - 1.6 - Query clear flags in SVM get_attr API
  */
 #define KFD_IOCTL_MAJOR_VERSION 1
-#define KFD_IOCTL_MINOR_VERSION 5
+#define KFD_IOCTL_MINOR_VERSION 6
 
 struct kfd_ioctl_get_version_args {
 	__u32 major_version;	/* from KFD */
@@ -575,18 +576,19 @@ struct kfd_ioctl_svm_attribute {
  * @KFD_IOCTL_SVM_ATTR_PREFERRED_LOC or
  * @KFD_IOCTL_SVM_ATTR_PREFETCH_LOC resepctively. For
  * @KFD_IOCTL_SVM_ATTR_SET_FLAGS, flags of all pages will be
- * aggregated by bitwise AND. The minimum  migration granularity
- * throughout the range will be returned for
- * @KFD_IOCTL_SVM_ATTR_GRANULARITY.
+ * aggregated by bitwise AND. That means, a flag will be set in the
+ * output, if that flag is set for all pages in the range. For
+ * @KFD_IOCTL_SVM_ATTR_CLR_FLAGS, flags of all pages will be
+ * aggregated by bitwise NOR. That means, a flag will be set in the
+ * output, if that flag is clear for all pages in the range.
+ * The minimum migration granularity throughout the range will be
+ * returned for @KFD_IOCTL_SVM_ATTR_GRANULARITY.
  *
  * Querying of accessibility attributes works by initializing the
  * attribute type to @KFD_IOCTL_SVM_ATTR_ACCESS and the value to the
  * GPUID being queried. Multiple attributes can be given to allow
  * querying multiple GPUIDs. The ioctl function overwrites the
  * attribute type to indicate the access for the specified GPU.
- *
- * @KFD_IOCTL_SVM_ATTR_CLR_FLAGS is invalid for
- * @KFD_IOCTL_SVM_OP_GET_ATTR.
  */
 struct kfd_ioctl_svm_args {
 	__u64 start_addr;


  

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman

[PATCH v2] drm/amdgpu - Corrected the video codecs array name for yellow carp

2021-07-19 Thread veerabadhran.gopalakrishnan
From: Veerabadhran Gopalakrishnan 

Signed-off-by: Veerabadhran Gopalakrishnan 
---
 drivers/gpu/drm/amd/amdgpu/nv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index cf73a6923..751c7b8b1 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -173,8 +173,8 @@ static const struct amdgpu_video_codec_info 
yc_video_codecs_decode_array[] = {
 };
 
 static const struct amdgpu_video_codecs yc_video_codecs_decode = {
-   .codec_count = ARRAY_SIZE(bg_video_codecs_decode_array),
-   .codec_array = bg_video_codecs_decode_array,
+   .codec_count = ARRAY_SIZE(yc_video_codecs_decode_array),
+   .codec_array = yc_video_codecs_decode_array,
 };
 
 static int nv_query_video_codecs(struct amdgpu_device *adev, bool encode,
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu - Corrected the video codecs array name for yellow carp

2021-07-19 Thread Chen, Guchun
[Public]

.codec_count = ARRAY_SIZE(bg_video_codecs_decode_array),

You need to change bg_xxx to yc_xxx as well.

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of 
veerabadhran.gopalakrish...@amd.com
Sent: Monday, July 19, 2021 9:39 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Boyuan ; Gopalakrishnan, Veerabadhran (Veera) 
; Zhu, James ; Liu, Leo 
; Rao, Srinath 
Subject: [PATCH] drm/amdgpu - Corrected the video codecs array name for yellow 
carp

From: Veerabadhran Gopalakrishnan 

Signed-off-by: Veerabadhran Gopalakrishnan 
---
 drivers/gpu/drm/amd/amdgpu/nv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c 
index cf73a6923..3027b44df 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -174,7 +174,7 @@ static const struct amdgpu_video_codec_info 
yc_video_codecs_decode_array[] = {
 
 static const struct amdgpu_video_codecs yc_video_codecs_decode = {
.codec_count = ARRAY_SIZE(bg_video_codecs_decode_array),
-   .codec_array = bg_video_codecs_decode_array,
+   .codec_array = yc_video_codecs_decode_array,
 };
 
 static int nv_query_video_codecs(struct amdgpu_device *adev, bool encode,
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cguchun.chen%40amd.com%7C5eeeb3a77fad4647678008d94aba9688%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637622987531891570%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=DI5h9oLQdSykmLEUgDXoobouTACHLMk99hnP92udTMo%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu - Corrected the video codecs array name for yellow carp

2021-07-19 Thread veerabadhran.gopalakrishnan
From: Veerabadhran Gopalakrishnan 

Signed-off-by: Veerabadhran Gopalakrishnan 
---
 drivers/gpu/drm/amd/amdgpu/nv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index cf73a6923..3027b44df 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -174,7 +174,7 @@ static const struct amdgpu_video_codec_info 
yc_video_codecs_decode_array[] = {
 
 static const struct amdgpu_video_codecs yc_video_codecs_decode = {
.codec_count = ARRAY_SIZE(bg_video_codecs_decode_array),
-   .codec_array = bg_video_codecs_decode_array,
+   .codec_array = yc_video_codecs_decode_array,
 };
 
 static int nv_query_video_codecs(struct amdgpu_device *adev, bool encode,
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/pm: Fix a bug communicating with the SMU (v3)

2021-07-19 Thread Luben Tuikov

  
On 2021-07-19 4:24 a.m., Lazar, Lijo
  wrote:


  

On 7/14/2021 11:28 PM, Luben Tuikov wrote:

  
This fixes a bug which if we probe a non-existing
I2C device, and the SMU returns 0xFF, from then on
we can never communicate with the SMU, because the
code before this patch reads and interprets 0xFF
as a terminal error, and thus we never write 0
into register 90 to clear the status (and
subsequently send a new command to the SMU.)

It is not an error that the SMU returns status
0xFF. This means that the SMU executed the last
command successfully (execution status), but the
command result is an error of some sort (execution
result), depending on what the command was.

When doing a status check of the SMU, before we
send a new command, the only status which
precludes us from sending a new command is 0--the
SMU hasn't finished executing a previous command,
and 0xFC--the SMU is busy.

This bug was seen as the following line in the
kernel log,

amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right state!

when subsequent SMU commands, not necessarily
related to I2C, were sent to the SMU.

This patch fixes this bug.

v2: Add a comment to the description of
__smu_cmn_poll_stat() to explain why we're NOT
defining the SMU FW return codes as macros, but
are instead hard-coding them. Such a change, can
be followed up by a subsequent patch.

v3: The changes are,
a) Add comments to break labels in
__smu_cmn_reg2errno().

b) When an unknown/unspecified/undefined result is
returned back from the SMU, map that to
-EREMOTEIO, to distinguish failure at the SMU
FW.

c) Add kernel-doc to
smu_cmn_send_msg_without_waiting(),
smu_cmn_wait_for_response(),
smu_cmn_send_smc_msg_with_param().

d) In smu_cmn_send_smc_msg_with_param(), since we
wait for completion of the command, if the
result of the completion is
undefined/unknown/unspecified, we print that to
the kernel log.

Cc: Alex Deucher 
Cc: Evan Quan 
Cc: Lijo Lazar 
Fixes: fcb1fe9c9e0031 ("drm/amd/powerplay: pre-check the SMU state before issuing message")
Signed-off-by: Luben Tuikov 
---
  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 274 -
  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h |   3 +-
  2 files changed, 230 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index c902fdf322c1be..dde10c51daa106 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -55,7 +55,7 @@
  
  #undef __SMU_DUMMY_MAP
  #define __SMU_DUMMY_MAP(type)	#type
-static const char* __smu_message_names[] = {
+static const char * const __smu_message_names[] = {
  	SMU_MESSAGE_TYPES
  };
  
@@ -76,55 +76,246 @@ static void smu_cmn_read_arg(struct smu_context *smu,
  	*arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82);
  }
  
-int smu_cmn_wait_for_response(struct smu_context *smu)
+/**
+ * __smu_cmn_poll_stat -- poll for a status from the SMU
+ * smu: a pointer to SMU context
+ *
+ * Returns the status of the SMU, which could be,
+ *0, the SMU is busy with your previous command;
+ *1, execution status: success, execution result: success;
+ * 0xFF, execution status: success, execution result: failure;
  


^ right here 


  

+ * 0xFE, unknown command;
+ * 0xFD, valid command, but bad (command) prerequisites;
+ * 0xFC, the command was rejected as the SMU is busy;
+ * 0xFB, "SMC_Result_DebugDataDumpEnd".
+ *
+ * The values here are not defined by macros, because I'd rather we
+ * include a single header file which defines them, which is
+ * maintained by the SMU FW team, so that we're impervious to firmware
+ * changes. At the moment those values are defined in various header
+ * files, one for each ASIC, yet here we're a single ASIC-agnostic
+ * interface. Such a change can be followed-up by a subsequent patch.
+ */
+static u32 __smu_cmn_poll_stat(struct smu_context *smu)
  {
  	struct amdgpu_device *adev = smu->adev;
-	uint32_t cur_value, i, timeout = adev->usec_timeout * 20;
+	int timeout = adev->usec_timeout * 20;
+	u32 reg;
  
-	for (i = 0; i < timeout; i++) {
-		cur_value = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90);
-		if ((cur_value & MP1_C2PMSG_90__CONTENT_MASK) != 0)
-			return cur_value;
+	for ( ; timeout > 0; timeout--) {
+		reg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90);
+		if ((reg & MP1_C2PMSG_90__CONTENT_MASK) != 0)
+			break;
  
  		udelay(1);
  	}
  
-	/* timeout means wrong logic */
-	if (i == timeout)
-		return -ETIME;
-
-	return RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90);
+	return reg;
  }
  
-int smu_cmn_send_msg_without_waiting(struct smu_context *smu,
- uint16_t msg, uint32_t param)
+static void __smu_cmn_reg_print_error(struct smu_context *smu,
+  u32 reg_c2pmsg_90,
+  int msg_index,
+  u32 param,
+  enum smu_message_type msg)
  {
  

RE: [PATCH 00/32] DC Patches July 18, 2021

2021-07-19 Thread Wheeler, Daniel
[Public]

Hi all,
 
This week this patchset was tested on the following systems:
 
HP Envy 360, with Ryzen 5 4500U, with the following display types: eDP 1080p 
60hz, 4k 60hz  (via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 
1680*1050 60hz (via USB-C to DP and then DP to DVI/VGA)
 
AMD Ryzen 9 5900H, with the following display types: eDP 1080p 60hz, 4k 60hz  
(via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 1680*1050 60hz (via 
USB-C to DP and then DP to DVI/VGA)
 
Sapphire Pulse RX5700XT with the following display types:
4k 60hz  (via DP/HDMI), 1440p 144hz (via DP/HDMI), 1680*1050 60hz (via DP to 
DVI/VGA)
 
Reference AMD RX6800 with the following display types:
4k 60hz  (via DP/HDMI and USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI 
and USB-C to DP/HDMI), 1680*1050 60hz (via DP to DVI/VGA)
 
Included testing using a Startech DP 1.4 MST hub at 2x 4k 60hz, and 3x 1080p 
60hz on all systems.
 
 
Tested-by: Daniel Wheeler 
 
 
Thank you,
 
Dan Wheeler
Technologist  |  AMD
SW Display
--
1 Commerce Valley Dr E, Thornhill, ON L3T 7X6
Facebook |  Twitter |  amd.com  

-Original Message-
From: Siqueira, Rodrigo  
Sent: July 18, 2021 10:06 AM
To: amd-gfx@lists.freedesktop.org
Cc: Wentland, Harry ; Li, Sun peng (Leo) 
; Lakha, Bhawanpreet ; Siqueira, 
Rodrigo ; Pillai, Aurabindo 
; Zhuo, Qingqing ; R, Bindu 
; Jacob, Anson ; Wheeler, Daniel 

Subject: [PATCH 00/32] DC Patches July 18, 2021

DC version 3.2.144 brings improvements in multiple areas. In summary, we
highlight:

- Code improvements for passive
- Cursor manipulation enhancements
- Expand debug in some areas
- Fix problems in DML
- Other minor code refactors

Cc: Daniel Wheeler 

Anthony Koo (2):
  drm/amd/display: [FW Promotion] Release 0.0.73
  drm/amd/display: [FW Promotion] Release 0.0.75

Aric Cyr (3):
  drm/amd/display: 3.2.143
  drm/amd/display: 3.2.144
  drm/amd/display: 3.2.145

Aurabindo Pillai (1):
  drm/amd/display: add debug print for DCC validation failure

Bindu Ramamurthy (2):
  drm/amd/display: Populate socclk entries for dcn3.02/3.03
  drm/amd/display: Populate dtbclk entries for dcn3.02/3.03

Camille Cho (1):
  drm/amd/display: Only set default brightness for OLED

Charlene Liu (1):
  drm/amd/display: reset dpcd_cap.dpcd_rev for passive dongle.

Dmytro Laktyushkin (1):
  drm/amd/display: remove compbuf size wait

Eric Yang (3):
  drm/amd/display: implement workaround for riommu related hang
  drm/amd/display: add workaround for riommu invalidation request hang
  drm/amd/display: change zstate allow msg condition

Ian Chen (1):
  drm/amd/display: Extend dmub_cmd_psr_copy_settings_data struct

Jake Wang (1):
  drm/amd/display: Fixed hardware power down bypass during headless boot

Josip Pavic (1):
  drm/amd/display: log additional register state for debug

Krunoslav Kovac (2):
  drm/amd/display: Assume active upper layer owns the HW cursor
  drm/amd/display: Refine condition for cursor visibility

Michael Strauss (1):
  drm/amd/display: Enable eDP ILR on DCN2.1

Mikita Lipski (2):
  drm/amd/display: Prevent Diags from entering S2
  drm/amd/display: Remove MALL function from DCN3.1

Nevenko Stupar (1):
  drm/amd/display: Line Buffer changes

Nicholas Kazlauskas (3):
  drm/amd/display: Fix max vstartup calculation for modes with borders
  drm/amd/display: Query VCO frequency from register for DCN3.1
  drm/amd/display: Update bounding box for DCN3.1

Oliver Logush (1):
  drm/amd/display: Fix timer_per_pixel unit error

Stylon Wang (1):
  drm/amd/display: Re-enable "Guard ASSR with internal display flag"

Victor Lu (1):
  drm/amd/display: Fix comparison error in dcn21 DML

Wesley Chalmers (1):
  drm/amd/display: Add copyright notice to new files

Zhan Liu (1):
  drm/amd/display: Reduce delay when sink device not able to ACK 00340h
write

sunglee (1):
  drm/amd/display: DCN2X Prefer ODM over bottom pipe to find second pipe

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  6 +-  
.../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c  |  4 ++
 .../display/dc/clk_mgr/dcn301/vg_clk_mgr.c|  2 +-
 .../display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c  | 59 ---  
.../display/dc/clk_mgr/dcn31/dcn31_clk_mgr.h  | 54 --  
drivers/gpu/drm/amd/display/dc/core/dc_link.c |  8 +++  
.../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 72 +--
 .../drm/amd/display/dc/core/dc_link_dpcd.c| 25 +++
 drivers/gpu/drm/amd/display/dc/dc.h   | 12 ++--
 drivers/gpu/drm/amd/display/dc/dc_dp_types.h  |  1 +
 .../gpu/drm/amd/display/dc/dce/dce_hwseq.h|  4 +-
 .../drm/amd/display/dc/dcn10/dcn10_dpp_dscl.c |  7 +-
 .../drm/amd/display/dc/dcn10/dcn10_hubbub.h   | 19 -
 .../gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c | 29   
.../gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h |  4 ++  
.../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 42 ++-  
.../gpu/drm/amd/display/dc/dcn10

Re: [PATCH v4 03/13] kernel: resource: lookup_resource as exported symbol

2021-07-19 Thread Christoph Hellwig
On Sat, Jul 17, 2021 at 02:21:25PM -0500, Alex Sierra wrote:
>   return res;
>  }
> -
> +EXPORT_SYMBOL_GPL(lookup_resource);
>  /*

Please keep this empty line (after the EXPORT_SYMBOL).
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v4 08/13] mm: call pgmap->ops->page_free for DEVICE_GENERIC pages

2021-07-19 Thread Christoph Hellwig
On Sat, Jul 17, 2021 at 02:21:30PM -0500, Alex Sierra wrote:
> Add MEMORY_DEVICE_GENERIC case to free_zone_device_page
> callback.
> Device generic type memory case is now able to free its
> pages properly.

No need for the early line breaks:

Add MEMORY_DEVICE_GENERIC case to free_zone_device_page callback.
Device generic type memory case is now able to free its pages properly.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v4 05/13] drm/amdkfd: generic type as sys mem on migration to ram

2021-07-19 Thread Christoph Hellwig
On Sat, Jul 17, 2021 at 02:21:27PM -0500, Alex Sierra wrote:
> + migrate.flags = adev->gmc.xgmi.connected_to_cpu ?
> + MIGRATE_VMA_SELECT_SYSTEM : 
> MIGRATE_VMA_SELECT_DEVICE_PRIVATE;

Please just use a good old if/else to keep the code readable.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority

2021-07-19 Thread Liu, Monk
[AMD Official Use Only]

Besides, I think our current KMD have three types of kernel sdma jobs:
1) adev->mman.entity, it is already a KERNEL priority entity
2) vm->immediate
3) vm->delay

Do you mean now vm->immediate or delay are used as moving jobs instead of 
mman.entity ?

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Liu, Monk 
Sent: Monday, July 19, 2021 5:40 PM
To: 'Christian König' ; Chen, JingWen 
; amd-gfx@lists.freedesktop.org
Cc: Chen, Horace 
Subject: RE: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority

[AMD Official Use Only]

If there is move jobs clashing there we probably need to fix the bugs of those 
move jobs

Previously I believe you also remember that we agreed to always trust kernel 
jobs especially paging jobs,

Without set paging jobs' priority to KERNEL level how can we keep that protocol 
? do you have a better idea?

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Christian König 
Sent: Monday, July 19, 2021 4:25 PM
To: Chen, JingWen ; amd-gfx@lists.freedesktop.org
Cc: Chen, Horace ; Liu, Monk 
Subject: Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority

Am 19.07.21 um 07:57 schrieb Jingwen Chen:
> [Why]
> Current vm_pte entities have NORMAL priority, in SRIOV multi-vf use 
> case, the vf flr happens first and then job time out is found.
> There can be several jobs timeout during a very small time slice.
> And if the innocent sdma job time out is found before the real bad 
> job, then the innocent sdma job will be set to guilty as it only has 
> NORMAL priority. This will lead to a page fault after resubmitting 
> job.
>
> [How]
> sdma should always have KERNEL priority. The kernel job will always be 
> resubmitted.

I'm not sure if that is a good idea. We intentionally didn't gave the page 
table updates kernel priority to avoid clashing with the move jobs.

Christian.

>
> Signed-off-by: Jingwen Chen 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 358316d6a38c..f7526b67cc5d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2923,13 +2923,13 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
> amdgpu_vm *vm)
>   INIT_LIST_HEAD(&vm->done);
>   
>   /* create scheduler entities for page table updates */
> - r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_NORMAL,
> + r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_KERNEL,
> adev->vm_manager.vm_pte_scheds,
> adev->vm_manager.vm_pte_num_scheds, NULL);
>   if (r)
>   return r;
>   
> - r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_NORMAL,
> + r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_KERNEL,
> adev->vm_manager.vm_pte_scheds,
> adev->vm_manager.vm_pte_num_scheds, NULL);
>   if (r)
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority

2021-07-19 Thread Liu, Monk
[AMD Official Use Only]

If there is move jobs clashing there we probably need to fix the bugs of those 
move jobs

Previously I believe you also remember that we agreed to always trust kernel 
jobs especially paging jobs,

Without set paging jobs' priority to KERNEL level how can we keep that protocol 
? do you have a better idea?

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Christian König  
Sent: Monday, July 19, 2021 4:25 PM
To: Chen, JingWen ; amd-gfx@lists.freedesktop.org
Cc: Chen, Horace ; Liu, Monk 
Subject: Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority

Am 19.07.21 um 07:57 schrieb Jingwen Chen:
> [Why]
> Current vm_pte entities have NORMAL priority, in SRIOV multi-vf use 
> case, the vf flr happens first and then job time out is found.
> There can be several jobs timeout during a very small time slice.
> And if the innocent sdma job time out is found before the real bad 
> job, then the innocent sdma job will be set to guilty as it only has 
> NORMAL priority. This will lead to a page fault after resubmitting 
> job.
>
> [How]
> sdma should always have KERNEL priority. The kernel job will always be 
> resubmitted.

I'm not sure if that is a good idea. We intentionally didn't gave the page 
table updates kernel priority to avoid clashing with the move jobs.

Christian.

>
> Signed-off-by: Jingwen Chen 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 358316d6a38c..f7526b67cc5d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2923,13 +2923,13 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
> amdgpu_vm *vm)
>   INIT_LIST_HEAD(&vm->done);
>   
>   /* create scheduler entities for page table updates */
> - r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_NORMAL,
> + r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_KERNEL,
> adev->vm_manager.vm_pte_scheds,
> adev->vm_manager.vm_pte_num_scheds, NULL);
>   if (r)
>   return r;
>   
> - r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_NORMAL,
> + r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_KERNEL,
> adev->vm_manager.vm_pte_scheds,
> adev->vm_manager.vm_pte_num_scheds, NULL);
>   if (r)
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/pm: Fix a bug communicating with the SMU (v3)

2021-07-19 Thread Lazar, Lijo




On 7/14/2021 11:28 PM, Luben Tuikov wrote:

This fixes a bug which if we probe a non-existing
I2C device, and the SMU returns 0xFF, from then on
we can never communicate with the SMU, because the
code before this patch reads and interprets 0xFF
as a terminal error, and thus we never write 0
into register 90 to clear the status (and
subsequently send a new command to the SMU.)

It is not an error that the SMU returns status
0xFF. This means that the SMU executed the last
command successfully (execution status), but the
command result is an error of some sort (execution
result), depending on what the command was.

When doing a status check of the SMU, before we
send a new command, the only status which
precludes us from sending a new command is 0--the
SMU hasn't finished executing a previous command,
and 0xFC--the SMU is busy.

This bug was seen as the following line in the
kernel log,

amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right 
state!

when subsequent SMU commands, not necessarily
related to I2C, were sent to the SMU.

This patch fixes this bug.

v2: Add a comment to the description of
__smu_cmn_poll_stat() to explain why we're NOT
defining the SMU FW return codes as macros, but
are instead hard-coding them. Such a change, can
be followed up by a subsequent patch.

v3: The changes are,
a) Add comments to break labels in
__smu_cmn_reg2errno().

b) When an unknown/unspecified/undefined result is
returned back from the SMU, map that to
-EREMOTEIO, to distinguish failure at the SMU
FW.

c) Add kernel-doc to
smu_cmn_send_msg_without_waiting(),
smu_cmn_wait_for_response(),
smu_cmn_send_smc_msg_with_param().

d) In smu_cmn_send_smc_msg_with_param(), since we
wait for completion of the command, if the
result of the completion is
undefined/unknown/unspecified, we print that to
the kernel log.

Cc: Alex Deucher 
Cc: Evan Quan 
Cc: Lijo Lazar 
Fixes: fcb1fe9c9e0031 ("drm/amd/powerplay: pre-check the SMU state before issuing 
message")
Signed-off-by: Luben Tuikov 
---
  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 274 -
  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h |   3 +-
  2 files changed, 230 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index c902fdf322c1be..dde10c51daa106 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -55,7 +55,7 @@
  
  #undef __SMU_DUMMY_MAP

  #define __SMU_DUMMY_MAP(type) #type
-static const char* __smu_message_names[] = {
+static const char * const __smu_message_names[] = {
SMU_MESSAGE_TYPES
  };
  
@@ -76,55 +76,246 @@ static void smu_cmn_read_arg(struct smu_context *smu,

*arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82);
  }
  
-int smu_cmn_wait_for_response(struct smu_context *smu)

+/**
+ * __smu_cmn_poll_stat -- poll for a status from the SMU
+ * smu: a pointer to SMU context
+ *
+ * Returns the status of the SMU, which could be,
+ *0, the SMU is busy with your previous command;
+ *1, execution status: success, execution result: success;
+ * 0xFF, execution status: success, execution result: failure;
+ * 0xFE, unknown command;
+ * 0xFD, valid command, but bad (command) prerequisites;
+ * 0xFC, the command was rejected as the SMU is busy;
+ * 0xFB, "SMC_Result_DebugDataDumpEnd".
+ *
+ * The values here are not defined by macros, because I'd rather we
+ * include a single header file which defines them, which is
+ * maintained by the SMU FW team, so that we're impervious to firmware
+ * changes. At the moment those values are defined in various header
+ * files, one for each ASIC, yet here we're a single ASIC-agnostic
+ * interface. Such a change can be followed-up by a subsequent patch.
+ */
+static u32 __smu_cmn_poll_stat(struct smu_context *smu)
  {
struct amdgpu_device *adev = smu->adev;
-   uint32_t cur_value, i, timeout = adev->usec_timeout * 20;
+   int timeout = adev->usec_timeout * 20;
+   u32 reg;
  
-	for (i = 0; i < timeout; i++) {

-   cur_value = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90);
-   if ((cur_value & MP1_C2PMSG_90__CONTENT_MASK) != 0)
-   return cur_value;
+   for ( ; timeout > 0; timeout--) {
+   reg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90);
+   if ((reg & MP1_C2PMSG_90__CONTENT_MASK) != 0)
+   break;
  
  		udelay(1);

}
  
-	/* timeout means wrong logic */

-   if (i == timeout)
-   return -ETIME;
-
-   return RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90);
+   return reg;
  }
  
-int smu_cmn_send_msg_without_waiting(struct smu_context *smu,

-uint16_t msg, uint32_t param)
+static void __smu_cmn_reg_print_error(struct smu_context *smu,
+ u32 reg_c2pmsg_90,
+ int msg_ind

Re: [PATCH 2/3] drm/amdkfd: report xgmi bandwidth between direct peers to the kfd

2021-07-19 Thread Lazar, Lijo




On 7/16/2021 10:13 PM, Jonathan Kim wrote:

Report the min/max bandwidth in megabytes to the kfd for direct
xgmi connections only.

v2: change reporting from num links to bandwidth

Signed-off-by: Jonathan Kim 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 23 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   | 12 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  2 ++
  drivers/gpu/drm/amd/amdkfd/kfd_crat.c  | 12 +++
  5 files changed, 50 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index bfab2f9fdd17..3978578a1c49 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -553,6 +553,29 @@ uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev 
*dst, struct kgd_dev *s
return  (uint8_t)ret;
  }
  
+int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst, struct kgd_dev *src, bool is_min)

+{
+   struct amdgpu_device *adev = (struct amdgpu_device *)dst, *peer_adev;
+   int num_links;
+
+   if (adev->asic_type != CHIP_ALDEBARAN)
+   return 0;
+
+   if (src)
+   peer_adev = (struct amdgpu_device *)src;
+
+   num_links = is_min ? 1 : amdgpu_xgmi_get_num_links(adev, peer_adev);
+   if (num_links < 0) {
+   DRM_ERROR("amdgpu: failed to get xgmi num links between node %d and 
%d. ret = %d\n",
+   adev->gmc.xgmi.physical_node_id,
+   peer_adev->gmc.xgmi.physical_node_id, num_links);
+   num_links = 0;
+   }
+
+   /* Aldebaran xGMI DPM is defeatured so assume x16 x 25Gbps for 
bandwidth. */
+   return (num_links * 16 * 25000)/BITS_PER_BYTE;


Instead of having ASIC family checks and bandwidth info in interface 
functions, better to have this info come from base layer (amdgpu_xgmi or 
xgmi ip). That will help to handle other ASICs.


Thanks,
Lijo


  uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct kgd_dev *kgd)
  {
struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 81264517d532..e12fccb2d2c4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -226,6 +226,7 @@ uint32_t amdgpu_amdkfd_get_num_gws(struct kgd_dev *kgd);
  uint32_t amdgpu_amdkfd_get_asic_rev_id(struct kgd_dev *kgd);
  int amdgpu_amdkfd_get_noretry(struct kgd_dev *kgd);
  uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst, struct kgd_dev 
*src);
+int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct kgd_dev *dst, struct 
kgd_dev *src, bool is_min);
  
  /* Read user wptr from a specified user address space with page fault

   * disabled. The memory must be pinned and mapped to the hardware when
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index 8567d5d77346..258cf86b32f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -486,6 +486,18 @@ int amdgpu_xgmi_get_hops_count(struct amdgpu_device *adev,
return  -EINVAL;
  }
  
+int amdgpu_xgmi_get_num_links(struct amdgpu_device *adev,

+   struct amdgpu_device *peer_adev)
+{
+   struct psp_xgmi_topology_info *top = &adev->psp.xgmi_context.top_info;
+   int i;
+
+   for (i = 0 ; i < top->num_nodes; ++i)
+   if (top->nodes[i].node_id == peer_adev->gmc.xgmi.node_id)
+   return top->nodes[i].num_links;
+   return  -EINVAL;
+}
+
  int amdgpu_xgmi_add_device(struct amdgpu_device *adev)
  {
struct psp_xgmi_topology_info *top_info;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
index 12969c0830d5..d2189bf7d428 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
@@ -59,6 +59,8 @@ int amdgpu_xgmi_remove_device(struct amdgpu_device *adev);
  int amdgpu_xgmi_set_pstate(struct amdgpu_device *adev, int pstate);
  int amdgpu_xgmi_get_hops_count(struct amdgpu_device *adev,
struct amdgpu_device *peer_adev);
+int amdgpu_xgmi_get_num_links(struct amdgpu_device *adev,
+   struct amdgpu_device *peer_adev);
  uint64_t amdgpu_xgmi_get_relative_phy_addr(struct amdgpu_device *adev,
   uint64_t addr);
  static inline bool amdgpu_xgmi_same_hive(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index c6b02aee4993..40ce6239c813 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -1989,6 +1989,13 @@ static int kfd_fill_gpu_direct_io_link_to_cpu(int 
*avail_size,
sub_type_hdr->flags |= CRAT_IOLINK_FLAGS_BI_DIRECTIONAL;
 

Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority

2021-07-19 Thread Christian König

Am 19.07.21 um 11:42 schrieb Liu, Monk:

[AMD Official Use Only]

Besides, I think our current KMD have three types of kernel sdma jobs:
1) adev->mman.entity, it is already a KERNEL priority entity
2) vm->immediate
3) vm->delay

Do you mean now vm->immediate or delay are used as moving jobs instead of 
mman.entity ?


No, exactly that's the point. vm->immediate and vm->delayed are not for 
kernel paging jobs.


Those are used for userspace page table updates.

I agree that those should probably not considered guilty, but modifying 
the priority of them is not the right way of doing that.


Regards,
Christian.



Thanks

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Liu, Monk
Sent: Monday, July 19, 2021 5:40 PM
To: 'Christian König' ; Chen, JingWen 
; amd-gfx@lists.freedesktop.org
Cc: Chen, Horace 
Subject: RE: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority

[AMD Official Use Only]

If there is move jobs clashing there we probably need to fix the bugs of those 
move jobs

Previously I believe you also remember that we agreed to always trust kernel 
jobs especially paging jobs,

Without set paging jobs' priority to KERNEL level how can we keep that protocol 
? do you have a better idea?

Thanks

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Christian König 
Sent: Monday, July 19, 2021 4:25 PM
To: Chen, JingWen ; amd-gfx@lists.freedesktop.org
Cc: Chen, Horace ; Liu, Monk 
Subject: Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority

Am 19.07.21 um 07:57 schrieb Jingwen Chen:

[Why]
Current vm_pte entities have NORMAL priority, in SRIOV multi-vf use
case, the vf flr happens first and then job time out is found.
There can be several jobs timeout during a very small time slice.
And if the innocent sdma job time out is found before the real bad
job, then the innocent sdma job will be set to guilty as it only has
NORMAL priority. This will lead to a page fault after resubmitting
job.

[How]
sdma should always have KERNEL priority. The kernel job will always be
resubmitted.

I'm not sure if that is a good idea. We intentionally didn't gave the page 
table updates kernel priority to avoid clashing with the move jobs.

Christian.


Signed-off-by: Jingwen Chen 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 358316d6a38c..f7526b67cc5d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2923,13 +2923,13 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm)
INIT_LIST_HEAD(&vm->done);
   
   	/* create scheduler entities for page table updates */

-   r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_NORMAL,
+   r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_KERNEL,
  adev->vm_manager.vm_pte_scheds,
  adev->vm_manager.vm_pte_num_scheds, NULL);
if (r)
return r;
   
-	r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_NORMAL,

+   r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_KERNEL,
  adev->vm_manager.vm_pte_scheds,
  adev->vm_manager.vm_pte_num_scheds, NULL);
if (r)


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/amdgpu: vm entities should have kernel priority

2021-07-19 Thread Jingwen Chen
[Why]
Current vm_pte entities have NORMAL priority, in SRIOV multi-vf
use case, the vf flr happens first and then job time out is found.
There can be several jobs timeout during a very small time slice.
And if the innocent sdma job time out is found before the real bad
job, then the innocent sdma job will be set to guilty as it only
has NORMAL priority. This will lead to a page fault after
resubmitting job.

[How]
sdma should always have KERNEL priority. The kernel job will always
be resubmitted.

Signed-off-by: Jingwen Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 358316d6a38c..f7526b67cc5d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2923,13 +2923,13 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm)
INIT_LIST_HEAD(&vm->done);
 
/* create scheduler entities for page table updates */
-   r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_NORMAL,
+   r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_KERNEL,
  adev->vm_manager.vm_pte_scheds,
  adev->vm_manager.vm_pte_num_scheds, NULL);
if (r)
return r;
 
-   r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_NORMAL,
+   r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_KERNEL,
  adev->vm_manager.vm_pte_scheds,
  adev->vm_manager.vm_pte_num_scheds, NULL);
if (r)
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority

2021-07-19 Thread Christian König

Am 19.07.21 um 07:57 schrieb Jingwen Chen:

[Why]
Current vm_pte entities have NORMAL priority, in SRIOV multi-vf
use case, the vf flr happens first and then job time out is found.
There can be several jobs timeout during a very small time slice.
And if the innocent sdma job time out is found before the real bad
job, then the innocent sdma job will be set to guilty as it only
has NORMAL priority. This will lead to a page fault after
resubmitting job.

[How]
sdma should always have KERNEL priority. The kernel job will always
be resubmitted.


I'm not sure if that is a good idea. We intentionally didn't gave the 
page table updates kernel priority to avoid clashing with the move jobs.


Christian.



Signed-off-by: Jingwen Chen 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 358316d6a38c..f7526b67cc5d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2923,13 +2923,13 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm)
INIT_LIST_HEAD(&vm->done);
  
  	/* create scheduler entities for page table updates */

-   r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_NORMAL,
+   r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_KERNEL,
  adev->vm_manager.vm_pte_scheds,
  adev->vm_manager.vm_pte_num_scheds, NULL);
if (r)
return r;
  
-	r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_NORMAL,

+   r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_KERNEL,
  adev->vm_manager.vm_pte_scheds,
  adev->vm_manager.vm_pte_num_scheds, NULL);
if (r)


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH v2] drm/amd/amdgpu: Recovery vcn instance iterate.

2021-07-19 Thread Zhou, Peng Ju
[AMD Official Use Only]

Hi Leo
Can you help to review this patch?


-- 
BW
Pengju Zhou




> -Original Message-
> From: amd-gfx  On Behalf Of Zhou,
> Peng Ju
> Sent: Friday, July 16, 2021 10:15 AM
> To: Liu, Monk ; Alex Deucher
> ; Liu, Leo 
> Cc: amd-gfx list 
> Subject: RE: [PATCH v2] drm/amd/amdgpu: Recovery vcn instance iterate.
> 
> [AMD Official Use Only]
> 
> Hi @Liu, Leo
> 
> Can you help to review this patch?
> Monk and Alex have reviewed it.
> 
> 
> --
> BW
> Pengju Zhou
> 
> 
> 
> > -Original Message-
> > From: Liu, Monk 
> > Sent: Thursday, July 15, 2021 7:54 AM
> > To: Alex Deucher ; Zhou, Peng Ju
> > ; Liu, Leo 
> > Cc: amd-gfx list 
> > Subject: RE: [PATCH v2] drm/amd/amdgpu: Recovery vcn instance iterate.
> >
> > [AMD Official Use Only]
> >
> > Reviewed-by: Monk Liu 
> >
> > You might need @Liu, Leo's review as well
> >
> > Thanks
> >
> > --
> > Monk Liu | Cloud-GPU Core team
> > --
> >
> > -Original Message-
> > From: amd-gfx  On Behalf Of Alex
> > Deucher
> > Sent: Wednesday, July 14, 2021 10:49 PM
> > To: Zhou, Peng Ju 
> > Cc: amd-gfx list 
> > Subject: Re: [PATCH v2] drm/amd/amdgpu: Recovery vcn instance iterate.
> >
> > On Tue, Jul 13, 2021 at 6:31 AM Peng Ju Zhou 
> wrote:
> > >
> > > The previous logic is recording the amount of valid vcn instances to
> > > use them on SRIOV, it is a hard task due to the vcn accessment is
> > > based on the index of the vcn instance.
> > >
> > > Check if the vcn instance enabled before do instance init.
> > >
> > > Signed-off-by: Peng Ju Zhou 
> >
> > Acked-by: Alex Deucher 
> >
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 33
> > > ---
> > >  1 file changed, 20 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> > > b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> > > index c3580de3ea9c..d11fea2c9d90 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
> > > @@ -88,9 +88,7 @@ static int vcn_v3_0_early_init(void *handle)
> > > int i;
> > >
> > > if (amdgpu_sriov_vf(adev)) {
> > > -   for (i = 0; i < VCN_INSTANCES_SIENNA_CICHLID; i++)
> > > -   if (amdgpu_vcn_is_disabled_vcn(adev, 
> > > VCN_DECODE_RING, i))
> > > -   adev->vcn.num_vcn_inst++;
> > > +   adev->vcn.num_vcn_inst = VCN_INSTANCES_SIENNA_CICHLID;
> > > adev->vcn.harvest_config = 0;
> > > adev->vcn.num_enc_rings = 1;
> > >
> > > @@ -151,8 +149,7 @@ static int vcn_v3_0_sw_init(void *handle)
> > > adev->firmware.fw_size +=
> > > ALIGN(le32_to_cpu(hdr->ucode_size_bytes),
> > > PAGE_SIZE);
> > >
> > > -   if ((adev->vcn.num_vcn_inst ==
> VCN_INSTANCES_SIENNA_CICHLID)
> > ||
> > > -   (amdgpu_sriov_vf(adev) && adev->asic_type ==
> > CHIP_SIENNA_CICHLID)) {
> > > +   if (adev->vcn.num_vcn_inst ==
> > > + VCN_INSTANCES_SIENNA_CICHLID) {
> > > 
> > > adev->firmware.ucode[AMDGPU_UCODE_ID_VCN1].ucode_id
> =
> > AMDGPU_UCODE_ID_VCN1;
> > > adev->firmware.ucode[AMDGPU_UCODE_ID_VCN1].fw =
> adev-
> > >vcn.fw;
> > > adev->firmware.fw_size += @@ -322,18 +319,28
> > > @@ static int vcn_v3_0_hw_init(void *handle)
> > > continue;
> > >
> > > ring = &adev->vcn.inst[i].ring_dec;
> > > -   ring->wptr = 0;
> > > -   ring->wptr_old = 0;
> > > -   vcn_v3_0_dec_ring_set_wptr(ring);
> > > -   ring->sched.ready = true;
> > > -
> > > -   for (j = 0; j < adev->vcn.num_enc_rings; ++j) {
> > > -   ring = &adev->vcn.inst[i].ring_enc[j];
> > > +   if (amdgpu_vcn_is_disabled_vcn(adev, 
> > > VCN_DECODE_RING,
> i))
> > {
> > > +   ring->sched.ready = false;
> > > +   dev_info(adev->dev, "ring %s is disabled 
> > > by
> hypervisor\n",
> > ring->name);
> > > +   } else {
> > > ring->wptr = 0;
> > > ring->wptr_old = 0;
> > > -   vcn_v3_0_enc_ring_set_wptr(ring);
> > > +   vcn_v3_0_dec_ring_set_wptr(ring);
> > > ring->sched.ready = true;
> > > }
> > > +
> > > +   for (j = 0; j < adev->vcn.num_enc_rings; ++j) {
> > > +   ring = &adev->vcn.inst[i].ring_enc[j];
> > > +   if (amdgpu_v