from:"Nathan Chancellor"

[PATCH] drm/amd/display: Reapply 2fde4fdddc1f

2024-07-24 Thread Nathan Chancellor

Commit 2563391e57b5 ("drm/amd/display: DML2.1 resynchronization") blew
away the compiler warning fix from commit 2fde4fdddc1f
("drm/amd/display: Avoid -Wenum-float-conversion in
add_margin_and_round_to_dfs_grainularity()"), causing the warning to
reappear.

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c:183:58:
 error: arithmetic between enumeration type 'enum dentist_divider_range' and 
floating-point type 'double' [-Werror,-Wenum-float-conversion]
183 | divider = (unsigned int)(DFS_DIVIDER_RANGE_SCALE_FACTOR * 
(vco_freq_khz / clock_khz));
|  ~~ ^ 
~~
  1 error generated.

Apply the fix again to resolve the warning.

Fixes: 2563391e57b5 ("drm/amd/display: DML2.1 resynchronization")
Signed-off-by: Nathan Chancellor 
---
 .../gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c
index 0021bbaa4b91..f19f6ebaae13 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c
@@ -180,7 +180,7 @@ static bool add_margin_and_round_to_dfs_grainularity(double 
clock_khz, double ma
 
clock_khz *= 1.0 + margin;
 
-   divider = (unsigned int)(DFS_DIVIDER_RANGE_SCALE_FACTOR * (vco_freq_khz 
/ clock_khz));
+   divider = (unsigned int)((int)DFS_DIVIDER_RANGE_SCALE_FACTOR * 
(vco_freq_khz / clock_khz));
 
/* we want to floor here to get higher clock than required rather than 
lower */
if (divider < DFS_DIVIDER_RANGE_2_START) {

---
base-commit: bd4bea5ab2bda37ddb092a978218c4d9b46927e6
change-id: 20240724-amdgpu-dml2-fix-float-enum-warning-again-647a33789138

Best regards,
-- 
Nathan Chancellor

[PATCH] drm/amd/display: Disable CONFIG_DRM_AMD_DC_FP for RISC-V with clang

2024-06-14 Thread Nathan Chancellor

Commit 77acc6b55ae4 ("riscv: add support for kernel-mode FPU") and
commit a28e4b672f04 ("drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT")
enabled support for CONFIG_DRM_AMD_DC_FP with RISC-V. Unfortunately,
this exposed -Wframe-larger-than warnings (which become fatal with
CONFIG_WERROR=y) when building ARCH=riscv allmodconfig with clang:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13: 
error: stack frame size (2448) exceeds limit (2048) in 
'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation'
 [-Werror,-Wframe-larger-than]
 58 | static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation(
| ^
  1 error generated.

Many functions in this file use a large number of parameters, which must
be passed on the stack at a certain pointer due to register exhaustion,
which can cause high stack usage when inlining and issues with stack
slot analysis get involved. While the compiler can and should do better
(as GCC uses less than half the amount of stack space for the same
function), it is not as simple as a fix as adjusting the functions not
to take a large number of parameters.

Unfortunately, modifying these files to avoid the problem is a difficult
to justify approach because any revisions to the files in the kernel
tree never make it back to the original source (so copies of the code
for newer hardware revisions just reintroduce the issue) and the files
are hard to read/modify due to being "gcc-parsable HW gospel, coming
straight from HW engineers".

Avoid building the problematic code for RISC-V by modifying the existing
condition for arm64 that exists for the same reason. Factor out the
logical not to make the condition a little more readable naturally.

Fixes: a28e4b672f04 ("drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT")
Reported-by: Palmer Dabbelt 
Closes: https://lore.kernel.org/20240530145741.7506-2-pal...@rivosinc.com/
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/Kconfig 
b/drivers/gpu/drm/amd/display/Kconfig
index 5fcd4f778dc3..47b8b49da8a7 100644
--- a/drivers/gpu/drm/amd/display/Kconfig
+++ b/drivers/gpu/drm/amd/display/Kconfig
@@ -8,7 +8,7 @@ config DRM_AMD_DC
depends on BROKEN || !CC_IS_CLANG || ARM64 || RISCV || SPARC64 || X86_64
select SND_HDA_COMPONENT if SND_HDA_CORE
# !CC_IS_CLANG: https://github.com/ClangBuiltLinux/linux/issues/1752
-   select DRM_AMD_DC_FP if ARCH_HAS_KERNEL_FPU_SUPPORT && (!ARM64 || 
!CC_IS_CLANG)
+   select DRM_AMD_DC_FP if ARCH_HAS_KERNEL_FPU_SUPPORT && !(CC_IS_CLANG && 
(ARM64 || RISCV))
help
  Choose this option if you want to use the new display engine
  support for AMDGPU. This adds required support for Vega and

---
base-commit: c6c4dd54012551cce5cde408b35468f2c62b0cce
change-id: 20240614-amdgpu-disable-drm-amd-dc-fp-riscv-clang-31c84f6b990d

Best regards,
-- 
Nathan Chancellor

Re: [PATCH] drm/amd/display: Increase frame-larger-than warning limit

2024-06-13 Thread Nathan Chancellor

Hi Palmer (and AMD folks),

On Tue, Jun 04, 2024 at 09:04:23AM -0700, Palmer Dabbelt wrote:
> On Mon, 03 Jun 2024 15:29:48 PDT (-0700), nat...@kernel.org wrote:
> > On Thu, May 30, 2024 at 07:57:42AM -0700, Palmer Dabbelt wrote:
> > > From: Palmer Dabbelt 
> > > 
> > > I get a handful of build errors along the lines of
> > > 
> > > 
> > > linux/drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13:
> > >  error: stack frame size (2352) exceeds limit (2048) in 
> > > 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation'
> > >  [-Werror,-Wframe-larger-than]
> > > static void 
> > > DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation(
> > > ^
> > > 
> > > linux/drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1724:6:
> > >  error: stack frame size (2096) exceeds limit (2048) in 
> > > 'dml32_ModeSupportAndSystemConfigurationFull' 
> > > [-Werror,-Wframe-larger-than]
> > > void dml32_ModeSupportAndSystemConfigurationFull(struct 
> > > display_mode_lib *mode_lib)
> > >  ^
> > 
> > Judging from the message, this is clang/LLVM? What version?
> 
> Yes, LLVM.  Looks like I'm on 16.0.6.  Probably time for an update, so I'll
> give it a shot.

FWIW, I can reproduce this with tip of tree, I was just curious in case
that ended up mattering.

> > I assume
> > this showed up in 6.10-rc1 because of commit 77acc6b55ae4 ("riscv: add
> > support for kernel-mode FPU"), which allows this driver to be built for
> > RISC-V.
> 
> Seems reasonable.  This didn't show up until post-merge, not 100% sure why.
> I didn't really dig any farther.

Perhaps you fast forwarded your tree to include that commit?

> > Is this allmodconfig or some other configuration?
> 
> IIRC both "allmodconfig" and "allyesconfig" show it, but I don't have a
> build tree sitting around to be 100% sure.

Yeah, allmodconfig triggers it.

I was able to come up with a "trivial" reproducer using cvise (attached
to this mail if you are curious) that has worse stack usage by a rough
factor of 2:

  $ clang --target=riscv64-linux-gnu -O2 -Wall -Wframe-larger-than=512 -c -o 
/dev/null display_mode_vba_32.i
  display_mode_vba_32.i:598:6: warning: stack frame size (1264) exceeds limit 
(512) in 
'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation'
 [-Wframe-larger-than]
598 | void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation()
 {
|  ^
  1 warning generated.

  $ riscv64-linux-gcc -O2 -Wall -Wframe-larger-than=512 -c -o /dev/null 
display_mode_vba_32.i
  display_mode_vba_32.i: In function 
'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation':
  display_mode_vba_32.i:1729:1: warning: the frame size of 528 bytes is larger 
than 512 bytes [-Wframe-larger-than=]
   1729 | }
| ^

I have not done too much further investigation but this is almost
certainly the same issue that has come up before [1][2] with the AMD
display code using functions with a large number of parameters, such
that they have to passed on the stack, coupled with inlining (if I
remember correctly, LLVM gives more of an inlining discount the less a
function is used in a file).

While clang does poorly with that code, I am not interested in
continuing to fix this code new hardware revision after new hardware
revision. We could just avoid this code like we do for arm64 for a
similar reason:

diff --git a/drivers/gpu/drm/amd/display/Kconfig 
b/drivers/gpu/drm/amd/display/Kconfig
index 5fcd4f778dc3..64df713df878 100644
--- a/drivers/gpu/drm/amd/display/Kconfig
+++ b/drivers/gpu/drm/amd/display/Kconfig
@@ -8,7 +8,7 @@ config DRM_AMD_DC
depends on BROKEN || !CC_IS_CLANG || ARM64 || RISCV || SPARC64 || X86_64
select SND_HDA_COMPONENT if SND_HDA_CORE
# !CC_IS_CLANG: https://github.com/ClangBuiltLinux/linux/issues/1752
-   select DRM_AMD_DC_FP if ARCH_HAS_KERNEL_FPU_SUPPORT && (!ARM64 || 
!CC_IS_CLANG)
+   select DRM_AMD_DC_FP if ARCH_HAS_KERNEL_FPU_SUPPORT && (!(ARM64 || 
RISCV) || !CC_IS_CLANG)
help
  Choose this option if you want to use the new display engine
  support for AMDGPU. This adds required support for Vega and

[1]: https://lore.kernel.org/20231019205117.GA839902@dev-arch.thelio-3990X/
[2]: https://lore.kernel.org/20220830203409.3491379-1-nat...@kernel.org/

Cheers,
Nathan
enum { false, true };
enum output_encoder_class { dm_dp2p0 };
enum output_format_class { dm_420 };
enum source_format_class { dm_444_32 };
enum scan_direction_class { dm_vert };
enum dm_swizzle_mode { dm_sw_linear };
enum clock_change_support { dm_std_cvt };
enum odm_combine_mode { dm_odm_combine_mode_2to1dm_odm_combine_mode_4to1 };
enum immediate_flip_requirement { dm_immediate_flip_not_required };
enum unbounded_requesting_policy { dm_unbounded_requesting_disable };
enum dm_rotation_angle { dm_rotation_270m };

Re: [PATCH] drm/amd/display: Increase frame-larger-than warning limit

2024-06-03 Thread Nathan Chancellor

Hi Palmer,

On Thu, May 30, 2024 at 07:57:42AM -0700, Palmer Dabbelt wrote:
> From: Palmer Dabbelt 
> 
> I get a handful of build errors along the lines of
> 
> 
> linux/drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13:
>  error: stack frame size (2352) exceeds limit (2048) in 
> 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation'
>  [-Werror,-Wframe-larger-than]
> static void 
> DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation(
> ^
> 
> linux/drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1724:6:
>  error: stack frame size (2096) exceeds limit (2048) in 
> 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
> *mode_lib)
>  ^

Judging from the message, this is clang/LLVM? What version? I assume
this showed up in 6.10-rc1 because of commit 77acc6b55ae4 ("riscv: add
support for kernel-mode FPU"), which allows this driver to be built for
RISC-V. Is this allmodconfig or some other configuration?

> as of 6.10-rc1.
> 
> Signed-off-by: Palmer Dabbelt 
> ---
>  drivers/gpu/drm/amd/display/dc/dml/Makefile | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
> b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> index c4a5efd2dda5..b2bd72e63734 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> @@ -62,9 +62,9 @@ endif
>  
>  ifneq ($(CONFIG_FRAME_WARN),0)
>  ifeq ($(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),y)
> -frame_warn_flag := -Wframe-larger-than=3072
> +frame_warn_flag := -Wframe-larger-than=4096
>  else
> -frame_warn_flag := -Wframe-larger-than=2048
> +frame_warn_flag := -Wframe-larger-than=3072
>  endif
>  endif
>  
> -- 
> 2.45.1
>

[PATCH] drm/radeon: Remove __counted_by from StateArray.states[]

2024-05-29 Thread Nathan Chancellor

From: Bill Wendling 

Work for __counted_by on generic pointers in structures (not just
flexible array members) has started landing in Clang 19 (current tip of
tree). During the development of this feature, a restriction was added
to __counted_by to prevent the flexible array member's element type from
including a flexible array member itself such as:

  struct foo {
int count;
char buf[];
  };

  struct bar {
int count;
struct foo data[] __counted_by(count);
  };

because the size of data cannot be calculated with the standard array
size formula:

  sizeof(struct foo) * count

This restriction was downgraded to a warning but due to CONFIG_WERROR,
it can still break the build. The application of __counted_by on the
states member of 'struct _StateArray' triggers this restriction,
resulting in:

  drivers/gpu/drm/radeon/pptable.h:442:5: error: 'counted_by' should not be 
applied to an array with element of unknown size because 'ATOM_PPLIB_STATE_V2' 
(aka 'struct _ATOM_PPLIB_STATE_V2') is a struct type with a flexible array 
member. This will be an error in a future compiler version 
[-Werror,-Wbounds-safety-counted-by-elt-type-unknown-size]
442 | ATOM_PPLIB_STATE_V2 states[] __counted_by(ucNumEntries);
| ^~~~
  1 error generated.

Remove this use of __counted_by to fix the warning/error. However,
rather than remove it altogether, leave it commented, as it may be
possible to support this in future compiler releases.

Cc: sta...@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2028
Fixes: efade6fe50e7 ("drm/radeon: silence UBSAN warning (v3)")
Signed-off-by: Bill Wendling 
Co-developed-by: Nathan Chancellor 
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/radeon/pptable.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/pptable.h b/drivers/gpu/drm/radeon/pptable.h
index b7f22597ee95..969a8fb0ee9e 100644
--- a/drivers/gpu/drm/radeon/pptable.h
+++ b/drivers/gpu/drm/radeon/pptable.h
@@ -439,7 +439,7 @@ typedef struct _StateArray{
 //how many states we have 
 UCHAR ucNumEntries;
 
-ATOM_PPLIB_STATE_V2 states[] __counted_by(ucNumEntries);
+ATOM_PPLIB_STATE_V2 states[] /* __counted_by(ucNumEntries) */;
 }StateArray;
 
 

---
base-commit: e64e8f7c178e5228e0b2dbb504b9dc75953a319f
change-id: 20240529-drop-counted-by-radeon-states-state-array-01936ded4c18

Best regards,
-- 
Nathan Chancellor

[PATCH 2/2] drm/amd/display: Fix CFLAGS for dml2_core_dcn4_calcs.o

2024-04-24 Thread Nathan Chancellor

-Wframe-larger-than=2048 is a part of both CFLAGS and CFLAGS_REMOVE for
dml2_core_dcn4_calcs.o, which means that it ultimately gets removed
altogether for 64-bit targets, as 2048 is the default FRAME_WARN value
for 64-bit platforms, resulting in no -Wframe-larger-than coverage for
this file.

Remove -Wframe-larger-than from CFLAGS_REMOVE_dml2_core_dcn4_calcs.o and
move to $(frame_warn_flag) for CFLAGS_dml2_core_dcn4_calcs.o, as that
accounts for the fact that -Wframe-larger-than may need to be larger
than 2048 in certain situations, such as when the sanitizers are
enabled.

Fixes: d546a39c6b10 ("drm/amd/display: Add misc DC changes for DCN401")
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/dc/dml2/Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
index c35212a4a968..904a2d419638 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
@@ -111,7 +111,7 @@ CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml_top.o 
:= $(dml2_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml_top_mcache.o := 
$(dml2_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml2_top_optimization := 
$(dml2_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4.o := 
$(dml2_ccflags)
-CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.o := 
$(dml2_ccflags) -Wframe-larger-than=2048
+CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.o := 
$(dml2_ccflags) $(frame_warn_flag)
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_factory.o := 
$(dml2_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_shared.o := 
$(dml2_ccflags) $(frame_warn_flag)
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.o := 
$(dml2_ccflags)
@@ -134,7 +134,7 @@ 
CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml_top.o := $(dml2_rcfla
 CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml_top_mcache.o := 
$(dml2_rcflags)
 CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml2_top_optimization.o 
:= $(dml2_rcflags)
 CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4.o := 
$(dml2_rcflags)
-CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.o 
:= $(dml2_rcflags) -Wframe-larger-than=2048
+CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.o 
:= $(dml2_rcflags)
 CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_factory.o := 
$(dml2_rcflags)
 CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_shared.o := 
$(dml2_rcflags)
 CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.o := 
$(dml2_rcflags)

-- 
2.44.0

[PATCH 1/2] drm/amd/display: Add frame_warn_flag to dml2_core_shared.o

2024-04-24 Thread Nathan Chancellor

When building with tip of tree Clang, there are some new instances of
-Wframe-larger-than from the new display code (which become fatal with
CONFIG_WERROR=y):

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_shared.c:754:6:
 error: stack frame size (2488) exceeds limit (2048) in 
'dml2_core_shared_mode_support' [-Werror,-Wframe-larger-than]
754 | bool dml2_core_shared_mode_support(struct 
dml2_core_calcs_mode_support_ex *in_out_params)
|  ^
  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_shared.c:9834:6:
 error: stack frame size (2152) exceeds limit (2048) in 
'dml2_core_shared_mode_programming' [-Werror,-Wframe-larger-than]
   9834 | bool dml2_core_shared_mode_programming(struct 
dml2_core_calcs_mode_programming_ex *in_out_params)
|  ^
  2 errors generated.

These warnings do not occur when CONFIG_K{A,C,M}SAN are disabled, so add
$(frame_warn_flag) to dml2_core_shared.o's CFLAGS, which was added in
commit 6740ec97bcdb ("drm/amd/display: Increase frame warning limit with
KASAN or KCSAN in dml2") to account for this situation.

Fixes: d546a39c6b10 ("drm/amd/display: Add misc DC changes for DCN401")
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/dc/dml2/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
index 6c76f346b237..c35212a4a968 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
@@ -113,7 +113,7 @@ 
CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_top/dml2_top_optimization := $(dml2_
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4.o := 
$(dml2_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.o := 
$(dml2_ccflags) -Wframe-larger-than=2048
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_factory.o := 
$(dml2_ccflags)
-CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_shared.o := 
$(dml2_ccflags)
+CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_core/dml2_core_shared.o := 
$(dml2_ccflags) $(frame_warn_flag)
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.o := 
$(dml2_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_factory.o := 
$(dml2_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml21/src/dml2_mcg/dml2_mcg_dcn4.o := 
$(dml2_ccflags)

-- 
2.44.0

[PATCH 0/2] drm/amd/display: Use frame_warn_flag consistently in dml2 Makefile

2024-04-24 Thread Nathan Chancellor

Hi all,

This series resolves a couple instances of -Wframe-larger-than from
the new display code that appear with newer versions of clang along
without another inconsistency I noticed while fixing this, which have
been accounted for with the $(frame_warn_flag) variable.

---
Nathan Chancellor (2):
  drm/amd/display: Add frame_warn_flag to dml2_core_shared.o
  drm/amd/display: Fix CFLAGS for dml2_core_dcn4_calcs.o

 drivers/gpu/drm/amd/display/dc/dml2/Makefile | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
---
base-commit: d60dc4dd72412d5d9566fdf391e4202b05f88912
change-id: 20240424-amdgpu-dml2-fix-frame-larger-than-dcn401-48ff7e1f51ea

Best regards,
-- 
Nathan Chancellor

[PATCH] drm/amd/display: Avoid -Wenum-float-conversion in add_margin_and_round_to_dfs_grainularity()

2024-04-24 Thread Nathan Chancellor

When building with clang 19 or newer (which strengthened some of the
enum conversion warnings for C), there is a warning (or error with
CONFIG_WERROR=y) around doing arithmetic with an enumerated type and a
floating point expression.

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c:181:58:
 error: arithmetic between enumeration type 'enum dentist_divider_range' and 
floating-point type 'double' [-Werror,-Wenum-float-conversion]
181 | divider = (unsigned int)(DFS_DIVIDER_RANGE_SCALE_FACTOR * 
(vco_freq_khz / clock_khz));
|  ~~ ^ 
~~
  1 error generated.

This conversion is expected due to the nature of the enumerated value
and definition, so silence the warning by casting the enumeration to an
integer explicitly to make it clear to the compiler.

Fixes: 3df48ddedee4 ("drm/amd/display: Add new DCN401 sources")
Signed-off-by: Nathan Chancellor 
---
Alternatively, perhaps the potential truncation could happen before the
multiplication?

  divider = DFS_DIVIDER_RANGE_SCALE_FACTOR * (unsigned int)(vco_freq_khz / 
clock_khz);

I suspect the result of the division is probably not very large
(certainly not within UINT_MAX / 4), so I would not expect the
multiplication to overflow, but I was not sure so I decided to take the
safer, NFC change. I am happy to respin as necessary.
---
 .../gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c
index e6698ee65843..65eb0187e965 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c
@@ -178,7 +178,7 @@ static bool add_margin_and_round_to_dfs_grainularity(double 
clock_khz, double ma
 
clock_khz *= 1.0 + margin;
 
-   divider = (unsigned int)(DFS_DIVIDER_RANGE_SCALE_FACTOR * (vco_freq_khz 
/ clock_khz));
+   divider = (unsigned int)((int)DFS_DIVIDER_RANGE_SCALE_FACTOR * 
(vco_freq_khz / clock_khz));
 
/* we want to floor here to get higher clock than required rather than 
lower */
if (divider < DFS_DIVIDER_RANGE_2_START) {

---
base-commit: d60dc4dd72412d5d9566fdf391e4202b05f88912
change-id: 20240424-amdgpu-display-dcn401-enum-float-conversion-c09cc1826ea2

Best regards,
-- 
Nathan Chancellor

Please apply commit 5b750b22530fe53bf7fd6a30baacd53ada26911b to linux-6.1.y

2024-02-27 Thread Nathan Chancellor

Hi Greg and Sasha,

Please apply upstream commit 5b750b22530f ("drm/amd/display: Increase
frame warning limit with KASAN or KCSAN in dml") to linux-6.1.y, as it
is needed to avoid instances of -Wframe-larger-than in allmodconfig,
which has -Werror enabled. It applies cleanly for me and it is already
in 6.6 and 6.7. The fixes tag is not entirely accurate and commit
e63e35f0164c ("drm/amd/display: Increase frame-larger-than for all
display_mode_vba files"), which was recently applied to that tree,
depends on it (I should have made that clearer in the patch).

If there are any issues, please let me know.

Cheers,
Nathan

[PATCH] drm/amd/display: Increase frame-larger-than for all display_mode_vba files

2024-02-05 Thread Nathan Chancellor

After a recent change in LLVM, allmodconfig (which has CONFIG_KCSAN=y
and CONFIG_WERROR=y enabled) has a few new instances of
-Wframe-larger-than for the mode support and system configuration
functions:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn20/display_mode_vba_20v2.c:3393:6:
 error: stack frame size (2144) exceeds limit (2048) in 
'dml20v2_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
   3393 | void dml20v2_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_lib)
|  ^
  1 error generated.

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_mode_vba_21.c:3520:6:
 error: stack frame size (2192) exceeds limit (2048) in 
'dml21_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
   3520 | void dml21_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_lib)
|  ^
  1 error generated.

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn20/display_mode_vba_20.c:3286:6:
 error: stack frame size (2128) exceeds limit (2048) in 
'dml20_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
   3286 | void dml20_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_lib)
|  ^
  1 error generated.

Without the sanitizers enabled, there are no warnings.

This was the catalyst for commit 6740ec97bcdb ("drm/amd/display:
Increase frame warning limit with KASAN or KCSAN in dml2") and that same
change was made to dml in commit 5b750b22530f ("drm/amd/display:
Increase frame warning limit with KASAN or KCSAN in dml") but the
frame_warn_flag variable was not applied to all files. Do so now to
clear up the warnings and make all these files consistent.

Cc: sta...@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issue/1990
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/dc/dml/Makefile | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index 6042a5a6a44f..59ade76ffb18 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -72,11 +72,11 @@ CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_lib.o := 
$(dml_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_vba.o := $(dml_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml/dcn10/dcn10_fpu.o := $(dml_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/dcn20_fpu.o := $(dml_ccflags)
-CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20.o := $(dml_ccflags)
+CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20.o := $(dml_ccflags) 
$(frame_warn_flag)
 CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/display_rq_dlg_calc_20.o := $(dml_ccflags)
-CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20v2.o := $(dml_ccflags)
+CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20v2.o := $(dml_ccflags) 
$(frame_warn_flag)
 CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/display_rq_dlg_calc_20v2.o := $(dml_ccflags)
-CFLAGS_$(AMDDALPATH)/dc/dml/dcn21/display_mode_vba_21.o := $(dml_ccflags)
+CFLAGS_$(AMDDALPATH)/dc/dml/dcn21/display_mode_vba_21.o := $(dml_ccflags) 
$(frame_warn_flag)
 CFLAGS_$(AMDDALPATH)/dc/dml/dcn21/display_rq_dlg_calc_21.o := $(dml_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml/dcn30/display_mode_vba_30.o := $(dml_ccflags) 
$(frame_warn_flag)
 CFLAGS_$(AMDDALPATH)/dc/dml/dcn30/display_rq_dlg_calc_30.o := $(dml_ccflags)

---
base-commit: 6813cdca4ab94a238f8eb0cef3d3f3fcbdfb0ee0
change-id: 20240205-amdgpu-raise-flt-for-dml-vba-files-ee5b5a9c5e43

Best regards,
-- 
Nathan Chancellor

Re: [PATCH 1/3] selftests/bpf: Update LLVM Phabricator links

2024-01-11 Thread Nathan Chancellor

Hi Alexei,

On Thu, Jan 11, 2024 at 12:00:50PM -0800, Alexei Starovoitov wrote:
> On Thu, Jan 11, 2024 at 11:40 AM Nathan Chancellor  wrote:
> >
> > Hi Yonghong,
> >
> > On Wed, Jan 10, 2024 at 08:05:36PM -0800, Yonghong Song wrote:
> > >
> > > On 1/9/24 2:16 PM, Nathan Chancellor wrote:
> > > > reviews.llvm.org was LLVM's Phabricator instances for code review. It
> > > > has been abandoned in favor of GitHub pull requests. While the majority
> > > > of links in the kernel sources still work because of the work Fangrui
> > > > has done turning the dynamic Phabricator instance into a static archive,
> > > > there are some issues with that work, so preemptively convert all the
> > > > links in the kernel sources to point to the commit on GitHub.
> > > >
> > > > Most of the commits have the corresponding differential review link in
> > > > the commit message itself so there should not be any loss of fidelity in
> > > > the relevant information.
> > > >
> > > > Additionally, fix a typo in the xdpwall.c print ("LLMV" -> "LLVM") while
> > > > in the area.
> > > >
> > > > Link: 
> > > > https://discourse.llvm.org/t/update-on-github-pull-requests/71540/172
> > > > Signed-off-by: Nathan Chancellor 
> > >
> > > Ack with one nit below.
> > >
> > > Acked-by: Yonghong Song 
> >
> > 
> >
> > > > @@ -304,6 +304,6 @@ from running test_progs will look like:
> > > >   .. code-block:: console
> > > > -  test_xdpwall:FAIL:Does LLVM have https://reviews.llvm.org/D109073? 
> > > > unexpected error: -4007
> > > > +  test_xdpwall:FAIL:Does LLVM have 
> > > > https://github.com/llvm/llvm-project/commit/ea72b0319d7b0f0c2fcf41d121afa5d031b319d5?
> > > >  unexpected error: -4007
> > > > -__ https://reviews.llvm.org/D109073
> > > > +__ 
> > > > https://github.com/llvm/llvm-project/commit/ea72b0319d7b0f0c2fcf41d121afa5d031b319d
> > >
> > > To be consistent with other links, could you add the missing last alnum 
> > > '5' to the above link?
> >
> > Thanks a lot for catching this and providing an ack. Andrew, could you
> > squash this update into selftests-bpf-update-llvm-phabricator-links.patch?
> 
> Please send a new patch.
> We'd like to take all bpf patches through the bpf tree to avoid conflicts.

Very well, I've sent a standalone v2 on top of bpf-next:

https://lore.kernel.org/20240111-bpf-update-llvm-phabricator-links-v2-1-9a7ae976b...@kernel.org/

Andrew, just drop selftests-bpf-update-llvm-phabricator-links.patch
altogether in that case, the other two patches are fine to go via -mm I
think.

Cheers,
Nathan

Re: [PATCH 1/3] selftests/bpf: Update LLVM Phabricator links

2024-01-11 Thread Nathan Chancellor

Hi Yonghong,

On Wed, Jan 10, 2024 at 08:05:36PM -0800, Yonghong Song wrote:
> 
> On 1/9/24 2:16 PM, Nathan Chancellor wrote:
> > reviews.llvm.org was LLVM's Phabricator instances for code review. It
> > has been abandoned in favor of GitHub pull requests. While the majority
> > of links in the kernel sources still work because of the work Fangrui
> > has done turning the dynamic Phabricator instance into a static archive,
> > there are some issues with that work, so preemptively convert all the
> > links in the kernel sources to point to the commit on GitHub.
> > 
> > Most of the commits have the corresponding differential review link in
> > the commit message itself so there should not be any loss of fidelity in
> > the relevant information.
> > 
> > Additionally, fix a typo in the xdpwall.c print ("LLMV" -> "LLVM") while
> > in the area.
> > 
> > Link: https://discourse.llvm.org/t/update-on-github-pull-requests/71540/172
> > Signed-off-by: Nathan Chancellor 
> 
> Ack with one nit below.
> 
> Acked-by: Yonghong Song 



> > @@ -304,6 +304,6 @@ from running test_progs will look like:
> >   .. code-block:: console
> > -  test_xdpwall:FAIL:Does LLVM have https://reviews.llvm.org/D109073? 
> > unexpected error: -4007
> > +  test_xdpwall:FAIL:Does LLVM have 
> > https://github.com/llvm/llvm-project/commit/ea72b0319d7b0f0c2fcf41d121afa5d031b319d5?
> >  unexpected error: -4007
> > -__ https://reviews.llvm.org/D109073
> > +__ 
> > https://github.com/llvm/llvm-project/commit/ea72b0319d7b0f0c2fcf41d121afa5d031b319d
> 
> To be consistent with other links, could you add the missing last alnum '5' 
> to the above link?

Thanks a lot for catching this and providing an ack. Andrew, could you
squash this update into selftests-bpf-update-llvm-phabricator-links.patch?

diff --git a/tools/testing/selftests/bpf/README.rst 
b/tools/testing/selftests/bpf/README.rst
index b9a493f66557..e56034abb3c2 100644
--- a/tools/testing/selftests/bpf/README.rst
+++ b/tools/testing/selftests/bpf/README.rst
@@ -306,4 +306,4 @@ from running test_progs will look like:
 
   test_xdpwall:FAIL:Does LLVM have 
https://github.com/llvm/llvm-project/commit/ea72b0319d7b0f0c2fcf41d121afa5d031b319d5?
 unexpected error: -4007
 
-__ 
https://github.com/llvm/llvm-project/commit/ea72b0319d7b0f0c2fcf41d121afa5d031b319d
+__ 
https://github.com/llvm/llvm-project/commit/ea72b0319d7b0f0c2fcf41d121afa5d031b319d5

[PATCH] drm/amd/display: Avoid enum conversion warning

2024-01-10 Thread Nathan Chancellor

Clang warns (or errors with CONFIG_WERROR=y) when performing arithmetic
with different enumerated types, which is usually a bug:


drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_dpia_bw.c:548:24:
 error: arithmetic between different enumeration types ('const enum 
dc_link_rate' and 'const enum dc_lane_count') [-Werror,-Wenum-enum-conversion]
  548 | link_cap->link_rate * link_cap->lane_count 
* LINK_RATE_REF_FREQ_IN_KHZ * 8;
  | ~~~ ^ 
1 error generated.

In this case, there is not a problem because the enumerated types are
basically treated as '#define' values. Add an explicit cast to an
integral type to silence the warning.

Closes: https://github.com/ClangBuiltLinux/linux/issues/1976
Fixes: 5f3bce13266e ("drm/amd/display: Request usb4 bw for mst streams")
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c 
b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c
index 4ef1a6a1d129..dd0d2b206462 100644
--- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c
+++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c
@@ -544,8 +544,9 @@ int link_dp_dpia_get_dp_overhead_in_dp_tunneling(struct 
dc_link *link)
 */
const struct dc_link_settings *link_cap =
dc_link_get_link_cap(link);
-   uint32_t link_bw_in_kbps =
-   link_cap->link_rate * link_cap->lane_count * 
LINK_RATE_REF_FREQ_IN_KHZ * 8;
+   uint32_t link_bw_in_kbps = (uint32_t)link_cap->link_rate *
+  (uint32_t)link_cap->lane_count *
+  LINK_RATE_REF_FREQ_IN_KHZ * 8;
link_mst_overhead = (link_bw_in_kbps / 64) + ((link_bw_in_kbps 
% 64) ? 1 : 0);
}
 

---
base-commit: 6e7a29f011ac03a765f53844f7c3f04cdd421715
change-id: 20240110-amdgpu-display-enum-enum-conversion-e2451adbf4a7

Best regards,
-- 
Nathan Chancellor

[PATCH 3/3] treewide: Update LLVM Bugzilla links

2024-01-09 Thread Nathan Chancellor

LLVM moved their issue tracker from their own Bugzilla instance to
GitHub issues. While all of the links are still valid, they may not
necessarily show the most up to date information around the issues, as
all updates will occur on GitHub, not Bugzilla.

Another complication is that the Bugzilla issue number is not always the
same as the GitHub issue number. Thankfully, LLVM maintains this mapping
through two shortlinks:

  https://llvm.org/bz -> https://bugs.llvm.org/show_bug.cgi?id=
  https://llvm.org/pr -> 
https://github.com/llvm/llvm-project/issues/

Switch all "https://bugs.llvm.org/show_bug.cgi?id=" links to the
"https://llvm.org/pr" shortlink so that the links show the most up
to date information. Each migrated issue links back to the Bugzilla
entry, so there should be no loss of fidelity of information here.

Signed-off-by: Nathan Chancellor 
---
 arch/powerpc/Makefile   | 4 ++--
 arch/powerpc/kvm/book3s_hv_nested.c | 2 +-
 arch/s390/include/asm/ftrace.h  | 2 +-
 arch/x86/power/Makefile | 2 +-
 crypto/blake2b_generic.c| 2 +-
 drivers/firmware/efi/libstub/Makefile   | 2 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c| 2 +-
 drivers/media/test-drivers/vicodec/codec-fwht.c | 2 +-
 drivers/regulator/Kconfig   | 2 +-
 include/asm-generic/vmlinux.lds.h   | 2 +-
 lib/Kconfig.kasan   | 2 +-
 lib/raid6/Makefile  | 2 +-
 lib/stackinit_kunit.c   | 2 +-
 mm/slab_common.c| 2 +-
 net/bridge/br_multicast.c   | 2 +-
 security/Kconfig| 2 +-
 16 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index f19dbaa1d541..cd6aaa45f355 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -133,11 +133,11 @@ CFLAGS-$(CONFIG_PPC64)+= $(call 
cc-option,-mno-pointers-to-nested-functions)
 CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mlong-double-128)
 
 # Clang unconditionally reserves r2 on ppc32 and does not support the flag
-# https://bugs.llvm.org/show_bug.cgi?id=39555
+# https://llvm.org/pr39555
 CFLAGS-$(CONFIG_PPC32) := $(call cc-option, -ffixed-r2)
 
 # Clang doesn't support -mmultiple / -mno-multiple
-# https://bugs.llvm.org/show_bug.cgi?id=39556
+# https://llvm.org/pr39556
 CFLAGS-$(CONFIG_PPC32) += $(call cc-option, $(MULTIPLEWORD))
 
 CFLAGS-$(CONFIG_PPC32) += $(call cc-option,-mno-readonly-in-sdata)
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c 
b/arch/powerpc/kvm/book3s_hv_nested.c
index 3b658b8696bc..3f5970f74c6b 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -55,7 +55,7 @@ void kvmhv_save_hv_regs(struct kvm_vcpu *vcpu, struct 
hv_guest_state *hr)
hr->dawrx1 = vcpu->arch.dawrx1;
 }
 
-/* Use noinline_for_stack due to https://bugs.llvm.org/show_bug.cgi?id=49610 */
+/* Use noinline_for_stack due to https://llvm.org/pr49610 */
 static noinline_for_stack void byteswap_pt_regs(struct pt_regs *regs)
 {
unsigned long *addr = (unsigned long *) regs;
diff --git a/arch/s390/include/asm/ftrace.h b/arch/s390/include/asm/ftrace.h
index 5a82b08f03cd..621f23d5ae30 100644
--- a/arch/s390/include/asm/ftrace.h
+++ b/arch/s390/include/asm/ftrace.h
@@ -9,7 +9,7 @@
 #ifndef __ASSEMBLY__
 
 #ifdef CONFIG_CC_IS_CLANG
-/* https://bugs.llvm.org/show_bug.cgi?id=41424 */
+/* https://llvm.org/pr41424 */
 #define ftrace_return_address(n) 0UL
 #else
 #define ftrace_return_address(n) __builtin_return_address(n)
diff --git a/arch/x86/power/Makefile b/arch/x86/power/Makefile
index 379777572bc9..e0cd7afd5302 100644
--- a/arch/x86/power/Makefile
+++ b/arch/x86/power/Makefile
@@ -5,7 +5,7 @@
 CFLAGS_cpu.o   := -fno-stack-protector
 
 # Clang may incorrectly inline functions with stack protector enabled into
-# __restore_processor_state(): https://bugs.llvm.org/show_bug.cgi?id=47479
+# __restore_processor_state(): https://llvm.org/pr47479
 CFLAGS_REMOVE_cpu.o := $(CC_FLAGS_LTO)
 
 obj-$(CONFIG_PM_SLEEP) += cpu.o
diff --git a/crypto/blake2b_generic.c b/crypto/blake2b_generic.c
index 6704c0355889..32e380b714b6 100644
--- a/crypto/blake2b_generic.c
+++ b/crypto/blake2b_generic.c
@@ -102,7 +102,7 @@ static void blake2b_compress_one_generic(struct 
blake2b_state *S,
ROUND(10);
ROUND(11);
 #ifdef CONFIG_CC_IS_CLANG
-#pragma nounroll /* https://bugs.llvm.org/show_bug.cgi?id=45803 */
+#pragma nounroll /* https://llvm.org/pr45803 */
 #endif
for (i = 0; i < 8; ++i)
S->h[i] = S->h[i] ^ v[i] ^ v[i + 8];
diff --git a/drivers/firmware/efi/libstub/Makefile 
b/drivers/firmware/efi/libstub/Makefile
index 06964a3c130f..a223bd10564b 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/

[PATCH 2/3] arch and include: Update LLVM Phabricator links

2024-01-09 Thread Nathan Chancellor

reviews.llvm.org was LLVM's Phabricator instances for code review. It
has been abandoned in favor of GitHub pull requests. While the majority
of links in the kernel sources still work because of the work Fangrui
has done turning the dynamic Phabricator instance into a static archive,
there are some issues with that work, so preemptively convert all the
links in the kernel sources to point to the commit on GitHub.

Most of the commits have the corresponding differential review link in
the commit message itself so there should not be any loss of fidelity in
the relevant information.

Link: https://discourse.llvm.org/t/update-on-github-pull-requests/71540/172
Signed-off-by: Nathan Chancellor 
---
 arch/arm64/Kconfig  | 4 ++--
 arch/riscv/Kconfig  | 2 +-
 arch/riscv/include/asm/ftrace.h | 2 +-
 include/linux/compiler-clang.h  | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7b071a00425d..3304ba7c6c2a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -380,7 +380,7 @@ config BROKEN_GAS_INST
 config BUILTIN_RETURN_ADDRESS_STRIPS_PAC
bool
# Clang's __builtin_return_adddress() strips the PAC since 12.0.0
-   # https://reviews.llvm.org/D75044
+   # 
https://github.com/llvm/llvm-project/commit/2a96f47c5ffca84cd774ad402cacd137f4bf45e2
default y if CC_IS_CLANG && (CLANG_VERSION >= 12)
# GCC's __builtin_return_address() strips the PAC since 11.1.0,
# and this was backported to 10.2.0, 9.4.0, 8.5.0, but not earlier
@@ -2202,7 +2202,7 @@ config STACKPROTECTOR_PER_TASK
 
 config UNWIND_PATCH_PAC_INTO_SCS
bool "Enable shadow call stack dynamically using code patching"
-   # needs Clang with https://reviews.llvm.org/D111780 incorporated
+   # needs Clang with 
https://github.com/llvm/llvm-project/commit/de07cde67b5d205d58690be012106022aea6d2b3
 incorporated
depends on CC_IS_CLANG && CLANG_VERSION >= 15
depends on ARM64_PTR_AUTH_KERNEL && CC_HAS_BRANCH_PROT_PAC_RET
depends on SHADOW_CALL_STACK
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index cd4c9a204d08..f7453eba0b62 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -291,7 +291,7 @@ config AS_HAS_INSN
def_bool $(as-instr,.insn r 51$(comma) 0$(comma) 0$(comma) t0$(comma) 
t0$(comma) zero)
 
 config AS_HAS_OPTION_ARCH
-   # https://reviews.llvm.org/D123515
+   # 
https://github.com/llvm/llvm-project/commit/9e8ed3403c191ab9c4903e8eeb8f732ff8a43cb4
def_bool y
depends on $(as-instr, .option arch$(comma) +m)
depends on !$(as-instr, .option arch$(comma) -i)
diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index 2b2f5df7ef2c..3f526404a718 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -15,7 +15,7 @@
 
 /*
  * Clang prior to 13 had "mcount" instead of "_mcount":
- * https://reviews.llvm.org/D98881
+ * 
https://github.com/llvm/llvm-project/commit/ef58ae86ba778ed7d01cd3f6bd6d08f943abab44
  */
 #if defined(CONFIG_CC_IS_GCC) || CONFIG_CLANG_VERSION >= 13
 #define MCOUNT_NAME _mcount
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index ddab1ef22bee..f0a47afef125 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -9,7 +9,7 @@
  * Clang prior to 17 is being silly and considers many __cleanup() variables
  * as unused (because they are, their sole purpose is to go out of scope).
  *
- * https://reviews.llvm.org/D152180
+ * 
https://github.com/llvm/llvm-project/commit/877210faa447f4cc7db87812f8ed80e398fedd61
  */
 #undef __cleanup
 #define __cleanup(func) __maybe_unused __attribute__((__cleanup__(func)))

-- 
2.43.0

[PATCH 1/3] selftests/bpf: Update LLVM Phabricator links

2024-01-09 Thread Nathan Chancellor

reviews.llvm.org was LLVM's Phabricator instances for code review. It
has been abandoned in favor of GitHub pull requests. While the majority
of links in the kernel sources still work because of the work Fangrui
has done turning the dynamic Phabricator instance into a static archive,
there are some issues with that work, so preemptively convert all the
links in the kernel sources to point to the commit on GitHub.

Most of the commits have the corresponding differential review link in
the commit message itself so there should not be any loss of fidelity in
the relevant information.

Additionally, fix a typo in the xdpwall.c print ("LLMV" -> "LLVM") while
in the area.

Link: https://discourse.llvm.org/t/update-on-github-pull-requests/71540/172
Signed-off-by: Nathan Chancellor 
---
Cc: a...@kernel.org
Cc: dan...@iogearbox.net
Cc: and...@kernel.org
Cc: myko...@fb.com
Cc: b...@vger.kernel.org
Cc: linux-kselft...@vger.kernel.org
---
 tools/testing/selftests/bpf/README.rst | 32 +++---
 tools/testing/selftests/bpf/prog_tests/xdpwall.c   |  2 +-
 .../selftests/bpf/progs/test_core_reloc_type_id.c  |  2 +-
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/tools/testing/selftests/bpf/README.rst 
b/tools/testing/selftests/bpf/README.rst
index cb9b95702ac6..b9a493f66557 100644
--- a/tools/testing/selftests/bpf/README.rst
+++ b/tools/testing/selftests/bpf/README.rst
@@ -115,7 +115,7 @@ the insn 20 undoes map_value addition. It is currently 
impossible for the
 verifier to understand such speculative pointer arithmetic.
 Hence `this patch`__ addresses it on the compiler side. It was committed on 
llvm 12.
 
-__ https://reviews.llvm.org/D85570
+__ 
https://github.com/llvm/llvm-project/commit/ddf1864ace484035e3cde5e83b3a31ac81e059c6
 
 The corresponding C code
 
@@ -165,7 +165,7 @@ This is due to a llvm BPF backend bug. `The fix`__
 has been pushed to llvm 10.x release branch and will be
 available in 10.0.1. The patch is available in llvm 11.0.0 trunk.
 
-__  https://reviews.llvm.org/D78466
+__  
https://github.com/llvm/llvm-project/commit/3cb7e7bf959dcd3b8080986c62e10a75c7af43f0
 
 bpf_verif_scale/loop6.bpf.o test failure with Clang 12
 ==
@@ -204,7 +204,7 @@ r5(w5) is eventually saved on stack at insn #24 for later 
use.
 This cause later verifier failure. The bug has been `fixed`__ in
 Clang 13.
 
-__  https://reviews.llvm.org/D97479
+__  
https://github.com/llvm/llvm-project/commit/1959ead525b8830cc8a345f45e1c3ef9902d3229
 
 BPF CO-RE-based tests and Clang version
 ===
@@ -221,11 +221,11 @@ failures:
 - __builtin_btf_type_id() [0_, 1_, 2_];
 - __builtin_preserve_type_info(), __builtin_preserve_enum_value() [3_, 4_].
 
-.. _0: https://reviews.llvm.org/D74572
-.. _1: https://reviews.llvm.org/D74668
-.. _2: https://reviews.llvm.org/D85174
-.. _3: https://reviews.llvm.org/D83878
-.. _4: https://reviews.llvm.org/D83242
+.. _0: 
https://github.com/llvm/llvm-project/commit/6b01b465388b204d543da3cf49efd6080db094a9
+.. _1: 
https://github.com/llvm/llvm-project/commit/072cde03aaa13a2c57acf62d79876bf79aa1919f
+.. _2: 
https://github.com/llvm/llvm-project/commit/00602ee7ef0bf6c68d690a2bd729c12b95c95c99
+.. _3: 
https://github.com/llvm/llvm-project/commit/6d218b4adb093ff2e9764febbbc89f429412006c
+.. _4: 
https://github.com/llvm/llvm-project/commit/6d6750696400e7ce988d66a1a00e1d0cb32815f8
 
 Floating-point tests and Clang version
 ==
@@ -234,7 +234,7 @@ Certain selftests, e.g. core_reloc, require support for the 
floating-point
 types, which was introduced in `Clang 13`__. The older Clang versions will
 either crash when compiling these tests, or generate an incorrect BTF.
 
-__  https://reviews.llvm.org/D83289
+__  
https://github.com/llvm/llvm-project/commit/a7137b238a07d9399d3ae96c0b461571bd5aa8b2
 
 Kernel function call test and Clang version
 ===
@@ -248,7 +248,7 @@ Without it, the error from compiling bpf selftests looks 
like:
 
   libbpf: failed to find BTF for extern 'tcp_slow_start' [25] section: -2
 
-__ https://reviews.llvm.org/D93563
+__ 
https://github.com/llvm/llvm-project/commit/886f9ff53155075bd5f1e994f17b85d1e1b7470c
 
 btf_tag test and Clang version
 ==
@@ -264,8 +264,8 @@ Without them, the btf_tag selftest will be skipped and you 
will observe:
 
   # btf_tag:SKIP
 
-.. _0: https://reviews.llvm.org/D111588
-.. _1: https://reviews.llvm.org/D99
+.. _0: 
https://github.com/llvm/llvm-project/commit/a162b67c98066218d0d00aa13b99afb95d9bb5e6
+.. _1: 
https://github.com/llvm/llvm-project/commit/3466e00716e12e32fdb100e3fcfca5c2b3e8d784
 
 Clang dependencies for static linking tests
 ===
@@ -274,7 +274,7 @@ linked_vars, linked_maps, and linked_funcs tests depend on 
`Clang fix`__ to
 generate valid BTF infor

[PATCH 0/3] Update LLVM Phabricator and Bugzilla links

2024-01-09 Thread Nathan Chancellor

This series updates all instances of LLVM Phabricator and Bugzilla links
to point to GitHub commits directly and LLVM's Bugzilla to GitHub issue
shortlinks respectively.

I split up the Phabricator patch into BPF selftests and the rest of the
kernel in case the BPF folks want to take it separately from the rest of
the series, there are obviously no dependency issues in that case. The
Bugzilla change was mechanical enough and should have no conflicts.

I am aiming this at Andrew and CC'ing other lists, in case maintainers
want to chime in, but I think this is pretty uncontroversial (famous
last words...).

---
Nathan Chancellor (3):
  selftests/bpf: Update LLVM Phabricator links
  arch and include: Update LLVM Phabricator links
  treewide: Update LLVM Bugzilla links

 arch/arm64/Kconfig |  4 +--
 arch/powerpc/Makefile  |  4 +--
 arch/powerpc/kvm/book3s_hv_nested.c|  2 +-
 arch/riscv/Kconfig |  2 +-
 arch/riscv/include/asm/ftrace.h|  2 +-
 arch/s390/include/asm/ftrace.h |  2 +-
 arch/x86/power/Makefile|  2 +-
 crypto/blake2b_generic.c   |  2 +-
 drivers/firmware/efi/libstub/Makefile  |  2 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c   |  2 +-
 drivers/media/test-drivers/vicodec/codec-fwht.c|  2 +-
 drivers/regulator/Kconfig  |  2 +-
 include/asm-generic/vmlinux.lds.h  |  2 +-
 include/linux/compiler-clang.h |  2 +-
 lib/Kconfig.kasan  |  2 +-
 lib/raid6/Makefile |  2 +-
 lib/stackinit_kunit.c  |  2 +-
 mm/slab_common.c   |  2 +-
 net/bridge/br_multicast.c  |  2 +-
 security/Kconfig   |  2 +-
 tools/testing/selftests/bpf/README.rst | 32 +++---
 tools/testing/selftests/bpf/prog_tests/xdpwall.c   |  2 +-
 .../selftests/bpf/progs/test_core_reloc_type_id.c  |  2 +-
 23 files changed, 40 insertions(+), 40 deletions(-)
---
base-commit: 0dd3ee31125508cd67f7e7172247f05b7fd1753a
change-id: 20240109-update-llvm-links-d03f9d649e1e

Best regards,
-- 
Nathan Chancellor

Re: [PATCH 3/3] drm/amd/display: Support DRM_AMD_DC_FP on RISC-V

2023-11-29 Thread Nathan Chancellor

On Thu, Nov 23, 2023 at 02:23:01PM +, Conor Dooley wrote:
> On Tue, Nov 21, 2023 at 07:05:15PM -0800, Samuel Holland wrote:
> > RISC-V uses kernel_fpu_begin()/kernel_fpu_end() like several other
> > architectures. Enabling hardware FP requires overriding the ISA string
> > for the relevant compilation units.
> 
> Ah yes, bringing the joy of frame-larger-than warnings to RISC-V:
> ../drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13:
>  warning: stack frame size (2416) exceeds limit (2048) in 
> 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation'
>  [-Wframe-larger-than]

:(

> Nathan, have you given up on these being sorted out?

Does your configuration have KASAN (I don't think RISC-V supports
KCSAN)? It is possible that dml/dcn32 needs something similar to commit
6740ec97bcdb ("drm/amd/display: Increase frame warning limit with KASAN
or KCSAN in dml2")?

I am not really interested in playing whack-a-mole with these warnings
like I have done in the past for the reasons I outlined here:

https://lore.kernel.org/20231019205117.GA839902@dev-arch.thelio-3990X/

> Also, what on earth is that function name, it exceeds 80 characters
> before even considering anything else? Actually, I don't think I want
> to know.

Welcome to "gcc-parsable HW gospel, coming straight from HW engineers" :)

Cheers,
Nathan

[PATCH v2] drm/amd/display: Increase frame warning limit with KASAN or KCSAN in dml2

2023-11-02 Thread Nathan Chancellor

When building ARCH=x86_64 allmodconfig with clang, which will typically
have sanitizers enabled, there is a warning about a large stack frame.

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6265:13: 
error: stack frame size (2520) exceeds limit (2048) in 'dml_prefetch_check' 
[-Werror,-Wframe-larger-than]
   6265 | static void dml_prefetch_check(struct display_mode_lib_st *mode_lib)
| ^
  1 error generated.

Notably, GCC 13.2.0 does not do too much of a better job, as it is right
at the current limit of 2048 (and others have reported being over with
older GCC versions):

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c: In 
function 'dml_prefetch_check':
  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6705:1: 
error: the frame size of 2048 bytes is larger than 1800 bytes 
[-Werror=frame-larger-than=]
   6705 | }
| ^

In the past, these warnings have been avoided by reducing the number of
parameters to various functions so that not as many arguments need to be
passed on the stack. However, these patches take a good amount of effort
to write despite being mechanical due to code structure and complexity
and they are never carried forward to new generations of the code so
that effort has to be expended every new hardware generation, which
becomes harder to justify as time goes on.

To avoid having a noticeable or lengthy breakage in all{mod,yes}config,
which are easy testing targets that have -Werror enabled, increase the
limit for configurations that have KASAN or KCSAN enabled by 50% so that
cases of extremely poor code generation can still be caught while not
breaking the majority of builds. CONFIG_KMSAN also causes high stack
usage but the frame limit is already set to zero when it is enabled,
which is accounted for by the check for CONFIG_FRAME_WARN=0 in the dml2
Makefile.

Signed-off-by: Nathan Chancellor 
---
If there is another DRM pull before 6.7-rc1, it would be much
appreciated if this could make that so that other trees are not
potentially broken by this. If not, no worries, as it was my fault for
not sending this sooner.

Changes in v2:
- Adjust workaround to check for either CONFIG_KASAN=y or
  CONFIG_KCSAN=y, as the same problem has been reported with older
  versions of GCC (Hamza, Alex)
- Link to v1: 
https://lore.kernel.org/r/20231102-amdgpu-dml2-increase-frame-size-warning-for-clang-v1-1-6eb157352...@kernel.org
---
 drivers/gpu/drm/amd/display/dc/dml2/Makefile | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
index 70ae5eba624e..acff3449b8d7 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
@@ -60,8 +60,12 @@ endif
 endif
 
 ifneq ($(CONFIG_FRAME_WARN),0)
+ifeq ($(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),y)
+frame_warn_flag := -Wframe-larger-than=3072
+else
 frame_warn_flag := -Wframe-larger-than=2048
 endif
+endif
 
 CFLAGS_$(AMDDALPATH)/dc/dml2/display_mode_core.o := $(dml2_ccflags) 
$(frame_warn_flag)
 CFLAGS_$(AMDDALPATH)/dc/dml2/display_mode_util.o := $(dml2_ccflags)

---
base-commit: 21e80f3841c01aeaf32d7aee7bbc87b3db1aa0c6
change-id: 
20231102-amdgpu-dml2-increase-frame-size-warning-for-clang-c93bd2d6a871

Best regards,
-- 
Nathan Chancellor

Re: [PATCH] drm/amd/display: Increase frame warning limit for clang in dml2

2023-11-02 Thread Nathan Chancellor

On Thu, Nov 02, 2023 at 12:59:00PM -0400, Hamza Mahfooz wrote:
> On 11/2/23 12:24, Nathan Chancellor wrote:
> > When building ARCH=x86_64 allmodconfig with clang, which have sanitizers
> > enabled, there is a warning about a large stack frame.
> > 
> >
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6265:13: 
> > error: stack frame size (2520) exceeds limit (2048) in 'dml_prefetch_check' 
> > [-Werror,-Wframe-larger-than]
> > 6265 | static void dml_prefetch_check(struct display_mode_lib_st 
> > *mode_lib)
> >  | ^
> >1 error generated.
> > 
> > Notably, GCC 13.2.0 does not do too much of a better job, as it is right
> > at the current limit of 2048:
> > 
> >drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c: In 
> > function 'dml_prefetch_check':
> >
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6705:1: 
> > error: the frame size of 2048 bytes is larger than 1800 bytes 
> > [-Werror=frame-larger-than=]
> > 6705 | }
> >  | ^
> > 
> > In the past, these warnings have been avoided by reducing the number of
> > parameters to various functions so that not as many arguments need to be
> > passed on the stack. However, these patches take a good amount of effort
> > to write despite being mechanical due to code structure and complexity
> > and they are never carried forward to new generations of the code so
> > that effort has to be expended every new hardware generation, which
> > becomes harder to justify as time goes on.
> > 
> > There is some effort to improve clang's code generation but that may
> > take some time between code review, shifting priorities, and release
> > cycles. To avoid having a noticeable or lengthy breakage in
> > all{mod,yes}config, which are easy testing targets that have -Werror
> > enabled, increase the limit for clang by 50% so that cases of extremely
> > poor code generation can still be caught while not breaking the majority
> > of builds. When clang's code generation improves, the limit increase can
> > be restricted to older clang versions.
> > 
> > Signed-off-by: Nathan Chancellor 
> > ---
> > If there is another DRM pull before 6.7-rc1, it would be much
> > appreciated if this could make that so that other trees are not
> > potentially broken by this. If not, no worries, as it was my fault for
> > not sending this sooner.
> > ---
> >   drivers/gpu/drm/amd/display/dc/dml2/Makefile | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/display/dc/dml2/Makefile 
> > b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
> > index 70ae5eba624e..dff8237c0999 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dml2/Makefile
> > +++ b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
> > @@ -60,7 +60,7 @@ endif
> >   endif
> >   ifneq ($(CONFIG_FRAME_WARN),0)
> > -frame_warn_flag := -Wframe-larger-than=2048
> > +frame_warn_flag := -Wframe-larger-than=$(if 
> > $(CONFIG_CC_IS_CLANG),3072,2048)
> 
> I would prefer checking for `CONFIG_KASAN || CONFIG_KCSAN` instead
> since the stack usage shouldn't change much if both of those are disabled.

So something like this? Or were you talking about replacing the clang
check entirely with the KASAN/KCSAN check?

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
index 70ae5eba624e..0fc1b13295eb 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
@@ -60,8 +60,12 @@ endif
 endif
 
 ifneq ($(CONFIG_FRAME_WARN),0)
+ifeq ($(CONFIG_CC_IS_CLANG)$(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),yy)
+frame_warn_flag := -Wframe-larger-than=3072
+else
 frame_warn_flag := -Wframe-larger-than=2048
 endif
+endif
 
 CFLAGS_$(AMDDALPATH)/dc/dml2/display_mode_core.o := $(dml2_ccflags) 
$(frame_warn_flag)
 CFLAGS_$(AMDDALPATH)/dc/dml2/display_mode_util.o := $(dml2_ccflags)

> >   endif
> >   CFLAGS_$(AMDDALPATH)/dc/dml2/display_mode_core.o := $(dml2_ccflags) 
> > $(frame_warn_flag)
> > 
> > ---
> > base-commit: 21e80f3841c01aeaf32d7aee7bbc87b3db1aa0c6
> > change-id: 
> > 20231102-amdgpu-dml2-increase-frame-size-warning-for-clang-c93bd2d6a871
> > 
> > Best regards,
> -- 
> Hamza
>

[PATCH] drm/amd/display: Increase frame warning limit for clang in dml2

2023-11-02 Thread Nathan Chancellor

When building ARCH=x86_64 allmodconfig with clang, which have sanitizers
enabled, there is a warning about a large stack frame.

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6265:13: 
error: stack frame size (2520) exceeds limit (2048) in 'dml_prefetch_check' 
[-Werror,-Wframe-larger-than]
   6265 | static void dml_prefetch_check(struct display_mode_lib_st *mode_lib)
| ^
  1 error generated.

Notably, GCC 13.2.0 does not do too much of a better job, as it is right
at the current limit of 2048:

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c: In 
function 'dml_prefetch_check':
  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6705:1: 
error: the frame size of 2048 bytes is larger than 1800 bytes 
[-Werror=frame-larger-than=]
   6705 | }
| ^

In the past, these warnings have been avoided by reducing the number of
parameters to various functions so that not as many arguments need to be
passed on the stack. However, these patches take a good amount of effort
to write despite being mechanical due to code structure and complexity
and they are never carried forward to new generations of the code so
that effort has to be expended every new hardware generation, which
becomes harder to justify as time goes on.

There is some effort to improve clang's code generation but that may
take some time between code review, shifting priorities, and release
cycles. To avoid having a noticeable or lengthy breakage in
all{mod,yes}config, which are easy testing targets that have -Werror
enabled, increase the limit for clang by 50% so that cases of extremely
poor code generation can still be caught while not breaking the majority
of builds. When clang's code generation improves, the limit increase can
be restricted to older clang versions.

Signed-off-by: Nathan Chancellor 
---
If there is another DRM pull before 6.7-rc1, it would be much
appreciated if this could make that so that other trees are not
potentially broken by this. If not, no worries, as it was my fault for
not sending this sooner.
---
 drivers/gpu/drm/amd/display/dc/dml2/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
index 70ae5eba624e..dff8237c0999 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
@@ -60,7 +60,7 @@ endif
 endif
 
 ifneq ($(CONFIG_FRAME_WARN),0)
-frame_warn_flag := -Wframe-larger-than=2048
+frame_warn_flag := -Wframe-larger-than=$(if $(CONFIG_CC_IS_CLANG),3072,2048)
 endif
 
 CFLAGS_$(AMDDALPATH)/dc/dml2/display_mode_core.o := $(dml2_ccflags) 
$(frame_warn_flag)

---
base-commit: 21e80f3841c01aeaf32d7aee7bbc87b3db1aa0c6
change-id: 
20231102-amdgpu-dml2-increase-frame-size-warning-for-clang-c93bd2d6a871

Best regards,
-- 
Nathan Chancellor

[PATCH] drm/amd/display: Respect CONFIG_FRAME_WARN=0 in DML2

2023-10-18 Thread Nathan Chancellor

display_mode_code.c is unconditionally built with
-Wframe-larger-than=2048, which causes warnings even when
CONFIG_FRAME_WARN has been set to 0, which should show no warnings.

Use the existing $(frame_warn_flag) variable, which handles this
situation. This is basically commit 25f178bbd078 ("drm/amd/display:
Respect CONFIG_FRAME_WARN=0 in dml Makefile") but for DML2.

Fixes: 7966f319c66d ("drm/amd/display: Introduce DML2")
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/dc/dml2/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml2/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
index f35ed8de260d..66431525f2a0 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
@@ -61,7 +61,7 @@ ifneq ($(CONFIG_FRAME_WARN),0)
 frame_warn_flag := -Wframe-larger-than=2048
 endif
 
-CFLAGS_$(AMDDALPATH)/dc/dml2/display_mode_core.o := $(dml2_ccflags) 
-Wframe-larger-than=2048
+CFLAGS_$(AMDDALPATH)/dc/dml2/display_mode_core.o := $(dml2_ccflags) 
$(frame_warn_flag)
 CFLAGS_$(AMDDALPATH)/dc/dml2/display_mode_util.o := $(dml2_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml2_wrapper.o := $(dml2_ccflags)
 CFLAGS_$(AMDDALPATH)/dc/dml2/dml2_utils.o := $(dml2_ccflags)

---
base-commit: cd90511557fdfb394bb4ac4c3b539b007383914c
change-id: 20231018-amdgpu-dml2-respect-frame_warn-536e19b69ce0

Best regards,
-- 
Nathan Chancellor

Re: [PATCH 0/2] Reduce stack size for DML2

2023-10-17 Thread Nathan Chancellor

On Tue, Oct 17, 2023 at 11:45:42AM -0600, Rodrigo Siqueira Jordao wrote:
> Hi Nathan,
> (+Hamza)
> 
> First of all, thanks a lot for your feedback. You can see my comments
> inline.
> 
> On 10/17/23 11:22, Nathan Chancellor wrote:
> > Hi Rodrigo,
> > 
> > On Mon, Oct 16, 2023 at 08:19:16AM -0600, Rodrigo Siqueira wrote:
> > > Stephen discovers a stack size issue when compiling the latest amdgpu
> > > code with allmodconfig. This patchset addresses that issue by splitting
> > > a large function into two smaller parts.
> > > 
> > > Thanks
> > > Siqueira
> > > 
> > > Rodrigo Siqueira (2):
> > >drm/amd/display: Reduce stack size by splitting function
> > >drm/amd/display: Fix stack size issue on DML2
> > > 
> > >   .../amd/display/dc/dml2/display_mode_core.c   | 3289 +
> > >   1 file changed, 1653 insertions(+), 1636 deletions(-)
> > > 
> > > -- 
> > > 2.42.0
> > > 
> > 
> > This series appears in -next as commit c587ee30f376 ("drm/amd/display:
> > Reduce stack size by splitting function"); while it may help stack usage
> > for GCC, clang still suffers. All clang versions that the kernel
> > supports show a warning for dml_prefetch_check(), the following is with
> > LLVM 17.0.2 from kernel.org [1].
> > 
> >
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6263:13: 
> > error: stack frame size (2520) exceeds limit (2048) in 'dml_prefetch_check' 
> > [-Werror,-Wframe-larger-than]
> > 6263 | static void dml_prefetch_check(struct display_mode_lib_st 
> > *mode_lib)
> >  | ^
> > 
> > With clang 18.0.0 (tip of tree) and 15.0.7, I see:
> > 
> >
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:8277:6: 
> > error: stack frame size (2056) exceeds limit (2048) in 
> > 'dml_core_mode_programming' [-Werror,-Wframe-larger-than]
> > 8277 | void dml_core_mode_programming(struct display_mode_lib_st 
> > *mode_lib, const struct dml_clk_cfg_st *clk_cfg)
> >  |  ^
> > 
> > For what it's worth, building with GCC 13.2.0 with a slighly lower
> > -Wframe-larger-than value reveals that dml_prefetch_check() is right at
> > the current limit and the stack usage of dml_core_mode_programming()
> > when built with GCC is not too far of clang's, so it seems like there
> > should be a more robust set of fixes, such as the ones that I have
> > already done for older generations of this code.
> > 
> >drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c: In 
> > function 'dml_prefetch_check':
> >
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6705:1: 
> > error: the frame size of 2048 bytes is larger than 1800 bytes 
> > [-Werror=frame-larger-than=]
> > 6705 | }
> >  | ^
> > 
> >drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c: In 
> > function 'dml_core_mode_programming':
> >
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:9893:1: 
> > error: the frame size of 1880 bytes is larger than 1800 bytes 
> > [-Werror=frame-larger-than=]
> > 9893 | } // dml_core_mode_programming
> >  | ^
> > 
> > 41012d715d5d drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline 
> > for stack usage
> > 21485d3da659 drm/amd/display: Reduce number of arguments of dml31's 
> > CalculateFlipSchedule()
> > 37934d4118e2 drm/amd/display: Reduce number of arguments of dml31's 
> > CalculateWatermarksAndDRAMSpeedChangeSupport()
> > a3fef74b1d48 drm/amd/display: Reduce number of arguments of 
> > dml32_CalculatePrefetchSchedule()
> > c4be0ac987f2 drm/amd/display: Reduce number of arguments of 
> > dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport()
> > 25ea501ed85d drm/amd/display: Reduce number of arguments of dml314's 
> > CalculateFlipSchedule()
> > ca07f4f5a98b drm/amd/display: Reduce number of arguments of dml314's 
> > CalculateWatermarksAndDRAMSpeedChangeSupport()
> > 
> > It would be really nice if these would somehow make it back to the
> > original sources so that we stop going through this every time a new
> > version of this code shows up.
> 
> I'm familiar with that approach of reducing the stack size. Correct me if
> I'm wrong, but the idea can be summarized as:
> 
> 1. Move the local variable to a new struct.
> 2. Add the

Re: [PATCH 0/2] Reduce stack size for DML2

2023-10-17 Thread Nathan Chancellor

Hi Rodrigo,

On Mon, Oct 16, 2023 at 08:19:16AM -0600, Rodrigo Siqueira wrote:
> Stephen discovers a stack size issue when compiling the latest amdgpu
> code with allmodconfig. This patchset addresses that issue by splitting
> a large function into two smaller parts.
> 
> Thanks
> Siqueira
> 
> Rodrigo Siqueira (2):
>   drm/amd/display: Reduce stack size by splitting function
>   drm/amd/display: Fix stack size issue on DML2
> 
>  .../amd/display/dc/dml2/display_mode_core.c   | 3289 +
>  1 file changed, 1653 insertions(+), 1636 deletions(-)
> 
> -- 
> 2.42.0
> 

This series appears in -next as commit c587ee30f376 ("drm/amd/display:
Reduce stack size by splitting function"); while it may help stack usage
for GCC, clang still suffers. All clang versions that the kernel
supports show a warning for dml_prefetch_check(), the following is with
LLVM 17.0.2 from kernel.org [1].

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6263:13: 
error: stack frame size (2520) exceeds limit (2048) in 'dml_prefetch_check' 
[-Werror,-Wframe-larger-than]
   6263 | static void dml_prefetch_check(struct display_mode_lib_st *mode_lib)
| ^

With clang 18.0.0 (tip of tree) and 15.0.7, I see:

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:8277:6: 
error: stack frame size (2056) exceeds limit (2048) in 
'dml_core_mode_programming' [-Werror,-Wframe-larger-than]
   8277 | void dml_core_mode_programming(struct display_mode_lib_st *mode_lib, 
const struct dml_clk_cfg_st *clk_cfg)
|  ^

For what it's worth, building with GCC 13.2.0 with a slighly lower
-Wframe-larger-than value reveals that dml_prefetch_check() is right at
the current limit and the stack usage of dml_core_mode_programming()
when built with GCC is not too far of clang's, so it seems like there
should be a more robust set of fixes, such as the ones that I have
already done for older generations of this code.

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c: In 
function 'dml_prefetch_check':
  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6705:1: 
error: the frame size of 2048 bytes is larger than 1800 bytes 
[-Werror=frame-larger-than=]
   6705 | }
| ^

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c: In 
function 'dml_core_mode_programming':
  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:9893:1: 
error: the frame size of 1880 bytes is larger than 1800 bytes 
[-Werror=frame-larger-than=]
   9893 | } // dml_core_mode_programming
| ^

41012d715d5d drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for 
stack usage
21485d3da659 drm/amd/display: Reduce number of arguments of dml31's 
CalculateFlipSchedule()
37934d4118e2 drm/amd/display: Reduce number of arguments of dml31's 
CalculateWatermarksAndDRAMSpeedChangeSupport()
a3fef74b1d48 drm/amd/display: Reduce number of arguments of 
dml32_CalculatePrefetchSchedule()
c4be0ac987f2 drm/amd/display: Reduce number of arguments of 
dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport()
25ea501ed85d drm/amd/display: Reduce number of arguments of dml314's 
CalculateFlipSchedule()
ca07f4f5a98b drm/amd/display: Reduce number of arguments of dml314's 
CalculateWatermarksAndDRAMSpeedChangeSupport()

It would be really nice if these would somehow make it back to the
original sources so that we stop going through this every time a new
version of this code shows up. I thought that AMD has started testing
with clang, how were these warnings not caught before the code was
merged? If you are unable to look into these warnings, I can try to
double back to this once I look into the other fires in -next...

[1]: https://mirrors.edge.kernel.org/pub/tools/llvm/

Cheers,
Nathan

[PATCH] drm/amd/display: Fix -Wuninitialized in dm_helpers_dp_mst_send_payload_allocation()

2023-09-13 Thread Nathan Chancellor

When building with clang, there is a warning (or error when
CONFIG_WERROR is set):

  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_helpers.c:368:21: 
error: variable 'old_payload' is uninitialized when used here 
[-Werror,-Wuninitialized]
368 |  new_payload, 
old_payload);
|   
^~~
  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_helpers.c:344:61: 
note: initialize the variable 'old_payload' to silence this warning
344 | struct drm_dp_mst_atomic_payload *new_payload, *old_payload;
|^
| = 
NULL
  1 error generated.

This variable is not required outside of this function so allocate
old_payload on the stack and pass it by reference to
dm_helpers_construct_old_payload(), resolving the warning.

Closes: https://github.com/ClangBuiltLinux/linux/issues/1931
Fixes: 5aa1dfcdf0a4 ("drm/mst: Refactor the flow for payload 
allocation/removement")
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index 9ad509279b0a..c4c35f6844f4 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -341,7 +341,7 @@ bool dm_helpers_dp_mst_send_payload_allocation(
struct amdgpu_dm_connector *aconnector;
struct drm_dp_mst_topology_state *mst_state;
struct drm_dp_mst_topology_mgr *mst_mgr;
-   struct drm_dp_mst_atomic_payload *new_payload, *old_payload;
+   struct drm_dp_mst_atomic_payload *new_payload, old_payload;
enum mst_progress_status set_flag = MST_ALLOCATE_NEW_PAYLOAD;
enum mst_progress_status clr_flag = MST_CLEAR_ALLOCATED_PAYLOAD;
int ret = 0;
@@ -365,8 +365,8 @@ bool dm_helpers_dp_mst_send_payload_allocation(
ret = drm_dp_add_payload_part2(mst_mgr, mst_state->base.state, 
new_payload);
} else {
dm_helpers_construct_old_payload(stream->link, 
mst_state->pbn_div,
-new_payload, old_payload);
-   drm_dp_remove_payload_part2(mst_mgr, mst_state, old_payload, 
new_payload);
+new_payload, &old_payload);
+   drm_dp_remove_payload_part2(mst_mgr, mst_state, &old_payload, 
new_payload);
}
 
if (ret) {

---
base-commit: 8569c31545385195bdb0c021124e68336e91c693
change-id: 
20230913-fix-wuninitialized-dm_helpers_dp_mst_send_payload_allocation-c37b33aaad18

Best regards,
-- 
Nathan Chancellor

Re: [PATCH] drm/radeon: Prefer 'unsigned int' to bare use of 'unsigned'

2023-07-29 Thread Nathan Chancellor

On Sat, Jul 29, 2023 at 09:12:05PM +0700, Bagas Sanjaya wrote:
> On Fri, Jul 28, 2023 at 10:35:19PM +0800, 孙冉 wrote:
> > WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
> > 
> > Signed-off-by: Ran Sun 
> 
> Your From: address != SoB identity

While the comment below is a completely valid complaint, I think this
comment is being rather pedantic. Google Translate will tell you that
孙冉 is Sun Ran, so while the name does not strictly match, it is
clearly the same...

> > ---
> >  drivers/gpu/drm/radeon/radeon_object.h | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/radeon/radeon_object.h 
> > b/drivers/gpu/drm/radeon/radeon_object.h
> > index 39cc87a59a9a..9b55a7103cfd 100644
> > --- a/drivers/gpu/drm/radeon/radeon_object.h
> > +++ b/drivers/gpu/drm/radeon/radeon_object.h
> > @@ -37,7 +37,7 @@
> >   *
> >   * Returns corresponding domain of the ttm mem_type
> >   */
> > -static inline unsigned radeon_mem_type_to_domain(u32 mem_type)
> > +static inline unsigned int radeon_mem_type_to_domain(u32 mem_type)
> >  {
> >   switch (mem_type) {
> >   case TTM_PL_VRAM:
> > @@ -112,12 +112,12 @@ static inline unsigned long radeon_bo_size(struct 
> > radeon_bo *bo)
> >   return bo->tbo.base.size;
> >  }
> >  
> > -static inline unsigned radeon_bo_ngpu_pages(struct radeon_bo *bo)
> > +static inline unsigned int radeon_bo_ngpu_pages(struct radeon_bo *bo)
> >  {
> >   return bo->tbo.base.size / RADEON_GPU_PAGE_SIZE;
> >  }
> >  
> > -static inline unsigned radeon_bo_gpu_page_alignment(struct radeon_bo *bo)
> > +static inline unsigned int radeon_bo_gpu_page_alignment(struct radeon_bo 
> > *bo)
> >  {
> >   return (bo->tbo.page_alignment << PAGE_SHIFT) / RADEON_GPU_PAGE_SIZE;
> >  }
> > @@ -189,7 +189,7 @@ static inline void *radeon_sa_bo_cpu_addr(struct 
> > drm_suballoc *sa_bo)
> >  
> >  extern int radeon_sa_bo_manager_init(struct radeon_device *rdev,
> >   struct radeon_sa_manager *sa_manager,
> > - unsigned size, u32 align, u32 domain,
> > + unsigned int size, u32 align, u32 domain,
> >   u32 flags);
> >  extern void radeon_sa_bo_manager_fini(struct radeon_device *rdev,
> >struct radeon_sa_manager *sa_manager);
> 
> The patch is whitespace-corrupted. Use git-send-email(1) to submit patches.
> Also, your patch is also MIME-encoded, hence the corruption.
> 
> To Alex: Please don't apply this patch due to reasons above.
> 
> Thanks.
> 
> -- 
> An old man doll... just what I always wanted! - Clara

Re: [PATCH 1/6] drm: execution context for GEM buffers v7

2023-07-22 Thread Nathan Chancellor

Hi Christian,

On Tue, Jul 11, 2023 at 03:31:17PM +0200, Christian König wrote:
> This adds the infrastructure for an execution context for GEM buffers
> which is similar to the existing TTMs execbuf util and intended to replace
> it in the long term.
> 
> The basic functionality is that we abstracts the necessary loop to lock
> many different GEM buffers with automated deadlock and duplicate handling.
> 
> v2: drop xarray and use dynamic resized array instead, the locking
> overhead is unnecessary and measurable.
> v3: drop duplicate tracking, radeon is really the only one needing that.
> v4: fixes issues pointed out by Danilo, some typos in comments and a
> helper for lock arrays of GEM objects.
> v5: some suggestions by Boris Brezillon, especially just use one retry
> macro, drop loop in prepare_array, use flags instead of bool
> v6: minor changes suggested by Thomas, Boris and Danilo
> v7: minor typos pointed out by checkpatch.pl fixed
> 
> Signed-off-by: Christian König 
> Reviewed-by: Boris Brezillon 
> Reviewed-by: Danilo Krummrich 
> Tested-by: Danilo Krummrich 



> diff --git a/include/drm/drm_exec.h b/include/drm/drm_exec.h
> new file mode 100644
> index ..73205afec162
> --- /dev/null
> +++ b/include/drm/drm_exec.h



> + * Since labels can't be defined local to the loops body we use a jump 
> pointer
> + * to make sure that the retry is only used from within the loops body.
> + */
> +#define drm_exec_until_all_locked(exec)  \
> + for (void *__drm_exec_retry_ptr; ({ \
> + __label__ __drm_exec_retry; \
> +__drm_exec_retry:\
> + __drm_exec_retry_ptr = &&__drm_exec_retry;  \
> + (void)__drm_exec_retry_ptr; \
> + drm_exec_cleanup(exec); \
> + });)
> +
> +/**
> + * drm_exec_retry_on_contention - restart the loop to grap all locks
> + * @exec: drm_exec object
> + *
> + * Control flow helper to continue when a contention was detected and we 
> need to
> + * clean up and re-start the loop to prepare all GEM objects.
> + */
> +#define drm_exec_retry_on_contention(exec)   \
> + do {\
> + if (unlikely(drm_exec_is_contended(exec)))  \
> + goto *__drm_exec_retry_ptr; \
> + } while (0)

This construct of using a label as a value and goto to jump into a GNU
C statement expression is explicitly documented in GCC's manual [1] as
undefined behavior:

"Jumping into a statement expression with a computed goto (see Labels as
Values [2]) has undefined behavior. "

A recent change in clang [3] to prohibit this altogether points this out, so
builds using clang-17 and newer will break with this change applied:

  drivers/gpu/drm/tests/drm_exec_test.c:41:3: error: cannot jump from this 
indirect goto statement to one of its possible targets
 41 | drm_exec_retry_on_contention(&exec);
| ^
  include/drm/drm_exec.h:96:4: note: expanded from macro 
'drm_exec_retry_on_contention'
 96 | goto *__drm_exec_retry_ptr; \
| ^
  drivers/gpu/drm/tests/drm_exec_test.c:39:2: note: possible target of indirect 
goto statement
 39 | drm_exec_until_all_locked(&exec) {
| ^
  include/drm/drm_exec.h:79:33: note: expanded from macro 
'drm_exec_until_all_locked'
 79 | __label__ __drm_exec_retry; \
| ^
  drivers/gpu/drm/tests/drm_exec_test.c:39:2: note: jump enters a statement 
expression

It seems like if this construct works, it is by chance, although I am
not sure if there is another solution.

[1]: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html
[2]: https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
[3]: 
https://github.com/llvm/llvm-project/commit/20219106060208f0c2f5d096eb3aed7b712f5067

Cheers,
Nathan

Re: [PATCH] drm/amd/display: Allow building DC with clang on RISC-V

2023-07-18 Thread Nathan Chancellor

On Mon, Jul 17, 2023 at 03:29:23PM -0700, Samuel Holland wrote:
> clang on RISC-V appears to be unaffected by the bug causing excessive
> stack usage in calculate_bandwidth(). clang 16 with -fstack-usage
> reports a 304 byte stack frame size with CONFIG_ARCH_RV32I, and 512
> bytes with CONFIG_ARCH_RV64I.
> 
> Signed-off-by: Samuel Holland 

I built ARCH=riscv allmodconfig drivers/gpu/drm/amd/amdgpu/ (confirming
that CONFIG_DRM_AMD_DC gets enabled) with LLVM 11 through 17 with and
without CONFIG_KASAN=y and I never saw the -Wframe-larger-than instance
that this was disabled for, so I agree.

Reviewed-by: Nathan Chancellor 
Tested-by: Nathan Chancellor 

> 
>  drivers/gpu/drm/amd/display/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/Kconfig 
> b/drivers/gpu/drm/amd/display/Kconfig
> index bf0a655d009e..901d1961b739 100644
> --- a/drivers/gpu/drm/amd/display/Kconfig
> +++ b/drivers/gpu/drm/amd/display/Kconfig
> @@ -5,7 +5,7 @@ menu "Display Engine Configuration"
>  config DRM_AMD_DC
>   bool "AMD DC - Enable new display engine"
>   default y
> - depends on BROKEN || !CC_IS_CLANG || X86_64 || SPARC64 || ARM64
> + depends on BROKEN || !CC_IS_CLANG || ARM64 || RISCV || SPARC64 || X86_64
>   select SND_HDA_COMPONENT if SND_HDA_CORE
>   # !CC_IS_CLANG: https://github.com/ClangBuiltLinux/linux/issues/1752
>   select DRM_AMD_DC_FP if (X86 || LOONGARCH || (PPC64 && ALTIVEC) || 
> (ARM64 && KERNEL_MODE_NEON && !CC_IS_CLANG))
> -- 
> 2.40.1
>

Re: [PATCH v4 18/21] compiler.h: RFC - s/LINE/COUNTER/ in __UNIQUE_ID fallback

2023-07-13 Thread Nathan Chancellor

Hi Jim

On Thu, Jul 13, 2023 at 10:36:23AM -0600, Jim Cromie wrote:
> We currently have 3 defns for __UNIQUE_ID(); gcc and clang are using
> __COUNTER__ for real uniqueness, 3rd just uses __LINE__, which should
> fail on this (and harder to avoid situations):
> 
>   DECLARE_FOO(); DECLARE_FOO();
> 
> Its 2023, can we haz a no-fallback __UNIQUE_ID ?

Yeah, I fail to see how this fallback definition can actually be used
after commit 95207db8166a ("Remove Intel compiler support"); even before
that, it would be pretty unlikely since icc usage has not been visible
for a long time. The kernel only officially supports clang or GCC now,
so the definitions of __UNIQUE_ID() in include/linux/compiler-clang.h
and include/linux/compiler-gcc.h should always be used because of the
include in include/linux/compiler_types.h, right?

I think the correct clean up is to just hoist the definition of
__UNIQUE_ID() out of the individual compiler headers into the common one
here but...

> NOTE:
> 
> This also changes __UNIQUE_ID_ to _kaUID_.  Ive been getting
> lkp-reports of collisions on names which should be unique; this
> shouldnt happen on gcc & clang, but does on some older ones, on some
> platforms, on some allyes & rand-configs.  Like this:
> 
> mips64-linux-ld:
> drivers/gpu/drm/display/drm_dp_helper.o:(__dyndbg_class_users+0x0):
> multiple definition of `__UNIQUE_ID_ddebug_class_user405';
> drivers/gpu/drm/drm_gem_shmem_helper.o:(__dyndbg_class_users+0x0):
> first defined here

This problem cannot be addressed with this patch given the above
information, no? Seems like that might mean that __COUNTER__ has issues
in earlier compilers?

Cheers,
Nathan

> Like above, the collision reports appear to always be 3-digit
> counters, which look like line-numbers.  Changing to _kaUID_ in this
> defn should make it more obvious (in *.i file) when a fallback has
> happened.  To be clear, I havent seen it yet.  Nor have I seen the
> multiple-defn problem above since adding this patch.
> 
> Lets see what lkp-robot says about this.
> 
> CC: Luc Van Oostenryck  (maintainer:SPARSE 
> CHECKER)
> CC: Nathan Chancellor  (supporter:CLANG/LLVM BUILD SUPPORT)
> CC: Nick Desaulniers  (supporter:CLANG/LLVM BUILD 
> SUPPORT)
> CC: Tom Rix  (reviewer:CLANG/LLVM BUILD SUPPORT)
> CC: linux-spa...@vger.kernel.org (open list:SPARSE CHECKER)
> CC: linux-ker...@vger.kernel.org (open list)
> CC: l...@lists.linux.dev (open list:CLANG/LLVM BUILD SUPPORT)
> Signed-off-by: Jim Cromie 
> ---
>  include/linux/compiler.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index d7779a18b24f..677d6c47cd9e 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -177,9 +177,9 @@ void ftrace_likely_update(struct ftrace_likely_data *f, 
> int val,
>   __asm__ ("" : "=r" (var) : "0" (var))
>  #endif
>  
> -/* Not-quite-unique ID. */
> +/* JFTI: to fix Not-quite-unique ID */
>  #ifndef __UNIQUE_ID
> -# define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__)
> +# define __UNIQUE_ID(prefix) __PASTE(__PASTE(_kaUID_, prefix), __COUNTER__)
>  #endif
>  
>  /**
> -- 
> 2.41.0
>

[PATCH 2/2] drm/amdgpu: Move clocks closer to its only usage in amdgpu_parse_cg_state()

2023-06-15 Thread Nathan Chancellor

After commit a25a9dae2067 ("drm/amd/amdgpu: enable W=1 for amdgpu"),
there is an instance of -Wunused-const-variable when CONFIG_DEBUG_FS is
disabled:

  drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_pm.c:38:34: error: unused variable 
'clocks' [-Werror,-Wunused-const-variable]
 38 | static const struct cg_flag_name clocks[] = {
|  ^
  1 error generated.

clocks is only used when CONFIG_DEBUG_FS is set, so move the definition
into the CONFIG_DEBUG_FS block right above its only usage to clear up
the warning.

Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/pm/amdgpu_pm.c | 76 +++---
 1 file changed, 38 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
index a57952b93e73..386ccf11e657 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -35,44 +35,6 @@
 #include 
 #include 
 
-static const struct cg_flag_name clocks[] = {
-   {AMD_CG_SUPPORT_GFX_FGCG, "Graphics Fine Grain Clock Gating"},
-   {AMD_CG_SUPPORT_GFX_MGCG, "Graphics Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_GFX_MGLS, "Graphics Medium Grain memory Light Sleep"},
-   {AMD_CG_SUPPORT_GFX_CGCG, "Graphics Coarse Grain Clock Gating"},
-   {AMD_CG_SUPPORT_GFX_CGLS, "Graphics Coarse Grain memory Light Sleep"},
-   {AMD_CG_SUPPORT_GFX_CGTS, "Graphics Coarse Grain Tree Shader Clock 
Gating"},
-   {AMD_CG_SUPPORT_GFX_CGTS_LS, "Graphics Coarse Grain Tree Shader Light 
Sleep"},
-   {AMD_CG_SUPPORT_GFX_CP_LS, "Graphics Command Processor Light Sleep"},
-   {AMD_CG_SUPPORT_GFX_RLC_LS, "Graphics Run List Controller Light Sleep"},
-   {AMD_CG_SUPPORT_GFX_3D_CGCG, "Graphics 3D Coarse Grain Clock Gating"},
-   {AMD_CG_SUPPORT_GFX_3D_CGLS, "Graphics 3D Coarse Grain memory Light 
Sleep"},
-   {AMD_CG_SUPPORT_MC_LS, "Memory Controller Light Sleep"},
-   {AMD_CG_SUPPORT_MC_MGCG, "Memory Controller Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_SDMA_LS, "System Direct Memory Access Light Sleep"},
-   {AMD_CG_SUPPORT_SDMA_MGCG, "System Direct Memory Access Medium Grain 
Clock Gating"},
-   {AMD_CG_SUPPORT_BIF_MGCG, "Bus Interface Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_BIF_LS, "Bus Interface Light Sleep"},
-   {AMD_CG_SUPPORT_UVD_MGCG, "Unified Video Decoder Medium Grain Clock 
Gating"},
-   {AMD_CG_SUPPORT_VCE_MGCG, "Video Compression Engine Medium Grain Clock 
Gating"},
-   {AMD_CG_SUPPORT_HDP_LS, "Host Data Path Light Sleep"},
-   {AMD_CG_SUPPORT_HDP_MGCG, "Host Data Path Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_DRM_MGCG, "Digital Right Management Medium Grain Clock 
Gating"},
-   {AMD_CG_SUPPORT_DRM_LS, "Digital Right Management Light Sleep"},
-   {AMD_CG_SUPPORT_ROM_MGCG, "Rom Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_DF_MGCG, "Data Fabric Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_VCN_MGCG, "VCN Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_HDP_DS, "Host Data Path Deep Sleep"},
-   {AMD_CG_SUPPORT_HDP_SD, "Host Data Path Shutdown"},
-   {AMD_CG_SUPPORT_IH_CG, "Interrupt Handler Clock Gating"},
-   {AMD_CG_SUPPORT_JPEG_MGCG, "JPEG Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_REPEATER_FGCG, "Repeater Fine Grain Clock Gating"},
-   {AMD_CG_SUPPORT_GFX_PERF_CLK, "Perfmon Clock Gating"},
-   {AMD_CG_SUPPORT_ATHUB_MGCG, "Address Translation Hub Medium Grain Clock 
Gating"},
-   {AMD_CG_SUPPORT_ATHUB_LS, "Address Translation Hub Light Sleep"},
-   {0, NULL},
-};
-
 static const struct hwmon_temp_label {
enum PP_HWMON_TEMP channel;
const char *label;
@@ -3684,6 +3646,44 @@ static int amdgpu_debugfs_pm_info_pp(struct seq_file *m, 
struct amdgpu_device *a
return 0;
 }
 
+static const struct cg_flag_name clocks[] = {
+   {AMD_CG_SUPPORT_GFX_FGCG, "Graphics Fine Grain Clock Gating"},
+   {AMD_CG_SUPPORT_GFX_MGCG, "Graphics Medium Grain Clock Gating"},
+   {AMD_CG_SUPPORT_GFX_MGLS, "Graphics Medium Grain memory Light Sleep"},
+   {AMD_CG_SUPPORT_GFX_CGCG, "Graphics Coarse Grain Clock Gating"},
+   {AMD_CG_SUPPORT_GFX_CGLS, "Graphics Coarse Grain memory Light Sleep"},
+   {AMD_CG_SUPPORT_GFX_CGTS, "Graphics Coarse Grain Tree Shader Clock 
Gating"},
+   {AMD_CG_SUPPORT_GFX_CGTS_LS, "Graphics Coarse Grain Tree Shader Light 
Sleep"},
+   {AMD_CG_SUPPORT_GFX_CP_LS, "Graphics Command Processor Light Sleep"},
+   {AMD_CG_SUP

[PATCH 0/2] drm/amdgpu: Fix instances of -Wunused-const-variable with CONFIG_DEBUG_FS=n

2023-06-15 Thread Nathan Chancellor

Hi all,

After commit a25a9dae2067 ("drm/amd/amdgpu: enable W=1 for amdgpu"),
I see a few instances of -Wunused-const-variable with configurations
that do not enable CONFIG_DEBUG_FS, such as Alpine Linux's. This series
includes two patches to resolve each warning I see.

---
Nathan Chancellor (2):
  drm/amdgpu: Remove CONFIG_DEBUG_FS guard around body of 
amdgpu_rap_debugfs_init()
  drm/amdgpu: Move clocks closer to its only usage in 
amdgpu_parse_cg_state()

 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c |  2 -
 drivers/gpu/drm/amd/pm/amdgpu_pm.c  | 76 -
 2 files changed, 38 insertions(+), 40 deletions(-)
---
base-commit: d297eedf83f5af96751c0da1e4355c19244a55a2
change-id: 20230615-amdgpu-wunused-const-variable-wo-debugfs-308ce8e17329

Best regards,
-- 
Nathan Chancellor

[PATCH 1/2] drm/amdgpu: Remove CONFIG_DEBUG_FS guard around body of amdgpu_rap_debugfs_init()

2023-06-15 Thread Nathan Chancellor

After commit a25a9dae2067 ("drm/amd/amdgpu: enable W=1 for amdgpu"),
there is an instance of -Wunused-const-variable when CONFIG_DEBUG_FS is
disabled:

  drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c:110:37: error: unused variable 
'amdgpu_rap_debugfs_ops' [-Werror,-Wunused-const-variable]
110 | static const struct file_operations amdgpu_rap_debugfs_ops = {
| ^
  1 error generated.

There is no reason for the body of this function to be guarded when
CONFIG_DEBUG_FS is disabled, as debugfs_create_file() is a stub that
just returns an error pointer in that situation. Remove the preprocessor
guards so that the variable never appears unused, while not changing
anything at run time.

Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
index 12010c988c8b..123bcf5c2bb1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
@@ -116,7 +116,6 @@ static const struct file_operations amdgpu_rap_debugfs_ops 
= {
 
 void amdgpu_rap_debugfs_init(struct amdgpu_device *adev)
 {
-#if defined(CONFIG_DEBUG_FS)
struct drm_minor *minor = adev_to_drm(adev)->primary;
 
if (!adev->psp.rap_context.context.initialized)
@@ -124,5 +123,4 @@ void amdgpu_rap_debugfs_init(struct amdgpu_device *adev)
 
debugfs_create_file("rap_test", S_IWUSR, minor->debugfs_root,
adev, &amdgpu_rap_debugfs_ops);
-#endif
 }

-- 
2.41.0

Re: [PATCH] drm/amdgpu: Wrap -Wunused-but-set-variable in cc-option

2023-06-12 Thread Nathan Chancellor

On Sat, Jun 10, 2023 at 10:14:05AM +0300, Jani Nikula wrote:
> On Thu, 08 Jun 2023, Nathan Chancellor  wrote:
> > -Wunused-but-set-variable was only supported in clang starting with
> > 13.0.0, so earlier versions will emit a warning, which is turned into a
> > hard error for the kernel to mirror GCC:
> >
> >   error: unknown warning option '-Wunused-but-set-variable'; did you mean 
> > '-Wunused-const-variable'? [-Werror,-Wunknown-warning-option]
> >
> > The minimum supported version of clang for building the kernel is
> > 11.0.0, so match the rest of the kernel and wrap
> > -Wunused-but-set-variable in a cc-option call, so that it is only used
> > when supported by the compiler.
> 
> I wonder if there's a table somewhere listing all the warning options,
> which GCC and Clang versions support them, and which versions have them
> in -Wall and -Wextra. Would be really useful.

I don't think there is anything other than the official documentations for each 
listing
all the warning options. I know each version has its own documentation
for comparing warnings between releases but that is obviously tedious.

The clang -Wall question is easy enough to answer based on the test
case:

https://github.com/llvm/llvm-project/blob/llvmorg-16.0.0/clang/test/Misc/warning-wall.c
https://github.com/llvm/llvm-project/blob/llvmorg-15.0.0/clang/test/Misc/warning-wall.c
https://github.com/llvm/llvm-project/blob/llvmorg-14.0.0/clang/test/Misc/warning-wall.c
https://github.com/llvm/llvm-project/blob/llvmorg-13.0.0/clang/test/Misc/warning-wall.c
https://github.com/llvm/llvm-project/blob/llvmorg-12.0.0/clang/test/Misc/warning-wall.c
https://github.com/llvm/llvm-project/blob/llvmorg-11.0.0/clang/test/Misc/warning-wall.c

Clang has a tool, diagtool, that can print information about -Wextra,
but I do not ship it with the kernel.org LLVM releases, nor does Debian
it seems. On a recent clang-17 (the colors don't matter for this
exercise):

$ diagtool tree -Wextra

GREEN = enabled by default
YELLOW = disabled by default
RED = unimplemented (accepted for GCC compatibility)

-Wextra
  -Wdeprecated-copy
-Wdeprecated-copy-with-user-provided-copy
  -Wmissing-field-initializers
  -Wignored-qualifiers
-Wignored-reference-qualifiers
  -Winitializer-overrides
  -Wsemicolon-before-method-body
  -Wmissing-method-return-type
  -Wsign-compare
  -Wunused-parameter
  -Wunused-but-set-parameter
  -Wnull-pointer-arithmetic
-Wgnu-null-pointer-arithmetic
  -Wnull-pointer-subtraction
  -Wempty-init-stmt
  -Wstring-concatenation
  -Wfuse-ld-path

Maybe some of that can be useful for future travelers.

> If there isn't one, it would be really helpful. *wink*.

Heh, that does sound like an interesting project but I am not sure I
have the bandwidth at the moment to do something like that, especially
since the number of warnings that are different between GCC and clang
are continuing to dwindle :)

Cheers,
Nathan

> > Closes: https://github.com/ClangBuiltLinux/linux/issues/1869
> > Fixes: a0fd5a5f676c ("drm/amd/amdgpu: introduce DRM_AMDGPU_WERROR")
> > Signed-off-by: Nathan Chancellor 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/Makefile | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> > b/drivers/gpu/drm/amd/amdgpu/Makefile
> > index 7ee68b1bbfed..86b833085f19 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> > +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> > @@ -40,7 +40,7 @@ ccflags-y := -I$(FULL_AMD_PATH)/include/asic_reg \
> > -I$(FULL_AMD_PATH)/amdkfd
> >  
> >  subdir-ccflags-y := -Wextra
> > -subdir-ccflags-y += -Wunused-but-set-variable
> > +subdir-ccflags-y += $(call cc-option, -Wunused-but-set-variable)
> >  subdir-ccflags-y += -Wno-unused-parameter
> >  subdir-ccflags-y += -Wno-type-limits
> >  subdir-ccflags-y += -Wno-sign-compare
> >
> > ---
> > base-commit: 6bd4b01e8938779b0d959bdf33949a9aa258a363
> > change-id: 
> > 20230608-amdgpu-wrap-wunused-but-set-variable-in-cc-option-0be9528ac5c8
> >
> > Best regards,
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center

Re: [PATCH] drm/amd/amdgpu: enable W=1 for amdgpu

2023-06-09 Thread Nathan Chancellor

+ Masahiro and linux-kbuild

On Fri, Jun 09, 2023 at 12:42:06PM -0400, Hamza Mahfooz wrote:
> We have a clean build with W=1 as of
> commit 12a15dd589ac ("drm/amd/display/amdgpu_dm/amdgpu_dm_helpers: Move
> SYNAPTICS_DEVICE_ID into CONFIG_DRM_AMD_DC_DCN ifdef"). So, let's enable
> these checks unconditionally for the entire module to catch these errors
> during development.
> 
> Cc: Alex Deucher 
> Cc: Nathan Chancellor 
> Signed-off-by: Hamza Mahfooz 

I think this is fine, especially since it will help catch issues in
amdgpu quickly and hopefully encourage developers to fix their problems
before they make it to a tree with wider impact lika -next.

However, this is now the third place that W=1 has been effectively
enabled (i915 and btrfs are the other two I know of) and it would be
nice if this was a little more unified, especially since it is not
uncommon for the warnings under W=1 to shift around and keeping them
unified will make maintainence over the longer term a little easier. I
am not sure if this has been brought up in the past and I don't want to
hold up this change but I suspect this sentiment of wanting to enable
W=1 on a per-subsystem basis is going to continue to grow.

Regardless, for clang 11.1.0 to 16.0.5, I see no warnings when building
drivers/gpu/drm/amd/amdgpu/ with Arch Linux's configuration or
allmodconfig.

Reviewed-by: Nathan Chancellor 
Tested-by: Nathan Chancellor 

> ---
>  drivers/gpu/drm/amd/amdgpu/Makefile | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index 86b833085f19..8d16f280b695 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -40,7 +40,18 @@ ccflags-y := -I$(FULL_AMD_PATH)/include/asic_reg \
>   -I$(FULL_AMD_PATH)/amdkfd
>  
>  subdir-ccflags-y := -Wextra
> -subdir-ccflags-y += $(call cc-option, -Wunused-but-set-variable)
> +subdir-ccflags-y += -Wunused
> +subdir-ccflags-y += -Wmissing-prototypes
> +subdir-ccflags-y += -Wmissing-declarations
> +subdir-ccflags-y += -Wmissing-include-dirs
> +subdir-ccflags-y += -Wold-style-definition
> +subdir-ccflags-y += -Wmissing-format-attribute
> +# Need this to avoid recursive variable evaluation issues
> +cond-flags := $(call cc-option, -Wunused-but-set-variable) \
> + $(call cc-option, -Wunused-const-variable) \
> + $(call cc-option, -Wstringop-truncation) \
> + $(call cc-option, -Wpacked-not-aligned)
> +subdir-ccflags-y += $(cond-flags)
>  subdir-ccflags-y += -Wno-unused-parameter
>  subdir-ccflags-y += -Wno-type-limits
>  subdir-ccflags-y += -Wno-sign-compare
> -- 
> 2.40.1
>

[PATCH] drm/amdgpu: Wrap -Wunused-but-set-variable in cc-option

2023-06-08 Thread Nathan Chancellor

-Wunused-but-set-variable was only supported in clang starting with
13.0.0, so earlier versions will emit a warning, which is turned into a
hard error for the kernel to mirror GCC:

  error: unknown warning option '-Wunused-but-set-variable'; did you mean 
'-Wunused-const-variable'? [-Werror,-Wunknown-warning-option]

The minimum supported version of clang for building the kernel is
11.0.0, so match the rest of the kernel and wrap
-Wunused-but-set-variable in a cc-option call, so that it is only used
when supported by the compiler.

Closes: https://github.com/ClangBuiltLinux/linux/issues/1869
Fixes: a0fd5a5f676c ("drm/amd/amdgpu: introduce DRM_AMDGPU_WERROR")
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/amdgpu/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 7ee68b1bbfed..86b833085f19 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -40,7 +40,7 @@ ccflags-y := -I$(FULL_AMD_PATH)/include/asic_reg \
-I$(FULL_AMD_PATH)/amdkfd
 
 subdir-ccflags-y := -Wextra
-subdir-ccflags-y += -Wunused-but-set-variable
+subdir-ccflags-y += $(call cc-option, -Wunused-but-set-variable)
 subdir-ccflags-y += -Wno-unused-parameter
 subdir-ccflags-y += -Wno-type-limits
 subdir-ccflags-y += -Wno-sign-compare

---
base-commit: 6bd4b01e8938779b0d959bdf33949a9aa258a363
change-id: 
20230608-amdgpu-wrap-wunused-but-set-variable-in-cc-option-0be9528ac5c8

Best regards,
-- 
Nathan Chancellor

Re: [PATCH] drm/amdkfd: remove unused function get_reserved_sdma_queues_bitmap

2023-05-25 Thread Nathan Chancellor

On Thu, May 25, 2023 at 04:07:59PM -0400, Tom Rix wrote:
> clang with W=1 reports
> drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:122:24: error:
>   unused function 'get_reserved_sdma_queues_bitmap' 
> [-Werror,-Wunused-function]
> static inline uint64_t get_reserved_sdma_queues_bitmap(struct 
> device_queue_manager *dqm)
>^
> This function is not used so remove it.
> 
> Signed-off-by: Tom Rix 

Caused by commit 09a95a85cf3e ("drm/amdkfd: Update SDMA queue management
for GFX9.4.3") it seems.

You can actually go a step farther and remove the
reserved_sdma_queues_bitmap member from 'struct kfd_device_info' because
it is now only assigned, never read.

$ git grep reserved_sdma_queues_bitmap next-20230525
next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_device.c:
kfd->device_info.reserved_sdma_queues_bitmap = 0xFULL;
next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_device.c:
kfd->device_info.reserved_sdma_queues_bitmap = 0x3ULL;
next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c:static 
inline uint64_t get_reserved_sdma_queues_bitmap(struct device_queue_manager 
*dqm)
next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c:return 
dqm->dev->kfd->device_info.reserved_sdma_queues_bitmap;
next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_priv.h:uint64_t 
reserved_sdma_queues_bitmap;

> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 493b4b66f180..2fbd0a96424f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -119,11 +119,6 @@ unsigned int get_num_xgmi_sdma_queues(struct 
> device_queue_manager *dqm)
>   dqm->dev->kfd->device_info.num_sdma_queues_per_engine;
>  }
>  
> -static inline uint64_t get_reserved_sdma_queues_bitmap(struct 
> device_queue_manager *dqm)
> -{
> - return dqm->dev->kfd->device_info.reserved_sdma_queues_bitmap;
> -}
> -
>  static void init_sdma_bitmaps(struct device_queue_manager *dqm)
>  {
>   bitmap_zero(dqm->sdma_bitmap, KFD_MAX_SDMA_QUEUES);
> -- 
> 2.27.0
>

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nathan Chancellor

On Thu, May 25, 2023 at 12:45:13PM -0400, Luben Tuikov wrote:
> On 2023-05-25 12:29, Nathan Chancellor wrote:
> > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote:
> >> On 2023-05-25 11:22, Nathan Chancellor wrote:
> >>> On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote:
> >>>> Silencing the compiler from below compilation error:
> >>>>
> >>>> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 
> >>>> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted 
> >>>> [-Werror,-Wunneeded-internal-declaration]
> >>>> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
> >>>>   ^
> >>>> 1 error generated.
> >>>>
> >>>> Mark the variable as __maybe_unused to make it clear to clang that this
> >>>> is expected, so there is no more warning.
> >>>>
> >>>> Cc: Christian König 
> >>>> Cc: Lijo Lazar 
> >>>> Cc: Luben Tuikov 
> >>>> Cc: Alex Deucher 
> >>>> Signed-off-by: Srinivasan Shanmugam 
> >>>
> >>> Traditionally, this attribute would go between the [] and =, but that is
> >>> a nit. Can someone please pick this up to unblock our builds on -next?
> >>>
> >>> Reviewed-by: Nathan Chancellor 
> >>
> >> I'll pick this up, fix it, and submit to amd-staging-drm-next.
> > 
> > Thanks a lot :)
> > 
> >> Which -next are you referring to, Nathan?
> > 
> > linux-next, this warning breaks the build when -Werror is enabled, such
> > as with allmodconfig:
> > 
> > https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log
> > 
> 
> Hi Nathan,
> 
> Thanks for the pointers.
> 
> Srinivasan has already submitted it to amd-staging-drm-next.
> 
> Seems Alex will push it upstream.
> 
> Not sure who fast you need it, we can send you the commit itself
> for you to git-am if you cannot wait.

Thanks for that extra info. We can just wait for the patch to end up in
-next naturally, we try to avoid applying extra patches when possible.

Cheers,
Nathan

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nathan Chancellor

On Thu, May 25, 2023 at 12:42:05PM -0400, Alex Deucher wrote:
> On Thu, May 25, 2023 at 12:29 PM Nathan Chancellor  wrote:
> >
> > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote:
> > > On 2023-05-25 11:22, Nathan Chancellor wrote:
> > > > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote:
> > > >> Silencing the compiler from below compilation error:
> > > >>
> > > >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 
> > > >> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted 
> > > >> [-Werror,-Wunneeded-internal-declaration]
> > > >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
> > > >>   ^
> > > >> 1 error generated.
> > > >>
> > > >> Mark the variable as __maybe_unused to make it clear to clang that this
> > > >> is expected, so there is no more warning.
> > > >>
> > > >> Cc: Christian König 
> > > >> Cc: Lijo Lazar 
> > > >> Cc: Luben Tuikov 
> > > >> Cc: Alex Deucher 
> > > >> Signed-off-by: Srinivasan Shanmugam 
> > > >
> > > > Traditionally, this attribute would go between the [] and =, but that is
> > > > a nit. Can someone please pick this up to unblock our builds on -next?
> > > >
> > > > Reviewed-by: Nathan Chancellor 
> > >
> > > I'll pick this up, fix it, and submit to amd-staging-drm-next.
> >
> > Thanks a lot :)
> >
> > > Which -next are you referring to, Nathan?
> >
> > linux-next, this warning breaks the build when -Werror is enabled, such
> > as with allmodconfig:
> >
> > https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log
> >
> 
> Srinivasan has already pushed it.  I'll push it out once CI has
> completed.  We are trying to figure out the best way to enable -WERROR
> in our CI system as it is almost always broken depending on what
> compiler you are using.  Also, I'm not sure fixing these is always
> better.  A lot of these warnings seem spurious and in a lot of cases
> the "fix" doesn't really improve the code, it just silences a warning.
> As one of my coworkers put it, there is a reason warnings are not
> errors.

I do not necessarily disagree with that final sentiment but at the end
of the day, it is pointing out a potential problem ("this variable is
only used in a compile time context, is that what you intended or not?")
and the solution is either to fix the code so that it works as initially
intended or you silence the warning because you know it is not actually
a problem. There are always going to be false positives, otherwise they
would just always be hard errors, but that does not mean that they are
not worth listening to, which is why Linus insists on -Werror being a
thing. We can opt out of -Werror for our CI but that does not change the
fact it is default enabled with allmodconfig, so that is how most people
will test.

Cheers,
Nathan

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nathan Chancellor

On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote:
> On 2023-05-25 11:22, Nathan Chancellor wrote:
> > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote:
> >> Silencing the compiler from below compilation error:
> >>
> >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 
> >> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted 
> >> [-Werror,-Wunneeded-internal-declaration]
> >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
> >>   ^
> >> 1 error generated.
> >>
> >> Mark the variable as __maybe_unused to make it clear to clang that this
> >> is expected, so there is no more warning.
> >>
> >> Cc: Christian König 
> >> Cc: Lijo Lazar 
> >> Cc: Luben Tuikov 
> >> Cc: Alex Deucher 
> >> Signed-off-by: Srinivasan Shanmugam 
> > 
> > Traditionally, this attribute would go between the [] and =, but that is
> > a nit. Can someone please pick this up to unblock our builds on -next?
> > 
> > Reviewed-by: Nathan Chancellor 
> 
> I'll pick this up, fix it, and submit to amd-staging-drm-next.

Thanks a lot :)

> Which -next are you referring to, Nathan?

linux-next, this warning breaks the build when -Werror is enabled, such
as with allmodconfig:

https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log

Cheers,
Nathan

> >> ---
> >>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 +
> >>  1 file changed, 1 insertion(+)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c 
> >> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> >> index 3648994724c2..cba087e529c0 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> >> @@ -701,6 +701,7 @@ static void mmhub_v1_8_reset_ras_error_count(struct 
> >> amdgpu_device *adev)
> >>mmhub_v1_8_inst_reset_ras_error_count(adev, i);
> >>  }
> >>  
> >> +__maybe_unused
> >>  static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
> >>regMMEA0_ERR_STATUS,
> >>regMMEA1_ERR_STATUS,
> >> -- 
> >> 2.25.1
> >>
>

Re: [PATCH v2] drm/amd/display: enable more strict compile checks

2023-05-25 Thread Nathan Chancellor

On Thu, May 25, 2023 at 08:37:07AM -0700, Kees Cook wrote:
> Hi!
> 
> On Wed, May 24, 2023 at 04:27:31PM -0400, Hamza Mahfooz wrote:
> > + Kees
> > 
> > On 5/24/23 15:50, Alex Deucher wrote:
> > > On Wed, May 24, 2023 at 3:46 PM Felix Kuehling  
> > > wrote:
> > > > 
> > > > Sure, I think we tried enabling warnings as errors before and had to
> > > > revert it because of weird compiler quirks or the variety of compiler
> > > > versions that need to be supported.
> > > > 
> > > > Alex, are you planning to upstream this, or is this only to enforce more
> > > > internal discipline about not ignoring warnings?
> > > 
> > > I'd like to upstream it.  Upstream already has CONFIG_WERROR as a
> > > config option, but it's been problematic to enable in CI because of
> > > various breakages outside of the driver and in different compilers.
> > > That said, I don't know how much trouble enabling it will cause with
> > > various compilers in the wild.
> 
> -Wmisleading-indentation is already part of -Wall, so this is globally
> enabled already.
> 
> -Wunused is enabled under W=1, and it's pretty noisy still. If you can
> get builds clean in drm, that'll be a good step towards getting it
> enabled globally. (A middle ground with less to clean up might be
> -Wunused-but-set-variable)
> 
> I agree about -Werror: just stick with CONFIG_WERROR instead.

There is also W=e, added by commit c77d06e70d59 ("kbuild: support W=e
to make build abort in case of warning") in 5.19, which works well for
building with configurations that do not have CONFIG_WERROR enabled and
avoiding dipping into menuconfig.

Unconditionally enabling -Werror with no way to turn it off is just
asking for problems over time with new compiler versions, either due to
new warnings in -Wall or warnings that have been improved or changed.
Should that still be desired, consider doing what i915 and PowerPC have
done and add a Kconfig option that can be disabled.

Cheers,
Nathan

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nathan Chancellor

On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote:
> Silencing the compiler from below compilation error:
> 
> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 
> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted 
> [-Werror,-Wunneeded-internal-declaration]
> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
>   ^
> 1 error generated.
> 
> Mark the variable as __maybe_unused to make it clear to clang that this
> is expected, so there is no more warning.
> 
> Cc: Christian König 
> Cc: Lijo Lazar 
> Cc: Luben Tuikov 
> Cc: Alex Deucher 
> Signed-off-by: Srinivasan Shanmugam 

Traditionally, this attribute would go between the [] and =, but that is
a nit. Can someone please pick this up to unblock our builds on -next?

Reviewed-by: Nathan Chancellor 

> ---
>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c 
> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> index 3648994724c2..cba087e529c0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> @@ -701,6 +701,7 @@ static void mmhub_v1_8_reset_ras_error_count(struct 
> amdgpu_device *adev)
>   mmhub_v1_8_inst_reset_ras_error_count(adev, i);
>  }
>  
> +__maybe_unused
>  static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
>   regMMEA0_ERR_STATUS,
>   regMMEA1_ERR_STATUS,
> -- 
> 2.25.1
>

[PATCH] drm/amdgpu: Fix return types of certain NBIOv7.9 callbacks

2023-05-24 Thread Nathan Chancellor

When building with clang's -Wincompatible-function-pointer-types-strict,
which ensures that function pointer signatures match exactly to avoid
tripping clang's Control Flow Integrity (kCFI) checks at run time and
will eventually be turned on for the kernel, the following instances
appear in the NBIOv7.9 code:

  drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c:465:32: error: incompatible function 
pointer types initializing 'int (*)(struct amdgpu_device *)' with an expression 
of type 'enum amdgpu_gfx_partition (struct amdgpu_device *)' 
[-Werror,-Wincompatible-function-pointer-types-strict]
  .get_compute_partition_mode = nbio_v7_9_get_compute_partition_mode,
^~~~
  drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c:467:31: error: incompatible function 
pointer types initializing 'u32 (*)(struct amdgpu_device *, u32 *)' (aka 
'unsigned int (*)(struct amdgpu_device *, unsigned int *)') with an expression 
of type 'enum amdgpu_memory_partition (struct amdgpu_device *, u32 *)' (aka 
'enum amdgpu_memory_partition (struct amdgpu_device *, unsigned int *)') 
[-Werror,-Wincompatible-function-pointer-types-strict]
  .get_memory_partition_mode = nbio_v7_9_get_memory_partition_mode,
   ^~~
  2 errors generated.

Change the return types of these callbacks to match the prototypes to
clear up the warning and avoid tripping kCFI at run time. Both functions
return a value from ffs(), which is an integer that can fit into either
int or unsigned int.

Fixes: 11f64eb1472f ("drm/amdgpu: add sysfs node for compute partition mode")
Fixes: 41a717ea8afc ("drm/amdgpu: detect current GPU memory partition mode")
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
index e082f6343d20..d19325476752 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
@@ -382,7 +382,7 @@ static void nbio_v7_9_enable_doorbell_interrupt(struct 
amdgpu_device *adev,
  DOORBELL_INTERRUPT_DISABLE, enable ? 0 : 1);
 }
 
-static enum amdgpu_gfx_partition nbio_v7_9_get_compute_partition_mode(struct 
amdgpu_device *adev)
+static int nbio_v7_9_get_compute_partition_mode(struct amdgpu_device *adev)
 {
u32 tmp, px;
 
@@ -408,8 +408,8 @@ static void nbio_v7_9_set_compute_partition_mode(struct 
amdgpu_device *adev,
WREG32_SOC15(NBIO, 0, regBIF_BX_PF0_PARTITION_COMPUTE_STATUS, tmp);
 }
 
-static enum amdgpu_memory_partition
-nbio_v7_9_get_memory_partition_mode(struct amdgpu_device *adev, u32 
*supp_modes)
+static u32 nbio_v7_9_get_memory_partition_mode(struct amdgpu_device *adev,
+  u32 *supp_modes)
 {
u32 tmp;
 

---
base-commit: fd8f7bb391fa9c1979232cb5ab5bedb08abc855d
change-id: 
20230524-nbio_v7_9-wincompatible-function-pointer-types-strict-c894636ce146

Best regards,
-- 
Nathan Chancellor

Re: [PATCH v2 00/14] Remove clang's -Qunused-arguments from KBUILD_CPPFLAGS

2023-01-23 Thread Nathan Chancellor

Hi Naresh,

On Mon, Jan 23, 2023 at 07:28:10PM +0530, Naresh Kamboju wrote:
> FYI,
> [ please provide comments, feedback and improvements on build/ ltp smoke 
> tests ]
> 
> LKFT test farm have fetched your patch series [1]
> [PATCH v2 00/14] Remove clang's -Qunused-arguments from KBUILD_CPPFLAGS
>  [1] 
> https://lore.kernel.org/llvm/20221228-drop-qunused-arguments-v2-0-9adbddd20...@kernel.org/

Thank you a lot for testing this series, it is much appreciated!

It looks like this was applied on top of 6.2-rc3 if I am reading your
logs right but your mainline testing is recent, 6.2-rc5. I think the
errors you are seeing here are just existing mainline regressions that
were later fixed.

> Following build warnings and errors reported.
> 
> sh:
> gcc-11-defconfig — FAIL
> gcc-11-shx3_defconfig — FAIL
> https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/https___lore_kernel_org_llvm_20221228-drop-qunused-arguments-v2-1-9adbddd20d86_kernel_org/testrun/14221835/suite/build/tests/
> 
> mainline getting passed.
> https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.2-rc5/testrun/14298156/suite/build/test/gcc-11-defconfig/history/
> https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.2-rc5/testrun/14298156/suite/build/test/gcc-11-shx3_defconfig/history/
> 
> Build error:
> In function 'follow_pmd_mask',
> inlined from 'follow_pud_mask' at /builds/linux/mm/gup.c:735:9,
> inlined from 'follow_p4d_mask' at /builds/linux/mm/gup.c:752:9,
> inlined from 'follow_page_mask' at /builds/linux/mm/gup.c:809:9:
> /builds/linux/include/linux/compiler_types.h:358:45: error: call to
> '__compiletime_assert_263' declared with attribute error: Unsupported
> access size for {READ,WRITE}_ONCE().
>   358 | _compiletime_assert(condition, msg,
> __compiletime_assert_, __COUNTER__)

I think this was fixed with mainline commit 526970be53d5 ("sh/mm: Fix
pmd_t for real"), released in 6.2-rc4. You can see a previous build
failing in the same manner:

https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.2-rc3-9-g5a41237ad1d4/testrun/14056384/suite/build/tests/

> s390:
> clang-15-defconfig — FAIL
> https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/https___lore_kernel_org_llvm_20221228-drop-qunused-arguments-v2-1-9adbddd20d86_kernel_org/testrun/14221913/suite/build/tests/
> 
> mainline getting passed.
> https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.2-rc5/testrun/14300495/suite/build/test/clang-15-defconfig/history/
> 
> Build error:
> make --silent --keep-going --jobs=8
> O=/home/tuxbuild/.cache/tuxmake/builds/1/build LLVM_IAS=0 ARCH=s390
> CROSS_COMPILE=s390x-linux-gnu- 'HOSTCC=sccache clang' 'CC=sccache
> clang'
> `.exit.text' referenced in section `__jump_table' of fs/fuse/inode.o:
> defined in discarded section `.exit.text' of fs/fuse/inode.o
> `.exit.text' referenced in section `__jump_table' of fs/fuse/inode.o:
> defined in discarded section `.exit.text' of fs/fuse/inode.o
> `.exit.text' referenced in section `__bug_table' of crypto/algboss.o:
> defined in discarded section `.exit.text' of crypto/algboss.o
> `.exit.text' referenced in section `__bug_table' of drivers/scsi/sd.o:
> defined in discarded section `.exit.text' of drivers/scsi/sd.o
> `.exit.text' referenced in section `__jump_table' of drivers/md/md.o:
> defined in discarded section `.exit.text' of drivers/md/md.o
> `.exit.text' referenced in section `__jump_table' of drivers/md/md.o:
> defined in discarded section `.exit.text' of drivers/md/md.o
> `.exit.text' referenced in section `.altinstructions' of
> drivers/md/md.o: defined in discarded section `.exit.text' of
> drivers/md/md.o
> `.exit.text' referenced in section `.altinstructions' of
> drivers/md/md.o: defined in discarded section `.exit.text' of
> drivers/md/md.o
> `.exit.text' referenced in section `.altinstructions' of
> net/iucv/iucv.o: defined in discarded section `.exit.text' of
> net/iucv/iucv.o
> `.exit.text' referenced in section `__bug_table' of
> drivers/s390/cio/qdio_thinint.o: defined in discarded section
> `.exit.text' of drivers/s390/cio/qdio_thinint.o
> `.exit.text' referenced in section `__bug_table' of
> drivers/s390/net/qeth_l3_main.o: defined in discarded section
> `.exit.text' of drivers/s390/net/qeth_l3_main.o
> `.exit.text' referenced in section `__bug_table' of
> drivers/s390/net/qeth_l3_main.o: defined in discarded section
> `.exit.text' of drivers/s390/net/qeth_l3_main.o
> s390x-linux-gnu-ld: BFD (GNU Binutils for Debian) 2.35.2 assertion
> fail ../../bfd/elf64-s390.c:3349
> make[2]: *** [/builds/linux/scripts/Makefile.vmlinux:34: vmlinux] Error 1

This should be fixed with mainline commit a494398bde27 ("s390: define
RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36"), released in
6.2-rc4 as well. Same as before, visible in mainline at one point
without this series:

https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.2-rc3-9-g5a41237ad1d4/testrun/1

[PATCH v2 12/14] drm/amd/display: Do not add '-mhard-float' to dml_ccflags for clang

2023-01-11 Thread Nathan Chancellor

When clang's -Qunused-arguments is dropped from KBUILD_CPPFLAGS, it
warns:

  clang-16: error: argument unused during compilation: '-mhard-float' 
[-Werror,-Wunused-command-line-argument]

Similar to commit 84edc2eff827 ("selftest/fpu: avoid clang warning"),
just add this flag to GCC builds. Commit 0f0727d971f6 ("drm/amd/display:
readd -msse2 to prevent Clang from emitting libcalls to undefined SW FP
routines") added '-msse2' to prevent clang from emitting software
floating point routines.

Signed-off-by: Nathan Chancellor 
Acked-by: Alex Deucher 
---
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
---
 drivers/gpu/drm/amd/display/dc/dml/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index 0ecea87cf48f..9d0f79dff2e3 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -26,7 +26,8 @@
 # subcomponents.
 
 ifdef CONFIG_X86
-dml_ccflags := -mhard-float -msse
+dml_ccflags-$(CONFIG_CC_IS_GCC) := -mhard-float
+dml_ccflags := $(dml_ccflags-y) -msse
 endif
 
 ifdef CONFIG_PPC64

-- 
2.39.0

[PATCH v2 00/14] Remove clang's -Qunused-arguments from KBUILD_CPPFLAGS

2023-01-11 Thread Nathan Chancellor

Hi all,

Clang can emit a few different warnings when it encounters a flag that it
recognizes but does not support internally. These warnings are elevated to
errors within {as,cc}-option via -Werror to catch unsupported flags that should
not be added to KBUILD_{A,C}FLAGS; see commit c3f0d0bc5b01 ("kbuild, LLVMLinux:
Add -Werror to cc-option to support clang").

If an unsupported flag is unconditionally to KBUILD_{A,C}FLAGS, all subsequent
{as,cc}-option will always fail, preventing supported and even potentially
necessary flags from getting adding to the tool flags.

One would expect these warnings to be visible in the kernel build logs since
they are added to KBUILD_{A,C}FLAGS but unfortunately, these warnings are
hidden with clang's -Qunused-arguments flag, which is added to KBUILD_CPPFLAGS
and used for both compiling and assembling files.

Patches 1-4 address the internal inconsistencies of invoking the assembler
within kbuild by using KBUILD_AFLAGS consistently and using '-x
assembler-with-cpp' over '-x assembler'. This matches how assembly files are
built across the kernel and helps avoid problems in situations where macro
definitions or warning flags are present in KBUILD_AFLAGS, which cause
instances of -Wunused-command-line-argument when the preprocessor is not called
to consume them. There were a couple of places in architecture code where this
change would break things so those are fixed first.

Patches 5-12 clean up warnings that will show up when -Qunused-argument is
dropped. I hope none of these are controversial.

Patch 13 turns two warnings into errors so that the presence of unused flags
cannot be easily ignored.

Patch 14 drops -Qunused-argument. This is done last so that it can be easily
reverted if need be.

This series has seen my personal test framework, which tests several different
configurations and architectures, with LLVM tip of tree (16.0.0). I have done
defconfig, allmodconfig, and allnoconfig builds for arm, arm64, i386, mips,
powerpc, riscv, s390, and x86_64 with GCC 12.2.0 as well but I am hoping the
rest of the test infrastructure will catch any lurking problems.

I would like this series to stay together so that there is no opportunity for
breakage so please consider giving acks so that this can be carried via the
kbuild tree (and many thanks to the people who have already provided such
tags).

---
Changes in v2:
- Pick up tags where provided (thank you everyone!)
- Patch 6 and 9: Clarify that '-s' is a compiler flag that is only relevant to
  the linking phase and remove all mention of the assembler's '-s' flag, as the
  assembler is never directly invoked (Nick, Segher)
- Patch 7: Move '-z noexecstack' into new ldflags-y variable (Nick)
- Patch 8: Reword commit message to explain the problem in a clearer manner
  (Nick)
- Link to v1: 
https://lore.kernel.org/r/20221228-drop-qunused-arguments-v1-0-658cbc8fc...@kernel.org

---
Nathan Chancellor (12):
  MIPS: Always use -Wa,-msoft-float and eliminate GAS_HAS_SET_HARDFLOAT
  MIPS: Prefer cc-option for additions to cflags
  powerpc: Remove linker flag from KBUILD_AFLAGS
  powerpc/vdso: Remove unused '-s' flag from ASFLAGS
  powerpc/vdso: Improve linker flags
  powerpc/vdso: Remove an unsupported flag from vgettimeofday-32.o with 
clang
  s390/vdso: Drop unused '-s' flag from KBUILD_AFLAGS_64
  s390/vdso: Drop '-shared' from KBUILD_CFLAGS_64
  s390/purgatory: Remove unused '-MD' and unnecessary '-c' flags
  drm/amd/display: Do not add '-mhard-float' to dml_ccflags for clang
  kbuild: Turn a couple more of clang's unused option warnings into errors
  kbuild: Stop using '-Qunused-arguments' with clang

Nick Desaulniers (2):
  x86/boot/compressed: prefer cc-option for CFLAGS additions
  kbuild: Update assembler calls to use proper flags and language target

 Makefile|  1 -
 arch/mips/Makefile  | 13 ++---
 arch/mips/include/asm/asmmacro-32.h |  4 +--
 arch/mips/include/asm/asmmacro.h| 42 ++---
 arch/mips/include/asm/fpregdef.h| 14 --
 arch/mips/include/asm/mipsregs.h| 20 +++---
 arch/mips/kernel/genex.S|  2 +-
 arch/mips/kernel/r2300_fpu.S|  4 +--
 arch/mips/kernel/r4k_fpu.S  | 12 -
 arch/mips/kvm/fpu.S |  6 ++---
 arch/mips/loongson2ef/Platform  |  2 +-
 arch/powerpc/Makefile   |  2 +-
 arch/powerpc/kernel/vdso/Makefile   | 25 +++--
 arch/s390/kernel/vdso64/Makefile|  4 +--
 arch/s390/purgatory/Makefile|  2 +-
 arch/x86/boot/compressed/Makefile   |  2 +-
 drivers/gpu/drm/amd/display/dc/dml/Makefile |  3 ++-
 scripts/K

[PATCH 12/14] drm/amd/display: Do not add '-mhard-float' to dml_ccflags for clang

2023-01-04 Thread Nathan Chancellor

When clang's -Qunused-arguments is dropped from KBUILD_CPPFLAGS, it
warns:

  clang-16: error: argument unused during compilation: '-mhard-float' 
[-Werror,-Wunused-command-line-argument]

Similar to commit 84edc2eff827 ("selftest/fpu: avoid clang warning"),
just add this flag to GCC builds. Commit 0f0727d971f6 ("drm/amd/display:
readd -msse2 to prevent Clang from emitting libcalls to undefined SW FP
routines") added '-msse2' to prevent clang from emitting software
floating point routines.

Signed-off-by: Nathan Chancellor 
---
Cc: harry.wentl...@amd.com
Cc: sunpeng...@amd.com
Cc: rodrigo.sique...@amd.com
Cc: alexander.deuc...@amd.com
Cc: christian.koe...@amd.com
Cc: xinhui@amd.com
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
---
 drivers/gpu/drm/amd/display/dc/dml/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index 0ecea87cf48f..9d0f79dff2e3 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -26,7 +26,8 @@
 # subcomponents.
 
 ifdef CONFIG_X86
-dml_ccflags := -mhard-float -msse
+dml_ccflags-$(CONFIG_CC_IS_GCC) := -mhard-float
+dml_ccflags := $(dml_ccflags-y) -msse
 endif
 
 ifdef CONFIG_PPC64

-- 
2.39.0

[PATCH 00/14] Remove clang's -Qunused-arguments from KBUILD_CPPFLAGS

2023-01-04 Thread Nathan Chancellor

Hi all,

Clang can emit a few different warnings when it encounters a flag that it
recognizes but does not support internally. These warnings are elevated to
errors within {as,cc}-option via -Werror to catch unsupported flags that should
not be added to KBUILD_{A,C}FLAGS; see commit c3f0d0bc5b01 ("kbuild, LLVMLinux:
Add -Werror to cc-option to support clang").

If an unsupported flag is unconditionally to KBUILD_{A,C}FLAGS, all subsequent
{as,cc}-option will always fail, preventing supported and even potentially
necessary flags from getting adding to the tool flags.

One would expect these warnings to be visible in the kernel build logs since
they are added to KBUILD_{A,C}FLAGS but unfortunately, these warnings are
hidden with clang's -Qunused-arguments flag, which is added to KBUILD_CPPFLAGS
and used for both compiling and assembling files.

Patches 1-4 address the internal inconsistencies of invoking the assembler
within kbuild by using KBUILD_AFLAGS consistently and using '-x
assembler-with-cpp' over '-x assembler'. This matches how assembly files are
built across the kernel and helps avoid problems in situations where macro
definitions or warning flags are present in KBUILD_AFLAGS, which cause
instances of -Wunused-command-line-argument when the preprocessor is not called
to consume them. There were a couple of places in architecture code where this
change would break things so those are fixed first.

Patches 5-12 clean up warnings that will show up when -Qunused-argument is
dropped. I hope none of these are controversial.

Patch 13 turns two warnings into errors so that the presence of unused flags
cannot be easily ignored.

Patch 14 drops -Qunused-argument. This is done last so that it can be easily
reverted if need be.

This series has seen my personal test framework, which tests several different
configurations and architectures, with LLVM tip of tree (16.0.0). I have done
defconfig, allmodconfig, and allnoconfig builds for arm, arm64, i386, mips,
powerpc, riscv, s390, and x86_64 with GCC 12.2.0 as well but I am hoping the
rest of the test infrastructure will catch any lurking problems.

I would like this series to stay together so that there is no opportunity for
breakage so please consider giving acks so that this can be carried via the
kbuild tree.

---
Nathan Chancellor (12):
  MIPS: Always use -Wa,-msoft-float and eliminate GAS_HAS_SET_HARDFLOAT
  MIPS: Prefer cc-option for additions to cflags
  powerpc: Remove linker flag from KBUILD_AFLAGS
  powerpc/vdso: Remove unused '-s' flag from ASFLAGS
  powerpc/vdso: Improve linker flags
  powerpc/vdso: Remove an unsupported flag from vgettimeofday-32.o with 
clang
  s390/vdso: Drop unused '-s' flag from KBUILD_AFLAGS_64
  s390/vdso: Drop '-shared' from KBUILD_CFLAGS_64
  s390/purgatory: Remove unused '-MD' and unnecessary '-c' flags
  drm/amd/display: Do not add '-mhard-float' to dml_ccflags for clang
  kbuild: Turn a couple more of clang's unused option warnings into errors
  kbuild: Stop using '-Qunused-arguments' with clang

Nick Desaulniers (2):
  x86/boot/compressed: prefer cc-option for CFLAGS additions
  kbuild: Update assembler calls to use proper flags and language target

 Makefile|  1 -
 arch/mips/Makefile  | 13 ++---
 arch/mips/include/asm/asmmacro-32.h |  4 +--
 arch/mips/include/asm/asmmacro.h| 42 ++---
 arch/mips/include/asm/fpregdef.h| 14 --
 arch/mips/include/asm/mipsregs.h| 20 +++---
 arch/mips/kernel/genex.S|  2 +-
 arch/mips/kernel/r2300_fpu.S|  4 +--
 arch/mips/kernel/r4k_fpu.S  | 12 -
 arch/mips/kvm/fpu.S |  6 ++---
 arch/mips/loongson2ef/Platform  |  2 +-
 arch/powerpc/Makefile   |  2 +-
 arch/powerpc/kernel/vdso/Makefile   | 25 +++--
 arch/s390/kernel/vdso64/Makefile|  4 +--
 arch/s390/purgatory/Makefile|  2 +-
 arch/x86/boot/compressed/Makefile   |  2 +-
 drivers/gpu/drm/amd/display/dc/dml/Makefile |  3 ++-
 scripts/Kconfig.include |  2 +-
 scripts/Makefile.clang  |  2 ++
 scripts/Makefile.compiler   |  8 +++---
 scripts/as-version.sh   |  2 +-
 21 files changed, 74 insertions(+), 98 deletions(-)
---
base-commit: 88603b6dc419445847923fcb7fe5080067a30f98
change-id: 20221228-drop-qunused-arguments-0c5c7dae54fb

Best regards,
-- 
Nathan Chancellor

Re: [PATCH v2 2/2] Kconfig.debug: Provide a little extra FRAME_WARN leeway when KASAN is enabled

2022-11-27 Thread Nathan Chancellor

On Fri, Nov 25, 2022 at 12:07:50PM +, Lee Jones wrote:
> When enabled, KASAN enlarges function's stack-frames.  Pushing quite a
> few over the current threshold.  This can mainly be seen on 32-bit
> architectures where the present limit (when !GCC) is a lowly
> 1024-Bytes.
> 
> Signed-off-by: Lee Jones 

Reviewed-by: Nathan Chancellor 

> ---
>  lib/Kconfig.debug | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index c3c0b077ade33..82d475168db95 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -399,6 +399,7 @@ config FRAME_WARN
>   default 2048 if GCC_PLUGIN_LATENT_ENTROPY
>   default 2048 if PARISC
>   default 1536 if (!64BIT && XTENSA)
> + default 1280 if KASAN && !64BIT
>   default 1024 if !64BIT
>   default 2048 if 64BIT
>   help
> -- 
> 2.38.1.584.g0f3c55d4c2-goog
>

Re: [PATCH v2 1/2] drm/amdgpu: Temporarily disable broken Clang builds due to blown stack-frame

2022-11-27 Thread Nathan Chancellor

On Fri, Nov 25, 2022 at 12:07:49PM +, Lee Jones wrote:
> calculate_bandwidth() is presently broken on all !(X86_64 || SPARC64 || ARM64)
> architectures built with Clang (all released versions), whereby the stack
> frame gets blown up to well over 5k.  This would cause an immediate kernel
> panic on most architectures.  We'll revert this when the following bug report
> has been resolved: https://github.com/llvm/llvm-project/issues/41896.
> 
> Suggested-by: Arnd Bergmann 
> Signed-off-by: Lee Jones 

Reviewed-by: Nathan Chancellor 

> ---
>  drivers/gpu/drm/amd/display/Kconfig | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/Kconfig 
> b/drivers/gpu/drm/amd/display/Kconfig
> index 6925e0280dbe6..f4f3d2665a6b2 100644
> --- a/drivers/gpu/drm/amd/display/Kconfig
> +++ b/drivers/gpu/drm/amd/display/Kconfig
> @@ -5,6 +5,7 @@ menu "Display Engine Configuration"
>  config DRM_AMD_DC
>   bool "AMD DC - Enable new display engine"
>   default y
> + depends on BROKEN || !CC_IS_CLANG || X86_64 || SPARC64 || ARM64
>   select SND_HDA_COMPONENT if SND_HDA_CORE
>   select DRM_AMD_DC_DCN if (X86 || PPC_LONG_DOUBLE_128)
>   help
> @@ -12,6 +13,12 @@ config DRM_AMD_DC
> support for AMDGPU. This adds required support for Vega and
> Raven ASICs.
>  
> +   calculate_bandwidth() is presently broken on all !(X86_64 || SPARC64 
> || ARM64)
> +   architectures built with Clang (all released versions), whereby the 
> stack
> +   frame gets blown up to well over 5k.  This would cause an immediate 
> kernel
> +   panic on most architectures.  We'll revert this when the following 
> bug report
> +   has been resolved: https://github.com/llvm/llvm-project/issues/41896.
> +
>  config DRM_AMD_DC_DCN
>   def_bool n
>   help
> -- 
> 2.38.1.584.g0f3c55d4c2-goog
>

Re: [PATCH 11/22] drm/amd/display: Disable phantom OTG after enable for plane disable

2022-11-10 Thread Nathan Chancellor

Hi Alan,

On Thu, Nov 03, 2022 at 12:01:06AM +0800, Alan Liu wrote:
> From: Alvin Lee 
> 
> [Description]
> - Need to disable phantom OTG after it's enabled
>   in order to restore it to it's original state.
> - If it's enabled and then an MCLK switch comes in
>   we may not prefetch the correct data since the phantom
>   OTG could already be in the middle of the frame.
> 
> Reviewed-by: Jun Lei 
> Acked-by: Alan Liu 
> Signed-off-by: Alvin Lee 
> ---
>  drivers/gpu/drm/amd/display/dc/core/dc.c   | 14 +-
>  drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c  |  8 
>  .../drm/amd/display/dc/inc/hw/timing_generator.h   |  1 +
>  3 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc.c
> index da808996e21d..9c3704c4d7e4 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
> @@ -1055,6 +1055,7 @@ static void disable_dangling_plane(struct dc *dc, 
> struct dc_state *context)
>   struct dc_state *dangling_context = dc_create_state(dc);
>   struct dc_state *current_ctx;
>   struct pipe_ctx *pipe;
> + struct timing_generator *tg;
>  
>   if (dangling_context == NULL)
>   return;
> @@ -1098,6 +1099,7 @@ static void disable_dangling_plane(struct dc *dc, 
> struct dc_state *context)
>  
>   if (should_disable && old_stream) {
>   pipe = &dc->current_state->res_ctx.pipe_ctx[i];
> + tg = pipe->stream_res.tg;
>   /* When disabling plane for a phantom pipe, we must 
> turn on the
>* phantom OTG so the disable programming gets the 
> double buffer
>* update. Otherwise the pipe will be left in a 
> partially disabled
> @@ -1105,7 +1107,8 @@ static void disable_dangling_plane(struct dc *dc, 
> struct dc_state *context)
>* again for different use.
>*/
>   if (old_stream->mall_stream_config.type == 
> SUBVP_PHANTOM) {
> - 
> pipe->stream_res.tg->funcs->enable_crtc(pipe->stream_res.tg);
> + if (tg->funcs->enable_crtc)
> + tg->funcs->enable_crtc(tg);
>   }
>   dc_rem_all_planes_for_stream(dc, old_stream, 
> dangling_context);
>   disable_all_writeback_pipes_for_stream(dc, old_stream, 
> dangling_context);
> @@ -1122,6 +1125,15 @@ static void disable_dangling_plane(struct dc *dc, 
> struct dc_state *context)
>   dc->hwss.interdependent_update_lock(dc, 
> dc->current_state, false);
>   dc->hwss.post_unlock_program_front_end(dc, 
> dangling_context);
>   }
> + /* We need to put the phantom OTG back into it's 
> default (disabled) state or we
> +  * can get corruption when transition from one SubVP 
> config to a different one.
> +  * The OTG is set to disable on falling edge of VUPDATE 
> so the plane disable
> +  * will still get it's double buffer update.
> +  */
> + if (old_stream->mall_stream_config.type == 
> SUBVP_PHANTOM) {
> + if (tg->funcs->disable_phantom_crtc)
> + tg->funcs->disable_phantom_crtc(tg);
> + }
>   }
>   }
>  
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c 
> b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c
> index 2b33eeb213e2..2ee798965bc2 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c
> @@ -167,6 +167,13 @@ static void optc32_phantom_crtc_post_enable(struct 
> timing_generator *optc)
>   REG_WAIT(OTG_CLOCK_CONTROL, OTG_BUSY, 0, 1, 10);
>  }
>  
> +static void optc32_disable_phantom_otg(struct timing_generator *optc)
> +{
> + struct optc *optc1 = DCN10TG_FROM_TG(optc);
> +
> + REG_UPDATE(OTG_CONTROL, OTG_MASTER_EN, 0);
> +}
> +
>  static void optc32_set_odm_bypass(struct timing_generator *optc,
>   const struct dc_crtc_timing *dc_crtc_timing)
>  {
> @@ -260,6 +267,7 @@ static struct timing_generator_funcs dcn32_tg_funcs = {
>   .enable_crtc = optc32_enable_crtc,
>   .disable_crtc = optc32_disable_crtc,
>   .phantom_crtc_post_enable = optc32_phantom_crtc_post_enable,
> + .disable_phantom_crtc = optc32_disable_phantom_otg,
>   /* used by enable_timing_synchronization. Not need for FPGA */
>   .is_counter_moving = optc1_is_counter_moving,
>   .get_position = optc1_get_position,
> diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h 
> b/drivers/gpu/drm/amd/dis

[PATCH 1/2] drm/amdgpu: Fix type of second parameter in trans_msg() callback

2022-11-02 Thread Nathan Chancellor

With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:

  drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:412:15: error: incompatible function 
pointer types initializing 'void (*)(struct amdgpu_device *, u32, u32, u32, 
u32)' (aka 'void (*)(struct amdgpu_device *, unsigned int, unsigned int, 
unsigned int, unsigned int)') with an expression of type 'void (struct 
amdgpu_device *, enum idh_request, u32, u32, u32)' (aka 'void (struct 
amdgpu_device *, enum idh_request, unsigned int, unsigned int, unsigned int)') 
[-Werror,-Wincompatible-function-pointer-types-strict]
  .trans_msg = xgpu_ai_mailbox_trans_msg,
  ^
  1 error generated.

  drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c:435:15: error: incompatible function 
pointer types initializing 'void (*)(struct amdgpu_device *, u32, u32, u32, 
u32)' (aka 'void (*)(struct amdgpu_device *, unsigned int, unsigned int, 
unsigned int, unsigned int)') with an expression of type 'void (struct 
amdgpu_device *, enum idh_request, u32, u32, u32)' (aka 'void (struct 
amdgpu_device *, enum idh_request, unsigned int, unsigned int, unsigned int)') 
[-Werror,-Wincompatible-function-pointer-types-strict]
  .trans_msg = xgpu_nv_mailbox_trans_msg,
  ^
  1 error generated.

The type of the second parameter in the prototype should be 'enum
idh_request' instead of 'u32'. Update it to clear up the warnings.

Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reported-by: Sami Tolvanen 
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index d94c31e68a14..bc4f079fd48c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -74,6 +74,8 @@ struct amdgpu_vf_error_buffer {
uint64_t data[AMDGPU_VF_ERROR_ENTRY_SIZE];
 };
 
+enum idh_request;
+
 /**
  * struct amdgpu_virt_ops - amdgpu device virt operations
  */
@@ -83,7 +85,8 @@ struct amdgpu_virt_ops {
int (*req_init_data)(struct amdgpu_device *adev);
int (*reset_gpu)(struct amdgpu_device *adev);
int (*wait_reset)(struct amdgpu_device *adev);
-   void (*trans_msg)(struct amdgpu_device *adev, u32 req, u32 data1, u32 
data2, u32 data3);
+   void (*trans_msg)(struct amdgpu_device *adev, enum idh_request req,
+ u32 data1, u32 data2, u32 data3);
 };
 
 /*

base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780
-- 
2.38.1

[PATCH 2/2] drm/amdgpu: Fix type of second parameter in odn_edit_dpm_table() callback

2022-11-02 Thread Nathan Chancellor

With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:

  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:3008:29: error: 
incompatible function pointer types initializing 'int (*)(void *, uint32_t, 
long *, uint32_t)' (aka 'int (*)(void *, unsigned int, long *, unsigned int)') 
with an expression of type 'int (void *, enum PP_OD_DPM_TABLE_COMMAND, long *, 
uint32_t)' (aka 'int (void *, enum PP_OD_DPM_TABLE_COMMAND, long *, unsigned 
int)') [-Werror,-Wincompatible-function-pointer-types-strict]
  .odn_edit_dpm_table  = smu_od_edit_dpm_table,
 ^
  1 error generated.

There are only two implementations of ->odn_edit_dpm_table() in 'struct
amd_pm_funcs': smu_od_edit_dpm_table() and pp_odn_edit_dpm_table(). One
has a second parameter type of 'enum PP_OD_DPM_TABLE_COMMAND' and the
other uses 'u32'. Ultimately, smu_od_edit_dpm_table() calls
->od_edit_dpm_table() from 'struct pptable_funcs' and
pp_odn_edit_dpm_table() calls ->odn_edit_dpm_table() from 'struct
pp_hwmgr_func', which both have a second parameter type of 'enum
PP_OD_DPM_TABLE_COMMAND'.

Update the type parameter in both the prototype in 'struct amd_pm_funcs'
and pp_odn_edit_dpm_table() to 'enum PP_OD_DPM_TABLE_COMMAND', which
cleans up the warning.

Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reported-by: Sami Tolvanen 
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/include/kgd_pp_interface.h   | 3 ++-
 drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h 
b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
index a40ead44778a..d18162e9ed1d 100644
--- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
@@ -354,7 +354,8 @@ struct amd_pm_funcs {
int (*get_power_profile_mode)(void *handle, char *buf);
int (*set_power_profile_mode)(void *handle, long *input, uint32_t size);
int (*set_fine_grain_clk_vol)(void *handle, uint32_t type, long *input, 
uint32_t size);
-   int (*odn_edit_dpm_table)(void *handle, uint32_t type, long *input, 
uint32_t size);
+   int (*odn_edit_dpm_table)(void *handle, enum PP_OD_DPM_TABLE_COMMAND 
type,
+ long *input, uint32_t size);
int (*set_mp1_state)(void *handle, enum pp_mp1_state mp1_state);
int (*smu_i2c_bus_access)(void *handle, bool acquire);
int (*gfx_state_change_set)(void *handle, uint32_t state);
diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c 
b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
index ec055858eb95..1159ae114dd0 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
@@ -838,7 +838,8 @@ static int pp_set_fine_grain_clk_vol(void *handle, uint32_t 
type, long *input, u
return hwmgr->hwmgr_func->set_fine_grain_clk_vol(hwmgr, type, input, 
size);
 }
 
-static int pp_odn_edit_dpm_table(void *handle, uint32_t type, long *input, 
uint32_t size)
+static int pp_odn_edit_dpm_table(void *handle, enum PP_OD_DPM_TABLE_COMMAND 
type,
+long *input, uint32_t size)
 {
struct pp_hwmgr *hwmgr = handle;
 
-- 
2.38.1

Re: [PATCH v4 1/1] drm/amd/display: add DCN support for ARM64

2022-11-01 Thread Nathan Chancellor

On Tue, Nov 01, 2022 at 10:36:08AM -0400, Rodrigo Siqueira Jordao wrote:
> 
> 
> On 2022-10-31 15:37, Ao Zhong wrote:
> > After moving all FPU code to the DML folder, we can enable DCN support
> > for the ARM64 platform. Remove the -mgeneral-regs-only CFLAG from the
> > code in the DML folder that needs to use hardware FPU, and add a control
> > mechanism for ARM Neon.
> > 
> > Signed-off-by: Ao Zhong 
> > ---
> >   drivers/gpu/drm/amd/display/Kconfig   |  3 ++-
> >   .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c|  6 ++
> >   drivers/gpu/drm/amd/display/dc/dml/Makefile   | 20 +++
> >   3 files changed, 24 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/display/Kconfig 
> > b/drivers/gpu/drm/amd/display/Kconfig
> > index 0142affcdaa3..843a55a6a3ac 100644
> > --- a/drivers/gpu/drm/amd/display/Kconfig
> > +++ b/drivers/gpu/drm/amd/display/Kconfig
> > @@ -6,7 +6,8 @@ config DRM_AMD_DC
> > bool "AMD DC - Enable new display engine"
> > default y
> > select SND_HDA_COMPONENT if SND_HDA_CORE
> > -   select DRM_AMD_DC_DCN if (X86 || PPC64)
> > +   # !CC_IS_CLANG: https://github.com/ClangBuiltLinux/linux/issues/1752
> > +   select DRM_AMD_DC_DCN if (X86 || PPC64 || (ARM64 && KERNEL_MODE_NEON && 
> > !CC_IS_CLANG))
> > help
> >   Choose this option if you want to use the new display engine
> >   support for AMDGPU. This adds required support for Vega and
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> > index ab0c6d191038..1743ca0a3641 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> > @@ -31,6 +31,8 @@
> >   #elif defined(CONFIG_PPC64)
> >   #include 
> >   #include 
> > +#elif defined(CONFIG_ARM64)
> > +#include 
> >   #endif
> >   /**
> > @@ -99,6 +101,8 @@ void dc_fpu_begin(const char *function_name, const int 
> > line)
> > preempt_disable();
> > enable_kernel_fp();
> > }
> > +#elif defined(CONFIG_ARM64)
> > +   kernel_neon_begin();
> >   #endif
> > }
> > @@ -136,6 +140,8 @@ void dc_fpu_end(const char *function_name, const int 
> > line)
> > disable_kernel_fp();
> > preempt_enable();
> > }
> > +#elif defined(CONFIG_ARM64)
> > +   kernel_neon_end();
> >   #endif
> > }
> > diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
> > b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> > index d0c6cf61c676..d4e93bed1c8e 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
> > +++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> > @@ -33,6 +33,10 @@ ifdef CONFIG_PPC64
> >   dml_ccflags := -mhard-float -maltivec
> >   endif
> > +ifdef CONFIG_ARM64
> > +dml_rcflags := -mgeneral-regs-only
> > +endif
> > +
> >   ifdef CONFIG_CC_IS_GCC
> >   ifeq ($(call cc-ifversion, -lt, 0701, y), y)
> >   IS_OLD_GCC = 1
> > @@ -55,8 +59,6 @@ frame_warn_flag := -Wframe-larger-than=2048
> >   endif
> >   CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_lib.o := $(dml_ccflags)
> > -
> > -ifdef CONFIG_DRM_AMD_DC_DCN
> >   CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_vba.o := $(dml_ccflags)
> >   CFLAGS_$(AMDDALPATH)/dc/dml/dcn10/dcn10_fpu.o := $(dml_ccflags)
> >   CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/dcn20_fpu.o := $(dml_ccflags)
> > @@ -88,7 +90,6 @@ CFLAGS_$(AMDDALPATH)/dc/dml/calcs/dcn_calcs.o := 
> > $(dml_ccflags)
> >   CFLAGS_$(AMDDALPATH)/dc/dml/calcs/dcn_calc_auto.o := $(dml_ccflags)
> >   CFLAGS_$(AMDDALPATH)/dc/dml/calcs/dcn_calc_math.o := $(dml_ccflags) 
> > -Wno-tautological-compare
> >   CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/display_mode_vba.o := $(dml_rcflags)
> > -CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn2x/dcn2x.o := $(dml_rcflags)
> >   CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20.o := 
> > $(dml_rcflags)
> >   CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn20/display_rq_dlg_calc_20.o := 
> > $(dml_rcflags)
> >   CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20v2.o := 
> > $(dml_rcflags)
> > @@ -105,7 +106,18 @@ 
> > CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn32/display_mode_vba_util_32.o := 
> > $(dml_rcf
> >   CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn301/dcn301_fpu.o := $(dml_rcflags)
> >   CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/display_mode_lib.o := $(dml_rcflags)
> >   CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dsc/rc_calc_fpu.o  := $(dml_rcflags)
> > -endif
> > +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn10/dcn10_fpu.o := $(dml_rcflags)
> > +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn20/dcn20_fpu.o := $(dml_rcflags)
> > +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn314/display_mode_vba_314.o := 
> > $(dml_rcflags)
> > +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn314/display_rq_dlg_calc_314.o := 
> > $(dml_rcflags)
> > +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn314/dcn314_fpu.o := $(dml_rcflags)
> > +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn30/dcn30_fpu.o := $(dml_rcflags)
> > +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn32/dcn

Re: [PATCH v3 1/1] drm/amd/display: add DCN support for ARM64

2022-10-31 Thread Nathan Chancellor

Hi Rodrigo and Ao,

On Fri, Oct 28, 2022 at 08:48:26AM -0700, Nathan Chancellor wrote:
> On Fri, Oct 28, 2022 at 11:35:32AM -0400, Rodrigo Siqueira Jordao wrote:
> > 
> > 
> > On 2022-10-28 11:09, Nathan Chancellor wrote:
> > > Hi Ao,
> > > 
> > > On Thu, Oct 27, 2022 at 09:52:29PM +0200, Ao Zhong wrote:
> > > > After moving all FPU code to the DML folder, we can enable DCN support
> > > > for the ARM64 platform. Remove the -mgeneral-regs-only CFLAG from the
> > > > code in the DML folder that needs to use hardware FPU, and add a control
> > > > mechanism for ARM Neon.
> > > > 
> > > > Signed-off-by: Ao Zhong 
> > > > ---
> > > >   drivers/gpu/drm/amd/display/Kconfig   |  2 +-
> > > >   .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c|  6 ++
> > > >   drivers/gpu/drm/amd/display/dc/dml/Makefile   | 20 +++
> > > >   3 files changed, 23 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/display/Kconfig 
> > > > b/drivers/gpu/drm/amd/display/Kconfig
> > > > index 0142affcdaa3..a7f1c4e51719 100644
> > > > --- a/drivers/gpu/drm/amd/display/Kconfig
> > > > +++ b/drivers/gpu/drm/amd/display/Kconfig
> > > > @@ -6,7 +6,7 @@ config DRM_AMD_DC
> > > > bool "AMD DC - Enable new display engine"
> > > > default y
> > > > select SND_HDA_COMPONENT if SND_HDA_CORE
> > > > -   select DRM_AMD_DC_DCN if (X86 || PPC64)
> > > > +   select DRM_AMD_DC_DCN if (X86 || PPC64 || (ARM64 && 
> > > > KERNEL_MODE_NEON))
> > > > help
> > > >   Choose this option if you want to use the new display engine
> > > >   support for AMDGPU. This adds required support for Vega and
> > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
> > > > b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> > > > index ab0c6d191038..1743ca0a3641 100644
> > > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> > > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> > > > @@ -31,6 +31,8 @@
> > > >   #elif defined(CONFIG_PPC64)
> > > >   #include 
> > > >   #include 
> > > > +#elif defined(CONFIG_ARM64)
> > > > +#include 
> > > >   #endif
> > > >   /**
> > > > @@ -99,6 +101,8 @@ void dc_fpu_begin(const char *function_name, const 
> > > > int line)
> > > > preempt_disable();
> > > > enable_kernel_fp();
> > > > }
> > > > +#elif defined(CONFIG_ARM64)
> > > > +   kernel_neon_begin();
> > > >   #endif
> > > > }
> > > > @@ -136,6 +140,8 @@ void dc_fpu_end(const char *function_name, const 
> > > > int line)
> > > > disable_kernel_fp();
> > > > preempt_enable();
> > > > }
> > > > +#elif defined(CONFIG_ARM64)
> > > > +   kernel_neon_end();
> > > >   #endif
> > > > }
> > > > diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
> > > > b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> > > > index d0c6cf61c676..d4e93bed1c8e 100644
> > > > --- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
> > > > +++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> > > > @@ -33,6 +33,10 @@ ifdef CONFIG_PPC64
> > > >   dml_ccflags := -mhard-float -maltivec
> > > >   endif
> > > > +ifdef CONFIG_ARM64
> > > > +dml_rcflags := -mgeneral-regs-only
> > > > +endif
> > > > +
> > > >   ifdef CONFIG_CC_IS_GCC
> > > >   ifeq ($(call cc-ifversion, -lt, 0701, y), y)
> > > >   IS_OLD_GCC = 1
> > > > @@ -55,8 +59,6 @@ frame_warn_flag := -Wframe-larger-than=2048
> > > >   endif
> > > >   CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_lib.o := $(dml_ccflags)
> > > > -
> > > > -ifdef CONFIG_DRM_AMD_DC_DCN
> > > >   CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_vba.o := $(dml_ccflags)
> > > >   CFLAGS_$(AMDDALPATH)/dc/dml/dcn10/dcn10_fpu.o := $(dml_ccflags)
> > > >   CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/dcn20_fpu.o := $(dml_ccflags)
> > > > @@ -88,7 +90,6 @@ CFLAGS_$(AMDDALPATH)/dc/dml/calcs/dcn

Re: [PATCH v3 1/1] drm/amd/display: add DCN support for ARM64

2022-10-28 Thread Nathan Chancellor

Hi Rodrigo,

On Fri, Oct 28, 2022 at 11:35:32AM -0400, Rodrigo Siqueira Jordao wrote:
> 
> 
> On 2022-10-28 11:09, Nathan Chancellor wrote:
> > Hi Ao,
> > 
> > On Thu, Oct 27, 2022 at 09:52:29PM +0200, Ao Zhong wrote:
> > > After moving all FPU code to the DML folder, we can enable DCN support
> > > for the ARM64 platform. Remove the -mgeneral-regs-only CFLAG from the
> > > code in the DML folder that needs to use hardware FPU, and add a control
> > > mechanism for ARM Neon.
> > > 
> > > Signed-off-by: Ao Zhong 
> > > ---
> > >   drivers/gpu/drm/amd/display/Kconfig   |  2 +-
> > >   .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c|  6 ++
> > >   drivers/gpu/drm/amd/display/dc/dml/Makefile   | 20 +++
> > >   3 files changed, 23 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/amd/display/Kconfig 
> > > b/drivers/gpu/drm/amd/display/Kconfig
> > > index 0142affcdaa3..a7f1c4e51719 100644
> > > --- a/drivers/gpu/drm/amd/display/Kconfig
> > > +++ b/drivers/gpu/drm/amd/display/Kconfig
> > > @@ -6,7 +6,7 @@ config DRM_AMD_DC
> > >   bool "AMD DC - Enable new display engine"
> > >   default y
> > >   select SND_HDA_COMPONENT if SND_HDA_CORE
> > > - select DRM_AMD_DC_DCN if (X86 || PPC64)
> > > + select DRM_AMD_DC_DCN if (X86 || PPC64 || (ARM64 && KERNEL_MODE_NEON))
> > >   help
> > > Choose this option if you want to use the new display engine
> > > support for AMDGPU. This adds required support for Vega and
> > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
> > > b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> > > index ab0c6d191038..1743ca0a3641 100644
> > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> > > @@ -31,6 +31,8 @@
> > >   #elif defined(CONFIG_PPC64)
> > >   #include 
> > >   #include 
> > > +#elif defined(CONFIG_ARM64)
> > > +#include 
> > >   #endif
> > >   /**
> > > @@ -99,6 +101,8 @@ void dc_fpu_begin(const char *function_name, const int 
> > > line)
> > >   preempt_disable();
> > >   enable_kernel_fp();
> > >   }
> > > +#elif defined(CONFIG_ARM64)
> > > + kernel_neon_begin();
> > >   #endif
> > >   }
> > > @@ -136,6 +140,8 @@ void dc_fpu_end(const char *function_name, const int 
> > > line)
> > >   disable_kernel_fp();
> > >   preempt_enable();
> > >   }
> > > +#elif defined(CONFIG_ARM64)
> > > + kernel_neon_end();
> > >   #endif
> > >   }
> > > diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
> > > b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> > > index d0c6cf61c676..d4e93bed1c8e 100644
> > > --- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
> > > +++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> > > @@ -33,6 +33,10 @@ ifdef CONFIG_PPC64
> > >   dml_ccflags := -mhard-float -maltivec
> > >   endif
> > > +ifdef CONFIG_ARM64
> > > +dml_rcflags := -mgeneral-regs-only
> > > +endif
> > > +
> > >   ifdef CONFIG_CC_IS_GCC
> > >   ifeq ($(call cc-ifversion, -lt, 0701, y), y)
> > >   IS_OLD_GCC = 1
> > > @@ -55,8 +59,6 @@ frame_warn_flag := -Wframe-larger-than=2048
> > >   endif
> > >   CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_lib.o := $(dml_ccflags)
> > > -
> > > -ifdef CONFIG_DRM_AMD_DC_DCN
> > >   CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_vba.o := $(dml_ccflags)
> > >   CFLAGS_$(AMDDALPATH)/dc/dml/dcn10/dcn10_fpu.o := $(dml_ccflags)
> > >   CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/dcn20_fpu.o := $(dml_ccflags)
> > > @@ -88,7 +90,6 @@ CFLAGS_$(AMDDALPATH)/dc/dml/calcs/dcn_calcs.o := 
> > > $(dml_ccflags)
> > >   CFLAGS_$(AMDDALPATH)/dc/dml/calcs/dcn_calc_auto.o := $(dml_ccflags)
> > >   CFLAGS_$(AMDDALPATH)/dc/dml/calcs/dcn_calc_math.o := $(dml_ccflags) 
> > > -Wno-tautological-compare
> > >   CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/display_mode_vba.o := $(dml_rcflags)
> > > -CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn2x/dcn2x.o := $(dml_rcflags)
> > >   CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20.o := 
> >

Re: [PATCH v3 1/1] drm/amd/display: add DCN support for ARM64

2022-10-28 Thread Nathan Chancellor

Hi Ao,

On Thu, Oct 27, 2022 at 09:52:29PM +0200, Ao Zhong wrote:
> After moving all FPU code to the DML folder, we can enable DCN support
> for the ARM64 platform. Remove the -mgeneral-regs-only CFLAG from the
> code in the DML folder that needs to use hardware FPU, and add a control
> mechanism for ARM Neon.
> 
> Signed-off-by: Ao Zhong 
> ---
>  drivers/gpu/drm/amd/display/Kconfig   |  2 +-
>  .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c|  6 ++
>  drivers/gpu/drm/amd/display/dc/dml/Makefile   | 20 +++
>  3 files changed, 23 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/Kconfig 
> b/drivers/gpu/drm/amd/display/Kconfig
> index 0142affcdaa3..a7f1c4e51719 100644
> --- a/drivers/gpu/drm/amd/display/Kconfig
> +++ b/drivers/gpu/drm/amd/display/Kconfig
> @@ -6,7 +6,7 @@ config DRM_AMD_DC
>   bool "AMD DC - Enable new display engine"
>   default y
>   select SND_HDA_COMPONENT if SND_HDA_CORE
> - select DRM_AMD_DC_DCN if (X86 || PPC64)
> + select DRM_AMD_DC_DCN if (X86 || PPC64 || (ARM64 && KERNEL_MODE_NEON))
>   help
> Choose this option if you want to use the new display engine
> support for AMDGPU. This adds required support for Vega and
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> index ab0c6d191038..1743ca0a3641 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
> @@ -31,6 +31,8 @@
>  #elif defined(CONFIG_PPC64)
>  #include 
>  #include 
> +#elif defined(CONFIG_ARM64)
> +#include 
>  #endif
>  
>  /**
> @@ -99,6 +101,8 @@ void dc_fpu_begin(const char *function_name, const int 
> line)
>   preempt_disable();
>   enable_kernel_fp();
>   }
> +#elif defined(CONFIG_ARM64)
> + kernel_neon_begin();
>  #endif
>   }
>  
> @@ -136,6 +140,8 @@ void dc_fpu_end(const char *function_name, const int line)
>   disable_kernel_fp();
>   preempt_enable();
>   }
> +#elif defined(CONFIG_ARM64)
> + kernel_neon_end();
>  #endif
>   }
>  
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
> b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> index d0c6cf61c676..d4e93bed1c8e 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
> @@ -33,6 +33,10 @@ ifdef CONFIG_PPC64
>  dml_ccflags := -mhard-float -maltivec
>  endif
>  
> +ifdef CONFIG_ARM64
> +dml_rcflags := -mgeneral-regs-only
> +endif
> +
>  ifdef CONFIG_CC_IS_GCC
>  ifeq ($(call cc-ifversion, -lt, 0701, y), y)
>  IS_OLD_GCC = 1
> @@ -55,8 +59,6 @@ frame_warn_flag := -Wframe-larger-than=2048
>  endif
>  
>  CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_lib.o := $(dml_ccflags)
> -
> -ifdef CONFIG_DRM_AMD_DC_DCN
>  CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_vba.o := $(dml_ccflags)
>  CFLAGS_$(AMDDALPATH)/dc/dml/dcn10/dcn10_fpu.o := $(dml_ccflags)
>  CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/dcn20_fpu.o := $(dml_ccflags)
> @@ -88,7 +90,6 @@ CFLAGS_$(AMDDALPATH)/dc/dml/calcs/dcn_calcs.o := 
> $(dml_ccflags)
>  CFLAGS_$(AMDDALPATH)/dc/dml/calcs/dcn_calc_auto.o := $(dml_ccflags)
>  CFLAGS_$(AMDDALPATH)/dc/dml/calcs/dcn_calc_math.o := $(dml_ccflags) 
> -Wno-tautological-compare
>  CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/display_mode_vba.o := $(dml_rcflags)
> -CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn2x/dcn2x.o := $(dml_rcflags)
>  CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20.o := 
> $(dml_rcflags)
>  CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn20/display_rq_dlg_calc_20.o := 
> $(dml_rcflags)
>  CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20v2.o := 
> $(dml_rcflags)
> @@ -105,7 +106,18 @@ 
> CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn32/display_mode_vba_util_32.o := 
> $(dml_rcf
>  CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn301/dcn301_fpu.o := $(dml_rcflags)
>  CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/display_mode_lib.o := $(dml_rcflags)
>  CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dsc/rc_calc_fpu.o  := $(dml_rcflags)
> -endif
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn10/dcn10_fpu.o := $(dml_rcflags)
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn20/dcn20_fpu.o := $(dml_rcflags)
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn314/display_mode_vba_314.o := 
> $(dml_rcflags)
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn314/display_rq_dlg_calc_314.o := 
> $(dml_rcflags)
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn314/dcn314_fpu.o := $(dml_rcflags)
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn30/dcn30_fpu.o := $(dml_rcflags)
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn32/dcn32_fpu.o := $(dml_rcflags)
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn321/dcn321_fpu.o := $(dml_rcflags)
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn31/dcn31_fpu.o := $(dml_rcflags)
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn302/dcn302_fpu.o := $(dml_rcflags)
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dml/dcn303/dcn303_fpu.o := $(dml_rcflags)
> +CFLAGS_REMOVE_

Re: [PATCH RESEND 1/1] drm/amd/display: add DCN support for ARM64

2022-10-27 Thread Nathan Chancellor

Hi Rodrigo,

On Thu, Oct 27, 2022 at 10:29:33AM -0400, Rodrigo Siqueira wrote:
> Nathan/Stephen,
> 
> Maybe I'm wrong, but I think you have access to some sort of CI that tests
> multiple builds with different compiles and configs, right? Is it possible
> to check this patch + amd-staging-drm-next in the CI to help us to
> anticipate any compilation issue if we merge this change?

Yup, I have a build framework that I have developed for my
ClangBuiltLinux work that I was planning on putting this through to see
if there are any new warnings that show up in this code since it is
going to be built on a new architecture. I will report back on Ao's v2
(or v3 if it is available since Arnd had some comments on v2) when that
is done.

> Or should we merge it and wait until it gets merged on the mainline? In case
> of a problem, we can easily revert a small patch like this, right?

Sure, although if we can catch issues beforehand, that would be nice :)

Cheers,
Nathan

[PATCH] drm/amdgpu: Fix uninitialized warning in mmhub_v2_0_get_clockgating()

2022-10-24 Thread Nathan Chancellor

Clang warns:

  drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c:686:3: error: variable 'data' is 
uninitialized when used here [-Werror,-Wuninitialized]
  data |= MM_ATC_L2_MISC_CG__ENABLE_MASK;
  ^~~~
  drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c:674:10: note: initialize the variable 
'data' to silence this warning
  int data, data1;
  ^
  = 0
  1 error generated.

This clearly should have just been a regular '=', as there was no prior
assignment.

Fixes: 7a4fad619819 ("drm/amdgpu: Remove ATC L2 access for MMHUB 2.1.x")
Link: https://github.com/ClangBuiltLinux/linux/issues/1748
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c
index 5ec6d17fed09..998b5d17b271 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c
@@ -683,7 +683,7 @@ static void mmhub_v2_0_get_clockgating(struct amdgpu_device 
*adev, u64 *flags)
/* There is no ATCL2 in MMHUB for 2.1.x. Keep the status
 * based on DAGB
 */
-   data |= MM_ATC_L2_MISC_CG__ENABLE_MASK;
+   data = MM_ATC_L2_MISC_CG__ENABLE_MASK;
data1 = RREG32_SOC15(MMHUB, 0, 
mmDAGB0_CNTL_MISC2_Sienna_Cichlid);
break;
default:

base-commit: fb5e487f910e1105019b883e8ed25e36e4bfd657
-- 
2.38.1

[PATCH] drm/amdkfd: Fix type of reset_type parameter in hqd_destroy() callback

2022-10-17 Thread Nathan Chancellor

When booting a kernel compiled with CONFIG_CFI_CLANG on a machine with
an RX 6700 XT, there is a CFI failure in kfd_destroy_mqd_cp():

  [   12.894543] CFI failure at kfd_destroy_mqd_cp+0x2a/0x40 [amdgpu] (target: 
hqd_destroy_v10_3+0x0/0x260 [amdgpu]; expected type: 0x8594d794)

Clang's kernel Control Flow Integrity (kCFI) makes sure that all
indirect call targets have a type that exactly matches the function
pointer prototype. In this case, hqd_destroy()'s third parameter,
reset_type, should have a type of 'uint32_t' but every implementation of
this callback has a third parameter type of 'enum kfd_preempt_type'.

Update the function pointer prototype to match reality so that there is
no more CFI violation.

Link: https://github.com/ClangBuiltLinux/linux/issues/1738
Signed-off-by: Nathan Chancellor 
---

No Fixes tag, as I could not pin down exactly when this started. I
suspect it is

Fixes: 70539bd79500 ("drm/amd: Update MEC HQD loading code for KFD")

but I did not want to add that without a second look. Feel free to add
it during patch application if it makes sense.

 drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h 
b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index e85364dff4e0..5cb3e8634739 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -262,8 +262,9 @@ struct kfd2kgd_calls {
uint32_t queue_id);
 
int (*hqd_destroy)(struct amdgpu_device *adev, void *mqd,
-   uint32_t reset_type, unsigned int timeout,
-   uint32_t pipe_id, uint32_t queue_id);
+   enum kfd_preempt_type reset_type,
+   unsigned int timeout, uint32_t pipe_id,
+   uint32_t queue_id);
 
bool (*hqd_sdma_is_occupied)(struct amdgpu_device *adev, void *mqd);
 

base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780
-- 
2.38.0

[PATCH] drm/amd/display: Fix build breakage with CONFIG_DEBUG_FS=n

2022-10-14 Thread Nathan Chancellor

After commit 8799c0be89eb ("drm/amd/display: Fix vblank refcount in vrr
transition"), a build with CONFIG_DEBUG_FS=n is broken due to a
misplaced brace, along the lines of:

  In file included from 
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_trace.h:39,
   from 
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:41:
  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: At top level:
  ./include/drm/drm_atomic.h:864:9: error: expected identifier or ‘(’ before 
‘for’
864 | for ((__i) = 0;   
  \
| ^~~
  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:8317:9: note: in 
expansion of macro ‘for_each_new_crtc_in_state’
   8317 | for_each_new_crtc_in_state(state, crtc, new_crtc_state, j)
| ^~

Move the brace within the #ifdef so that the file can be built with or
without CONFIG_DEBUG_FS.

Fixes: 8799c0be89eb ("drm/amd/display: Fix vblank refcount in vrr transition")
Signed-off-by: Nathan Chancellor 
---

I have sent this to Linus in case he wants to take this directly since
this is a pretty obvious fix, as opposed to going through the regular
channels.

 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f6a9e8fdd87d..c053cb79cd06 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -8310,8 +8310,8 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
crtc, dm_new_crtc_state, cur_crc_src))
DRM_DEBUG_DRIVER("Failed to configure 
crc source");
}
-#endif
}
+#endif
}
 
for_each_new_crtc_in_state(state, crtc, new_crtc_state, j)

base-commit: 9c9155a3509a2ebdb06d77c7a621e9685c802eac
-- 
2.38.0

Re: [PATCH 1/2] drm/amd/display: Reduce number of arguments of dml314's CalculateWatermarksAndDRAMSpeedChangeSupport()

2022-09-20 Thread Nathan Chancellor

On Tue, Sep 20, 2022 at 12:06:46PM -0400, Alex Deucher wrote:
> Applied the series.  Thanks!

Great, thank you so much! Hopefully these could also be applied to the
6.0 branch so that this error can be resolved there as well. No worries
on timeline if that was already the plan but I just want to keep -Werror
on for arm64 and x86_64 allmodconfig for this release.

Cheers,
Nathan

> On Sat, Sep 17, 2022 at 8:38 AM Maíra Canal  wrote:
> >
> > Hi Nathan,
> >
> > On 9/16/22 18:06, Nathan Chancellor wrote:
> > > Most of the arguments are identical between the two call sites and they
> > > can be accessed through the 'struct vba_vars_st' pointer. This reduces
> > > the total amount of stack space that
> > > dml314_ModeSupportAndSystemConfigurationFull() uses by 240 bytes with
> > > LLVM 16 (2216 -> 1976), helping clear up the following clang warning:
> > >
> > >   
> > > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/display_mode_vba_314.c:4020:6:
> > >  error: stack frame size (2216) exceeds limit (2048) in 
> > > 'dml314_ModeSupportAndSystemConfigurationFull' 
> > > [-Werror,-Wframe-larger-than]
> > >   void dml314_ModeSupportAndSystemConfigurationFull(struct 
> > > display_mode_lib *mode_lib)
> > >    ^
> > >   1 error generated.
> > >
> > > Link: https://github.com/ClangBuiltLinux/linux/issues/1710
> > > Reported-by: "kernelci.org bot" 
> > > Signed-off-by: Nathan Chancellor 
> >
> > I have built-tested the whole series with clang 14.0.5 (Fedora
> > 14.0.5-1.fc36), using:
> >
> > $ make -kj"$(nproc)" ARCH=x86_64 LLVM=1 mrproper allmodconfig
> > drivers/gpu/drm/amd/amdgpu/
> >
> > Another great patch to the DML! As Tom, I also would like to see this
> > expand to all display_mode_vba files, but so far this is great to
> > unbreak the build.
> >
> > To the whole series:
> >
> > Tested-by: Maíra Canal 
> >
> > Best Regards,
> > - Maíra Canal
> >
> > > ---
> > >
> > > This is just commit ab2ac59c32db ("drm/amd/display: Reduce number of
> > > arguments of dml31's CalculateWatermarksAndDRAMSpeedChangeSupport()")
> > > applied to dml314.
> > >
> > >  .../dc/dml/dcn314/display_mode_vba_314.c  | 248 --
> > >  1 file changed, 52 insertions(+), 196 deletions(-)
> > >
> > > diff --git 
> > > a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c 
> > > b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
> > > index 2829f179f982..32ceb72f7a14 100644
> > > --- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
> > > +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
> > > @@ -325,64 +325,28 @@ static void 
> > > CalculateVupdateAndDynamicMetadataParameters(
> > >  static void CalculateWatermarksAndDRAMSpeedChangeSupport(
> > >   struct display_mode_lib *mode_lib,
> > >   unsigned int PrefetchMode,
> > > - unsigned int NumberOfActivePlanes,
> > > - unsigned int MaxLineBufferLines,
> > > - unsigned int LineBufferSize,
> > > - unsigned int WritebackInterfaceBufferSize,
> > >   double DCFCLK,
> > >   double ReturnBW,
> > > - bool SynchronizedVBlank,
> > > - unsigned int dpte_group_bytes[],
> > > - unsigned int MetaChunkSize,
> > >   double UrgentLatency,
> > >   double ExtraLatency,
> > > - double WritebackLatency,
> > > - double WritebackChunkSize,
> > >   double SOCCLK,
> > > - double DRAMClockChangeLatency,
> > > - double SRExitTime,
> > > - double SREnterPlusExitTime,
> > > - double SRExitZ8Time,
> > > - double SREnterPlusExitZ8Time,
> > >   double DCFCLKDeepSleep,
> > >   unsigned int DETBufferSizeY[],
> > >   unsigned int DETBufferSizeC[],
> > >   unsigned int SwathHeightY[],
> > >   unsigned int SwathHeightC[],
> > > - unsigned int LBBitPerPixel[],
> > >   double SwathWidthY[],
> > >   double SwathWidthC[],
> > > - double HRatio[],
> > > - double HRatioChroma[],
> >

Re: mainline build failure (new) for x86_64 allmodconfig with clang

2022-09-17 Thread Nathan Chancellor

Hi Sudip,

On Sat, Sep 17, 2022 at 11:55:05AM +0100, Sudip Mukherjee (Codethink) wrote:
> Hi All,
> 
> The latest mainline kernel branch fails to build for x86_64 allmodconfig
> with clang. The errors are:
> 
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/display_mode_vba_314.c:4020:6:
>  error: stack frame size (2184) exceeds limit (2048) in 
> 'dml314_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> void dml314_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
> *mode_lib)
>  ^
> 1 error generated.
> 
> 
> Note: This is a new error seen on top on a335366bad13 ("Merge tag 
> 'gpio-fixes-for-v6.0-rc6' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux").
> Previous reported clang build error is now fixed, thanks to Nathan.
> 
> And, it appears Nathan has already sent a fix for this:
> https://github.com/intel-lab-lkp/linux/commit/4ecc45d7585ae2e05d622879ad97e13a7d8c595b
> https://github.com/intel-lab-lkp/linux/commit/819976a950b497d7f10cd9a198a94c26a9005b30

I did not realize this was a mainline issue too :( it seems that
commit af2f2a256e04 ("drm/amd/display: Enable dlg and vba compilation
for dcn314") is needed to see this and it was only in -next for three
releases (20220914 to 20220916), which I missed checking as closely as I
normally do due to Plumbers wrapping up and traveling.

The series is on the mailing lists at
https://lore.kernel.org/20220916210658.3412450-1-nat...@kernel.org/,
which is basically just 's/31/314/g' on the dml31 fixes because the code
is identical. Hopefully those two patches can be picked up in the same
manner as the other ones so that x86_64 allmodconfig does not ship
broken in 6.0 and thank you to the AMD folks for moving on those
already!

Cheers,
Nathan

Re: [PATCH 1/2] drm/amd/display: Reduce number of arguments of dml314's CalculateWatermarksAndDRAMSpeedChangeSupport()

2022-09-16 Thread Nathan Chancellor

On Fri, Sep 16, 2022 at 03:04:53PM -0700, Tom Rix wrote:
> 
> On 9/16/22 2:06 PM, Nathan Chancellor wrote:
> > Most of the arguments are identical between the two call sites and they
> > can be accessed through the 'struct vba_vars_st' pointer. This reduces
> > the total amount of stack space that
> > dml314_ModeSupportAndSystemConfigurationFull() uses by 240 bytes with
> > LLVM 16 (2216 -> 1976), helping clear up the following clang warning:
> > 
> >
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/display_mode_vba_314.c:4020:6:
> >  error: stack frame size (2216) exceeds limit (2048) in 
> > 'dml314_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> >void dml314_ModeSupportAndSystemConfigurationFull(struct 
> > display_mode_lib *mode_lib)
> > ^
> >1 error generated.
> > 
> > Link: https://github.com/ClangBuiltLinux/linux/issues/1710
> > Reported-by: "kernelci.org bot" 
> > Signed-off-by: Nathan Chancellor 
> 
> Nathan,
> 
> I like this change but I don't think it goes far enough.
> 
> There are many similar functions in this file and there other
> display_node_vba_*.c files that pass too many vba_vars_st elements.
> 
> I think most/all of the static functions should be refactored to pass
> vba_vars_st * or vba_vars_st **
> 
> fwiw, i found the calling function 
> DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation,
> hilariously long :)
> 
> I'll do the change if you want to pass this to me, I promise not to add to
> the above function name.

Right, there is definitely more that could be done here; I just picked
the couple of functions that would appear to make the most impact, as I
am only concerned with keeping the code warning free with clang so that
-Werror does not break us. I think it makes sense to take this series to
fix the warnings right now (especially since this patch has technically
already been accepted, as it was applied to dcn31) then follow up with
refactoring, which I am more than happy to let you do if you so desire
:)

Thank you for the input as always!

Cheers,
Nathan

> > ---
> > 
> > This is just commit ab2ac59c32db ("drm/amd/display: Reduce number of
> > arguments of dml31's CalculateWatermarksAndDRAMSpeedChangeSupport()")
> > applied to dml314.
> > 
> >   .../dc/dml/dcn314/display_mode_vba_314.c  | 248 --
> >   1 file changed, 52 insertions(+), 196 deletions(-)
> > 
> > diff --git 
> > a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c 
> > b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
> > index 2829f179f982..32ceb72f7a14 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
> > +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
> > @@ -325,64 +325,28 @@ static void 
> > CalculateVupdateAndDynamicMetadataParameters(
> >   static void CalculateWatermarksAndDRAMSpeedChangeSupport(
> > struct display_mode_lib *mode_lib,
> > unsigned int PrefetchMode,
> > -   unsigned int NumberOfActivePlanes,
> > -   unsigned int MaxLineBufferLines,
> > -   unsigned int LineBufferSize,
> > -   unsigned int WritebackInterfaceBufferSize,
> > double DCFCLK,
> > double ReturnBW,
> > -   bool SynchronizedVBlank,
> > -   unsigned int dpte_group_bytes[],
> > -   unsigned int MetaChunkSize,
> > double UrgentLatency,
> > double ExtraLatency,
> > -   double WritebackLatency,
> > -   double WritebackChunkSize,
> > double SOCCLK,
> > -   double DRAMClockChangeLatency,
> > -   double SRExitTime,
> > -   double SREnterPlusExitTime,
> > -   double SRExitZ8Time,
> > -   double SREnterPlusExitZ8Time,
> > double DCFCLKDeepSleep,
> > unsigned int DETBufferSizeY[],
> > unsigned int DETBufferSizeC[],
> > unsigned int SwathHeightY[],
> > unsigned int SwathHeightC[],
> > -   unsigned int LBBitPerPixel[],
> > double SwathWidthY[],
> > double SwathWidthC[],
> > -   double HRatio[],
> > -   double HRatioChroma[],
> > -   unsigned int vtaps[],
> > -   unsigned int VTAPsChroma[],
> > -   double VRatio[],
> > -   double VRatioChroma[],
> > -   unsigned int HTotal[],
> >

[PATCH 2/2] drm/amd/display: Reduce number of arguments of dml314's CalculateFlipSchedule()

2022-09-16 Thread Nathan Chancellor

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml314_ModeSupportAndSystemConfigurationFull() uses by 112 bytes with
LLVM 16 (1976 -> 1864), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/display_mode_vba_314.c:4020:6:
 error: stack frame size (2216) exceeds limit (2048) in 
'dml314_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml314_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
   ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1710
Reported-by: "kernelci.org bot" 
Signed-off-by: Nathan Chancellor 
---

This is just commit 1dbec5b4b0ef ("drm/amd/display: Reduce number of
arguments of dml31's CalculateFlipSchedule()") applied to dml314.

 .../dc/dml/dcn314/display_mode_vba_314.c  | 172 +-
 1 file changed, 47 insertions(+), 125 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
index 32ceb72f7a14..e4dfa714207a 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
@@ -265,33 +265,13 @@ static void CalculateRowBandwidth(
 
 static void CalculateFlipSchedule(
struct display_mode_lib *mode_lib,
+   unsigned int k,
double HostVMInefficiencyFactor,
double UrgentExtraLatency,
double UrgentLatency,
-   unsigned int GPUVMMaxPageTableLevels,
-   bool HostVMEnable,
-   unsigned int HostVMMaxNonCachedPageTableLevels,
-   bool GPUVMEnable,
-   double HostVMMinPageSize,
double PDEAndMetaPTEBytesPerFrame,
double MetaRowBytes,
-   double DPTEBytesPerRow,
-   double BandwidthAvailableForImmediateFlip,
-   unsigned int TotImmediateFlipBytes,
-   enum source_format_class SourcePixelFormat,
-   double LineTime,
-   double VRatio,
-   double VRatioChroma,
-   double Tno_bw,
-   bool DCCEnable,
-   unsigned int dpte_row_height,
-   unsigned int meta_row_height,
-   unsigned int dpte_row_height_chroma,
-   unsigned int meta_row_height_chroma,
-   double *DestinationLinesToRequestVMInImmediateFlip,
-   double *DestinationLinesToRequestRowInImmediateFlip,
-   double *final_flip_bw,
-   bool *ImmediateFlipSupportedForPipe);
+   double DPTEBytesPerRow);
 static double CalculateWriteBackDelay(
enum source_format_class WritebackPixelFormat,
double WritebackHRatio,
@@ -2892,33 +2872,13 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
for (k = 0; k < v->NumberOfActivePlanes; ++k) {
CalculateFlipSchedule(
mode_lib,
+   k,
HostVMInefficiencyFactor,
v->UrgentExtraLatency,
v->UrgentLatency,
-   v->GPUVMMaxPageTableLevels,
-   v->HostVMEnable,
-   
v->HostVMMaxNonCachedPageTableLevels,
-   v->GPUVMEnable,
-   v->HostVMMinPageSize,
v->PDEAndMetaPTEBytesFrame[k],
v->MetaRowByte[k],
-   v->PixelPTEBytesPerRow[k],
-   
v->BandwidthAvailableForImmediateFlip,
-   v->TotImmediateFlipBytes,
-   v->SourcePixelFormat[k],
-   v->HTotal[k] / v->PixelClock[k],
-   v->VRatio[k],
-   v->VRatioChroma[k],
-   v->Tno_bw[k],
-   v->DCCEnable[k],
-   v->dpte_row_height[k],
-   v->meta_row_height[k],
-

[PATCH 1/2] drm/amd/display: Reduce number of arguments of dml314's CalculateWatermarksAndDRAMSpeedChangeSupport()

2022-09-16 Thread Nathan Chancellor

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml314_ModeSupportAndSystemConfigurationFull() uses by 240 bytes with
LLVM 16 (2216 -> 1976), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/display_mode_vba_314.c:4020:6:
 error: stack frame size (2216) exceeds limit (2048) in 
'dml314_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml314_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
   ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1710
Reported-by: "kernelci.org bot" 
Signed-off-by: Nathan Chancellor 
---

This is just commit ab2ac59c32db ("drm/amd/display: Reduce number of
arguments of dml31's CalculateWatermarksAndDRAMSpeedChangeSupport()")
applied to dml314.

 .../dc/dml/dcn314/display_mode_vba_314.c  | 248 --
 1 file changed, 52 insertions(+), 196 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
index 2829f179f982..32ceb72f7a14 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
@@ -325,64 +325,28 @@ static void CalculateVupdateAndDynamicMetadataParameters(
 static void CalculateWatermarksAndDRAMSpeedChangeSupport(
struct display_mode_lib *mode_lib,
unsigned int PrefetchMode,
-   unsigned int NumberOfActivePlanes,
-   unsigned int MaxLineBufferLines,
-   unsigned int LineBufferSize,
-   unsigned int WritebackInterfaceBufferSize,
double DCFCLK,
double ReturnBW,
-   bool SynchronizedVBlank,
-   unsigned int dpte_group_bytes[],
-   unsigned int MetaChunkSize,
double UrgentLatency,
double ExtraLatency,
-   double WritebackLatency,
-   double WritebackChunkSize,
double SOCCLK,
-   double DRAMClockChangeLatency,
-   double SRExitTime,
-   double SREnterPlusExitTime,
-   double SRExitZ8Time,
-   double SREnterPlusExitZ8Time,
double DCFCLKDeepSleep,
unsigned int DETBufferSizeY[],
unsigned int DETBufferSizeC[],
unsigned int SwathHeightY[],
unsigned int SwathHeightC[],
-   unsigned int LBBitPerPixel[],
double SwathWidthY[],
double SwathWidthC[],
-   double HRatio[],
-   double HRatioChroma[],
-   unsigned int vtaps[],
-   unsigned int VTAPsChroma[],
-   double VRatio[],
-   double VRatioChroma[],
-   unsigned int HTotal[],
-   double PixelClock[],
-   unsigned int BlendingAndTiming[],
unsigned int DPPPerPlane[],
double BytePerPixelDETY[],
double BytePerPixelDETC[],
-   double DSTXAfterScaler[],
-   double DSTYAfterScaler[],
-   bool WritebackEnable[],
-   enum source_format_class WritebackPixelFormat[],
-   double WritebackDestinationWidth[],
-   double WritebackDestinationHeight[],
-   double WritebackSourceHeight[],
bool UnboundedRequestEnabled,
unsigned int CompressedBufferSizeInkByte,
enum clock_change_support *DRAMClockChangeSupport,
-   double *UrgentWatermark,
-   double *WritebackUrgentWatermark,
-   double *DRAMClockChangeWatermark,
-   double *WritebackDRAMClockChangeWatermark,
double *StutterExitWatermark,
double *StutterEnterPlusExitWatermark,
double *Z8StutterExitWatermark,
-   double *Z8StutterEnterPlusExitWatermark,
-   double *MinActiveDRAMClockChangeLatencySupported);
+   double *Z8StutterEnterPlusExitWatermark);
 
 static void CalculateDCFCLKDeepSleep(
struct display_mode_lib *mode_lib,
@@ -3041,64 +3005,28 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
CalculateWatermarksAndDRAMSpeedChangeSupport(
mode_lib,
PrefetchMode,
-   v->NumberOfActivePlanes,
-   v->MaxLineBufferLines,
-   v->LineBufferSize,
-   v->WritebackInterfaceBufferSize,

Re: [PATCH 0/5] drm/amd/display: Reduce stack usage for clang

2022-09-12 Thread Nathan Chancellor

Hi Rodrigo,

On Mon, Sep 12, 2022 at 05:50:31PM -0400, Rodrigo Siqueira Jordao wrote:
> 
> 
> On 2022-08-30 16:34, Nathan Chancellor wrote:
> > Hi all,
> > 
> > This series aims to address the following warnings, which are visible
> > when building x86_64 allmodconfig with clang after commit 3876a8b5e241
> > ("drm/amd/display: Enable building new display engine with KCOV
> > enabled").
> > 
> >  
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6:
> >  error: stack frame size (2200) exceeds limit (2048) in 
> > 'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> >  void dml30_ModeSupportAndSystemConfigurationFull(struct 
> > display_mode_lib *mode_lib)
> >  ^
> >  1 error generated.
> > 
> >  
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6:
> >  error: stack frame size (2216) exceeds limit (2048) in 
> > 'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> >  void dml31_ModeSupportAndSystemConfigurationFull(struct 
> > display_mode_lib *mode_lib)
> >  ^
> >  1 error generated.
> > 
> >  
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1721:6:
> >  error: stack frame size (2152) exceeds limit (2048) in 
> > 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> >  void dml32_ModeSupportAndSystemConfigurationFull(struct 
> > display_mode_lib *mode_lib)
> >  ^
> >  1 error generated.
> > 
> > This series is based on commit b3235e8635e1 ("drm/amd/display: clean up
> > some inconsistent indentings"). These warnings are fatal for
> > allmodconfig due to CONFIG_WERROR so ideally, I would like to see these
> > patches cherry-picked to a branch targeting mainline to allow our builds
> > to go back to green. However, since this series is not exactly trivial
> > in size, I can understand not wanting to apply these to mainline during
> > the -rc cycle. If they cannot be cherry-picked to mainline, I can add a
> > patch raising the value of -Wframe-larger-than for these files that can
> > be cherry-picked to 6.0/mainline then add a revert of that change as the
> > last patch in the stack so everything goes back to normal for -next/6.1.
> > I am open to other options though!
> > 
> > I have built this series against clang 16.0.0 (ToT) and GCC 12.2.0 for
> > x86_64. It has seen no runtime testing, as my only test system with AMD
> > graphics is a Renoir one, which as far as I understand it uses DCN 2.1.
> > 
> > Nathan Chancellor (5):
> >drm/amd/display: Reduce number of arguments of
> >  dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport()
> >drm/amd/display: Reduce number of arguments of
> >  dml32_CalculatePrefetchSchedule()
> >drm/amd/display: Reduce number of arguments of dml31's
> >  CalculateWatermarksAndDRAMSpeedChangeSupport()
> >drm/amd/display: Reduce number of arguments of dml31's
> >  CalculateFlipSchedule()
> >drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack
> >  usage
> > 
> >   .../dc/dml/dcn30/display_mode_vba_30.c|   2 +-
> >   .../dc/dml/dcn31/display_mode_vba_31.c| 420 +-
> >   .../dc/dml/dcn32/display_mode_vba_32.c| 236 +++---
> >   .../dc/dml/dcn32/display_mode_vba_util_32.c   | 323 ++
> >   .../dc/dml/dcn32/display_mode_vba_util_32.h   |  51 +--
> >   5 files changed, 318 insertions(+), 714 deletions(-)
> > 
> > 
> > base-commit: b3235e8635e1dd7ac1a27a73330e9880dfe05154
> 
> Hi Nathan,
> 
> First of all, thanks a lot for your patchset!
> 
> Sorry for the delay; it took me more time than I expected to review and run
> a couple of tests in this patchset (most of them were IGT). Anyway, I'm good
> with this change; this series is:
> 
> Reviewed-by: Rodrigo Siqueira 
> 
> And I applied it to amd-staging-drm-next.
> 
> We will run some extra tests this week; if we find some issues, I'll debug
> them.
> 
> Also, thanks, Maíra, for checking this patch as well.

No worries on the delay, the series is not exactly the smallest one I
have ever sent :) While the changes were mostly mechanical, I could have
definitely messed something up and I appreciate you taking the time to
review it and run it through some tests. Please let me know if I can be
of further assistance on that front.

If you have any thoughts on the blurb I had in the cover letter around
how to handle the warnings this series resolves with regards to
mainline, I would love to hear them.

Cheers,
Nathan

Re: [PATCH v2 3/5] Makefile.compiler: replace cc-ifversion with compiler-specific macros

2022-08-31 Thread Nathan Chancellor

On Wed, Aug 31, 2022 at 11:44:06AM -0700, Nick Desaulniers wrote:
> cc-ifversion is GCC specific. Replace it with compiler specific
> variants. Update the users of cc-ifversion to use these new macros.
> Provide a helper for checking compiler versions for GCC and Clang
> simultaneously, that will be used in a follow up patch.
> 
> Cc: Jonathan Corbet 
> Cc: linux-...@vger.kernel.org
> Cc: amd-gfx@lists.freedesktop.org
> Cc: dri-de...@lists.freedesktop.org
> Link: https://github.com/ClangBuiltLinux/linux/issues/350
> Link: 
> https://lore.kernel.org/llvm/CAGG=3QWSAUakO42kubrCap8fp-gm1ERJJAYXTnP1iHk_wrH=b...@mail.gmail.com/
> Suggested-by: Bill Wendling 
> Signed-off-by: Nick Desaulniers 

These are so much nicer. I find the name kind of awkward but the only
thing I could come up with that sounded better was 'gcc-is-at-least' or
'clang-is-at-least' and I don't really feel like painting sheds today,
it's hot outside :)

Reviewed-by: Nathan Chancellor 

Some comments below.

> ---
> Changes v1 -> v2:
> * New patch.
> 
>  Documentation/kbuild/makefiles.rst  | 44 +++--
>  Makefile|  4 +-
>  drivers/gpu/drm/amd/display/dc/dml/Makefile | 12 ++
>  scripts/Makefile.compiler   | 15 +--
>  4 files changed, 49 insertions(+), 26 deletions(-)
> 
> diff --git a/Documentation/kbuild/makefiles.rst 
> b/Documentation/kbuild/makefiles.rst
> index 11a296e52d68..e46f5b45c422 100644
> --- a/Documentation/kbuild/makefiles.rst
> +++ b/Documentation/kbuild/makefiles.rst
> @@ -682,22 +682,42 @@ more details, with real examples.
>   In the above example, -Wno-unused-but-set-variable will be added to
>   KBUILD_CFLAGS only if gcc really accepts it.
>  
> -cc-ifversion
> - cc-ifversion tests the version of $(CC) and equals the fourth parameter
> - if version expression is true, or the fifth (if given) if the version
> - expression is false.
> +gcc-min-version
> + gcc-min-version tests if the value of $(CONFIG_GCC_VERSION) is greater 
> than
> + or equal to the provided value and evaluates to y if so.
>  
>   Example::
>  
> - #fs/reiserfs/Makefile
> - ccflags-y := $(call cc-ifversion, -lt, 0402, -O1)
> + cflags-$(call gcc-min-version, 70100) := -foo
>  
> - In this example, ccflags-y will be assigned the value -O1 if the
> - $(CC) version is less than 4.2.
> - cc-ifversion takes all the shell operators:
> - -eq, -ne, -lt, -le, -gt, and -ge
> - The third parameter may be a text as in this example, but it may also
> - be an expanded variable or a macro.
> + In this example, cflags-y will be assigned the value -foo if $(CC) is 
> gcc and
> + $(CONFIG_GCC_VERSION) is >= 7.1.
> +
> +clang-min-version
> + clang-min-version tests if the value of $(CONFIG_CLANG_VERSION) is 
> greater
> + than or equal to the provided value and evaluates to y if so.
> +
> + Example::
> +
> + cflags-$(call clang-min-version, 11) := -foo
> +
> + In this example, cflags-y will be assigned the value -foo if $(CC) is 
> clang
> + and $(CONFIG_CLANG_VERSION) is >= 11.0.0.
> +
> +cc-min-version
> + cc-min-version tests if the value of $(CONFIG_GCC_VERSION) is greater
> + than or equal to the first value provided, or if the value of
> + $(CONFIG_CLANG_VERSION) is greater than or equal to the second value
> + provided, and evaluates
> + to y if so.
> +
> + Example::
> +
> + cflags-$(call cc-min-version, 70100, 11) := -foo
> +
> + In this example, cflags-y will be assigned the value -foo if $(CC) is 
> gcc and
> + $(CONFIG_GCC_VERSION) is >= 7.1, or if $(CC) is clang and
> + $(CONFIG_CLANG_VERSION) is >= 11.0.0.
>  
>  cc-cross-prefix
>   cc-cross-prefix is used to check if there exists a $(CC) in path with
> diff --git a/Makefile b/Makefile
> index 952d354069a4..caa39ecb1136 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -972,7 +972,7 @@ ifdef CONFIG_CC_IS_GCC
>  KBUILD_CFLAGS += -Wno-maybe-uninitialized
>  endif
>  
> -ifdef CONFIG_CC_IS_GCC
> +ifeq ($(call gcc-min-version, 90100),y)
>  # The allocators already balk at large sizes, so silence the compiler
>  # warnings for bounds checks involving those possible values. While
>  # -Wno-alloc-size-larger-than would normally be used here, earlier versions
> @@ -984,7 +984,7 @@ ifdef CONFIG_CC_IS_GCC
>  # ignored, continuing to default to PTRDIFF_MAX. So, left with no other
>  # choice, we must perform a versioned check to disable this warning.
>  # https://lore.kernel.org/lk

Re: mainline build failure for x86_64 allmodconfig with clang

2022-08-30 Thread Nathan Chancellor

On Fri, Aug 26, 2022 at 10:31:34AM -0400, Alex Deucher wrote:
> On Thu, Aug 25, 2022 at 6:34 PM Nathan Chancellor  wrote:
> >
> > Hi AMD folks,
> >
> > Top posting because it might not have been obvious but I was looking for
> > your feedback on this message (which can be viewed on lore.kernel.org if
> > you do not have the original [1]) so that we can try to get this fixed
> > in some way for 6.0/6.1. If my approach is not welcome, please consider
> > suggesting another one or looking to see if this is something you all
> > could look into.
> 
> The patch looks good to me.  I was hoping Harry or Rodrigo could
> comment more since they are more familiar with this code and trying to
> keep it in sync with what we get from the hardware teams.

Thanks a lot for the input! That patch was broken but I have polished it
and a few other patches up and sent them along for review:

https://lore.kernel.org/20220830203409.3491379-1-nat...@kernel.org/

I did not CC everyone from this thread but it is on lore if others want
to comment on it. Hopefully we can get this all sorted out for 6.0
final.

Cheers,
Nathan

> > [1]: https://lore.kernel.org/Yv5h0rb3AgTZLVJv@dev-arch.thelio-3990X/
> >
> > Cheers,
> > Nathan
> >
> > On Thu, Aug 18, 2022 at 08:59:14AM -0700, Nathan Chancellor wrote:
> > > Hi Arnd,
> > >
> > > Doubling back around to this now since I think this is the only thing
> > > breaking x86_64 allmodconfig with clang 11 through 15.
> > >
> > > On Fri, Aug 05, 2022 at 09:32:13PM +0200, Arnd Bergmann wrote:
> > > > On Fri, Aug 5, 2022 at 8:02 PM Nathan Chancellor  
> > > > wrote:
> > > > > On Fri, Aug 05, 2022 at 06:16:45PM +0200, Arnd Bergmann wrote:
> > > > > > On Fri, Aug 5, 2022 at 5:32 PM Harry Wentland 
> > > > > >  wrote:
> > > > > > While splitting out sub-functions can help reduce the maximum stack
> > > > > > usage, it seems that in this case it makes the actual problem worse:
> > > > > > I see 2168 bytes for the combined
> > > > > > dml32_ModeSupportAndSystemConfigurationFull(), but marking
> > > > > > mode_support_configuration() as noinline gives me 1992 bytes
> > > > > > for the outer function plus 384 bytes for the inner one. So it does
> > > > > > avoid the warning (barely), but not the problem that the warning 
> > > > > > tries
> > > > > > to point out.
> > > > >
> > > > > I haven't had a chance to take a look at splitting things up yet, 
> > > > > would
> > > > > you recommend a different approach?
> > > >
> > > > Splitting up large functions can help when you have large local 
> > > > variables
> > > > that are used in different parts of the function, and the split gets the
> > > > compiler to reuse stack locations.
> > > >
> > > > I think in this particular function, the problem isn't actually local 
> > > > variables
> > > > but either pushing variables on the stack for argument passing,
> > > > or something that causes the compiler to run out of registers so it
> > > > has to spill registers to the stack.
> > > >
> > > > In either case, one has to actually look at the generated output
> > > > and then try to rearrange the codes so this does not happen.
> > > >
> > > > One thing to try would be to condense a function call like
> > > >
> > > > 
> > > > dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
> > > >
> > > > &v->dummy_vars.dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport,
> > > > mode_lib->vba.USRRetrainingRequiredFinal,
> > > > mode_lib->vba.UsesMALLForPStateChange,
> > > >
> > > > mode_lib->vba.PrefetchModePerState[mode_lib->vba.VoltageLevel][mode_lib->vba.maxMpcComb],
> > > > mode_lib->vba.NumberOfActiveSurfaces,
> > > > mode_lib->vba.MaxLineBufferLines,
> > > > mode_lib->vba.LineBufferSizeFinal,
> > > > mode_lib->vba.WritebackInterfaceBufferSize,
> > > > mode_lib->vba.DCFCLK,
> > > > mode_lib->vba.ReturnBW,
> > > > mode_lib->vba.SynchronizeTiming

[PATCH 5/5] drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack usage

2022-08-30 Thread Nathan Chancellor

This function consumes a lot of stack space and it blows up the size of
dml30_ModeSupportAndSystemConfigurationFull() with clang:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6:
 error: stack frame size (2200) exceeds limit (2048) in 
'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
   ^
  1 error generated.

Commit a0f7e7f759cf ("drm/amd/display: fix i386 frame size warning")
aimed to address this for i386 but it did not help x86_64.

To reduce the amount of stack space that
dml30_ModeSupportAndSystemConfigurationFull() uses, mark
UseMinimumDCFCLK() as noinline, using the _for_stack variant for
documentation. While this will increase the total amount of stack usage
between the two functions (1632 and 1304 bytes respectively), it will
make sure both stay below the limit of 2048 bytes for these files. The
aforementioned change does help reduce UseMinimumDCFCLK()'s stack usage
so it should not be reverted in favor of this change.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
index b7fa003ffe06..6991a68061ef 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
@@ -6497,7 +6497,7 @@ static double CalculateUrgentLatency(
return ret;
 }
 
-static void UseMinimumDCFCLK(
+static noinline_for_stack void UseMinimumDCFCLK(
struct display_mode_lib *mode_lib,
struct vba_vars_st *v,
int MaxPrefetchMode,
-- 
2.37.2

[PATCH 4/5] drm/amd/display: Reduce number of arguments of dml31's CalculateFlipSchedule()

2022-08-30 Thread Nathan Chancellor

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml31_ModeSupportAndSystemConfigurationFull() uses by 112 bytes with
LLVM 16 (1976 -> 1864), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6:
 error: stack frame size (2216) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
  ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Signed-off-by: Nathan Chancellor 
---
 .../dc/dml/dcn31/display_mode_vba_31.c| 172 +-
 1 file changed, 47 insertions(+), 125 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
index 21c74ee1deec..60ee936e6436 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
@@ -251,33 +251,13 @@ static void CalculateRowBandwidth(
 
 static void CalculateFlipSchedule(
struct display_mode_lib *mode_lib,
+   unsigned int k,
double HostVMInefficiencyFactor,
double UrgentExtraLatency,
double UrgentLatency,
-   unsigned int GPUVMMaxPageTableLevels,
-   bool HostVMEnable,
-   unsigned int HostVMMaxNonCachedPageTableLevels,
-   bool GPUVMEnable,
-   double HostVMMinPageSize,
double PDEAndMetaPTEBytesPerFrame,
double MetaRowBytes,
-   double DPTEBytesPerRow,
-   double BandwidthAvailableForImmediateFlip,
-   unsigned int TotImmediateFlipBytes,
-   enum source_format_class SourcePixelFormat,
-   double LineTime,
-   double VRatio,
-   double VRatioChroma,
-   double Tno_bw,
-   bool DCCEnable,
-   unsigned int dpte_row_height,
-   unsigned int meta_row_height,
-   unsigned int dpte_row_height_chroma,
-   unsigned int meta_row_height_chroma,
-   double *DestinationLinesToRequestVMInImmediateFlip,
-   double *DestinationLinesToRequestRowInImmediateFlip,
-   double *final_flip_bw,
-   bool *ImmediateFlipSupportedForPipe);
+   double DPTEBytesPerRow);
 static double CalculateWriteBackDelay(
enum source_format_class WritebackPixelFormat,
double WritebackHRatio,
@@ -2868,33 +2848,13 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
for (k = 0; k < v->NumberOfActivePlanes; ++k) {
CalculateFlipSchedule(
mode_lib,
+   k,
HostVMInefficiencyFactor,
v->UrgentExtraLatency,
v->UrgentLatency,
-   v->GPUVMMaxPageTableLevels,
-   v->HostVMEnable,
-   
v->HostVMMaxNonCachedPageTableLevels,
-   v->GPUVMEnable,
-   v->HostVMMinPageSize,
v->PDEAndMetaPTEBytesFrame[k],
v->MetaRowByte[k],
-   v->PixelPTEBytesPerRow[k],
-   
v->BandwidthAvailableForImmediateFlip,
-   v->TotImmediateFlipBytes,
-   v->SourcePixelFormat[k],
-   v->HTotal[k] / v->PixelClock[k],
-   v->VRatio[k],
-   v->VRatioChroma[k],
-   v->Tno_bw[k],
-   v->DCCEnable[k],
-   v->dpte_row_height[k],
-   v->meta_row_height[k],
-   v->dpte_row_height_chroma[k],
-   v->meta_row_height_

[PATCH 3/5] drm/amd/display: Reduce number of arguments of dml31's CalculateWatermarksAndDRAMSpeedChangeSupport()

2022-08-30 Thread Nathan Chancellor

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml31_ModeSupportAndSystemConfigurationFull() uses by 240 bytes with
LLVM 16 (2216 -> 1976), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6:
 error: stack frame size (2216) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
  ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Signed-off-by: Nathan Chancellor 
---
 .../dc/dml/dcn31/display_mode_vba_31.c| 248 --
 1 file changed, 52 insertions(+), 196 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
index d63b4209b14c..21c74ee1deec 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
@@ -311,64 +311,28 @@ static void CalculateVupdateAndDynamicMetadataParameters(
 static void CalculateWatermarksAndDRAMSpeedChangeSupport(
struct display_mode_lib *mode_lib,
unsigned int PrefetchMode,
-   unsigned int NumberOfActivePlanes,
-   unsigned int MaxLineBufferLines,
-   unsigned int LineBufferSize,
-   unsigned int WritebackInterfaceBufferSize,
double DCFCLK,
double ReturnBW,
-   bool SynchronizedVBlank,
-   unsigned int dpte_group_bytes[],
-   unsigned int MetaChunkSize,
double UrgentLatency,
double ExtraLatency,
-   double WritebackLatency,
-   double WritebackChunkSize,
double SOCCLK,
-   double DRAMClockChangeLatency,
-   double SRExitTime,
-   double SREnterPlusExitTime,
-   double SRExitZ8Time,
-   double SREnterPlusExitZ8Time,
double DCFCLKDeepSleep,
unsigned int DETBufferSizeY[],
unsigned int DETBufferSizeC[],
unsigned int SwathHeightY[],
unsigned int SwathHeightC[],
-   unsigned int LBBitPerPixel[],
double SwathWidthY[],
double SwathWidthC[],
-   double HRatio[],
-   double HRatioChroma[],
-   unsigned int vtaps[],
-   unsigned int VTAPsChroma[],
-   double VRatio[],
-   double VRatioChroma[],
-   unsigned int HTotal[],
-   double PixelClock[],
-   unsigned int BlendingAndTiming[],
unsigned int DPPPerPlane[],
double BytePerPixelDETY[],
double BytePerPixelDETC[],
-   double DSTXAfterScaler[],
-   double DSTYAfterScaler[],
-   bool WritebackEnable[],
-   enum source_format_class WritebackPixelFormat[],
-   double WritebackDestinationWidth[],
-   double WritebackDestinationHeight[],
-   double WritebackSourceHeight[],
bool UnboundedRequestEnabled,
int unsigned CompressedBufferSizeInkByte,
enum clock_change_support *DRAMClockChangeSupport,
-   double *UrgentWatermark,
-   double *WritebackUrgentWatermark,
-   double *DRAMClockChangeWatermark,
-   double *WritebackDRAMClockChangeWatermark,
double *StutterExitWatermark,
double *StutterEnterPlusExitWatermark,
double *Z8StutterExitWatermark,
-   double *Z8StutterEnterPlusExitWatermark,
-   double *MinActiveDRAMClockChangeLatencySupported);
+   double *Z8StutterEnterPlusExitWatermark);
 
 static void CalculateDCFCLKDeepSleep(
struct display_mode_lib *mode_lib,
@@ -3017,64 +2981,28 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
CalculateWatermarksAndDRAMSpeedChangeSupport(
mode_lib,
PrefetchMode,
-   v->NumberOfActivePlanes,
-   v->MaxLineBufferLines,
-   v->LineBufferSize,
-   v->WritebackInterfaceBufferSize,
v->DCFCLK,
v->ReturnBW,
-   v->SynchronizedVBlank,
-   v->dpte_grou

[PATCH 2/5] drm/amd/display: Reduce number of arguments of dml32_CalculatePrefetchSchedule()

2022-08-30 Thread Nathan Chancellor

Several of the arguments are identical between the two call sites and
they can be accessed through the 'struct vba_vars_st' pointer. This
reduces the total amount of stack space that
dml32_ModeSupportAndSystemConfigurationFull() uses by 208 bytes with
LLVM 16 (1936 -> 1728), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1721:6:
 error: stack frame size (2152) exceeds limit (2048) in 
'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
   ^
  1 error generated.

Additionally, while modifying the arguments to
dml32_CalculatePrefetchSchedule(), use 'v' consistently, instead of 'v'
mixed with 'mode_lib->vba'.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Signed-off-by: Nathan Chancellor 
---
 .../dc/dml/dcn32/display_mode_vba_32.c| 118 +++---
 .../dc/dml/dcn32/display_mode_vba_util_32.c   |  75 +--
 .../dc/dml/dcn32/display_mode_vba_util_32.h   |  18 +--
 3 files changed, 78 insertions(+), 133 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
index 7da90fba95fc..58c6cc58583a 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
@@ -755,30 +755,18 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman

v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.myPipe.BytePerPixelY
 = v->BytePerPixelY[k];

v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.myPipe.BytePerPixelC
 = v->BytePerPixelC[k];

v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.myPipe.ProgressiveToInterlaceUnitInOPP
 = mode_lib->vba.ProgressiveToInterlaceUnitInOPP;
-   v->ErrorResult[k] = 
dml32_CalculatePrefetchSchedule(v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.HostVMInefficiencyFactor,
-   
&v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.myPipe,
 v->DSCDelay[k],
-   mode_lib->vba.DPPCLKDelaySubtotal + 
mode_lib->vba.DPPCLKDelayCNVCFormater,
-   mode_lib->vba.DPPCLKDelaySCL,
-   mode_lib->vba.DPPCLKDelaySCLLBOnly,
-   mode_lib->vba.DPPCLKDelayCNVCCursor,
-   mode_lib->vba.DISPCLKDelaySubtotal,
-   (unsigned int) (v->SwathWidthY[k] / 
mode_lib->vba.HRatio[k]),
-   mode_lib->vba.OutputFormat[k],
-   mode_lib->vba.MaxInterDCNTileRepeaters,
+   v->ErrorResult[k] = dml32_CalculatePrefetchSchedule(
+   v,
+   k,
+   
v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.HostVMInefficiencyFactor,
+   
&v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.myPipe,
+   v->DSCDelay[k],
+   (unsigned int) (v->SwathWidthY[k] / 
v->HRatio[k]),
dml_min(v->VStartupLines, 
v->MaxVStartupLines[k]),
v->MaxVStartupLines[k],
-   mode_lib->vba.GPUVMMaxPageTableLevels,
-   mode_lib->vba.GPUVMEnable,
-   mode_lib->vba.HostVMEnable,
-   
mode_lib->vba.HostVMMaxNonCachedPageTableLevels,
-   mode_lib->vba.HostVMMinPageSize,
-   mode_lib->vba.DynamicMetadataEnable[k],
-   mode_lib->vba.DynamicMetadataVMEnabled,
-   
mode_lib->vba.DynamicMetadataLinesBeforeActiveRequired[k],
-   
mode_lib->vba.DynamicMetadataTransmittedBytes[k],
v->UrgentLatency,
v->Urgen

[PATCH 1/5] drm/amd/display: Reduce number of arguments of dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport()

2022-08-30 Thread Nathan Chancellor

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer created at the
top of dml32_ModeSupportAndSystemConfigurationFull(). This reduces the
total amount of stack space that
dml32_ModeSupportAndSystemConfigurationFull() uses by 216 bytes with
LLVM 16 (2152 -> 1936), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1721:6:
 error: stack frame size (2152) exceeds limit (2048) in 
'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
   ^
  1 error generated.

Additionally, while modifying the arguments to
dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(), use 'v'
consistently, instead of 'v' mixed with 'mode_lib->vba'.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Signed-off-by: Nathan Chancellor 
---
 .../dc/dml/dcn32/display_mode_vba_32.c| 118 ++---
 .../dc/dml/dcn32/display_mode_vba_util_32.c   | 248 --
 .../dc/dml/dcn32/display_mode_vba_util_32.h   |  33 +--
 3 files changed, 140 insertions(+), 259 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
index d8014bfbc3fe..7da90fba95fc 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
@@ -1163,58 +1163,28 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman

v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.mmSOCParameters.SMNLatency
 = mode_lib->vba.SMNLatency;
 
dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
-   mode_lib->vba.USRRetrainingRequiredFinal,
-   mode_lib->vba.UsesMALLForPStateChange,
-   
mode_lib->vba.PrefetchModePerState[mode_lib->vba.VoltageLevel][mode_lib->vba.maxMpcComb],
-   mode_lib->vba.NumberOfActiveSurfaces,
-   mode_lib->vba.MaxLineBufferLines,
-   mode_lib->vba.LineBufferSizeFinal,
-   mode_lib->vba.WritebackInterfaceBufferSize,
-   mode_lib->vba.DCFCLK,
-   mode_lib->vba.ReturnBW,
-   mode_lib->vba.SynchronizeTimingsFinal,
-   
mode_lib->vba.SynchronizeDRRDisplaysForUCLKPStateChangeFinal,
-   mode_lib->vba.DRRDisplay,
-   v->dpte_group_bytes,
-   v->meta_row_height,
-   v->meta_row_height_chroma,
+   v,
+   v->PrefetchModePerState[v->VoltageLevel][v->maxMpcComb],
+   v->DCFCLK,
+   v->ReturnBW,

v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.mmSOCParameters,
-   mode_lib->vba.WritebackChunkSize,
-   mode_lib->vba.SOCCLK,
+   v->SOCCLK,
v->DCFCLKDeepSleep,
-   mode_lib->vba.DETBufferSizeY,
-   mode_lib->vba.DETBufferSizeC,
-   mode_lib->vba.SwathHeightY,
-   mode_lib->vba.SwathHeightC,
-   mode_lib->vba.LBBitPerPixel,
+   v->DETBufferSizeY,
+   v->DETBufferSizeC,
+   v->SwathHeightY,
+   v->SwathHeightC,
v->SwathWidthY,
v->SwathWidthC,
-   mode_lib->vba.HRatio,
-   mode_lib->vba.HRatioChroma,
-   mode_lib->vba.vtaps,
-   mode_lib->vba.VTAPsChroma,
-   mode_lib->vba.VRatio,
-   mode_lib->vba.VRatioChroma,
-   mode_lib->vba.HTotal,
-   mode_lib->vba.VTotal,
-   mode_lib->vba.VActive,
-   mode_lib->vba.PixelClock,
-   mode_lib->vba.BlendingAndTiming,
-   mode_lib->vba.DPPPerPlane,
+   v->DPPPerPlane,
v->BytePerPixelDETY,
v->BytePerPixelDETC,
v->DSTXAfterScaler,
v->DSTYAfterScaler,
-   mode_lib->vba.

[PATCH 0/5] drm/amd/display: Reduce stack usage for clang

2022-08-30 Thread Nathan Chancellor

Hi all,

This series aims to address the following warnings, which are visible
when building x86_64 allmodconfig with clang after commit 3876a8b5e241
("drm/amd/display: Enable building new display engine with KCOV
enabled").


drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6:
 error: stack frame size (2200) exceeds limit (2048) in 
'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
^
1 error generated.


drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6:
 error: stack frame size (2216) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
^
1 error generated.


drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1721:6:
 error: stack frame size (2152) exceeds limit (2048) in 
'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
^
1 error generated.

This series is based on commit b3235e8635e1 ("drm/amd/display: clean up
some inconsistent indentings"). These warnings are fatal for
allmodconfig due to CONFIG_WERROR so ideally, I would like to see these
patches cherry-picked to a branch targeting mainline to allow our builds
to go back to green. However, since this series is not exactly trivial
in size, I can understand not wanting to apply these to mainline during
the -rc cycle. If they cannot be cherry-picked to mainline, I can add a
patch raising the value of -Wframe-larger-than for these files that can
be cherry-picked to 6.0/mainline then add a revert of that change as the
last patch in the stack so everything goes back to normal for -next/6.1.
I am open to other options though!

I have built this series against clang 16.0.0 (ToT) and GCC 12.2.0 for
x86_64. It has seen no runtime testing, as my only test system with AMD
graphics is a Renoir one, which as far as I understand it uses DCN 2.1.

Nathan Chancellor (5):
  drm/amd/display: Reduce number of arguments of
dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport()
  drm/amd/display: Reduce number of arguments of
dml32_CalculatePrefetchSchedule()
  drm/amd/display: Reduce number of arguments of dml31's
CalculateWatermarksAndDRAMSpeedChangeSupport()
  drm/amd/display: Reduce number of arguments of dml31's
CalculateFlipSchedule()
  drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack
usage

 .../dc/dml/dcn30/display_mode_vba_30.c|   2 +-
 .../dc/dml/dcn31/display_mode_vba_31.c| 420 +-
 .../dc/dml/dcn32/display_mode_vba_32.c| 236 +++---
 .../dc/dml/dcn32/display_mode_vba_util_32.c   | 323 ++
 .../dc/dml/dcn32/display_mode_vba_util_32.h   |  51 +--
 5 files changed, 318 insertions(+), 714 deletions(-)


base-commit: b3235e8635e1dd7ac1a27a73330e9880dfe05154
-- 
2.37.2

Re: mainline build failure for x86_64 allmodconfig with clang

2022-08-25 Thread Nathan Chancellor

Hi AMD folks,

Top posting because it might not have been obvious but I was looking for
your feedback on this message (which can be viewed on lore.kernel.org if
you do not have the original [1]) so that we can try to get this fixed
in some way for 6.0/6.1. If my approach is not welcome, please consider
suggesting another one or looking to see if this is something you all
could look into.

[1]: https://lore.kernel.org/Yv5h0rb3AgTZLVJv@dev-arch.thelio-3990X/

Cheers,
Nathan

On Thu, Aug 18, 2022 at 08:59:14AM -0700, Nathan Chancellor wrote:
> Hi Arnd,
> 
> Doubling back around to this now since I think this is the only thing
> breaking x86_64 allmodconfig with clang 11 through 15.
> 
> On Fri, Aug 05, 2022 at 09:32:13PM +0200, Arnd Bergmann wrote:
> > On Fri, Aug 5, 2022 at 8:02 PM Nathan Chancellor  wrote:
> > > On Fri, Aug 05, 2022 at 06:16:45PM +0200, Arnd Bergmann wrote:
> > > > On Fri, Aug 5, 2022 at 5:32 PM Harry Wentland  
> > > > wrote:
> > > > While splitting out sub-functions can help reduce the maximum stack
> > > > usage, it seems that in this case it makes the actual problem worse:
> > > > I see 2168 bytes for the combined
> > > > dml32_ModeSupportAndSystemConfigurationFull(), but marking
> > > > mode_support_configuration() as noinline gives me 1992 bytes
> > > > for the outer function plus 384 bytes for the inner one. So it does
> > > > avoid the warning (barely), but not the problem that the warning tries
> > > > to point out.
> > >
> > > I haven't had a chance to take a look at splitting things up yet, would
> > > you recommend a different approach?
> > 
> > Splitting up large functions can help when you have large local variables
> > that are used in different parts of the function, and the split gets the
> > compiler to reuse stack locations.
> > 
> > I think in this particular function, the problem isn't actually local 
> > variables
> > but either pushing variables on the stack for argument passing,
> > or something that causes the compiler to run out of registers so it
> > has to spill registers to the stack.
> > 
> > In either case, one has to actually look at the generated output
> > and then try to rearrange the codes so this does not happen.
> > 
> > One thing to try would be to condense a function call like
> > 
> > dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
> > 
> > &v->dummy_vars.dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport,
> > mode_lib->vba.USRRetrainingRequiredFinal,
> > mode_lib->vba.UsesMALLForPStateChange,
> > 
> > mode_lib->vba.PrefetchModePerState[mode_lib->vba.VoltageLevel][mode_lib->vba.maxMpcComb],
> > mode_lib->vba.NumberOfActiveSurfaces,
> > mode_lib->vba.MaxLineBufferLines,
> > mode_lib->vba.LineBufferSizeFinal,
> > mode_lib->vba.WritebackInterfaceBufferSize,
> > mode_lib->vba.DCFCLK,
> > mode_lib->vba.ReturnBW,
> > mode_lib->vba.SynchronizeTimingsFinal,
> > 
> > mode_lib->vba.SynchronizeDRRDisplaysForUCLKPStateChangeFinal,
> > mode_lib->vba.DRRDisplay,
> > v->dpte_group_bytes,
> > v->meta_row_height,
> > v->meta_row_height_chroma,
> > 
> > v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.mmSOCParameters,
> > mode_lib->vba.WritebackChunkSize,
> > mode_lib->vba.SOCCLK,
> > v->DCFCLKDeepSleep,
> > mode_lib->vba.DETBufferSizeY,
> > mode_lib->vba.DETBufferSizeC,
> > mode_lib->vba.SwathHeightY,
> > mode_lib->vba.SwathHeightC,
> > mode_lib->vba.LBBitPerPixel,
> > v->SwathWidthY,
> > v->SwathWidthC,
> > mode_lib->vba.HRatio,
> > mode_lib->vba.HRatioChroma,
> > mode_lib->vba.vtaps,
> > mode_lib->vba.VTAPsChroma,
> > mode_lib->vba.VRatio,
> > mode_lib->vba.VRatioChroma,
> >

Re: mainline build failure for x86_64 allmodconfig with clang

2022-08-18 Thread Nathan Chancellor

Hi Arnd,

Doubling back around to this now since I think this is the only thing
breaking x86_64 allmodconfig with clang 11 through 15.

On Fri, Aug 05, 2022 at 09:32:13PM +0200, Arnd Bergmann wrote:
> On Fri, Aug 5, 2022 at 8:02 PM Nathan Chancellor  wrote:
> > On Fri, Aug 05, 2022 at 06:16:45PM +0200, Arnd Bergmann wrote:
> > > On Fri, Aug 5, 2022 at 5:32 PM Harry Wentland  
> > > wrote:
> > > While splitting out sub-functions can help reduce the maximum stack
> > > usage, it seems that in this case it makes the actual problem worse:
> > > I see 2168 bytes for the combined
> > > dml32_ModeSupportAndSystemConfigurationFull(), but marking
> > > mode_support_configuration() as noinline gives me 1992 bytes
> > > for the outer function plus 384 bytes for the inner one. So it does
> > > avoid the warning (barely), but not the problem that the warning tries
> > > to point out.
> >
> > I haven't had a chance to take a look at splitting things up yet, would
> > you recommend a different approach?
> 
> Splitting up large functions can help when you have large local variables
> that are used in different parts of the function, and the split gets the
> compiler to reuse stack locations.
> 
> I think in this particular function, the problem isn't actually local 
> variables
> but either pushing variables on the stack for argument passing,
> or something that causes the compiler to run out of registers so it
> has to spill registers to the stack.
> 
> In either case, one has to actually look at the generated output
> and then try to rearrange the codes so this does not happen.
> 
> One thing to try would be to condense a function call like
> 
> dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
> 
> &v->dummy_vars.dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport,
> mode_lib->vba.USRRetrainingRequiredFinal,
> mode_lib->vba.UsesMALLForPStateChange,
> 
> mode_lib->vba.PrefetchModePerState[mode_lib->vba.VoltageLevel][mode_lib->vba.maxMpcComb],
> mode_lib->vba.NumberOfActiveSurfaces,
> mode_lib->vba.MaxLineBufferLines,
> mode_lib->vba.LineBufferSizeFinal,
> mode_lib->vba.WritebackInterfaceBufferSize,
> mode_lib->vba.DCFCLK,
> mode_lib->vba.ReturnBW,
> mode_lib->vba.SynchronizeTimingsFinal,
> 
> mode_lib->vba.SynchronizeDRRDisplaysForUCLKPStateChangeFinal,
> mode_lib->vba.DRRDisplay,
> v->dpte_group_bytes,
> v->meta_row_height,
> v->meta_row_height_chroma,
> 
> v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.mmSOCParameters,
> mode_lib->vba.WritebackChunkSize,
> mode_lib->vba.SOCCLK,
> v->DCFCLKDeepSleep,
> mode_lib->vba.DETBufferSizeY,
> mode_lib->vba.DETBufferSizeC,
> mode_lib->vba.SwathHeightY,
> mode_lib->vba.SwathHeightC,
> mode_lib->vba.LBBitPerPixel,
> v->SwathWidthY,
> v->SwathWidthC,
> mode_lib->vba.HRatio,
> mode_lib->vba.HRatioChroma,
> mode_lib->vba.vtaps,
> mode_lib->vba.VTAPsChroma,
> mode_lib->vba.VRatio,
> mode_lib->vba.VRatioChroma,
> mode_lib->vba.HTotal,
> mode_lib->vba.VTotal,
> mode_lib->vba.VActive,
> mode_lib->vba.PixelClock,
> mode_lib->vba.BlendingAndTiming,
>  /* more arguments */);
> 
> into calling conventions that take a pointer to 'mode_lib->vba' and another
> one to 'v', so these are no longer passed on the stack individually.

So I took a whack at reducing this function's number of parameters and
ended up with the attached patch. I basically just removed any
parameters that were identical between the two call sites and access them
through the vba pointer, as you suggested.

AMD folks, is this an acceptable approach? It didn't take a trivial
amount of time so I want to make sure this is okay before I do it to
more fu

Re: mainline build failure for x86_64 allmodconfig with clang

2022-08-05 Thread Nathan Chancellor

On Fri, Aug 05, 2022 at 06:16:45PM +0200, Arnd Bergmann wrote:
> On Fri, Aug 5, 2022 at 5:32 PM Harry Wentland  wrote:
> > > I do notice that these files build with a non-configurable
> > > -Wframe-large-than value:
> > >
> > > $ rg frame_warn_flag drivers/gpu/drm/amd/display/dc/dml/Makefile
> > > 54:frame_warn_flag := -Wframe-larger-than=2048
> >
> > Tbh, I was looking at the history and I can't find a good reason this
> > was added. It should be safe to drop this. I would much rather use
> > the CONFIG_FRAME_WARN value than override it.
> >
> > AFAIK most builds use 2048 by default anyways.
> 
> I'm fairly sure this was done for 32-bit builds, which default to a lower
> warning limit of 1024 bytes and would otherwise run into this
> problem when 64-bit platforms don't. With the default warning limit,
> clang warns even more about an i386 build:
> 
> display/dc/dml/dcn20/display_rq_dlg_calc_20.c:1549:6: error: stack
> frame size (1324) exceeds limit (1024) in 'dml20_rq_dlg_get_dlg_reg'
> display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c:1550:6: error: stack
> frame size (1324) exceeds limit (1024) in 'dml20v2_rq_dlg_get_dlg_reg'
> display/dc/dml/dcn30/display_rq_dlg_calc_30.c:1742:6: error: stack
> frame size (1484) exceeds limit (1024) in 'dml30_rq_dlg_get_dlg_reg'
> display/dc/dml/dcn31/display_rq_dlg_calc_31.c:1571:6: error: stack
> frame size (1548) exceeds limit (1024) in 'dml31_rq_dlg_get_dlg_reg'
> display/dc/dml/dcn21/display_rq_dlg_calc_21.c:1657:6: error: stack
> frame size (1388) exceeds limit (1024) in 'dml21_rq_dlg_get_dlg_reg'
> display/dc/dml/dcn32/display_rq_dlg_calc_32.c:206:6: error: stack
> frame size (1276) exceeds limit (1024) in 'dml32_rq_dlg_get_dlg_reg'
> display/dc/dml/dcn31/display_mode_vba_31.c:2049:13: error: stack frame
> size (1468) exceeds limit (1024) in
> 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation'
> display/dc/dml/dcn20/display_mode_vba_20v2.c:1145:13: error: stack
> frame size (1228) exceeds limit (1024) in
> 'dml20v2_DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation'
> display/dc/dml/dcn20/display_mode_vba_20.c:1085:13: error: stack frame
> size (1340) exceeds limit (1024) in
> 'dml20_DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation'
> display/dc/dml/dcn31/display_mode_vba_31.c:3908:6: error: stack frame
> size (1996) exceeds limit (1024) in
> 'dml31_ModeSupportAndSystemConfigurationFull'
> display/dc/dml/dcn21/display_mode_vba_21.c:1466:13: error: stack frame
> size (1308) exceeds limit (1024) in
> 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation'
> display/dc/dml/dcn20/display_mode_vba_20v2.c:3393:6: error: stack
> frame size (1356) exceeds limit (1024) in
> 'dml20v2_ModeSupportAndSystemConfigurationFull'
> display/dc/dml/dcn20/display_mode_vba_20.c:3286:6: error: stack frame
> size (1468) exceeds limit (1024) in
> 'dml20_ModeSupportAndSystemConfigurationFull'
> display/dc/dml/dcn21/display_mode_vba_21.c:3518:6: error: stack frame
> size (1228) exceeds limit (1024) in
> 'dml21_ModeSupportAndSystemConfigurationFull'
> display/dc/dml/dcn30/display_mode_vba_30.c:1906:13: error: stack frame
> size (1436) exceeds limit (1024) in
> 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation'
> display/dc/dml/dcn30/display_mode_vba_30.c:3596:6: error: stack frame
> size (2092) exceeds limit (1024) in
> 'dml30_ModeSupportAndSystemConfigurationFull'
> > > I do note that commit 1b54a0121dba ("drm/amd/display: Reduce stack size
> > > in the mode support function") did have a workaround for GCC. It appears
> > > clang will still inline mode_support_configuration(). If I mark it as
> > > 'noinline', the warning disappears in that file.
> >
> > That'd be the best quick fix. I guess if we split out functions to fix
> > stack usage we should mark them as 'noinline' in the future to avoid
> > agressive compiler optimizations.
> 
> While splitting out sub-functions can help reduce the maximum stack
> usage, it seems that in this case it makes the actual problem worse:
> I see 2168 bytes for the combined
> dml32_ModeSupportAndSystemConfigurationFull(), but marking
> mode_support_configuration() as noinline gives me 1992 bytes
> for the outer function plus 384 bytes for the inner one. So it does
> avoid the warning (barely), but not the problem that the warning tries
> to point out.

I haven't had a chance to take a look at splitting things up yet, would
you recommend a different approach?

Cheers,
Nathan

Re: mainline build failure for x86_64 allmodconfig with clang

2022-08-04 Thread Nathan Chancellor

On Thu, Aug 04, 2022 at 02:59:01PM -0700, Linus Torvalds wrote:
> On Thu, Aug 4, 2022 at 1:43 PM Nathan Chancellor  wrote:
> >
> > I do note that commit 1b54a0121dba ("drm/amd/display: Reduce stack size
> > in the mode support function") did have a workaround for GCC. It appears
> > clang will still inline mode_support_configuration(). If I mark it as
> > 'noinline', the warning disappears in that file.
> 
> That sounds like probably the best option for now. Gcc does not inline
> that function (at least for allmodconfig builds in my testing), so if
> that makes clang match what gcc does, it seems a reasonable thing to
> do.

Sounds good. That solution only takes care of the warning in
display_mode_vba_32.c. I will try and come up with something similar for
the other two files tomorrow, unless the AMD folks beat me to it, since
they will know the driver better than I will ;)

Cheers,
Nathan

Re: mainline build failure for x86_64 allmodconfig with clang

2022-08-04 Thread Nathan Chancellor

On Thu, Aug 04, 2022 at 09:24:41PM +0200, Arnd Bergmann wrote:
> On Thu, Aug 4, 2022 at 8:52 PM Linus Torvalds
>  wrote:
> >
> > On Thu, Aug 4, 2022 at 11:37 AM Sudip Mukherjee (Codethink)
> >  wrote:cov_trace_cmp
> > >
> > > git bisect points to 3876a8b5e241 ("drm/amd/display: Enable building new 
> > > display engine with KCOV enabled").
> >
> > Ahh. So that was presumably why it was disabled before - because it
> > presumably does disgusting things that make KCOV generate even bigger
> > stack frames than it already has.
> >
> > Those functions do seem to have fairly big stack footprints already (I
> > didn't try to look into why, I assume it's partly due to aggressive
> > inlining, and probably some automatic structures on stack). But gcc
> > doesn't seem to make it all that much worse with KCOV (and my clang
> > build doesn't enable KCOV).
> >
> > So it's presumably some KCOV-vs-clang thing. Nathan?

Looks like Arnd beat me to it :)

> The dependency was originally added to avoid a link failure in 9d1d02ff3678
>  ("drm/amd/display: Don't build DCN1 when kcov is enabled") after I reported 
> the
> problem in 
> https://lists.freedesktop.org/archives/dri-devel/2018-August/186131.html
> 
> The commit from the bisection just turns off KCOV for the entire directory
> to avoid the link failure, so it's not actually a problem with KCOV vs clang,
> but I think a problem with clang vs badly written code that was obscured
> in allmodconfig builds prior to this.

Right, I do think the sanitizers make things worse here too, as those get
enabled with allmodconfig. I ran some really quick tests with allmodconfig and
a few instrumentation options flipped on/off:

allmodconfig (CONFIG_KASAN=y, CONFIG_KCSAN=n, CONFIG_KCOV=y, and 
CONFIG_UBSAN=y):

warning: stack frame size (2216) exceeds limit (2048) in 
'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
warning: stack frame size (2184) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
warning: stack frame size (2176) exceeds limit (2048) in 
'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]

allmodconfig + CONFIG_KASAN=n:

warning: stack frame size (2112) exceeds limit (2048) in 
'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]

allmodconfig + CONFIG_KCOV=n:

warning: stack frame size (2216) exceeds limit (2048) in 
'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
warning: stack frame size (2184) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
warning: stack frame size (2176) exceeds limit (2048) in 
'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]

allmodconfig + CONFIG_UBSAN=n:

warning: stack frame size (2584) exceeds limit (2048) in 
'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
warning: stack frame size (2680) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
warning: stack frame size (2352) exceeds limit (2048) in 
'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]

allmodconfig + CONFIG_KASAN=n + CONFIG_KCSAN=y + CONFIG_UBSAN=n:

warning: stack frame size (2504) exceeds limit (2048) in 
'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
warning: stack frame size (2600) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
warning: stack frame size (2264) exceeds limit (2048) in 
'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]

allmodconfig + CONFIG_KASAN=n + CONFIG_KCSAN=n + CONFIG_UBSAN=n:

warning: stack frame size (2072) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]

There might be other debugging configurations that make this worse too,
as I don't see those warnings on my distribution configuration.

> The dml30_ModeSupportAndSystemConfigurationFull() function exercises
> a few paths in the compiler that are otherwise rare. On thing it does is to
> pass up to 60 arguments to other functions, and it heavily uses float and
> double variables. Both of these make it rather fragile when it comes to
> unusual compiler options, so the files keep coming up whenever a new
> instrumentation feature gets added. There is probably some other flag
> in allmodconfig that we can disable to improve this again, but I have not
> checked this time.

I do notice that these files build with a non-configurable
-Wframe-large-than value:

$ rg frame_warn_flag drivers/gpu/drm/amd/display/dc/dml/Makefile
54:frame_warn_flag := -Wframe-larger-than=2048
70:CFLAGS_$(AMDDALPATH)/dc/dml/dcn30/display_mode_vba_30.o := $(dml_ccflags) 
$(frame_warn_flag)
72:CFLAGS_$(AMDDALPATH)/dc/dml/dcn31/display_mode_vba_31.o := $(dml_ccflags) 
$(frame_warn_flag)
76:CFLAGS_$(AMDDALPATH)/dc/dml/dcn32/display_mode_vba_32.o := $(dml_ccflags) 
$(frame_warn_flag)

I suppose that could just be bumped as a quick workaround? Two of tho

Re: [PATCH 02/40] drm/amd/display: Add SubVP required code

2022-07-06 Thread Nathan Chancellor

On Wed, Jul 06, 2022 at 03:38:57PM -0400, Alex Deucher wrote:
> On Wed, Jul 6, 2022 at 1:58 PM Nathan Chancellor  wrote:
> >
> > On Thu, Jun 30, 2022 at 03:12:44PM -0400, Rodrigo Siqueira wrote:
> > > From: Alvin Lee 
> > >
> > > This commit enables the SubVP feature. To achieve that, we need to:
> > >
> > > - Don't force p-state disallow on SubVP (can't block dummy p-state)
> > > - Send calculated watermark to DMCUB for SubVP
> > > - Adjust CAB mode message to PMFW
> > > - Add a proper locking sequence for SubVP
> > > - Various fixes to SubVP static analysis and determining SubVP config
> > > - Currently SubVP not supported with pipe split so merge all pipes
> > >   before setting up SubVp
> > >
> > > Reviewed-by: Jun Lei 
> > > Acked-by: Rodrigo Siqueira 
> > > Acked-by: Alan Liu 
> > > Signed-off-by: Alvin Lee 
> >
> > This patch is now in linux-next as commit 85f4bc0c333c
> > ("drm/amd/display: Add SubVP required code"), where it causes build
> > failures when building for arm64 with both Clang and GCC (see bisect log
> > below).
> >
> > Clang shows errors during modpost:
> >
> > ERROR: modpost: "__floatunsidf" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] 
> > undefined!
> > ERROR: modpost: "__divdf3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
> > ERROR: modpost: "fma" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
> > ERROR: modpost: "__adddf3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
> > ERROR: modpost: "__fixdfsi" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] 
> > undefined!
> > ERROR: modpost: "__muldf3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
> > ERROR: modpost: "__floatsidf" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] 
> > undefined!
> > ERROR: modpost: "__fixunsdfsi" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] 
> > undefined!
> >
> 
> I think the attached patch may fix this.

Indeed, I tested both arm64 and riscv:

Tested-by: Nathan Chancellor 

> > GCC shows errors along the lines of:
> >
> > In function 'populate_subvp_cmd_pipe_info',
> > inlined from 'dc_dmub_setup_subvp_dmub_command' at 
> > /home/nathan/cbl/src/linux-next/drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:675:5:
> > /home/nathan/cbl/src/linux-next/drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:603:91:
> >  error: '-mgeneral-regs-only' is incompatible with the use of 
> > floating-point types
> >   603 | 
> > (((double)dc->caps.subvp_prefetch_end_to_mall_start_us / 100) *
> >   |  
> > ~^
> >   604 | (phantom_timing->pix_clk_100hz * 100) + 
> > phantom_timing->h_total - 1) /
> >   | ~
> > /home/nathan/cbl/src/linux-next/drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:604:63:
> >  error: '-mgeneral-regs-only' is incompatible with the use of 
> > floating-point types
> >   603 | 
> > (((double)dc->caps.subvp_prefetch_end_to_mall_start_us / 100) *
> >   |  
> > ~~
> >   604 | (phantom_timing->pix_clk_100hz * 100) + 
> > phantom_timing->h_total - 1) /
> >   | 
> > ~~^
> > /home/nathan/cbl/src/linux-next/drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:602:72:
> >  error: '-mgeneral-regs-only' is incompatible with the use of 
> > floating-point types
> >   602 | 
> > pipe_data->pipe_config.subvp_data.prefetch_to_mall_start_lines =
> >   | 
> > ~~~^
> >   603 | 
> > (((double)dc->caps.subvp_prefetch_end_to_mall_start_us / 100) *
> >   | 
> > ~~~
> >   604 | (phantom_timing->pix_clk_100hz * 100) + 
> > phantom_timing->h_total - 1) /
> >   | 
> > ~~
> >   605 | (double)phantom_timing->h_total;
> >   |

Re: [PATCH 02/40] drm/amd/display: Add SubVP required code

2022-07-06 Thread Nathan Chancellor

On Thu, Jun 30, 2022 at 03:12:44PM -0400, Rodrigo Siqueira wrote:
> From: Alvin Lee 
> 
> This commit enables the SubVP feature. To achieve that, we need to:
> 
> - Don't force p-state disallow on SubVP (can't block dummy p-state)
> - Send calculated watermark to DMCUB for SubVP
> - Adjust CAB mode message to PMFW
> - Add a proper locking sequence for SubVP
> - Various fixes to SubVP static analysis and determining SubVP config
> - Currently SubVP not supported with pipe split so merge all pipes
>   before setting up SubVp
> 
> Reviewed-by: Jun Lei 
> Acked-by: Rodrigo Siqueira 
> Acked-by: Alan Liu 
> Signed-off-by: Alvin Lee 

This patch is now in linux-next as commit 85f4bc0c333c
("drm/amd/display: Add SubVP required code"), where it causes build
failures when building for arm64 with both Clang and GCC (see bisect log
below).

Clang shows errors during modpost:

ERROR: modpost: "__floatunsidf" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] 
undefined!
ERROR: modpost: "__divdf3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: modpost: "fma" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: modpost: "__adddf3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: modpost: "__fixdfsi" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: modpost: "__muldf3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: modpost: "__floatsidf" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: modpost: "__fixunsdfsi" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

GCC shows errors along the lines of:

In function 'populate_subvp_cmd_pipe_info',
inlined from 'dc_dmub_setup_subvp_dmub_command' at 
/home/nathan/cbl/src/linux-next/drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:675:5:
/home/nathan/cbl/src/linux-next/drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:603:91:
 error: '-mgeneral-regs-only' is incompatible with the use of floating-point 
types
  603 | 
(((double)dc->caps.subvp_prefetch_end_to_mall_start_us / 100) *
  |  
~^
  604 | (phantom_timing->pix_clk_100hz * 100) + 
phantom_timing->h_total - 1) /
  | ~
/home/nathan/cbl/src/linux-next/drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:604:63:
 error: '-mgeneral-regs-only' is incompatible with the use of floating-point 
types
  603 | 
(((double)dc->caps.subvp_prefetch_end_to_mall_start_us / 100) *
  |  
~~
  604 | (phantom_timing->pix_clk_100hz * 100) + 
phantom_timing->h_total - 1) /
  | 
~~^
/home/nathan/cbl/src/linux-next/drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:602:72:
 error: '-mgeneral-regs-only' is incompatible with the use of floating-point 
types
  602 | pipe_data->pipe_config.subvp_data.prefetch_to_mall_start_lines =
  | ~~~^
  603 | 
(((double)dc->caps.subvp_prefetch_end_to_mall_start_us / 100) *
  | 
~~~
  604 | (phantom_timing->pix_clk_100hz * 100) + 
phantom_timing->h_total - 1) /
  | 
~~
  605 | (double)phantom_timing->h_total;
  | ~~~

I initially reproduced this with Fedora's configuration [1] but it
appears that allmodconfig should show it as well. Our CI also shows
problems for ARCH=riscv allmodconfig [2].

I am happy to test patches as necessary.

[1]: 
https://src.fedoraproject.org/rpms/kernel/raw/rawhide/f/kernel-aarch64-fedora.config
[2]: https://builds.tuxbuild.com/2BZS5HPSuDdoMFw6mxjG2ZmT441/build.log

Cheers,
Nathan

# bad: [088b9c375534d905a4d337c78db3b3bfbb52c4a0] Add linux-next specific files 
for 20220706
# good: [e35e5b6f695d241ffb1d223207da58a1fbcdff4b] Merge tag 'xsa-5.19-tag' of 
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
git bisect start '088b9c375534d905a4d337c78db3b3bfbb52c4a0' 
'e35e5b6f695d241ffb1d223207da58a1fbcdff4b'
# good: [1a4255ede07a967e57115b54da5bd4b571d22a8c] Merge branch 
'for-linux-next' of git://anongit.freedesktop.org/drm/drm-misc
git bisect good 1a4255ede07a967e57115b54da5bd4b571d22a8c
# bad: [756b44529e2ab179e4dd6f6358b5c351e1bbe5d3] Merge branch 'rcu/next' of 
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
git bisect bad 756b44529e2ab179e4dd6f6358b5c351e1bbe5d3
# bad: [f26873a2fc786251765db3e0ced8e1424b862059] next-20220705/sound-asoc
git bisect bad f26873a2fc786251765db3e0ced8e1424b862

[PATCH] drm/amd/display: Fix indentation in dcn32_get_vco_frequency_from_reg()

2022-06-23 Thread Nathan Chancellor

Clang warns:

  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c:549:4: 
warning: misleading indentation; statement is not part of the previous 'else' 
[-Wmisleading-indentation]
  pll_req = dc_fixpt_from_int(pll_req_reg & 
clk_mgr->clk_mgr_mask->FbMult_int);
  ^
  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c:542:3: 
note: previous statement is here
  else
  ^
  1 warning generated.

Indent this statement to the left, as it was clearly intended to be
called unconditionally, which will fix the warning.

Link: https://github.com/ClangBuiltLinux/linux/issues/1655
Fixes: 3e838f7ccf64 ("drm/amd/display: Get VCO frequency from registers")
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
index 113f93b3d3b2..4e8059f20007 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
@@ -546,7 +546,7 @@ static uint32_t dcn32_get_vco_frequency_from_reg(struct 
clk_mgr_internal *clk_mg
 * this works because the int part is on the right edge of the 
register
 * and the frac part is on the left edge
 */
-   pll_req = dc_fixpt_from_int(pll_req_reg & 
clk_mgr->clk_mgr_mask->FbMult_int);
+   pll_req = dc_fixpt_from_int(pll_req_reg & 
clk_mgr->clk_mgr_mask->FbMult_int);
pll_req.value |= pll_req_reg & 
clk_mgr->clk_mgr_mask->FbMult_frac;
 
/* multiply by REFCLK period */

base-commit: fdf249c70a36e2daa7ddf1252cf3b71faed87abc
-- 
2.36.1

Re: [PATCH 16/31] drm/amd/display: refactor function transmitter_to_phy_id

2022-06-17 Thread Nathan Chancellor

Hi Rodrigo,

On Fri, Jun 17, 2022 at 03:34:57PM -0400, Rodrigo Siqueira wrote:
> From: Nicholas Choi 
> 
> [Why & How]
> Since we only need transmitter value in function transmitter_to_phy_id().
> Replace argument struct dc_link with enum transmitter.
> 
> Reviewed-by: Chao-kai Wang 
> Acked-by: Alan Liu 
> Reviewed-by: Nicholas Kazlauskas 
> Signed-off-by: Nathan Chancellor 
> Signed-off-by: Alex Deucher 

How did I end up in the signoff chain for a patch I have never seen up
until this point? That should definitely be cleaned up.

Additionally, this commit message doesn't really seem to line up with
the change. It says that "struct dc_link" is being replaced with "enum
transmitter", when it is really the reverse, and that only the
transmitter value is needed, which is already the case, right? I guess
this is so that you can use DC_ERROR(), which requires a dc_ctx
variable? It is not immediately obvious from the commit message so that
should be clarified as well.

Cheers,
Nathan

> ---
>  drivers/gpu/drm/amd/display/dc/core/dc_link.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> index 43b55bc6e2db..58882d42eff5 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> @@ -3185,8 +3185,11 @@ bool dc_link_get_psr_state(const struct dc_link *link, 
> enum dc_psr_state *state)
>  }
>  
>  static inline enum physical_phy_id
> -transmitter_to_phy_id(enum transmitter transmitter_value)
> +transmitter_to_phy_id(struct dc_link *link)
>  {
> + struct dc_context *dc_ctx = link->ctx;
> + enum transmitter transmitter_value = link->link_enc->transmitter;
> +
>   switch (transmitter_value) {
>   case TRANSMITTER_UNIPHY_A:
>   return PHYLD_0;
> @@ -3213,8 +3216,7 @@ transmitter_to_phy_id(enum transmitter 
> transmitter_value)
>   case TRANSMITTER_UNKNOWN:
>   return PHYLD_UNKNOWN;
>   default:
> - WARN_ONCE(1, "Unknown transmitter value %d\n",
> -   transmitter_value);
> + DC_ERROR("Unknown transmitter value %d\n", transmitter_value);
>   return PHYLD_UNKNOWN;
>   }
>  }
> @@ -3331,7 +,7 @@ bool dc_link_setup_psr(struct dc_link *link,
>   psr_context->phyType = PHY_TYPE_UNIPHY;
>   /*PhyId is associated with the transmitter id*/
>   psr_context->smuPhyId =
> - transmitter_to_phy_id(link->link_enc->transmitter);
> + transmitter_to_phy_id(link);
>  
>   psr_context->crtcTimingVerticalTotal = stream->timing.v_total;
>   psr_context->vsync_rate_hz = div64_u64(div64_u64((stream->
> -- 
> 2.25.1
> 
>

Re: linux-next: Tree for Jun 15 (drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c)

2022-06-15 Thread Nathan Chancellor

On Wed, Jun 15, 2022 at 04:45:16PM -0400, Alex Deucher wrote:
> On Wed, Jun 15, 2022 at 4:24 PM Nathan Chancellor  wrote:
> >
> > On Wed, Jun 15, 2022 at 03:28:52PM -0400, Alex Deucher wrote:
> > > On Wed, Jun 15, 2022 at 3:01 PM Randy Dunlap  
> > > wrote:
> > > >
> > > >
> > > >
> > > > On 6/14/22 23:01, Stephen Rothwell wrote:
> > > > > Hi all,
> > > > >
> > > > > Changes since 20220614:
> > > > >
> > > >
> > > > on i386:
> > > > # CONFIG_DEBUG_FS is not set
> > > >
> > > >
> > > > ../drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In 
> > > > function ‘amdgpu_dm_crtc_late_register’:
> > > > ../drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:6599:2: 
> > > > error: implicit declaration of function ‘crtc_debugfs_init’; did you 
> > > > mean ‘amdgpu_debugfs_init’? [-Werror=implicit-function-declaration]
> > > >   crtc_debugfs_init(crtc);
> > > >   ^
> > > >   amdgpu_debugfs_init
> > > >
> > > >
> > > > Full randconfig file is attached.
> > >
> > > I tried building with your config and I can't repro this.  As Harry
> > > noted, that function and the whole secure display feature depend on
> > > debugfs.  It should never be built without CONFIG_DEBUG_FS.  See
> > > drivers/gpu/drm/amd/display/Kconfig:
> > >
> > > > config DRM_AMD_SECURE_DISPLAY
> > > > bool "Enable secure display support"
> > > > default n
> > > > depends on DEBUG_FS
> > > > depends on DRM_AMD_DC_DCN
> > > > help
> > > > Choose this option if you want to
> > > > support secure display
> > > >
> > > > This option enables the calculation
> > > > of crc of specific region via debugfs.
> > > > Cooperate with specific DMCU FW.
> > >
> > > amdgpu_dm_crtc_late_register is guarded by
> > > CONIG_DRM_AMD_SECURE_DISPLAY.  It's not clear to me how we could hit
> > > this.
> >
> > I think the problem is that you are not looking at the right tree.
> >
> > The kernel test robot reported [1] [2] this error is caused by commit
> > 4cd79f614b50 ("drm/amd/display: Move connector debugfs to drm"), which
> > is in the drm-misc tree on the drm-misc-next branch. That change removes
> > the #ifdef around amdgpu_dm_crtc_late_register(), meaning that
> > crtc_debugfs_init() can be called without CONFIG_DRM_AMD_SECURE_DISPLAY
> > and CONFIG_DEBUG_FS.
> >
> >   $ git show -s --format='%h ("%s")'
> >   abf0ba5a34ea ("drm/bridge: it6505: Add missing CRYPTO_HASH dependency")
> >
> >   $ make -skj"$(nproc)" ARCH=x86_64 mrproper defconfig
> >
> >   $ scripts/config -d BLK_DEV_IO_TRACE -d DEBUG_FS -e DRM_AMDGPU
> >
> >   $ make -skj"$(nproc)" ARCH=x86_64 olddefconfig 
> > drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.o
> >   drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In function 
> > ‘amdgpu_dm_crtc_late_register’:
> >   drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:6622:9: 
> > error: implicit declaration of function ‘crtc_debugfs_init’; did you mean 
> > ‘amdgpu_debugfs_init’? [-Werror=implicit-function-declaration]
> >6622 | crtc_debugfs_init(crtc);
> > | ^
> > | amdgpu_debugfs_init
> >   cc1: all warnings being treated as errors
> >
> > Contrast that with the current top of your tree:
> >
> >   $ git show -s --format='%h ("%s")'
> >   c435f61d0eb3 ("drm/amd/display: Drop unnecessary guard from DC resource")
> >
> >   $ make -skj"$(nproc)" ARCH=x86_64 mrproper defconfig
> >
> >   $ scripts/config -d BLK_DEV_IO_TRACE -d DEBUG_FS -e DRM_AMDGPU
> >
> >   $ make -skj"$(nproc)" ARCH=x86_64 olddefconfig 
> > drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.o
> >
> >   $ echo $?
> >   0
> >
> > Randy's patch [3] seems like it should resolve the issue just fine but
> > it needs to be applied to drm-misc-next, not the amdgpu tree.
> 
> Thanks for tracking this down.  I think something like the attached
> patch is cleaner since the whole thing is only vali

Re: linux-next: Tree for Jun 15 (drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c)

2022-06-15 Thread Nathan Chancellor

On Wed, Jun 15, 2022 at 03:28:52PM -0400, Alex Deucher wrote:
> On Wed, Jun 15, 2022 at 3:01 PM Randy Dunlap  wrote:
> >
> >
> >
> > On 6/14/22 23:01, Stephen Rothwell wrote:
> > > Hi all,
> > >
> > > Changes since 20220614:
> > >
> >
> > on i386:
> > # CONFIG_DEBUG_FS is not set
> >
> >
> > ../drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In function 
> > ‘amdgpu_dm_crtc_late_register’:
> > ../drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:6599:2: 
> > error: implicit declaration of function ‘crtc_debugfs_init’; did you mean 
> > ‘amdgpu_debugfs_init’? [-Werror=implicit-function-declaration]
> >   crtc_debugfs_init(crtc);
> >   ^
> >   amdgpu_debugfs_init
> >
> >
> > Full randconfig file is attached.
> 
> I tried building with your config and I can't repro this.  As Harry
> noted, that function and the whole secure display feature depend on
> debugfs.  It should never be built without CONFIG_DEBUG_FS.  See
> drivers/gpu/drm/amd/display/Kconfig:
> 
> > config DRM_AMD_SECURE_DISPLAY
> > bool "Enable secure display support"
> > default n
> > depends on DEBUG_FS
> > depends on DRM_AMD_DC_DCN
> > help
> > Choose this option if you want to
> > support secure display
> >
> > This option enables the calculation
> > of crc of specific region via debugfs.
> > Cooperate with specific DMCU FW.
> 
> amdgpu_dm_crtc_late_register is guarded by
> CONIG_DRM_AMD_SECURE_DISPLAY.  It's not clear to me how we could hit
> this.

I think the problem is that you are not looking at the right tree.

The kernel test robot reported [1] [2] this error is caused by commit
4cd79f614b50 ("drm/amd/display: Move connector debugfs to drm"), which
is in the drm-misc tree on the drm-misc-next branch. That change removes
the #ifdef around amdgpu_dm_crtc_late_register(), meaning that
crtc_debugfs_init() can be called without CONFIG_DRM_AMD_SECURE_DISPLAY
and CONFIG_DEBUG_FS.

  $ git show -s --format='%h ("%s")'
  abf0ba5a34ea ("drm/bridge: it6505: Add missing CRYPTO_HASH dependency")

  $ make -skj"$(nproc)" ARCH=x86_64 mrproper defconfig

  $ scripts/config -d BLK_DEV_IO_TRACE -d DEBUG_FS -e DRM_AMDGPU

  $ make -skj"$(nproc)" ARCH=x86_64 olddefconfig 
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.o
  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In function 
‘amdgpu_dm_crtc_late_register’:
  drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:6622:9: error: 
implicit declaration of function ‘crtc_debugfs_init’; did you mean 
‘amdgpu_debugfs_init’? [-Werror=implicit-function-declaration]
   6622 | crtc_debugfs_init(crtc);
| ^
| amdgpu_debugfs_init
  cc1: all warnings being treated as errors

Contrast that with the current top of your tree:

  $ git show -s --format='%h ("%s")'
  c435f61d0eb3 ("drm/amd/display: Drop unnecessary guard from DC resource")

  $ make -skj"$(nproc)" ARCH=x86_64 mrproper defconfig

  $ scripts/config -d BLK_DEV_IO_TRACE -d DEBUG_FS -e DRM_AMDGPU

  $ make -skj"$(nproc)" ARCH=x86_64 olddefconfig 
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.o

  $ echo $?
  0

Randy's patch [3] seems like it should resolve the issue just fine but
it needs to be applied to drm-misc-next, not the amdgpu tree.

[1]: https://lore.kernel.org/202205241843.8ewkesia-...@intel.com/
[2]: https://lore.kernel.org/202205240207.kmdlusrc-...@intel.com/
[3]: https://lore.kernel.org/20220614155726.26211-1-rdun...@infradead.org/

Cheers,
Nathan

Re: [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems

2022-04-13 Thread Nathan Chancellor

Hi Richard,

On Tue, Apr 12, 2022 at 04:50:00PM -0500, Richard Gong wrote:
> Active State Power Management (ASPM) feature is enabled since kernel 5.14.
> There are some AMD GFX cards (such as WX3200 and RX640) that won't work
> with ASPM-enabled Intel Alder Lake based systems. Using these GFX cards as
> video/display output, Intel Alder Lake based systems will hang during
> suspend/resume.
> 
> The issue was initially reported on one system (Dell Precision 3660 with
> BIOS version 0.14.81), but was later confirmed to affect at least 4 Alder
> Lake based systems.
> 
> Add extra check to disable ASPM on Intel Alder Lake based systems.
> 
> Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
> Reported-by: kernel test robot 
> Signed-off-by: Richard Gong 
> ---
> v4: s/CONFIG_X86_64/CONFIG_X86
> enhanced check logic
> v3: s/intel_core_asom_chk/aspm_support_quirk_check
> correct build error with W=1 option
> v2: correct commit description
> move the check from chip family to problematic platform
> ---
>  drivers/gpu/drm/amd/amdgpu/vi.c | 17 -
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
> index 039b90cdc3bc..b33e0a9bee65 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
> @@ -81,6 +81,10 @@
>  #include "mxgpu_vi.h"
>  #include "amdgpu_dm.h"
>  
> +#if IS_ENABLED(CONFIG_X86)
> +#include 
> +#endif
> +
>  #define ixPCIE_LC_L1_PM_SUBSTATE 0x100100C6
>  #define PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK 
> 0x0001L
>  #define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK 0x0002L
> @@ -1134,13 +1138,24 @@ static void vi_enable_aspm(struct amdgpu_device *adev)
>   WREG32_PCIE(ixPCIE_LC_CNTL, data);
>  }
>  
> +static bool aspm_support_quirk_check(void)
> +{
> + if (IS_ENABLED(CONFIG_X86)) {
> + struct cpuinfo_x86 *c = &cpu_data(0);
> +
> + return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
> + }

I have not seen this reported by a bot, sorry if it is a duplicate. This
breaks non-x86 builds (arm64 allmodconfig for example):

drivers/gpu/drm/amd/amdgpu/vi.c:1144:28: error: implicit declaration of 
function 'cpu_data' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
struct cpuinfo_x86 *c = &cpu_data(0);
 ^
drivers/gpu/drm/amd/amdgpu/vi.c:1144:27: error: cannot take the address of an 
rvalue of type 'int'
struct cpuinfo_x86 *c = &cpu_data(0);
^~~~
drivers/gpu/drm/amd/amdgpu/vi.c:1146:13: error: incomplete definition of type 
'struct cpuinfo_x86'
return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
 ~^
drivers/gpu/drm/amd/amdgpu/vi.c:1144:10: note: forward declaration of 'struct 
cpuinfo_x86'
struct cpuinfo_x86 *c = &cpu_data(0);
   ^
drivers/gpu/drm/amd/amdgpu/vi.c:1146:28: error: incomplete definition of type 
'struct cpuinfo_x86'
return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
~^
drivers/gpu/drm/amd/amdgpu/vi.c:1144:10: note: forward declaration of 'struct 
cpuinfo_x86'
struct cpuinfo_x86 *c = &cpu_data(0);
   ^
drivers/gpu/drm/amd/amdgpu/vi.c:1146:43: error: use of undeclared identifier 
'INTEL_FAM6_ALDERLAKE'
return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
^
5 errors generated.

'struct cpuinfo_x86' is only defined for CONFIG_X86 so this section
needs to guarded with the preprocessor, which is how it was done in v2.
Please go back to that.

Cheers,
Nathan

[PATCH 5.4 2/2] drm/amdkfd: Fix -Wstrict-prototypes from amdgpu_amdkfd_gfx_10_0_get_functions()

2022-04-11 Thread Nathan Chancellor

This patch is for linux-5.4.y only, it has no equivalent change
upstream.

When building x86_64 allmodconfig with tip of tree clang, there is an
instance of -Wstrict-prototypes:

  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c:168:59: error: a function 
declaration without a prototype is deprecated in all versions of C 
[-Werror,-Wstrict-prototypes]
  struct kfd2kgd_calls *amdgpu_amdkfd_gfx_10_0_get_functions()
^
 void
  1 error generated.

amdgpu_amdkfd_gfx_10_0_get_functions() is prototyped properly in
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h but its definition in
amdgpu_amdkfd_gfx_v10.c does not have the argument types specified,
which causes the warning. GCC does not warn because it permits an
old-style definition if the prototype has the argument types.

This code was eliminated by commit e392c887df97 ("drm/amdkfd: Use array
to probe kfd2kgd_calls"), which was a part of a larger series that does
not look very suitable for stable. Just fix this one location, as it was
the only instance of this new warning across a variety of builds.

Fixes: 6bdadb207224 ("drm/amdgpu: Add navi10 kfd support for amdgpu (v3)")
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
index ce30d4e8bf25..f7c4337c1ffe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
@@ -165,7 +165,7 @@ static const struct kfd2kgd_calls kfd2kgd = {
.get_tile_config = amdgpu_amdkfd_get_tile_config,
 };
 
-struct kfd2kgd_calls *amdgpu_amdkfd_gfx_10_0_get_functions()
+struct kfd2kgd_calls *amdgpu_amdkfd_gfx_10_0_get_functions(void)
 {
return (struct kfd2kgd_calls *)&kfd2kgd;
 }
-- 
2.35.1

[PATCH 5.4 1/2] drm/amdkfd: add missing void argument to function kgd2kfd_init

2022-04-11 Thread Nathan Chancellor

From: Colin Ian King 

commit 63617d8b125ed9f674133dd000b6df58d6b2965a upstream.

Function kgd2kfd_init is missing a void argument, add it
to clean up the non-ANSI function declaration.

Acked-by: Randy Dunlap 
Signed-off-by: Colin Ian King 
Signed-off-by: Alex Deucher 
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/amdkfd/kfd_module.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 986ff52d5750..f4b7f7e6c40e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -82,7 +82,7 @@ static void kfd_exit(void)
kfd_chardev_exit();
 }
 
-int kgd2kfd_init()
+int kgd2kfd_init(void)
 {
return kfd_init();
 }
-- 
2.35.1

[PATCH 5.4 0/2] Fix two instances of -Wstrict-prototypes in drm/amd

2022-04-11 Thread Nathan Chancellor

Hi everyone,

These two patches resolve two instances of -Wstrict-prototypes with
newer versions of clang that are present in 5.4. The main Makefile makes
this a hard error.

The first patch is upstream commit 63617d8b125e ("drm/amdkfd: add
missing void argument to function kgd2kfd_init"), which showed up in
5.5.

The second patch has no upstream equivalent, as the code in question was
removed in commit e392c887df97 ("drm/amdkfd: Use array to probe
kfd2kgd_calls") upstream, which is part of a larger series that did not
look reasonable for stable. I opted to just fix the warning in the same
manner as the prior patch, which is less risky and accomplishes the same
end result of no warning.

Colin Ian King (1):
  drm/amdkfd: add missing void argument to function kgd2kfd_init

Nathan Chancellor (1):
  drm/amdkfd: Fix -Wstrict-prototypes from
amdgpu_amdkfd_gfx_10_0_get_functions()

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_module.c| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)


base-commit: 2845ff3fd34499603249676495c524a35e795b45
-- 
2.35.1

Re: [PATCH] drm/amd/display: fix 64 bit divide in freesync code

2022-04-08 Thread Nathan Chancellor

On Fri, Apr 08, 2022 at 05:04:55PM -0400, Alex Deucher wrote:
> Use div_u64() rather than a a 64 bit divide.
> 
> Fixes: 3fe5739db48843 ("drm/amd/display: Add flip interval workaround")
> Reported-by: kernel test robot 
> Signed-off-by: Alex Deucher 
> Cc: Angus Wang 
> Cc: Anthony Koo 
> Cc: Aric Cyr 
> Cc: Nathan Chancellor 

This resolves the build failure for me.

Reviewed-by: Nathan Chancellor 

> ---
>  drivers/gpu/drm/amd/display/modules/freesync/freesync.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c 
> b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
> index 0130f1879116..d2d76ce56f89 100644
> --- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
> +++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
> @@ -1239,7 +1239,7 @@ void mod_freesync_handle_v_update(struct mod_freesync 
> *mod_freesync,
>   if (in_out_vrr->supported == false)
>   return;
>  
> - cur_timestamp_in_us = dm_get_timestamp(core_freesync->dc->ctx)/10;
> + cur_timestamp_in_us = div_u64(dm_get_timestamp(core_freesync->dc->ctx), 
> 10);
>  
>   in_out_vrr->flip_interval.vsyncs_between_flip++;
>   in_out_vrr->flip_interval.v_update_timestamp_in_us = 
> cur_timestamp_in_us;
> -- 
> 2.35.1
>

Re: [PATCH] drm/amd/display: fix 64 bit divide in freesync code

2022-04-07 Thread Nathan Chancellor

On Thu, Apr 07, 2022 at 03:50:29PM -0400, Alex Deucher wrote:
> Use do_div() rather than a a 64 bit divide.
> 
> Fixes: 3fe5739db48843 ("drm/amd/display: Add flip interval workaround")
> Reported-by: kernel test robot 
> Signed-off-by: Alex Deucher 
> Cc: Angus Wang 
> Cc: Anthony Koo 
> Cc: Aric Cyr 
> ---
>  drivers/gpu/drm/amd/display/modules/freesync/freesync.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c 
> b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
> index 0130f1879116..70f06fa8cc2b 100644
> --- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
> +++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
> @@ -1230,6 +1230,7 @@ void mod_freesync_handle_v_update(struct mod_freesync 
> *mod_freesync,
>  {
>   struct core_freesync *core_freesync = NULL;
>   unsigned int cur_timestamp_in_us;
> + unsigned long long tmp;
>  
>   if ((mod_freesync == NULL) || (stream == NULL) || (in_out_vrr == NULL))
>   return;
> @@ -1239,7 +1240,9 @@ void mod_freesync_handle_v_update(struct mod_freesync 
> *mod_freesync,
>   if (in_out_vrr->supported == false)
>   return;
>  
> - cur_timestamp_in_us = dm_get_timestamp(core_freesync->dc->ctx)/10;
> + tmp = dm_get_timestamp(core_freesync->dc->ctx);
> + do_div(tmp, 10);
> + cur_timestamp_in_us = tmp;

Any reason not to use

cur_timestamp_in_us = div_u64(dm_get_timestamp(core_freesync->dc->ctx), 10)

and save a variable?

>   in_out_vrr->flip_interval.vsyncs_between_flip++;
>   in_out_vrr->flip_interval.v_update_timestamp_in_us = 
> cur_timestamp_in_us;
> -- 
> 2.35.1
> 
>

Re: [PATCH v2] drm/amkfd: bail out early if no get_atc_vmid_pasid_mapping_info

2022-03-09 Thread Nathan Chancellor

On Wed, Mar 09, 2022 at 10:22:42AM +0800, Yifan Zhang wrote:
> it makes no sense to continue with an undefined vmid.
> 
> Fixes: d21bcfc01eb1 (drm/amdkfd: judge get_atc_vmid_pasid_mapping_info before 
> call)
> 
> Signed-off-by: Yifan Zhang 
> Reported-by: Nathan Chancellor 

Thank you for the quick fix!

Reviewed-by: Nathan Chancellor 

> ---
>  .../drm/amd/amdkfd/kfd_device_queue_manager.c | 21 +++
>  1 file changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 77364afdc606..acf4f7975850 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -500,21 +500,24 @@ static int dbgdev_wave_reset_wavefronts(struct kfd_dev 
> *dev, struct kfd_process
>  
>   pr_debug("Killing all process wavefronts\n");
>  
> + if (!dev->kfd2kgd->get_atc_vmid_pasid_mapping_info) {
> + pr_err("no vmid pasid mapping supported \n");
> + return -EOPNOTSUPP;
> + }
> +
>   /* Scan all registers in the range ATC_VMID8_PASID_MAPPING ..
>* ATC_VMID15_PASID_MAPPING
>* to check which VMID the current process is mapped to.
>*/
>  
> - if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_info) {
> - for (vmid = first_vmid_to_scan; vmid <= last_vmid_to_scan; 
> vmid++) {
> - status = dev->kfd2kgd->get_atc_vmid_pasid_mapping_info
> - (dev->adev, vmid, &queried_pasid);
> + for (vmid = first_vmid_to_scan; vmid <= last_vmid_to_scan; vmid++) {
> + status = dev->kfd2kgd->get_atc_vmid_pasid_mapping_info
> + (dev->adev, vmid, &queried_pasid);
>  
> - if (status && queried_pasid == p->pasid) {
> - pr_debug("Killing wave fronts of vmid %d and 
> pasid 0x%x\n",
> - vmid, p->pasid);
> - break;
> - }
> + if (status && queried_pasid == p->pasid) {
> + pr_debug("Killing wave fronts of vmid %d and pasid 
> 0x%x\n",
> + vmid, p->pasid);
> + break;
>   }
>   }
>  
> -- 
> 2.25.1
>

Re: [PATCH 1/2] drm/amdkfd: judge get_atc_vmid_pasid_mapping_info before call

2022-03-08 Thread Nathan Chancellor

On Thu, Mar 03, 2022 at 04:05:13PM +0800, Yifan Zhang wrote:
> Fix the NULL point issue:
> 
> [ 3076.255609] BUG: kernel NULL pointer dereference, address: 
> [ 3076.255624] #PF: supervisor instruction fetch in kernel mode
> [ 3076.255637] #PF: error_code(0x0010) - not-present page
> [ 3076.255649] PGD 0 P4D 0
> [ 3076.255660] Oops: 0010 [#1] SMP NOPTI
> [ 3076.255669] CPU: 20 PID: 2415 Comm: kfdtest Tainted: GW  OE 
> 5.11.0-41-generic #45~20.04.1-Ubuntu
> [ 3076.255691] Hardware name: AMD Splinter/Splinter-RPL, BIOS VS2326337N.FD 
> 02/07/2022
> [ 3076.255706] RIP: 0010:0x0
> [ 3076.255718] Code: Unable to access opcode bytes at RIP 0xffd6.
> [ 3076.255732] RSP: 0018:b64283e3fc10 EFLAGS: 00010297
> [ 3076.255744] RAX:  RBX: 0008 RCX: 
> 0027
> [ 3076.255759] RDX: b64283e3fc1e RSI: 0008 RDI: 
> 8c7a87f6
> [ 3076.255776] RBP: b64283e3fc78 R08: 8c7d88518ac0 R09: 
> b64283e3fa60
> [ 3076.255791] R10: 0001 R11: 0001 R12: 
> 000f
> [ 3076.255805] R13: 8c7bdcea5800 R14: 8c7a9f3f3000 R15: 
> 8c7a8696bc00
> [ 3076.255820] FS:  () GS:8c7d8850() 
> knlGS:
> [ 3076.255839] CS:  0010 DS:  ES:  CR0: 80050033
> [ 3076.255851] CR2: ffd6 CR3: 000109e3c000 CR4: 
> 00750ee0
> [ 3076.255866] PKRU: 5554
> [ 3076.255873] Call Trace:
> [ 3076.255884]  dbgdev_wave_reset_wavefronts+0x72/0x160 [amdgpu]
> [ 3076.256025]  process_termination_cpsch.cold+0x26/0x2f [amdgpu]
> [ 3076.256182]  ? ktime_get_mono_fast_ns+0x4e/0xa0
> [ 3076.256196]  kfd_process_dequeue_from_all_devices+0x49/0x70 [amdgpu]
> [ 3076.256328]  kfd_process_notifier_release+0x187/0x2b0 [amdgpu]
> [ 3076.256451]  ? mn_itree_inv_end+0xdc/0x110
> [ 3076.256463]  __mmu_notifier_release+0x74/0x1f0
> [ 3076.256474]  exit_mmap+0x170/0x200
> [ 3076.256484]  ? __handle_mm_fault+0x677/0x920
> [ 3076.256496]  ? _cond_resched+0x19/0x30
> [ 3076.256507]  mmput+0x5d/0x130
> [ 3076.256518]  do_exit+0x332/0xaf0
> [ 3076.256526]  ? handle_mm_fault+0xd7/0x2b0
> [ 3076.256537]  do_group_exit+0x43/0xa0
> [ 3076.256548]  __x64_sys_exit_group+0x18/0x20
> [ 3076.256559]  do_syscall_64+0x38/0x90
> [ 3076.256569]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Signed-off-by: Yifan Zhang 
> ---
>  .../drm/amd/amdkfd/kfd_device_queue_manager.c  | 18 ++
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 1cd2ea536bd0..77364afdc606 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -505,14 +505,16 @@ static int dbgdev_wave_reset_wavefronts(struct kfd_dev 
> *dev, struct kfd_process
>* to check which VMID the current process is mapped to.
>*/
>  
> - for (vmid = first_vmid_to_scan; vmid <= last_vmid_to_scan; vmid++) {
> - status = dev->kfd2kgd->get_atc_vmid_pasid_mapping_info
> - (dev->adev, vmid, &queried_pasid);
> -
> - if (status && queried_pasid == p->pasid) {
> - pr_debug("Killing wave fronts of vmid %d and pasid 
> 0x%x\n",
> - vmid, p->pasid);
> - break;
> + if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_info) {
> + for (vmid = first_vmid_to_scan; vmid <= last_vmid_to_scan; 
> vmid++) {
> + status = dev->kfd2kgd->get_atc_vmid_pasid_mapping_info
> + (dev->adev, vmid, &queried_pasid);
> +
> + if (status && queried_pasid == p->pasid) {
> + pr_debug("Killing wave fronts of vmid %d and 
> pasid 0x%x\n",
> + vmid, p->pasid);
> + break;
> + }
>   }
>   }
>  
> -- 
> 2.25.1
> 
> 

Apologies if this has been reported and fixed already, I have not seen
it if it has.

This patch as commit c8b0507f40de ("drm/amdkfd: judge
get_atc_vmid_pasid_mapping_info before call") in -next causes the
following clang warning, which appears to be legitimate.

  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:508:6: error: 
variable 'vmid' is used uninitialized whenever 'if' condition is false 
[-Werror,-Wsometimes-uninitialized]
  if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_info) {
  ^
  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:521:6: note: 
uninitialized use occurs here
  if (vmid > last_vmid_to_scan) {
  ^~~~
  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:508:2: note: 
remove the 'if' if its condition is always true

[PATCH] drm/amdkfd: Use proper enum in pm_unmap_queues_v9()

2022-02-17 Thread Nathan Chancellor

Clang warns:

  drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_packet_manager_v9.c:267:3:
  error: implicit conversion from enumeration type 'enum
  mes_map_queues_extended_engine_sel_enum' to different enumeration type
  'enum mes_unmap_queues_extended_engine_sel_enum'
  [-Werror,-Wenum-conversion]
  extended_engine_sel__mes_map_queues__sdma0_to_7_sel :
  ^~~
  1 error generated.

Use 'extended_engine_sel__mes_unmap_queues__sdma0_to_7_sel' to eliminate
the warning, which is the same numeric value of the proper type.

Fixes: 009e9a158505 ("drm/amdkfd: navi2x requires extended engines to map and 
unmap sdma queues")
Link: https://github.com/ClangBuiltLinux/linux/issues/1596
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
index 806a03566a24..18250845a989 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
@@ -264,7 +264,7 @@ static int pm_unmap_queues_v9(struct packet_manager *pm, 
uint32_t *buffer,
sizeof(struct pm4_mes_unmap_queues));
 
packet->bitfields2.extended_engine_sel = pm_use_ext_eng(pm->dqm->dev) ?
-   extended_engine_sel__mes_map_queues__sdma0_to_7_sel :
+   extended_engine_sel__mes_unmap_queues__sdma0_to_7_sel :
extended_engine_sel__mes_unmap_queues__legacy_engine_sel;
 
packet->bitfields2.engine_sel =

base-commit: 3c30cf91b5ecc7272b3d2942ae0505dd8320b81c
-- 
2.35.1

Re: [PATCH] drm/amd/pm: fix enabled features retrieving on Renoir and Cyan Skillfish

2022-02-09 Thread Nathan Chancellor

On Thu, Feb 10, 2022 at 09:47:00AM +0800, Evan Quan wrote:
> For Cyan Skillfish and Renoir, there is no interface provided by PMFW
> to retrieve the enabled features. So, we assume all features are enabled.
> 
> Fixes: 7ade3ca9cdb5 ("drm/amd/pm: correct the usage for 'supported' member of 
> smu_feature structure")
> 
> Signed-off-by: Evan Quan 
> Change-Id: I1231f146405a229a11aa7ac608c8c932d3c90ee4

Tested-by: Nathan Chancellor 

> --
> v1->v2:
>   - add back the logic for supporting those ASICs which have
> no feature_map available
> v2->v3:
>   - update the check for smu_cmn_feature_is_enabled to use a more
> generic way instead of asic type
> 
> Change-Id: I7dfa453ffc086f5364848f7f32decd57a5a5b0e6
> ---
>  .../amd/pm/swsmu/smu11/cyan_skillfish_ppt.c   |  1 +
>  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 27 ++-
>  drivers/gpu/drm/amd/pm/swsmu/smu_internal.h   |  2 +-
>  3 files changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
> index 2b38a9154dd4..b3a0f3fb3e65 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
> @@ -562,6 +562,7 @@ static const struct pptable_funcs 
> cyan_skillfish_ppt_funcs = {
>   .fini_smc_tables = smu_v11_0_fini_smc_tables,
>   .read_sensor = cyan_skillfish_read_sensor,
>   .print_clk_levels = cyan_skillfish_print_clk_levels,
> + .get_enabled_mask = smu_cmn_get_enabled_mask,
>   .is_dpm_running = cyan_skillfish_is_dpm_running,
>   .get_gpu_metrics = cyan_skillfish_get_gpu_metrics,
>   .od_edit_dpm_table = cyan_skillfish_od_edit_dpm_table,
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index 2a6b752a6996..4c12abcd995d 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -500,7 +500,17 @@ int smu_cmn_feature_is_enabled(struct smu_context *smu,
>   uint64_t enabled_features;
>   int feature_id;
>  
> - if (smu->is_apu && adev->family < AMDGPU_FAMILY_VGH)
> + if (smu_cmn_get_enabled_mask(smu, &enabled_features)) {
> + dev_err(adev->dev, "Failed to retrieve enabled ppfeatures!\n");
> + return 0;
> + }
> +
> + /*
> +  * For Renoir and Cyan Skillfish, they are assumed to have all features
> +  * enabled. Also considering they have no feature_map available, the
> +  * check here can avoid unwanted feature_map check below.
> +  */
> + if (enabled_features == ULLONG_MAX)
>   return 1;
>  
>   feature_id = smu_cmn_to_asic_specific_index(smu,
> @@ -509,11 +519,6 @@ int smu_cmn_feature_is_enabled(struct smu_context *smu,
>   if (feature_id < 0)
>   return 0;
>  
> - if (smu_cmn_get_enabled_mask(smu, &enabled_features)) {
> - dev_err(adev->dev, "Failed to retrieve enabled ppfeatures!\n");
> - return 0;
> - }
> -
>   return test_bit(feature_id, (unsigned long *)&enabled_features);
>  }
>  
> @@ -559,7 +564,7 @@ int smu_cmn_get_enabled_mask(struct smu_context *smu,
>   feature_mask_high = &((uint32_t *)feature_mask)[1];
>  
>   switch (adev->ip_versions[MP1_HWIP][0]) {
> - case IP_VERSION(11, 0, 8):
> + /* For Vangogh and Yellow Carp */
>   case IP_VERSION(11, 5, 0):
>   case IP_VERSION(13, 0, 1):
>   case IP_VERSION(13, 0, 3):
> @@ -575,8 +580,16 @@ int smu_cmn_get_enabled_mask(struct smu_context *smu,
> 1,
> feature_mask_high);
>   break;
> + /*
> +  * For Cyan Skillfish and Renoir, there is no interface provided by PMFW
> +  * to retrieve the enabled features. So, we assume all features are 
> enabled.
> +  * TODO: add other APU ASICs which suffer from the same issue here
> +  */
> + case IP_VERSION(11, 0, 8):
>   case IP_VERSION(12, 0, 0):
>   case IP_VERSION(12, 0, 1):
> + memset(feature_mask, 0xff, sizeof(*feature_mask));
> + break;
>   /* other dGPU ASICs */
>   default:
>   ret = smu_cmn_send_smc_msg(smu,
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_internal.h 
> b/drivers/gpu/drm/amd/pm/swsmu/smu_internal.h
> index 530be44e00ec..15bcf72b8e56 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_internal.h
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_internal.h
> @@ -55,7 +55,

Re: [PATCH V3 4/7] drm/amd/pm: correct the usage for 'supported' member of smu_feature structure

2022-02-08 Thread Nathan Chancellor

Hi Evan,

On Fri, Jan 28, 2022 at 03:04:52PM +0800, Evan Quan wrote:
> The supported features should be retrieved just after EnableAllDpmFeatures 
> message
> complete. And the check(whether some dpm feature is supported) is only needed 
> when we
> decide to enable or disable it.
> 
> Signed-off-by: Evan Quan 
> Change-Id: I07c9a5ac5290cd0d88a40ce1768d393156419b5a
> ---
>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 11 +++
>  drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |  8 
>  .../gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 10 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c|  3 ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  |  5 +
>  drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c|  3 ---
>  drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c  |  3 ---
>  7 files changed, 21 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> index ae48cc5aa567..803068cb5079 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> @@ -1057,8 +1057,10 @@ static int smu_get_thermal_temperature_range(struct 
> smu_context *smu)
>  
>  static int smu_smc_hw_setup(struct smu_context *smu)
>  {
> + struct smu_feature *feature = &smu->smu_feature;
>   struct amdgpu_device *adev = smu->adev;
>   uint32_t pcie_gen = 0, pcie_width = 0;
> + uint64_t features_supported;
>   int ret = 0;
>  
>   if (adev->in_suspend && smu_is_dpm_running(smu)) {
> @@ -1138,6 +1140,15 @@ static int smu_smc_hw_setup(struct smu_context *smu)
>   return ret;
>   }
>  
> + ret = smu_feature_get_enabled_mask(smu, &features_supported);
> + if (ret) {
> + dev_err(adev->dev, "Failed to retrieve supported dpm 
> features!\n");
> + return ret;
> + }
> + bitmap_copy(feature->supported,
> + (unsigned long *)&features_supported,
> + feature->feature_num);
> +
>   if (!smu_is_dpm_running(smu))
>   dev_info(adev->dev, "dpm has been disabled\n");
>  
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index 84cbde3f913d..f55ead5f9aba 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -1624,8 +1624,8 @@ static int navi10_display_config_changed(struct 
> smu_context *smu)
>   int ret = 0;
>  
>   if ((smu->watermarks_bitmap & WATERMARKS_EXIST) &&
> - smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) &&
> - smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) {
> + smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) &&
> + smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) {
>   ret = smu_cmn_send_smc_msg_with_param(smu, 
> SMU_MSG_NumOfDisplays,
> 
> smu->display_config->num_display,
> NULL);
> @@ -1860,13 +1860,13 @@ static int navi10_notify_smc_display_config(struct 
> smu_context *smu)
>   min_clocks.dcef_clock_in_sr = 
> smu->display_config->min_dcef_deep_sleep_set_clk;
>   min_clocks.memory_clock = smu->display_config->min_mem_set_clock;
>  
> - if (smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT)) {
> + if (smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_DCEFCLK_BIT)) {
>   clock_req.clock_type = amd_pp_dcef_clock;
>   clock_req.clock_freq_in_khz = min_clocks.dcef_clock * 10;
>  
>   ret = smu_v11_0_display_clock_voltage_request(smu, &clock_req);
>   if (!ret) {
> - if (smu_cmn_feature_is_supported(smu, 
> SMU_FEATURE_DS_DCEFCLK_BIT)) {
> + if (smu_cmn_feature_is_enabled(smu, 
> SMU_FEATURE_DS_DCEFCLK_BIT)) {
>   ret = smu_cmn_send_smc_msg_with_param(smu,
> 
> SMU_MSG_SetMinDeepSleepDcefclk,
> 
> min_clocks.dcef_clock_in_sr/100,
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> index b6759f8b5167..804e1c98238d 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> @@ -1280,8 +1280,8 @@ static int sienna_cichlid_display_config_changed(struct 
> smu_context *smu)
>   int ret = 0;
>  
>   if ((smu->watermarks_bitmap & WATERMARKS_EXIST) &&
> - smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) &&
> - smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) {
> + smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_

[PATCH] drm/amd: Return NULL instead of false in dcn201_acquire_idle_pipe_for_layer()

2021-09-30 Thread Nathan Chancellor

Clang warns:

drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_resource.c:1017:10: 
error: expression which evaluates to zero treated as a null pointer constant of 
type 'struct pipe_ctx *' [-Werror,-Wnon-literal-null-conversion]
return false;
   ^
1 error generated.

Use NULL instead of false since the function is returning a pointer
rather than a boolean.

Fixes: ff7e396f822f ("drm/amd/display: add cyan_skillfish display support")
Link: https://github.com/ClangBuiltLinux/linux/issues/1470
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c
index aec276e1db65..8523a048e6f6 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c
@@ -1014,7 +1014,7 @@ static struct pipe_ctx 
*dcn201_acquire_idle_pipe_for_layer(
ASSERT(0);
 
if (!idle_pipe)
-   return false;
+   return NULL;
 
idle_pipe->stream = head_pipe->stream;
idle_pipe->stream_res.tg = head_pipe->stream_res.tg;

base-commit: b47b99e30cca8906753c83205e8c6179045dd725
-- 
2.33.0.591.gddb1055343

[PATCH] drm/amd: Initialize remove_mpcc in dcn201_update_mpcc()

2021-09-30 Thread Nathan Chancellor

Clang warns:

drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_hwseq.c:505:6: error: 
variable 'remove_mpcc' is used uninitialized whenever 'if' condition is false 
[-Werror,-Wsometimes-uninitialized]
if (mpc->funcs->get_mpcc_for_dpp_from_secondary)
^~~
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_hwseq.c:509:6: note: 
uninitialized use occurs here
if (remove_mpcc != NULL && mpc->funcs->remove_mpcc_from_secondary)
^~~
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_hwseq.c:505:2: note: 
remove the 'if' if its condition is always true
if (mpc->funcs->get_mpcc_for_dpp_from_secondary)
^~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_hwseq.c:442:26: note: 
initialize the variable 'remove_mpcc' to silence this warning
struct mpcc *remove_mpcc;
^
 = NULL
1 error generated.

The code already handles remove_mpcc being NULL just fine so initialize
it to NULL at the beginning of the function so it is never used
uninitialized.

Fixes: ff7e396f822f ("drm/amd/display: add cyan_skillfish display support")
Link: https://github.com/ClangBuiltLinux/linux/issues/1469
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
index ceaaeeb8f2de..cfd09b3f705e 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
@@ -439,7 +439,7 @@ void dcn201_update_mpcc(struct dc *dc, struct pipe_ctx 
*pipe_ctx)
bool per_pixel_alpha = pipe_ctx->plane_state->per_pixel_alpha && 
pipe_ctx->bottom_pipe;
int mpcc_id, dpp_id;
struct mpcc *new_mpcc;
-   struct mpcc *remove_mpcc;
+   struct mpcc *remove_mpcc = NULL;
struct mpc *mpc = dc->res_pool->mpc;
struct mpc_tree *mpc_tree_params = 
&(pipe_ctx->stream_res.opp->mpc_tree_params);
 

base-commit: 30fc33064c846df29888c3c61e30a064aad3a342
-- 
2.33.0.591.gddb1055343

[PATCH] drm/amd: Guard IS_OLD_GCC assignment with CONFIG_CC_IS_GCC

2021-09-30 Thread Nathan Chancellor

cc-ifversion only works for GCC, as clang pretends to be GCC 4.2.1 for
glibc compatibility, which means IS_OLD_GCC will get set and unsupported
flags will be passed to clang when building certain code within the DCN
files:

clang-14: error: unknown argument: '-mpreferred-stack-boundary=4'
make[5]: *** [scripts/Makefile.build:277: 
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_resource.o] Error 1

Guard the call to cc-ifversion with CONFIG_CC_IS_GCC so that everything
continues to work properly. See commit 00db297106e8 ("drm/amdgpu: fix stack
alignment ABI mismatch for GCC 7.1+") for more context.

Fixes: ff7e396f822f ("drm/amd/display: add cyan_skillfish display support")
Link: https://github.com/ClangBuiltLinux/linux/issues/1468
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/dc/dcn201/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
index d98d69705117..96cbd4ccd344 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
@@ -14,9 +14,11 @@ ifdef CONFIG_PPC64
 CFLAGS_$(AMDDALPATH)/dc/dcn201/dcn201_resource.o := -mhard-float -maltivec
 endif
 
+ifdef CONFIG_CC_IS_GCC
 ifeq ($(call cc-ifversion, -lt, 0701, y), y)
 IS_OLD_GCC = 1
 endif
+endif
 
 ifdef CONFIG_X86
 ifdef IS_OLD_GCC

base-commit: b47b99e30cca8906753c83205e8c6179045dd725
-- 
2.33.0.591.gddb1055343

1 2 >

1 - 100 of 173 matches

Mail list logo