To: [email protected]
Hello,
I am providing a detailed root cause analysis for this bug based on
direct inspection of the 550.163.01 source on a Debian trixie system.
== ROOT CAUSE ==
The problem is in conftest.sh, in the vm_area_struct_has_const_vm_flags
detection block (line 6529).
The conftest detects NV_VM_AREA_STRUCT_HAS_CONST_VM_FLAGS by compiling:
#include <linux/mm_types.h>
int conftest_vm_area_struct_has_const_vm_flags(void) {
return offsetof(struct vm_area_struct, __vm_flags);
}
In kernel 6.19, __vm_flags was removed from the vm_area_struct union,
so this compile test fails and NV_VM_AREA_STRUCT_HAS_CONST_VM_FLAGS
is left undefined.
With that macro undefined, nv-mm.h evaluates nv_vm_flags_set() and
nv_vm_flags_clear() as follows:
static inline void nv_vm_flags_set(struct vm_area_struct *vma,
vm_flags_t flags)
{
#if !NV_CAN_CALL_VMA_START_WRITE
nv_vma_start_write(vma);
ACCESS_PRIVATE(vma, __vm_flags) |= flags; /* FAILS on 6.19 */
#elif defined(NV_VM_AREA_STRUCT_HAS_CONST_VM_FLAGS)
vm_flags_set(vma, flags); /* correct for 6.19 */
#else
vma->vm_flags |= flags;
#endif
}
NV_CAN_CALL_VMA_START_WRITE is defined in nv-mm.h based on whether
NV_IS_EXPORT_SYMBOL_GPL___vma_start_write is set. On Debian kernels
__vma_start_write is GPL-only, so NV_IS_EXPORT_SYMBOL_GPL___vma_start_write
is never set, NV_CAN_CALL_VMA_START_WRITE is always 0, and the first
branch is always taken — hitting the now-missing __vm_flags field.
The correct branch for kernel 6.19 is the second one (vm_flags_set),
but it is never reached because NV_VM_AREA_STRUCT_HAS_CONST_VM_FLAGS
is undefined.
== FIX ==
Add a second compile test in the same case block in conftest.sh,
immediately after the existing one and before the ;; terminator.
The second test checks for vm_flags_set(). If it compiles, it defines
NV_VM_AREA_STRUCT_HAS_CONST_VM_FLAGS to 1, overriding the previous
#undef. This is safe because append_conftest() writes sequentially
to stdout and the C preprocessor uses the last definition.
--- a/conftest.sh
+++ b/conftest.sh
@@ -6542,6 +6542,13 @@ vm_area_struct_has_const_vm_flags)
}"
compile_check_conftest "$CODE" \
"NV_VM_AREA_STRUCT_HAS_CONST_VM_FLAGS" "" "types"
+ CODE="
+ #include <linux/mm_types.h>
+ #include <linux/mm.h>
+ void conftest_vm_flags_set_exists(struct vm_area_struct *vma) {
+ vm_flags_set(vma, (vm_flags_t)0);
+ }"
+ compile_check_conftest "$CODE" \
+ "NV_VM_AREA_STRUCT_HAS_CONST_VM_FLAGS" "1" "types"
;;
== AFFECTED ==
nvidia-kernel-dkms 550.163.01-4 on Linux 6.19.6+deb14-amd64.
Linux <= 6.18 unaffected.
== NOTE ==
The 550 branch is required for Maxwell (GTX 900) and Pascal (GTX 1000)
GPUs which are not supported by the 575+ driver.
Regards