[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2024-02-29 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-mtk/5.15.0-1030.34
kernel in -proposed solves the problem. Please test the kernel and
update this bug with the results. If the problem is solved, change the
tag 'verification-needed-jammy-linux-mtk' to 'verification-done-jammy-
linux-mtk'. If the problem still exists, change the tag 'verification-
needed-jammy-linux-mtk' to 'verification-failed-jammy-linux-mtk'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-mtk-v2 
verification-needed-jammy-linux-mtk

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Released

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's worth noting that the current kernel has replaced the 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-12-04 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-azure-
fips/5.15.0-1053.61+fips1 kernel in -proposed solves the problem. Please
test the kernel and update this bug with the results. If the problem is
solved, change the tag 'verification-needed-jammy-linux-azure-fips' to
'verification-done-jammy-linux-azure-fips'. If the problem still exists,
change the tag 'verification-needed-jammy-linux-azure-fips' to
'verification-failed-jammy-linux-azure-fips'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-azure-fips-v2 
verification-needed-jammy-linux-azure-fips

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Released

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-11-15 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-nvidia-
tegra-5.15/5.15.0-1019.19~20.04.1 kernel in -proposed solves the
problem. Please test the kernel and update this bug with the results. If
the problem is solved, change the tag 'verification-needed-focal-linux-
nvidia-tegra-5.15' to 'verification-done-focal-linux-nvidia-tegra-5.15'.
If the problem still exists, change the tag 'verification-needed-focal-
linux-nvidia-tegra-5.15' to 'verification-failed-focal-linux-nvidia-
tegra-5.15'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-focal-linux-nvidia-tegra-5.15-v2 
verification-needed-focal-linux-nvidia-tegra-5.15

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Released

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-11-10 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-nvidia-tegra-
igx/5.15.0-1006.6 kernel in -proposed solves the problem. Please test
the kernel and update this bug with the results. If the problem is
solved, change the tag 'verification-needed-jammy-linux-nvidia-tegra-
igx' to 'verification-done-jammy-linux-nvidia-tegra-igx'. If the problem
still exists, change the tag 'verification-needed-jammy-linux-nvidia-
tegra-igx' to 'verification-failed-jammy-linux-nvidia-tegra-igx'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-nvidia-tegra-igx-v2 
verification-needed-jammy-linux-nvidia-tegra-igx

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Released

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-11-09 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-xilinx-
zynqmp/5.15.0-1025.29 kernel in -proposed solves the problem. Please
test the kernel and update this bug with the results. If the problem is
solved, change the tag 'verification-needed-jammy-linux-xilinx-zynqmp'
to 'verification-done-jammy-linux-xilinx-zynqmp'. If the problem still
exists, change the tag 'verification-needed-jammy-linux-xilinx-zynqmp'
to 'verification-failed-jammy-linux-xilinx-zynqmp'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-xilinx-zynqmp-v2 
verification-needed-jammy-linux-xilinx-zynqmp

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Released

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-11-08 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-gcp-tcpx/5.15.0-1002.2
kernel in -proposed solves the problem. Please test the kernel and
update this bug with the results. If the problem is solved, change the
tag 'verification-needed-focal-linux-gcp-tcpx' to 'verification-done-
focal-linux-gcp-tcpx'. If the problem still exists, change the tag
'verification-needed-focal-linux-gcp-tcpx' to 'verification-failed-
focal-linux-gcp-tcpx'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-focal-linux-gcp-tcpx-v2 
verification-needed-focal-linux-gcp-tcpx

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Released

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's worth noting that the 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-11-01 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-nvidia-
tegra/5.15.0-1019.19 kernel in -proposed solves the problem. Please test
the kernel and update this bug with the results. If the problem is
solved, change the tag 'verification-needed-jammy-linux-nvidia-tegra' to
'verification-done-jammy-linux-nvidia-tegra'. If the problem still
exists, change the tag 'verification-needed-jammy-linux-nvidia-tegra' to
'verification-failed-jammy-linux-nvidia-tegra'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-nvidia-tegra-v2 
verification-needed-jammy-linux-nvidia-tegra

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Released

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-10-30 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-
bluefield/5.15.0-1029.31 kernel in -proposed solves the problem. Please
test the kernel and update this bug with the results. If the problem is
solved, change the tag 'verification-needed-jammy-linux-bluefield' to
'verification-done-jammy-linux-bluefield'. If the problem still exists,
change the tag 'verification-needed-jammy-linux-bluefield' to
'verification-failed-jammy-linux-bluefield'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-bluefield-v2 
verification-needed-jammy-linux-bluefield

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Released

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's worth noting 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-10-30 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-gkeop/5.15.0-1032.38
kernel in -proposed solves the problem. Please test the kernel and
update this bug with the results. If the problem is solved, change the
tag 'verification-needed-jammy-linux-gkeop' to 'verification-done-jammy-
linux-gkeop'. If the problem still exists, change the tag 'verification-
needed-jammy-linux-gkeop' to 'verification-failed-jammy-linux-gkeop'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-gkeop-v2 
verification-needed-jammy-linux-gkeop

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Released

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's worth noting that the current kernel has 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-10-30 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 5.15.0-88.98

---
linux (5.15.0-88.98) jammy; urgency=medium

  * jammy/linux: 5.15.0-88.98 -proposed tracker (LP: #2038055)

  * CVE-2023-4244
- netfilter: nf_tables: don't skip expired elements during walk
- netfilter: nf_tables: adapt set backend to use GC transaction API
- netfilter: nft_set_hash: mark set element as dead when deleting from 
packet
  path
- netfilter: nf_tables: GC transaction API to avoid race with control plane
- netfilter: nf_tables: remove busy mark and gc batch API
- netfilter: nf_tables: don't fail inserts if duplicate has expired
- netfilter: nf_tables: fix kdoc warnings after gc rework
- netfilter: nf_tables: fix GC transaction races with netns and netlink 
event
  exit path
- netfilter: nf_tables: GC transaction race with netns dismantle
- netfilter: nf_tables: GC transaction race with abort path
- netfilter: nf_tables: use correct lock to protect gc_list
- netfilter: nf_tables: defer gc run if previous batch is still pending
- netfilter: nft_dynset: disallow object maps
- netfilter: nft_set_rbtree: skip sync GC for new elements in this 
transaction

  * CVE-2023-42756
- netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP

  * CVE-2023-4623
- net/sched: sch_hfsc: Ensure inner classes have fsc curve

  * PCI BARs larger than 128GB are disabled (LP: #2037403)
- PCI: Support BAR sizes up to 8TB

  * Fix unstable audio at low levels on Thinkpad P1G4 (LP: #2037077)
- ALSA: hda/realtek - ALC287 I2S speaker platform support

  * Check for changes relevant for security certifications (LP: #1945989)
- [Packaging] Add a new fips-checks script

  * Jammy update: v5.15.126 upstream stable release (LP: #2037593)
- io_uring: gate iowait schedule on having pending requests
- perf: Fix function pointer case
- net/mlx5: Free irqs only on shutdown callback
- arm64: errata: Add workaround for TSB flush failures
- arm64: errata: Add detection for TRBE write to out-of-range
- [Config] updateconfigs for ARM64_ERRATUM_ and
  ARM64_WORKAROUND_TSB_FLUSH_FAILURE
- iommu/arm-smmu-v3: Work around MMU-600 erratum 1076982
- iommu/arm-smmu-v3: Document MMU-700 erratum 2812531
- iommu/arm-smmu-v3: Add explicit feature for nesting
- iommu/arm-smmu-v3: Document nesting-related errata
- arm64: dts: imx8mn-var-som: add missing pull-up for onboard PHY reset 
pinmux
- word-at-a-time: use the same return type for has_zero regardless of
  endianness
- KVM: s390: fix sthyi error handling
- wifi: cfg80211: Fix return value in scan logic
- net/mlx5: DR, fix memory leak in mlx5dr_cmd_create_reformat_ctx
- net/mlx5e: fix return value check in mlx5e_ipsec_remove_trailer()
- bpf: Add length check for SK_DIAG_BPF_STORAGE_REQ_MAP_FD parsing
- rtnetlink: let rtnl_bridge_setlink checks IFLA_BRIDGE_MODE length
- net: dsa: fix value check in bcm_sf2_sw_probe()
- perf test uprobe_from_different_cu: Skip if there is no gcc
- net: sched: cls_u32: Fix match key mis-addressing
- mISDN: hfcpci: Fix potential deadlock on >lock
- qed: Fix kernel-doc warnings
- qed: Fix scheduling in a tasklet while getting stats
- net: annotate data-races around sk->sk_max_pacing_rate
- net: add missing READ_ONCE(sk->sk_rcvlowat) annotation
- net: add missing READ_ONCE(sk->sk_sndbuf) annotation
- net: add missing READ_ONCE(sk->sk_rcvbuf) annotation
- net: add missing data-race annotations around sk->sk_peek_off
- net: add missing data-race annotation for sk_ll_usec
- net/sched: taprio: Limit TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME to INT_MAX.
- bpf, cpumap: Handle skb as well when clean up ptr_ring
- bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire
- net: ll_temac: Switch to use dev_err_probe() helper
- net: ll_temac: fix error checking of irq_of_parse_and_map()
- net: korina: handle clk prepare error in korina_probe()
- net: netsec: Ignore 'phy-mode' on SynQuacer in DT mode
- net: dcb: choose correct policy to parse DCB_ATTR_BCN
- s390/qeth: Don't call dev_close/dev_open (DOWN/UP)
- ip6mr: Fix skb_under_panic in ip6mr_cache_report()
- vxlan: Fix nexthop hash size
- net/mlx5: fs_core: Make find_closest_ft more generic
- net/mlx5: fs_core: Skip the FTs in the same FS_TYPE_PRIO_CHAINS fs_prio
- prestera: fix fallback to previous version on same major version
- tcp_metrics: fix addr_same() helper
- tcp_metrics: annotate data-races around tm->tcpm_stamp
- tcp_metrics: annotate data-races around tm->tcpm_lock
- tcp_metrics: annotate data-races around tm->tcpm_vals[]
- tcp_metrics: annotate data-races around tm->tcpm_net
- tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen
- scsi: zfcp: Defer fc_rport blocking until after ADISC response
- scsi: storvsc: Limit max_sectors for 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-10-30 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-intel-iot-
realtime/5.15.0-1042.44 kernel in -proposed solves the problem. Please
test the kernel and update this bug with the results. If the problem is
solved, change the tag 'verification-needed-jammy-linux-intel-iot-
realtime' to 'verification-done-jammy-linux-intel-iot-realtime'. If the
problem still exists, change the tag 'verification-needed-jammy-linux-
intel-iot-realtime' to 'verification-failed-jammy-linux-intel-iot-
realtime'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-intel-iot-realtime-v2 
verification-needed-jammy-linux-intel-iot-realtime

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Released

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-10-30 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-raspi/5.15.0-1042.45
kernel in -proposed solves the problem. Please test the kernel and
update this bug with the results. If the problem is solved, change the
tag 'verification-needed-jammy-linux-raspi' to 'verification-done-jammy-
linux-raspi'. If the problem still exists, change the tag 'verification-
needed-jammy-linux-raspi' to 'verification-failed-jammy-linux-raspi'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-intel-iotg-v2 
verification-needed-jammy-linux-intel-iotg

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Released

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's worth noting that the current 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-10-30 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-intel-
iotg/5.15.0-1044.50 kernel in -proposed solves the problem. Please test
the kernel and update this bug with the results. If the problem is
solved, change the tag 'verification-needed-jammy-linux-intel-iotg' to
'verification-done-jammy-linux-intel-iotg'. If the problem still exists,
change the tag 'verification-needed-jammy-linux-intel-iotg' to
'verification-failed-jammy-linux-intel-iotg'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Released

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's worth noting that the current kernel has replaced the list_head with 
atomic_t, as indicated by the following 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-10-30 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-raspi/5.15.0-1042.45
kernel in -proposed solves the problem. Please test the kernel and
update this bug with the results. If the problem is solved, change the
tag 'verification-needed-jammy-linux-raspi' to 'verification-done-jammy-
linux-raspi'. If the problem still exists, change the tag 'verification-
needed-jammy-linux-raspi' to 'verification-failed-jammy-linux-raspi'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-raspi-v2 
verification-needed-jammy-linux-raspi

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Released

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's worth noting that the current kernel has 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-10-05 Thread Chengen Du
The kernels (5.15.0-88.98) have been tested without any issues.

** Tags removed: verification-needed-jammy-linux
** Tags added: verification-done-jammy-linux

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Committed

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's worth noting that the current kernel has replaced the list_head with 
atomic_t, as indicated by the following commit.

  d25ceb926436 KVM: x86/mmu: Track the number of TDP MMU pages, but not
  the actual pages

  While this patch doesn't modify the triggering logic, it replaces the 
problematic section with a more reliable approach while keeping the original 
logic unchanged.
  If the issue persists, it should not result in any memory access problems.
  We also requested the customer to set up a test environment and simulate a 
workload similar to the production environment.
  The patch worked well and did not introduce any adverse effects.

 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-10-05 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux/5.15.0-88.98 kernel in
-proposed solves the problem. Please test the kernel and update this bug
with the results. If the problem is solved, change the tag
'verification-needed-jammy-linux' to 'verification-done-jammy-linux'. If
the problem still exists, change the tag 'verification-needed-jammy-
linux' to 'verification-failed-jammy-linux'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-v2 verification-needed-jammy-linux

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Committed

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's worth noting that the current kernel has replaced the list_head with 
atomic_t, as 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-09-20 Thread Roxana Nicolescu
** Changed in: linux (Ubuntu Jammy)
   Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  Fix Committed

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's worth noting that the current kernel has replaced the list_head with 
atomic_t, as indicated by the following commit.

  d25ceb926436 KVM: x86/mmu: Track the number of TDP MMU pages, but not
  the actual pages

  While this patch doesn't modify the triggering logic, it replaces the 
problematic section with a more reliable approach while keeping the original 
logic unchanged.
  If the issue persists, it should not result in any memory access problems.
  We also requested the customer to set up a test environment and simulate a 
workload similar to the production environment.
  The patch worked well and did not introduce any adverse effects.

  [Test Plan]
  Reproducing the issue has proven to be challenging.
  

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-09-12 Thread Chengen Du
** Changed in: linux (Ubuntu)
   Status: Invalid => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  In Progress

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's worth noting that the current kernel has replaced the list_head with 
atomic_t, as indicated by the following commit.

  d25ceb926436 KVM: x86/mmu: Track the number of TDP MMU pages, but not
  the actual pages

  While this patch doesn't modify the triggering logic, it replaces the 
problematic section with a more reliable approach while keeping the original 
logic unchanged.
  If the issue persists, it should not result in any memory access problems.
  We also requested the customer to set up a test environment and simulate a 
workload similar to the production environment.
  The patch worked well and did not introduce any adverse effects.

  [Test Plan]
  Reproducing the issue has proven to be challenging.
  Simulating heavy live 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-09-12 Thread Stefan Bader
** Changed in: linux (Ubuntu)
   Status: Incomplete => Invalid

** Changed in: linux (Ubuntu Jammy)
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu Jammy)
   Importance: Medium => High

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Jammy:
  In Progress

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's worth noting that the current kernel has replaced the list_head with 
atomic_t, as indicated by the following commit.

  d25ceb926436 KVM: x86/mmu: Track the number of TDP MMU pages, but not
  the actual pages

  While this patch doesn't modify the triggering logic, it replaces the 
problematic section with a more reliable approach while keeping the original 
logic unchanged.
  If the issue persists, it should not result in any memory access problems.
  We also requested the customer to set up a test environment and simulate a 
workload similar to the production environment.
  The patch worked well 

[Kernel-packages] [Bug 2035166] Re: NULL Pointer Dereference During KVM MMU Page Invalidation

2023-09-12 Thread Chengen Du
** Changed in: linux (Ubuntu Jammy)
   Status: Incomplete => In Progress

** Changed in: linux (Ubuntu)
   Status: Incomplete => New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2035166

Title:
  NULL Pointer Dereference During KVM MMU Page Invalidation

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Jammy:
  In Progress

Bug description:
  [Impact]
  During VM live migration, there is a potential risk of dereferencing a NULL 
pointer,
  which can lead to memory access issues and result in an unstable environment.

  [Fix]
  The call trace is as follows:

  kernel: BUG: kernel NULL pointer dereference, address: 0008
  kernel: #PF: supervisor write access in kernel mode
  kernel: #PF: error_code(0x0002) - not-present page
  kernel: PGD 0 P4D 0 
  kernel: Oops: 0002 [#1] SMP NOPTI
  kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G  IOE 
5.15.0-53-generic #59~20.04.1-Ubuntu
  kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021
  kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm]
  kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 
0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 
89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de
  kernel: RSP: 0018:b580320278a8 EFLAGS: 00010246
  kernel: RAX:  RBX: a0fe29e94c38 RCX: 0027
  kernel: RDX:  RSI: 0004 RDI: b5801e24ba58
  kernel: RBP: b58032027930 R08:  R09: 0004
  kernel: R10: 0001 R11:  R12: 0003
  kernel: R13: 0004 R14:  R15: b5801e235000
  kernel: FS:  7f1553fff700() GS:a20eff78() 
knlGS:
  kernel: CS:  0010 DS:  ES:  CR0: 80050033
  kernel: CR2: 0008 CR3: 00e7f7544004 CR4: 007726e0
  kernel: PKRU: 5554
  kernel: Call Trace:
  kernel:  
  kernel:  ? __switch_to_xtra+0x109/0x510
  kernel:  zap_gfn_range+0x218/0x360 [kvm]
  kernel:  ? __smp_call_single_queue+0x59/0x90
  kernel:  ? alloc_cpumask_var_node+0x1/0x30
  kernel:  ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm]
  kernel:  kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm]
  kernel:  kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm]
  --
  kernel: RAX: ffda RBX: 4020ae46 RCX: 7f15aa26e3ab
  kernel: RDX: 7f1553ffe050 RSI: 4020ae46 RDI: 002f
  kernel: RBP: 5602a885a410 R08: 5602a82ad000 R09: 7f154c087470
  kernel: R10:  R11: 0246 R12: 7f1553ffe050
  kernel: R13: 7f1553ffe160 R14:  R15: 0080
  kernel:  

  The error occurred randomly in different production environments of the 
customer, all with the same call trace.
  Therefore, the likelihood of other processes contaminating memory is low.
  After analyzing the call trace with the help of debug symbols, we can 
pinpoint the source of the error.

  root@focal:~/ddeb# eu-addr2line -ifae 
./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko 
__handle_changed_spte+0x3a9
  0x00068109
  __list_del inlined at 
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 
in __handle_changed_spte
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13
  __list_del_entry
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2
  list_del
  /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2
  tdp_mmu_unlink_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2
  handle_removed_tdp_mmu_page
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2
  __handle_changed_spte
  
/build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3

  The error occurred when the kernel attempted to delete an entry from a list.
  This issue may potentially be related to timing and has proven challenging to 
reproduce consistently, making it difficult for us to pinpoint the cause.
  It's worth noting that the current kernel has replaced the list_head with 
atomic_t, as indicated by the following commit.

  d25ceb926436 KVM: x86/mmu: Track the number of TDP MMU pages, but not
  the actual pages

  While this patch doesn't modify the triggering logic, it replaces the 
problematic section with a more reliable approach while keeping the original 
logic unchanged.
  If the issue persists, it should not result in any memory access problems.
  We also requested the customer to set up a test environment and simulate a 
workload similar to the production environment.
  The patch worked well and did not introduce any adverse effects.

  [Test Plan]