Public bug reported:

[Impact]
Trusty 3.13 kernel Oops due to ixgbe driver failing to check non-existing VF

[Description]
In the current Trusty kernel, when the ixgbe driver tries to enable or disable 
spoofchk for a non-existing VF, it causes a kernel oops. This is due to a 
missing check in ixgbe_ndo_set_vf_spoofchk() before dereferencing the VF. There 
is an upstream commit to fix this issue (600a507ddcb99 ixgbe: check for vfs 
outside of sriov_num_vfs before dereference), but it needs to be cherry-picked 
into 3.13 for Trusty.

Upstream commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=600a507ddcb99

$ git describe --contains 600a507ddcb99
Ubuntu-lts-3.19.0-7.7_14.04.1~1051^2~26^2

$ rmadison linux-generic
=> linux-generic | 3.13.0.24.28   | trusty           | amd64, ...
=> linux-generic | 3.13.0.165.175 | trusty-security  | amd64, ...
=> linux-generic | 3.13.0.165.175 | trusty-updates   | amd64, ...
   linux-generic | 4.4.0.21.22    | xenial           | amd64, ...
   linux-generic | 4.15.0.20.23   | bionic           | amd64, ...
   linux-generic | 4.18.0.10.11   | cosmic           | amd64, ...
   linux-generic | 4.19.0.12.13   | disco            | amd64, ...

[Fix]
The fix is to check if the requested VF exists before dereferencing it in the 
driver. Upstream commit 600a507ddcb99 introduced this check, and it's a clean 
cherry pick into the latest Trusty kernel.

[Test Case]
1) Deploy a Trusty system with an ixgbe adapter and latest kernel from -updates:
# lsb_release  -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 14.04.5 LTS
Release:        14.04
Codename:       trusty
# uname -r
3.13.0-165-generic
# lspci -v -s 04:00.0
04:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit 
X540-AT2 (rev 01)
        Subsystem: Hewlett-Packard Company 561FLR-T 2-port 10Gb Ethernet Adapter
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Memory at 92e00000 (32-bit, prefetchable) [size=2M]
        Memory at 93004000 (32-bit, prefetchable) [size=16K]
        [virtual] Expansion ROM at 93080000 [disabled] [size=512K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
        Capabilities: [a0] Express Endpoint, MSI 00
        Capabilities: [e0] Vital Product Data
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
        Capabilities: [1d0] Access Control Services
        Kernel driver in use: ixgbe

2) Attempt to disable spoofchk with VF -1:
# ip link set dev eth4 vf -1 spoofchk off
Killed
# dmesg
[  241.066440] BUG: unable to handle kernel paging request at fffffffffffffffa
[  241.066880] IP: [<ffffffffa014775c>] ixgbe_ndo_set_vf_spoofchk+0x3c/0xc0 
[ixgbe]
[  241.067331] PGD 2c13067 PUD 2c15067 PMD 0
[  241.067591] Oops: 0002 [#1] SMP
[  241.067793] Modules linked in: ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad 
ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_crypt 
gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm hpilo 
lpc_ich ioatdma ipmi_si shpchp acpi_power_meter mac_hid nls_iso8859_1 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw 
gf128mul glue_helper ablk_helper cryptd ixgbe tg3 dca ptp pps_core hpsa nvme 
mdio wmi
[  241.070462] CPU: 43 PID: 2214 Comm: ip Not tainted 3.13.0-165-generic 
#215-Ubuntu
[  241.070908] Hardware name: HP ProLiant DL360 Gen9, BIOS P89 05/06/2015
[  241.071302] task: ffff880035c4c800 ti: ffff8810275c8000 task.ti: 
ffff8810275c8000
[  241.071751] RIP: 0010:[<ffffffffa014775c>]  [<ffffffffa014775c>] 
ixgbe_ndo_set_vf_spoofchk+0x3c/0xc0 [ixgbe]
[  241.072349] RSP: 0018:ffff8810275c9858  EFLAGS: 00010283
[  241.072663] RAX: 0000000000000000 RBX: ffff881022360000 RCX: 00000000ffffffff
[  241.073090] RDX: 0000000000000000 RSI: 00000000000081fc RDI: ffff881022360000
[  241.073522] RBP: ffff8810275c9858 R08: fffffffffffffffb R09: ffffffffffffffa8
[  241.073953] R10: 00000000ffffffa1 R11: 0000000000000246 R12: ffff8810275c9950
[  241.074381] R13: 0000000000000000 R14: ffffffffa01511c0 R15: 00000000ffffffea
[  241.074814] FS:  00007f3188ce6740(0000) GS:ffff88203f3e0000(0000) 
knlGS:0000000000000000
[  241.075299] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  241.075642] CR2: fffffffffffffffa CR3: 000000202662a000 CR4: 0000000000160770
[  241.076365] Stack:
[  241.076470]  ffff8810275c98f0 ffffffff8164f6fb ffff8810275c9888 
ffff882000000010
[  241.076938]  ffffffffa01511c0 ffff8820268b0c24 0000000000000000 
0000000000000000
[  241.077403]  0000000000000000 0000000000000000 ffff8820268b0c28 
0000000000000000
[  241.077865] Call Trace:
[  241.078037]  [<ffffffff8164f6fb>] do_setlink+0x87b/0x9a0
[  241.078352]  [<ffffffff8139c4c6>] ? nla_parse+0xb6/0x120
[  241.078669]  [<ffffffff8164fd1a>] rtnl_newlink+0x3ba/0x620
[  241.078994]  [<ffffffff81161923>] ? __alloc_pages_nodemask+0x1a3/0xb90
[  241.079393]  [<ffffffff812e5f7e>] ? security_capable+0x1e/0x20
[  241.079733]  [<ffffffff81077e29>] ? ns_capable+0x29/0x50
[  241.080027]  [<ffffffff8164c718>] rtnetlink_rcv_msg+0x98/0x250
[  241.080381]  [<ffffffff811afc88>] ? __kmalloc_node_track_caller+0x58/0x2b0
[  241.080802]  [<ffffffff8162c98e>] ? __alloc_skb+0x7e/0x2b0
[  241.081121]  [<ffffffff8164c680>] ? rtnetlink_rcv+0x30/0x30
[  241.081464]  [<ffffffff8166b0ab>] netlink_rcv_skb+0xab/0xc0
[  241.081792]  [<ffffffff8164c678>] rtnetlink_rcv+0x28/0x30
[  241.082118]  [<ffffffff8166a7a0>] netlink_unicast+0xe0/0x1b0
[  241.082455]  [<ffffffff8166ab7e>] netlink_sendmsg+0x30e/0x680
[  241.082801]  [<ffffffff81623251>] sock_sendmsg+0x91/0xc0
[  241.099175]  [<ffffffff811bd526>] ? __mem_cgroup_commit_charge+0x156/0x3d0
[  241.115633]  [<ffffffff81622f2e>] ? move_addr_to_kernel.part.14+0x1e/0x60
[  241.132150]  [<ffffffff81623d81>] ? move_addr_to_kernel+0x21/0x30
[  241.148222]  [<ffffffff81623659>] ___sys_sendmsg+0x389/0x3a0
[  241.163907]  [<ffffffff81621ecf>] ? sock_destroy_inode+0x2f/0x40
[  241.179322]  [<ffffffff817487d4>] ? __do_page_fault+0x214/0x570
[  241.194220]  [<ffffffff811dffdd>] ? dput+0xad/0x190
[  241.209158]  [<ffffffff811e94e4>] ? mntput+0x24/0x40
[  241.223576]  [<ffffffff811cab81>] ? __fput+0x181/0x260
[  241.237701]  [<ffffffff81624462>] __sys_sendmsg+0x42/0x80
[  241.251846]  [<ffffffff816244b2>] SyS_sendmsg+0x12/0x20
[  241.265579]  [<ffffffff8174d3bc>] system_call_fastpath+0x26/0x2b
[  241.278978] Code: 8d 0c 06 83 e1 07 29 c1 48 63 c6 c1 fe 03 4c 8d 04 80 8d 
34 b5 00 82 00 00 4e 8d 0c 40 48 8b 87 90 85 00 00 48 63 f6 49 c1 e1 03 <42> 88 
54 08 52 48 89 f0 48 03 87 80 16 00 00 8b 00 41 ba 01 00
[  241.306211] RIP  [<ffffffffa014775c>] ixgbe_ndo_set_vf_spoofchk+0x3c/0xc0 
[ixgbe]
[  241.319310]  RSP <ffff8810275c9858>
[  241.332404] CR2: fffffffffffffffa
[  241.345011] ---[ end trace a45b72690a7e13be ]---

[Regression Potential]
The regression potential is low, since the fix is a simple check to confirm the 
VF exists before doing any operations. This check is already implemented in 
other functions of the ixgbe driver, and only the spoofchk function is missing 
it. Nonetheless, the patch was tested in an impacted system and confirmed to 
resolve the kernel oops without further problems.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: Fix Released

** Attachment added: "Full dmesg output until kernel oops"
   https://bugs.launchpad.net/bugs/1815501/+attachment/5237733/+files/dmesg.out

** Changed in: linux (Ubuntu)
       Status: New => Incomplete

** Changed in: linux (Ubuntu)
       Status: Incomplete => New

** Changed in: linux (Ubuntu)
       Status: New => Fix Released

** Changed in: linux (Ubuntu)
     Assignee: Heitor R. Alves de Siqueira (halves) => (unassigned)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1815501

Title:
  ixgbe: Kernel Oops when attempting to disable spoofchk in a non-
  existing VF

Status in linux package in Ubuntu:
  Fix Released

Bug description:
  [Impact]
  Trusty 3.13 kernel Oops due to ixgbe driver failing to check non-existing VF

  [Description]
  In the current Trusty kernel, when the ixgbe driver tries to enable or 
disable spoofchk for a non-existing VF, it causes a kernel oops. This is due to 
a missing check in ixgbe_ndo_set_vf_spoofchk() before dereferencing the VF. 
There is an upstream commit to fix this issue (600a507ddcb99 ixgbe: check for 
vfs outside of sriov_num_vfs before dereference), but it needs to be 
cherry-picked into 3.13 for Trusty.

  Upstream commit:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=600a507ddcb99

  $ git describe --contains 600a507ddcb99
  Ubuntu-lts-3.19.0-7.7_14.04.1~1051^2~26^2

  $ rmadison linux-generic
  => linux-generic | 3.13.0.24.28   | trusty           | amd64, ...
  => linux-generic | 3.13.0.165.175 | trusty-security  | amd64, ...
  => linux-generic | 3.13.0.165.175 | trusty-updates   | amd64, ...
     linux-generic | 4.4.0.21.22    | xenial           | amd64, ...
     linux-generic | 4.15.0.20.23   | bionic           | amd64, ...
     linux-generic | 4.18.0.10.11   | cosmic           | amd64, ...
     linux-generic | 4.19.0.12.13   | disco            | amd64, ...

  [Fix]
  The fix is to check if the requested VF exists before dereferencing it in the 
driver. Upstream commit 600a507ddcb99 introduced this check, and it's a clean 
cherry pick into the latest Trusty kernel.

  [Test Case]
  1) Deploy a Trusty system with an ixgbe adapter and latest kernel from 
-updates:
  # lsb_release  -a
  No LSB modules are available.
  Distributor ID: Ubuntu
  Description:    Ubuntu 14.04.5 LTS
  Release:        14.04
  Codename:       trusty
  # uname -r
  3.13.0-165-generic
  # lspci -v -s 04:00.0
  04:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit 
X540-AT2 (rev 01)
          Subsystem: Hewlett-Packard Company 561FLR-T 2-port 10Gb Ethernet 
Adapter
          Flags: bus master, fast devsel, latency 0, IRQ 16
          Memory at 92e00000 (32-bit, prefetchable) [size=2M]
          Memory at 93004000 (32-bit, prefetchable) [size=16K]
          [virtual] Expansion ROM at 93080000 [disabled] [size=512K]
          Capabilities: [40] Power Management version 3
          Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
          Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
          Capabilities: [a0] Express Endpoint, MSI 00
          Capabilities: [e0] Vital Product Data
          Capabilities: [100] Advanced Error Reporting
          Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
          Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
          Capabilities: [1d0] Access Control Services
          Kernel driver in use: ixgbe

  2) Attempt to disable spoofchk with VF -1:
  # ip link set dev eth4 vf -1 spoofchk off
  Killed
  # dmesg
  [  241.066440] BUG: unable to handle kernel paging request at fffffffffffffffa
  [  241.066880] IP: [<ffffffffa014775c>] ixgbe_ndo_set_vf_spoofchk+0x3c/0xc0 
[ixgbe]
  [  241.067331] PGD 2c13067 PUD 2c15067 PMD 0
  [  241.067591] Oops: 0002 [#1] SMP
  [  241.067793] Modules linked in: ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad 
ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_crypt 
gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm hpilo 
lpc_ich ioatdma ipmi_si shpchp acpi_power_meter mac_hid nls_iso8859_1 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw 
gf128mul glue_helper ablk_helper cryptd ixgbe tg3 dca ptp pps_core hpsa nvme 
mdio wmi
  [  241.070462] CPU: 43 PID: 2214 Comm: ip Not tainted 3.13.0-165-generic 
#215-Ubuntu
  [  241.070908] Hardware name: HP ProLiant DL360 Gen9, BIOS P89 05/06/2015
  [  241.071302] task: ffff880035c4c800 ti: ffff8810275c8000 task.ti: 
ffff8810275c8000
  [  241.071751] RIP: 0010:[<ffffffffa014775c>]  [<ffffffffa014775c>] 
ixgbe_ndo_set_vf_spoofchk+0x3c/0xc0 [ixgbe]
  [  241.072349] RSP: 0018:ffff8810275c9858  EFLAGS: 00010283
  [  241.072663] RAX: 0000000000000000 RBX: ffff881022360000 RCX: 
00000000ffffffff
  [  241.073090] RDX: 0000000000000000 RSI: 00000000000081fc RDI: 
ffff881022360000
  [  241.073522] RBP: ffff8810275c9858 R08: fffffffffffffffb R09: 
ffffffffffffffa8
  [  241.073953] R10: 00000000ffffffa1 R11: 0000000000000246 R12: 
ffff8810275c9950
  [  241.074381] R13: 0000000000000000 R14: ffffffffa01511c0 R15: 
00000000ffffffea
  [  241.074814] FS:  00007f3188ce6740(0000) GS:ffff88203f3e0000(0000) 
knlGS:0000000000000000
  [  241.075299] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [  241.075642] CR2: fffffffffffffffa CR3: 000000202662a000 CR4: 
0000000000160770
  [  241.076365] Stack:
  [  241.076470]  ffff8810275c98f0 ffffffff8164f6fb ffff8810275c9888 
ffff882000000010
  [  241.076938]  ffffffffa01511c0 ffff8820268b0c24 0000000000000000 
0000000000000000
  [  241.077403]  0000000000000000 0000000000000000 ffff8820268b0c28 
0000000000000000
  [  241.077865] Call Trace:
  [  241.078037]  [<ffffffff8164f6fb>] do_setlink+0x87b/0x9a0
  [  241.078352]  [<ffffffff8139c4c6>] ? nla_parse+0xb6/0x120
  [  241.078669]  [<ffffffff8164fd1a>] rtnl_newlink+0x3ba/0x620
  [  241.078994]  [<ffffffff81161923>] ? __alloc_pages_nodemask+0x1a3/0xb90
  [  241.079393]  [<ffffffff812e5f7e>] ? security_capable+0x1e/0x20
  [  241.079733]  [<ffffffff81077e29>] ? ns_capable+0x29/0x50
  [  241.080027]  [<ffffffff8164c718>] rtnetlink_rcv_msg+0x98/0x250
  [  241.080381]  [<ffffffff811afc88>] ? __kmalloc_node_track_caller+0x58/0x2b0
  [  241.080802]  [<ffffffff8162c98e>] ? __alloc_skb+0x7e/0x2b0
  [  241.081121]  [<ffffffff8164c680>] ? rtnetlink_rcv+0x30/0x30
  [  241.081464]  [<ffffffff8166b0ab>] netlink_rcv_skb+0xab/0xc0
  [  241.081792]  [<ffffffff8164c678>] rtnetlink_rcv+0x28/0x30
  [  241.082118]  [<ffffffff8166a7a0>] netlink_unicast+0xe0/0x1b0
  [  241.082455]  [<ffffffff8166ab7e>] netlink_sendmsg+0x30e/0x680
  [  241.082801]  [<ffffffff81623251>] sock_sendmsg+0x91/0xc0
  [  241.099175]  [<ffffffff811bd526>] ? __mem_cgroup_commit_charge+0x156/0x3d0
  [  241.115633]  [<ffffffff81622f2e>] ? move_addr_to_kernel.part.14+0x1e/0x60
  [  241.132150]  [<ffffffff81623d81>] ? move_addr_to_kernel+0x21/0x30
  [  241.148222]  [<ffffffff81623659>] ___sys_sendmsg+0x389/0x3a0
  [  241.163907]  [<ffffffff81621ecf>] ? sock_destroy_inode+0x2f/0x40
  [  241.179322]  [<ffffffff817487d4>] ? __do_page_fault+0x214/0x570
  [  241.194220]  [<ffffffff811dffdd>] ? dput+0xad/0x190
  [  241.209158]  [<ffffffff811e94e4>] ? mntput+0x24/0x40
  [  241.223576]  [<ffffffff811cab81>] ? __fput+0x181/0x260
  [  241.237701]  [<ffffffff81624462>] __sys_sendmsg+0x42/0x80
  [  241.251846]  [<ffffffff816244b2>] SyS_sendmsg+0x12/0x20
  [  241.265579]  [<ffffffff8174d3bc>] system_call_fastpath+0x26/0x2b
  [  241.278978] Code: 8d 0c 06 83 e1 07 29 c1 48 63 c6 c1 fe 03 4c 8d 04 80 8d 
34 b5 00 82 00 00 4e 8d 0c 40 48 8b 87 90 85 00 00 48 63 f6 49 c1 e1 03 <42> 88 
54 08 52 48 89 f0 48 03 87 80 16 00 00 8b 00 41 ba 01 00
  [  241.306211] RIP  [<ffffffffa014775c>] ixgbe_ndo_set_vf_spoofchk+0x3c/0xc0 
[ixgbe]
  [  241.319310]  RSP <ffff8810275c9858>
  [  241.332404] CR2: fffffffffffffffa
  [  241.345011] ---[ end trace a45b72690a7e13be ]---

  [Regression Potential]
  The regression potential is low, since the fix is a simple check to confirm 
the VF exists before doing any operations. This check is already implemented in 
other functions of the ixgbe driver, and only the spoofchk function is missing 
it. Nonetheless, the patch was tested in an impacted system and confirmed to 
resolve the kernel oops without further problems.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1815501/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to