Re: [PATCH] bnx2x: Fix NULL pointer dereference in bnx2x_del_all_vlans() on some hw

2018-12-24 Thread Ivan Mironov
4.20 release is affected too.

On Sun, 2018-12-23 at 20:29 +0500, Ivan Mironov wrote:
> This happened when I tried to boot normal Fedora 29 system with latest
> available kernel (from fedora rawhide, plus some unrelated custom
> patches):
> 
>   BUG: unable to handle kernel NULL pointer dereference at 
> 
>   PGD 0 P4D 0
>   Oops: 0010 [#1] SMP PTI
>   CPU: 6 PID: 1422 Comm: libvirtd Tainted: G  I   
> 4.20.0-0.rc7.git3.hpsa2.1.fc29.x86_64 #1
>   Hardware name: HP ProLiant BL460c G6, BIOS I24 05/21/2018
>   RIP: 0010:  (null)
>   Code: Bad RIP value.
>   RSP: 0018:a47ccdc9fbe0 EFLAGS: 00010246
>   RAX:  RBX: 03e8 RCX: a47ccdc9fbf8
>   RDX: a47ccdc9fc00 RSI: 97d9ee7b01f8 RDI: 97d9f0150b80
>   RBP: 97d9f0150b80 R08:  R09: 
>   R10:  R11:  R12: 0003
>   R13: 97d9ef1e53e8 R14: 0009 R15: 97d9f0ac6730
>   FS:  7f4d224ef700() GS:97d9fa20() 
> knlGS:
>   CS:  0010 DS:  ES:  CR0: 80050033
>   CR2: ffd6 CR3: 0011ece52006 CR4: 000206e0
>   Call Trace:
>? bnx2x_chip_cleanup+0x195/0x610 [bnx2x]
>? bnx2x_nic_unload+0x1e2/0x8f0 [bnx2x]
>? bnx2x_reload_if_running+0x24/0x40 [bnx2x]
>? bnx2x_set_features+0x79/0xa0 [bnx2x]
>? __netdev_update_features+0x244/0x9e0
>? netlink_broadcast_filtered+0x136/0x4b0
>? netdev_update_features+0x22/0x60
>? dev_disable_lro+0x1c/0xe0
>? devinet_sysctl_forward+0x1c6/0x211
>? proc_sys_call_handler+0xab/0x100
>? __vfs_write+0x36/0x1a0
>? rcu_read_lock_sched_held+0x79/0x80
>? rcu_sync_lockdep_assert+0x2e/0x60
>? __sb_start_write+0x14c/0x1b0
>? vfs_write+0x159/0x1c0
>? vfs_write+0xba/0x1c0
>? ksys_write+0x52/0xc0
>? do_syscall_64+0x60/0x1f0
>? entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> After some investigation I figured out that recently added cleanup code
> tries to call VLAN filtering de-initialization function which exist only
> for newer hardware. Corresponding function pointer is not
> initialized (== 0) for older hardware, namely these chips:
> 
>   #define CHIP_NUM_57710  0x164e
>   #define CHIP_NUM_57711  0x164f
>   #define CHIP_NUM_57711E 0x1650
> 
> And I have one of those in my test system:
> 
>   02:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries 
> NetXtreme II BCM57711E 10-Gigabit PCIe [14e4:1650]
>   02:00.1 Ethernet controller [0200]: Broadcom Inc. and subsidiaries 
> NetXtreme II BCM57711E 10-Gigabit PCIe [14e4:1650]
> 
> Function bnx2x_init_vlan_mac_fp_objs() from
> drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h decides whether to
> initialize relevant pointers in bnx2x_sp_objs.vlan_obj or not.
> 
> This regression was introduced after v4.20-rc7.
> 
> Fixes: 04f05230c5c13 ("bnx2x: Remove configured vlans as part of unload 
> sequence.")
> Signed-off-by: Ivan Mironov 
> ---
>  .../net/ethernet/broadcom/bnx2x/bnx2x_main.c  | 22 +--
>  1 file changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> index b164f705709d..0e37c2484ac2 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> @@ -8504,15 +8504,23 @@ int bnx2x_set_vlan_one(struct bnx2x *bp, u16 vlan,
>  static int bnx2x_del_all_vlans(struct bnx2x *bp)
>  {
>   struct bnx2x_vlan_mac_obj *vlan_obj = >sp_objs[0].vlan_obj;
> - unsigned long ramrod_flags = 0, vlan_flags = 0;
>   struct bnx2x_vlan_entry *vlan;
> - int rc;
>  
> - __set_bit(RAMROD_COMP_WAIT, _flags);
> - __set_bit(BNX2X_VLAN, _flags);
> - rc = vlan_obj->delete_all(bp, vlan_obj, _flags, _flags);
> - if (rc)
> - return rc;
> + /* The whole *vlan_obj structure may be not initialized if VLAN
> +  * filtering offload is not supported by hardware. Currently this is
> +  * true for all hardware covered by CHIP_IS_E1x().
> +  */
> + if (vlan_obj->delete_all) {
> + unsigned long ramrod_flags = 0, vlan_flags = 0;
> + int rc;
> +
> + __set_bit(RAMROD_COMP_WAIT, _flags);
> + __set_bit(BNX2X_VLAN, _flags);
> + rc = vlan_obj->delete_all(bp, vlan_obj, _flags,
> +   _flags);
> + if (rc)
> + return rc;
> + }
>  
>   /* Mark that hw forgot all entries */
>   list_for_each_entry(vlan, >vlan_reg, link)



[PATCH] bnx2x: Fix NULL pointer dereference in bnx2x_del_all_vlans() on some hw

2018-12-23 Thread Ivan Mironov
This happened when I tried to boot normal Fedora 29 system with latest
available kernel (from fedora rawhide, plus some unrelated custom
patches):

BUG: unable to handle kernel NULL pointer dereference at 

PGD 0 P4D 0
Oops: 0010 [#1] SMP PTI
CPU: 6 PID: 1422 Comm: libvirtd Tainted: G  I   
4.20.0-0.rc7.git3.hpsa2.1.fc29.x86_64 #1
Hardware name: HP ProLiant BL460c G6, BIOS I24 05/21/2018
RIP: 0010:  (null)
Code: Bad RIP value.
RSP: 0018:a47ccdc9fbe0 EFLAGS: 00010246
RAX:  RBX: 03e8 RCX: a47ccdc9fbf8
RDX: a47ccdc9fc00 RSI: 97d9ee7b01f8 RDI: 97d9f0150b80
RBP: 97d9f0150b80 R08:  R09: 
R10:  R11:  R12: 0003
R13: 97d9ef1e53e8 R14: 0009 R15: 97d9f0ac6730
FS:  7f4d224ef700() GS:97d9fa20() 
knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: ffd6 CR3: 0011ece52006 CR4: 000206e0
Call Trace:
 ? bnx2x_chip_cleanup+0x195/0x610 [bnx2x]
 ? bnx2x_nic_unload+0x1e2/0x8f0 [bnx2x]
 ? bnx2x_reload_if_running+0x24/0x40 [bnx2x]
 ? bnx2x_set_features+0x79/0xa0 [bnx2x]
 ? __netdev_update_features+0x244/0x9e0
 ? netlink_broadcast_filtered+0x136/0x4b0
 ? netdev_update_features+0x22/0x60
 ? dev_disable_lro+0x1c/0xe0
 ? devinet_sysctl_forward+0x1c6/0x211
 ? proc_sys_call_handler+0xab/0x100
 ? __vfs_write+0x36/0x1a0
 ? rcu_read_lock_sched_held+0x79/0x80
 ? rcu_sync_lockdep_assert+0x2e/0x60
 ? __sb_start_write+0x14c/0x1b0
 ? vfs_write+0x159/0x1c0
 ? vfs_write+0xba/0x1c0
 ? ksys_write+0x52/0xc0
 ? do_syscall_64+0x60/0x1f0
 ? entry_SYSCALL_64_after_hwframe+0x49/0xbe

After some investigation I figured out that recently added cleanup code
tries to call VLAN filtering de-initialization function which exist only
for newer hardware. Corresponding function pointer is not
initialized (== 0) for older hardware, namely these chips:

#define CHIP_NUM_57710  0x164e
#define CHIP_NUM_57711  0x164f
#define CHIP_NUM_57711E 0x1650

And I have one of those in my test system:

02:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries 
NetXtreme II BCM57711E 10-Gigabit PCIe [14e4:1650]
02:00.1 Ethernet controller [0200]: Broadcom Inc. and subsidiaries 
NetXtreme II BCM57711E 10-Gigabit PCIe [14e4:1650]

Function bnx2x_init_vlan_mac_fp_objs() from
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h decides whether to
initialize relevant pointers in bnx2x_sp_objs.vlan_obj or not.

This regression was introduced after v4.20-rc7.

Fixes: 04f05230c5c13 ("bnx2x: Remove configured vlans as part of unload 
sequence.")
Signed-off-by: Ivan Mironov 
---
 .../net/ethernet/broadcom/bnx2x/bnx2x_main.c  | 22 +--
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index b164f705709d..0e37c2484ac2 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -8504,15 +8504,23 @@ int bnx2x_set_vlan_one(struct bnx2x *bp, u16 vlan,
 static int bnx2x_del_all_vlans(struct bnx2x *bp)
 {
struct bnx2x_vlan_mac_obj *vlan_obj = >sp_objs[0].vlan_obj;
-   unsigned long ramrod_flags = 0, vlan_flags = 0;
struct bnx2x_vlan_entry *vlan;
-   int rc;
 
-   __set_bit(RAMROD_COMP_WAIT, _flags);
-   __set_bit(BNX2X_VLAN, _flags);
-   rc = vlan_obj->delete_all(bp, vlan_obj, _flags, _flags);
-   if (rc)
-   return rc;
+   /* The whole *vlan_obj structure may be not initialized if VLAN
+* filtering offload is not supported by hardware. Currently this is
+* true for all hardware covered by CHIP_IS_E1x().
+*/
+   if (vlan_obj->delete_all) {
+   unsigned long ramrod_flags = 0, vlan_flags = 0;
+   int rc;
+
+   __set_bit(RAMROD_COMP_WAIT, _flags);
+   __set_bit(BNX2X_VLAN, _flags);
+   rc = vlan_obj->delete_all(bp, vlan_obj, _flags,
+ _flags);
+   if (rc)
+   return rc;
+   }
 
/* Mark that hw forgot all entries */
list_for_each_entry(vlan, >vlan_reg, link)
-- 
2.20.1