Re: [PATCH V2] scsi: iscsi_tcp: Fix use-after-free when do get_host_param

2021-03-22 Thread Wu Bo

On 2021/3/22 4:21, michael.chris...@oracle.com wrote:

On 3/21/21 1:47 AM, Wu Bo wrote:

From: Wu Bo 

iscsid(cpu1): Logout of iscsi session, will do destroy session,
tcp_sw_host->session is not set to NULL before release the iscsi session.
in the iscsi_sw_tcp_session_destroy().

iscsadm(cpu2): Get host parameters access to tcp_sw_host->session in the
iscsi_sw_tcp_host_get_param(), tcp_sw_host->session is not NULL,
but pointed to a freed space.

Add ihost->lock and kref to protect the session,
between get host parameters and destroy iscsi session.

[29844.848044] sd 2:0:0:1: [sdj] Synchronizing SCSI cache
[29844.923745] scsi 2:0:0:1: alua: Detached
[29844.927840] 
==
[29844.927861] BUG: KASAN: use-after-free in 
iscsi_sw_tcp_host_get_param+0xf4/0x218 [iscsi_tcp]
[29844.927864] Read of size 8 at addr 80002c0b8f68 by task iscsiadm/523945
[29844.927871] CPU: 1 PID: 523945 Comm: iscsiadm Kdump: loaded Not tainted 
4.19.90.kasan.aarch64
[29844.927873] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[29844.927875] Call trace:
[29844.927884]  dump_backtrace+0x0/0x270
[29844.927886]  show_stack+0x24/0x30
[29844.927895]  dump_stack+0xc4/0x120
[29844.927902]  print_address_description+0x68/0x278
[29844.927904]  kasan_report+0x20c/0x338
[29844.927906]  __asan_load8+0x88/0xb0
[29844.927910]  iscsi_sw_tcp_host_get_param+0xf4/0x218 [iscsi_tcp]
[29844.927932]  show_host_param_ISCSI_HOST_PARAM_IPADDRESS+0x84/0xa0 
[scsi_transport_iscsi]
[29844.927938]  dev_attr_show+0x48/0x90
[29844.927943]  sysfs_kf_seq_show+0x100/0x1e0
[29844.927946]  kernfs_seq_show+0x88/0xa0
[29844.927949]  seq_read+0x164/0x748
[29844.927951]  kernfs_fop_read+0x204/0x308
[29844.927956]  __vfs_read+0xd4/0x2d8
[29844.927958]  vfs_read+0xa8/0x198
[29844.927960]  ksys_read+0xd0/0x180
[29844.927962]  __arm64_sys_read+0x4c/0x60
[29844.927966]  el0_svc_common+0xa8/0x230
[29844.927969]  el0_svc_handler+0xdc/0x138
[29844.927971]  el0_svc+0x10/0x218

[29844.928063] Freed by task 53358:
[29844.928066]  __kasan_slab_free+0x120/0x228
[29844.928068]  kasan_slab_free+0x10/0x18
[29844.928069]  kfree+0x98/0x278
[29844.928083]  iscsi_session_release+0x84/0xa0 [scsi_transport_iscsi]
[29844.928085]  device_release+0x4c/0x100
[29844.928089]  kobject_put+0xc4/0x288
[29844.928091]  put_device+0x24/0x30
[29844.928105]  iscsi_free_session+0x60/0x70 [scsi_transport_iscsi]
[29844.928112]  iscsi_session_teardown+0x134/0x158 [libiscsi]
[29844.928116]  iscsi_sw_tcp_session_destroy+0x7c/0xd8 [iscsi_tcp]
[29844.928129]  iscsi_if_rx+0x1538/0x1f00 [scsi_transport_iscsi]
[29844.928131]  netlink_unicast+0x338/0x3c8
[29844.928133]  netlink_sendmsg+0x51c/0x588
[29844.928135]  sock_sendmsg+0x74/0x98
[29844.928137]  ___sys_sendmsg+0x434/0x470
[29844.928139]  __sys_sendmsg+0xd4/0x148
[29844.928141]  __arm64_sys_sendmsg+0x50/0x60
[29844.928143]  el0_svc_common+0xa8/0x230
[29844.928146]  el0_svc_handler+0xdc/0x138
[29844.928147]  el0_svc+0x10/0x218
[29844.928148]
[29844.928150] The buggy address belongs to the object at 80002c0b8880#012 
which belongs to the cache kmalloc-2048 of size 2048
[29844.928153] The buggy address is located 1768 bytes inside of#012 2048-byte 
region [80002c0b8880, 80002c0b9080)
[29844.928154] The buggy address belongs to the page:
[29844.928158] page:7eb02e00 count:1 mapcount:0 
mapping:8000d8402600 index:0x0 compound_mapcount: 0
[29844.928902] flags: 0x7fffe008100(slab|head)
[29844.929215] raw: 07fffe008100 7e0003535e08 7e00024a9408 
8000d8402600
[29844.929217] raw:  000f000f 0001 

[29844.929219] page dumped because: kasan: bad access detected
[29844.929219]
[29844.929221] Memory state around the buggy address:
[29844.929223]  80002c0b8e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929225]  80002c0b8e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929227] >80002c0b8f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929228]   ^
[29844.929230]  80002c0b8f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929232]  80002c0b9000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929232] 
==
[29844.929234] Disabling lock debugging due to kernel taint
[29844.969534] scsi host2: iSCSI Initiator over TCP/IP

Fixes: a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param libiscsi 
function")
Signed-off-by: Wu Bo 
Signed-off-by: WenChao Hao 
---
  drivers/scsi/iscsi_tcp.c | 23 +--
  1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 93ce990..579aa80 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -783,22 +783,32 @@ static int iscsi_sw_tcp_host_get_param

Re: [PATCH] pci: fix memory leak when virtio pci hotplug

2021-03-21 Thread Wu Bo

On 2021/3/21 23:29, Zhiqiang Liu wrote:

From: Feilong Lin 

Repeated hot-plugging of pci devices for a virtual
machine driven by virtio, we found that there is a
leak in kmalloc-4k, which was confirmed as the memory
of the pci_device structure. Then we found out that
it was missing pci_dev_put() after pci_get_slot() in
enable_slot() of acpiphp_glue.c.

Signed-off-by: Feilong Lin 
Reviewed-by: Zhiqiang Liu 
---
  drivers/pci/hotplug/acpiphp_glue.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/pci/hotplug/acpiphp_glue.c 
b/drivers/pci/hotplug/acpiphp_glue.c
index 3365c93abf0e..f031302ad401 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -533,6 +533,7 @@ static void enable_slot(struct acpiphp_slot *slot, bool 
bridge)
slot->flags &= ~SLOT_ENABLED;
continue;
}
+   pci_dev_put(dev);
}
  }


Reviewed-by: Wu Bo 


[PATCH V2] scsi: iscsi_tcp: Fix use-after-free when do get_host_param

2021-03-20 Thread Wu Bo
From: Wu Bo 

iscsid(cpu1): Logout of iscsi session, will do destroy session, 
tcp_sw_host->session is not set to NULL before release the iscsi session.
in the iscsi_sw_tcp_session_destroy(). 

iscsadm(cpu2): Get host parameters access to tcp_sw_host->session in the
iscsi_sw_tcp_host_get_param(), tcp_sw_host->session is not NULL, 
but pointed to a freed space.

Add ihost->lock and kref to protect the session,
between get host parameters and destroy iscsi session. 

[29844.848044] sd 2:0:0:1: [sdj] Synchronizing SCSI cache
[29844.923745] scsi 2:0:0:1: alua: Detached
[29844.927840] 
==
[29844.927861] BUG: KASAN: use-after-free in 
iscsi_sw_tcp_host_get_param+0xf4/0x218 [iscsi_tcp]
[29844.927864] Read of size 8 at addr 80002c0b8f68 by task iscsiadm/523945
[29844.927871] CPU: 1 PID: 523945 Comm: iscsiadm Kdump: loaded Not tainted 
4.19.90.kasan.aarch64
[29844.927873] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[29844.927875] Call trace:
[29844.927884]  dump_backtrace+0x0/0x270
[29844.927886]  show_stack+0x24/0x30
[29844.927895]  dump_stack+0xc4/0x120
[29844.927902]  print_address_description+0x68/0x278
[29844.927904]  kasan_report+0x20c/0x338
[29844.927906]  __asan_load8+0x88/0xb0
[29844.927910]  iscsi_sw_tcp_host_get_param+0xf4/0x218 [iscsi_tcp]
[29844.927932]  show_host_param_ISCSI_HOST_PARAM_IPADDRESS+0x84/0xa0 
[scsi_transport_iscsi]
[29844.927938]  dev_attr_show+0x48/0x90
[29844.927943]  sysfs_kf_seq_show+0x100/0x1e0
[29844.927946]  kernfs_seq_show+0x88/0xa0
[29844.927949]  seq_read+0x164/0x748
[29844.927951]  kernfs_fop_read+0x204/0x308
[29844.927956]  __vfs_read+0xd4/0x2d8
[29844.927958]  vfs_read+0xa8/0x198
[29844.927960]  ksys_read+0xd0/0x180
[29844.927962]  __arm64_sys_read+0x4c/0x60
[29844.927966]  el0_svc_common+0xa8/0x230
[29844.927969]  el0_svc_handler+0xdc/0x138
[29844.927971]  el0_svc+0x10/0x218

[29844.928063] Freed by task 53358:
[29844.928066]  __kasan_slab_free+0x120/0x228
[29844.928068]  kasan_slab_free+0x10/0x18
[29844.928069]  kfree+0x98/0x278
[29844.928083]  iscsi_session_release+0x84/0xa0 [scsi_transport_iscsi]
[29844.928085]  device_release+0x4c/0x100
[29844.928089]  kobject_put+0xc4/0x288
[29844.928091]  put_device+0x24/0x30
[29844.928105]  iscsi_free_session+0x60/0x70 [scsi_transport_iscsi]
[29844.928112]  iscsi_session_teardown+0x134/0x158 [libiscsi]
[29844.928116]  iscsi_sw_tcp_session_destroy+0x7c/0xd8 [iscsi_tcp]
[29844.928129]  iscsi_if_rx+0x1538/0x1f00 [scsi_transport_iscsi]
[29844.928131]  netlink_unicast+0x338/0x3c8
[29844.928133]  netlink_sendmsg+0x51c/0x588
[29844.928135]  sock_sendmsg+0x74/0x98
[29844.928137]  ___sys_sendmsg+0x434/0x470
[29844.928139]  __sys_sendmsg+0xd4/0x148
[29844.928141]  __arm64_sys_sendmsg+0x50/0x60
[29844.928143]  el0_svc_common+0xa8/0x230
[29844.928146]  el0_svc_handler+0xdc/0x138
[29844.928147]  el0_svc+0x10/0x218
[29844.928148]
[29844.928150] The buggy address belongs to the object at 80002c0b8880#012 
which belongs to the cache kmalloc-2048 of size 2048
[29844.928153] The buggy address is located 1768 bytes inside of#012 2048-byte 
region [80002c0b8880, 80002c0b9080)
[29844.928154] The buggy address belongs to the page:
[29844.928158] page:7eb02e00 count:1 mapcount:0 
mapping:8000d8402600 index:0x0 compound_mapcount: 0
[29844.928902] flags: 0x7fffe008100(slab|head)
[29844.929215] raw: 07fffe008100 7e0003535e08 7e00024a9408 
8000d8402600
[29844.929217] raw:  000f000f 0001 

[29844.929219] page dumped because: kasan: bad access detected
[29844.929219]
[29844.929221] Memory state around the buggy address:
[29844.929223]  80002c0b8e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929225]  80002c0b8e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929227] >80002c0b8f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929228]   ^
[29844.929230]  80002c0b8f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929232]  80002c0b9000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929232] 
==
[29844.929234] Disabling lock debugging due to kernel taint
[29844.969534] scsi host2: iSCSI Initiator over TCP/IP

Fixes: a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param libiscsi 
function")
Signed-off-by: Wu Bo 
Signed-off-by: WenChao Hao 
---
 drivers/scsi/iscsi_tcp.c | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 93ce990..579aa80 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -783,22 +783,32 @@ static int iscsi_sw_tcp_host_get_param(struct Scsi_Host 
*shost,
   enum iscsi_host_param param, cha

Re: [PATCH] scsi: iscsi_tcp: Fix use-after-free when do get_host_param

2021-03-20 Thread Wu Bo

On 2021/3/20 17:08, Wu Bo wrote:

When logout of iscsi session, to do destroy session process,
tcp_sw_host->session is not set to NULL.
Get host parameters access to tcp_sw_host->session at the same time,
but the session has been released at this time.

[29844.848044] sd 2:0:0:1: [sdj] Synchronizing SCSI cache
[29844.923745] scsi 2:0:0:1: alua: Detached
[29844.927840] 
==
[29844.927861] BUG: KASAN: use-after-free in 
iscsi_sw_tcp_host_get_param+0xf4/0x218 [iscsi_tcp]
[29844.927864] Read of size 8 at addr 80002c0b8f68 by task iscsiadm/523945
[29844.927871] CPU: 1 PID: 523945 Comm: iscsiadm Kdump: loaded Not tainted 
4.19.90.kasan.aarch64
[29844.927873] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[29844.927875] Call trace:
[29844.927884]  dump_backtrace+0x0/0x270
[29844.927886]  show_stack+0x24/0x30
[29844.927895]  dump_stack+0xc4/0x120
[29844.927902]  print_address_description+0x68/0x278
[29844.927904]  kasan_report+0x20c/0x338
[29844.927906]  __asan_load8+0x88/0xb0
[29844.927910]  iscsi_sw_tcp_host_get_param+0xf4/0x218 [iscsi_tcp]
[29844.927932]  show_host_param_ISCSI_HOST_PARAM_IPADDRESS+0x84/0xa0 
[scsi_transport_iscsi]
[29844.927938]  dev_attr_show+0x48/0x90
[29844.927943]  sysfs_kf_seq_show+0x100/0x1e0
[29844.927946]  kernfs_seq_show+0x88/0xa0
[29844.927949]  seq_read+0x164/0x748
[29844.927951]  kernfs_fop_read+0x204/0x308
[29844.927956]  __vfs_read+0xd4/0x2d8
[29844.927958]  vfs_read+0xa8/0x198
[29844.927960]  ksys_read+0xd0/0x180
[29844.927962]  __arm64_sys_read+0x4c/0x60
[29844.927966]  el0_svc_common+0xa8/0x230
[29844.927969]  el0_svc_handler+0xdc/0x138
[29844.927971]  el0_svc+0x10/0x218
[29844.928063] Freed by task 53358:
[29844.928066]  __kasan_slab_free+0x120/0x228
[29844.928068]  kasan_slab_free+0x10/0x18
[29844.928069]  kfree+0x98/0x278
[29844.928083]  iscsi_session_release+0x84/0xa0 [scsi_transport_iscsi]
[29844.928085]  device_release+0x4c/0x100
[29844.928089]  kobject_put+0xc4/0x288
[29844.928091]  put_device+0x24/0x30
[29844.928105]  iscsi_free_session+0x60/0x70 [scsi_transport_iscsi]
[29844.928112]  iscsi_session_teardown+0x134/0x158 [libiscsi]
[29844.928116]  iscsi_sw_tcp_session_destroy+0x7c/0xd8 [iscsi_tcp]
[29844.928129]  iscsi_if_rx+0x1538/0x1f00 [scsi_transport_iscsi]
[29844.928131]  netlink_unicast+0x338/0x3c8
[29844.928133]  netlink_sendmsg+0x51c/0x588
[29844.928135]  sock_sendmsg+0x74/0x98
[29844.928137]  ___sys_sendmsg+0x434/0x470
[29844.928139]  __sys_sendmsg+0xd4/0x148
[29844.928141]  __arm64_sys_sendmsg+0x50/0x60
[29844.928143]  el0_svc_common+0xa8/0x230
[29844.928146]  el0_svc_handler+0xdc/0x138
[29844.928147]  el0_svc+0x10/0x218
[29844.928148]
[29844.928150] The buggy address belongs to the object at 80002c0b8880#012 
which belongs to the cache kmalloc-2048 of size 2048
[29844.928153] The buggy address is located 1768 bytes inside of#012 2048-byte 
region [80002c0b8880, 80002c0b9080)
[29844.928154] The buggy address belongs to the page:
[29844.928158] page:7eb02e00 count:1 mapcount:0 
mapping:8000d8402600 index:0x0 compound_mapcount: 0
[29844.928902] flags: 0x7fffe008100(slab|head)
[29844.929215] raw: 07fffe008100 7e0003535e08 7e00024a9408 
8000d8402600
[29844.929217] raw:  000f000f 0001 

[29844.929219] page dumped because: kasan: bad access detected
[29844.929219]
[29844.929221] Memory state around the buggy address:
[29844.929223]  80002c0b8e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929225]  80002c0b8e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929227] >80002c0b8f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929228]   ^
[29844.929230]  80002c0b8f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929232]  80002c0b9000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929232] 
==
[29844.929234] Disabling lock debugging due to kernel taint
[29844.969534] scsi host2: iSCSI Initiator over TCP/IP

Fixes: a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param libiscsi 
function")
Signed-off-by: Wu Bo 
Signed-off-by: WenChao Hao 
---
  drivers/scsi/iscsi_tcp.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index dd33ce0..98d782d 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -901,10 +901,12 @@ static void iscsi_sw_tcp_session_destroy(struct 
iscsi_cls_session *cls_session)
  {
struct Scsi_Host *shost = iscsi_session_to_shost(cls_session);
struct iscsi_session *session = cls_session->dd_data;
+   struct iscsi_sw_tcp_host *tcp_sw_host = iscsi_host_priv(shost);
  
  	if (WARN_ON_ONCE(session->leadconn))

return;
  
+	tc

[PATCH] scsi: iscsi_tcp: Fix use-after-free when do get_host_param

2021-03-20 Thread Wu Bo
When logout of iscsi session, to do destroy session process, 
tcp_sw_host->session is not set to NULL.
Get host parameters access to tcp_sw_host->session at the same time,
but the session has been released at this time.

[29844.848044] sd 2:0:0:1: [sdj] Synchronizing SCSI cache
[29844.923745] scsi 2:0:0:1: alua: Detached
[29844.927840] 
==
[29844.927861] BUG: KASAN: use-after-free in 
iscsi_sw_tcp_host_get_param+0xf4/0x218 [iscsi_tcp]
[29844.927864] Read of size 8 at addr 80002c0b8f68 by task iscsiadm/523945
[29844.927871] CPU: 1 PID: 523945 Comm: iscsiadm Kdump: loaded Not tainted 
4.19.90.kasan.aarch64
[29844.927873] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[29844.927875] Call trace:
[29844.927884]  dump_backtrace+0x0/0x270
[29844.927886]  show_stack+0x24/0x30
[29844.927895]  dump_stack+0xc4/0x120
[29844.927902]  print_address_description+0x68/0x278
[29844.927904]  kasan_report+0x20c/0x338
[29844.927906]  __asan_load8+0x88/0xb0
[29844.927910]  iscsi_sw_tcp_host_get_param+0xf4/0x218 [iscsi_tcp]
[29844.927932]  show_host_param_ISCSI_HOST_PARAM_IPADDRESS+0x84/0xa0 
[scsi_transport_iscsi]
[29844.927938]  dev_attr_show+0x48/0x90
[29844.927943]  sysfs_kf_seq_show+0x100/0x1e0
[29844.927946]  kernfs_seq_show+0x88/0xa0
[29844.927949]  seq_read+0x164/0x748
[29844.927951]  kernfs_fop_read+0x204/0x308
[29844.927956]  __vfs_read+0xd4/0x2d8
[29844.927958]  vfs_read+0xa8/0x198
[29844.927960]  ksys_read+0xd0/0x180
[29844.927962]  __arm64_sys_read+0x4c/0x60
[29844.927966]  el0_svc_common+0xa8/0x230
[29844.927969]  el0_svc_handler+0xdc/0x138
[29844.927971]  el0_svc+0x10/0x218
[29844.928063] Freed by task 53358:
[29844.928066]  __kasan_slab_free+0x120/0x228
[29844.928068]  kasan_slab_free+0x10/0x18
[29844.928069]  kfree+0x98/0x278
[29844.928083]  iscsi_session_release+0x84/0xa0 [scsi_transport_iscsi]
[29844.928085]  device_release+0x4c/0x100
[29844.928089]  kobject_put+0xc4/0x288
[29844.928091]  put_device+0x24/0x30
[29844.928105]  iscsi_free_session+0x60/0x70 [scsi_transport_iscsi]
[29844.928112]  iscsi_session_teardown+0x134/0x158 [libiscsi]
[29844.928116]  iscsi_sw_tcp_session_destroy+0x7c/0xd8 [iscsi_tcp]
[29844.928129]  iscsi_if_rx+0x1538/0x1f00 [scsi_transport_iscsi]
[29844.928131]  netlink_unicast+0x338/0x3c8
[29844.928133]  netlink_sendmsg+0x51c/0x588
[29844.928135]  sock_sendmsg+0x74/0x98
[29844.928137]  ___sys_sendmsg+0x434/0x470
[29844.928139]  __sys_sendmsg+0xd4/0x148
[29844.928141]  __arm64_sys_sendmsg+0x50/0x60
[29844.928143]  el0_svc_common+0xa8/0x230
[29844.928146]  el0_svc_handler+0xdc/0x138
[29844.928147]  el0_svc+0x10/0x218
[29844.928148]
[29844.928150] The buggy address belongs to the object at 80002c0b8880#012 
which belongs to the cache kmalloc-2048 of size 2048
[29844.928153] The buggy address is located 1768 bytes inside of#012 2048-byte 
region [80002c0b8880, 80002c0b9080)
[29844.928154] The buggy address belongs to the page:
[29844.928158] page:7eb02e00 count:1 mapcount:0 
mapping:8000d8402600 index:0x0 compound_mapcount: 0
[29844.928902] flags: 0x7fffe008100(slab|head)
[29844.929215] raw: 07fffe008100 7e0003535e08 7e00024a9408 
8000d8402600
[29844.929217] raw:  000f000f 0001 

[29844.929219] page dumped because: kasan: bad access detected
[29844.929219]
[29844.929221] Memory state around the buggy address:
[29844.929223]  80002c0b8e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929225]  80002c0b8e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929227] >80002c0b8f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929228]   ^
[29844.929230]  80002c0b8f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929232]  80002c0b9000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[29844.929232] 
==
[29844.929234] Disabling lock debugging due to kernel taint
[29844.969534] scsi host2: iSCSI Initiator over TCP/IP

Fixes: a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param libiscsi 
function")
Signed-off-by: Wu Bo 
Signed-off-by: WenChao Hao 
---
 drivers/scsi/iscsi_tcp.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index dd33ce0..98d782d 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -901,10 +901,12 @@ static void iscsi_sw_tcp_session_destroy(struct 
iscsi_cls_session *cls_session)
 {
struct Scsi_Host *shost = iscsi_session_to_shost(cls_session);
struct iscsi_session *session = cls_session->dd_data;
+   struct iscsi_sw_tcp_host *tcp_sw_host = iscsi_host_priv(shost);
 
if (WARN_ON_ONCE(session->leadconn))
return;
 
+   tcp_sw_host->session = NULL;
iscsi_tcp_r2tpo

[RFC PATCH] ata: add lun validity check on ata_sas_queuecmd

2021-03-18 Thread Wu Bo
Hi all,

After executing rescan-scsi-bus.sh -r -m, the system adds 255 more disks.

The reason is as follows:
1. Execute the rescan-scsi-bus.sh script to scan all targets.
2. The REPORT_LUNS failed due to some errors on the device of sdb(LUN0).
3. Do a sequential scan on the target which sdb belongs to.
4. Have already scanned LUN 0, so start at LUN 1. and keep scanning until
   reach the max.
5. For SATA device, the driver of ata does not check the validity of 
   LUN ID before dispatch commands to ATA device. 
   Result in LUN 1~MAX successfully added to the system.

trace:
__scsi_scan_target()
   -> scsi_report_lun_scan()
   -> scsi_sequential_lun_scan()

Steps to reproduce as follow:

step1: lsscsi
[1:0:0:0]diskATA  /dev/sda 
[1:0:1:0]diskATA  /dev/sdb 
[1:0:2:0]diskATA  /dev/sdc 
[1:0:3:0]diskATA  /dev/sdd 
[1:0:4:0]diskATA  /dev/sde

step2: echo "offline" > /sys/block/sdb/device/state
step3: rescan-scsi-bus.sh -r -m
step4: lsscsi
[1:0:0:0]diskATA  /dev/sda 
[1:0:1:1]diskATA  /dev/sdr 
[1:0:1:2]diskATA  /dev/sds 
[1:0:1:3]diskATA  /dev/sdt 
[1:0:1:4]diskATA  /dev/sdu 
..
[1:0:1:251]  diskATA  /dev/sdjh
[1:0:1:252]  diskATA  /dev/sdji
[1:0:1:253]  diskATA  /dev/sdjj
[1:0:1:254]  diskATA  /dev/sdjk
[1:0:1:255]  diskATA  /dev/sdjl
[1:0:2:0]diskATA  /dev/sdc 
[1:0:3:0]diskATA  /dev/sdd 
[1:0:4:0]diskATA  /dev/sde 

Signed-off-by: Wu Bo 
---
 drivers/ata/libata-sata.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
index c16423e..e30a412 100644
--- a/drivers/ata/libata-sata.c
+++ b/drivers/ata/libata-sata.c
@@ -1242,7 +1242,8 @@ int ata_sas_queuecmd(struct scsi_cmnd *cmd, struct 
ata_port *ap)
 
ata_scsi_dump_cdb(ap, cmd);
 
-   if (likely(ata_dev_enabled(ap->link.device)))
+   if (likely(ata_dev_enabled(ap->link.device)) &&
+   (cmd->device->lun == 0))
rc = __ata_scsi_queuecmd(cmd, ap->link.device);
else {
cmd->result = (DID_BAD_TARGET << 16);
-- 
1.8.3.1



Re: [Openipmi-developer] [PATCH] x86: Fix MCE error handing when kdump is enabled

2020-09-23 Thread Wu Bo

On 2020/9/23 2:43, Corey Minyard wrote:

On Tue, Sep 22, 2020 at 01:29:40PM -0500, miny...@acm.org wrote:

From: Corey Minyard 

If kdump is enabled, the handling of shooting down CPUs does not use the
RESET_VECTOR irq before trying to use NMIs to shoot down the CPUs.

For normal errors that is fine.  MCEs, however, are already running in
an NMI, so sending them an NMI won't do anything.  The MCE code is set
up to receive the RESET_VECTOR because it disables CPUs, but it won't

 ^ should be "enables irqs"

work on the NMI-only case.

There is already code in place to scan for the NMI callback being ready,
simply call that from the MCE's wait_for_panic() code so it will pick up
and handle it if an NMI shootdown is requested.  This required
propagating the registers down to wait_for_panic().

Signed-off-by: Corey Minyard 
---
After looking at it a bit, I think this is the proper way to fix the
issue, though I'm not an expert on this code so I'm not sure.

I have not even tested this patch, I have only compiled it.  But from
what I can tell, things waiting in NMIs for a shootdown should call
run_crash_ipi_callback() in their wait loop.


Hi,

In my VM (using qemu-kvm), Kump is enabled, used mce-inject injects an 
uncorrectable error. I has an issue with the IPMI driver's panic 
handling running while the other CPUs are sitting in "wait_for_panic()" 
with interrupt on, and IPMI interrupts interfering with the panic 
handling, As a result, IPMI panic hangs for more than 3000 seconds.


After I has patched and tested this patch, the problem of IPMI hangs has 
disappeared. It should be a solution to the problem.



Thanks,

Wu Bo



  arch/x86/kernel/cpu/mce/core.c | 67 ++
  1 file changed, 44 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index f43a78bde670..3a842b3773b3 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -282,20 +282,35 @@ static int fake_panic;
  static atomic_t mce_fake_panicked;
  
  /* Panic in progress. Enable interrupts and wait for final IPI */

-static void wait_for_panic(void)
+static void wait_for_panic(struct pt_regs *regs)
  {
long timeout = PANIC_TIMEOUT*USEC_PER_SEC;
  
  	preempt_disable();

local_irq_enable();
-   while (timeout-- > 0)
+   while (timeout-- > 0) {
+   /*
+* We are in an NMI waiting to be stopped by the
+* handing processor.  For kdump handling, we need to
+* be monitoring crash_ipi_issued since that is what
+* is used for an NMI stop used by kdump.  But we also
+* need to have interrupts enabled some so that
+* RESET_VECTOR will interrupt us on a normal
+* shutdown.
+*/
+   local_irq_disable();
+   run_crash_ipi_callback(regs);
+   local_irq_enable();
+
udelay(1);
+   }
if (panic_timeout == 0)
panic_timeout = mca_cfg.panic_timeout;
panic("Panicing machine check CPU died");
  }
  
-static void mce_panic(const char *msg, struct mce *final, char *exp)

+static void mce_panic(const char *msg, struct mce *final, char *exp,
+ struct pt_regs *regs)
  {
int apei_err = 0;
struct llist_node *pending;
@@ -306,7 +321,7 @@ static void mce_panic(const char *msg, struct mce *final, 
char *exp)
 * Make sure only one CPU runs in machine check panic
 */
if (atomic_inc_return(&mce_panicked) > 1)
-   wait_for_panic();
+   wait_for_panic(regs);
barrier();
  
  		bust_spinlocks(1);

@@ -817,7 +832,7 @@ static atomic_t mce_callin;
  /*
   * Check if a timeout waiting for other CPUs happened.
   */
-static int mce_timed_out(u64 *t, const char *msg)
+static int mce_timed_out(u64 *t, const char *msg, struct pt_regs *regs)
  {
/*
 * The others already did panic for some reason.
@@ -827,12 +842,12 @@ static int mce_timed_out(u64 *t, const char *msg)
 */
rmb();
if (atomic_read(&mce_panicked))
-   wait_for_panic();
+   wait_for_panic(regs);
if (!mca_cfg.monarch_timeout)
goto out;
if ((s64)*t < SPINUNIT) {
if (mca_cfg.tolerant <= 1)
-   mce_panic(msg, NULL, NULL);
+   mce_panic(msg, NULL, NULL, regs);
cpu_missing = 1;
return 1;
}
@@ -866,7 +881,7 @@ static int mce_timed_out(u64 *t, const char *msg)
   * All the spin loops have timeouts; when a timeout happens a CPU
   * typically elects itself to be Monarch.
   */
-static void mce_reign(void)
+static void mce_reign

Re: [RFC PATCH V2] ipmi: ssif: Fix out of bounds in write_next_byte()

2020-09-23 Thread Wu Bo

On 2020/9/23 0:13, Corey Minyard wrote:

On Tue, Sep 22, 2020 at 08:31:44AM -0500, Corey Minyard wrote:

On Mon, Sep 21, 2020 at 10:00:08PM +0800, Wu Bo wrote:

In my virtual machine (have 4 cpus), Use mce_inject to inject errors
into the system. After mce-inject injects an uncorrectable error,
there is a probability that the virtual machine is not reset immediately,
but hangs for more than 3000 seconds, and appeared unable to
handle kernel paging request.

The analysis reasons are as follows:
1) MCE appears on all CPUs, Currently all CPUs are in the NMI interrupt
context. cpu0 is the first to seize the opportunity to run panic
routines, and panic event should stop the other processors before
do ipmi flush_messages(). but cpu1, cpu2 and cpu3 has already
in NMI interrupt context, So the Second NMI interrupt(IPI)
will not be processed again by cpu1, cpu2 and cpu3.
At this time, cpu1,cpu2 and cpu3 did not stopped.

2) cpu1, cpu2 and cpu3 are waitting for cpu0 to finish the panic process.
if a timeout waiting for other CPUs happened, do wait_for_panic(),
the irq is enabled in the wait_for_panic() function.

3) ipmi IRQ occurs on the cpu3, and the cpu0 is doing the panic,
they have the opportunity to call the smi_event_handler()
function concurrently. the ipmi IRQ affects the panic process of cpu0.

   CPU0CPU3

|-nmi_handle do mce_panic   |-nmi_handle do_machine_check
|   |
|-panic()   |-wait_for_panic()
|   |
|-stop other cpus  NMI --> (Ignore, already in nmi interrupt)


There is a step that happens before this.  In native_stop_other_cpus()
it uses the REBOOT_VECTOR irq to stop the other CPUs before it does the
NMI.

The question is: Why isn't that working?  That's why irqs are enabled in
wait_for_panic, so this REBOOT_VECTOR will work.

Again, changing the IPMI driver to account for this is ignoring the root
problem, and the root problem will cause other issues.

You mention you are running in a VM, but you don't mention which one.
Maybe the problem is in the VM?  I can't see how this is an issue unless
you are not using native_stop_other_cpus() (using Xen?) or you have
other kernel code changes.


I looked a bit more, and I'm guessing you are using kdump.
kdump_nmi_shuutdown_cpus() does not have the same handling for sending
the REBOOT_VECTOR interrupts as native_stop_other_cpus().

The kdump version of the code is not going to work with an MCE as it
only uses the NMI, and the NMI is not going to work.

I'm still not sure of the right way to fix this.  I can see two options:

* Do the REBOOT_VECTOR handling in kdump_nmi_shuutdown_cpus().

* In mce_panic, detect if you are going to do a kdump and handle it
   appropriately there.

We really need to get the x86 maintainers to see the issue and fix it.

-corey


Hi, corey

Your guess is correct. Kdump is enabled in my VM (using qemu-kvm).
The panic process use kdump_nmi_shootdown_cpus() to stop other cpus.


Thanks,

Wu Bo.


-corey


|   |
|-notifier call(ipmi panic_event)   |<-ipmi IRQ occurs
|   |
   \|/ \|/
do smi_event_handler() do smi_event_handler()


The current scene encountered is a simulation injection error of the mce 
software,
and it is not confirmed whether there are other similar scenes.

so add the try spinlocks in the IPMI panic handler to solve the concurrency 
problem of
panic process and IRQ process, and also to prevent the panic process from 
deadlock.

Steps to reproduce (Have a certain probability):
1. # vim /tmp/uncorrected
CPU 1 BANK 4
STATUS uncorrected 0xc0
MCGSTATUS  EIPV MCIP
ADDR 0x1234
RIP 0xdeadbabe
RAISINGCPU 0
MCGCAP SER CMCI TES 0x6

2. # modprobe mce_inject
3. # cd /tmp
4. # mce-inject uncorrected

The logs:
[   55.086670] core: [Hardware Error]: RIP 00:<deadbabe>
[   55.086671] core: [Hardware Error]: TSC 2e11aff65eea ADDR 1234
[   55.086673] core: [Hardware Error]: PROCESSOR 0:50654 TIME 1598967234 SOCKET 
0 APIC 1 microcode 1
[   55.086674] core: [Hardware Error]: Run the above through 'mcelog --ascii'
[   55.086675] core: [Hardware Error]: Machine check: In kernel and no restart 
IP
[   55.086676] Kernel panic - not syncing: Fatal machine check
[   55.086677] kernel fault(0x5) notification starting on CPU 0
[   55.086682] kernel fault(0x5) notification finished on CPU 0
[ 4767.947960] BUG: unable to handle kernel paging request at 893e4000
[ 4767.947962] PGD 13c001067 P4D 13c001067 PUD 0
[ 4767.947965] Oops:  [#1] SMP PTI
[ 4767.947967] CPU: 0 PID: 0 Comm: swapper/0
[ 4767.947968] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.10.2-0-g5f4c7b1-20181220_00-szxrtosci1 04/01

[RFC PATCH V2] ipmi: ssif: Fix out of bounds in write_next_byte()

2020-09-21 Thread Wu Bo
ut at 95637f5d
 #12 [fe088c70] do_machine_check at 95638db4
 #13 [fe088d80] raise_exception at c05b6117 [mce_inject]
 #14 [fe088e48] mce_raise_notify at c05b6a92 [mce_inject]
 #15 [fe088e58] nmi_handle at 95621c73
 #16 [fe088eb0] default_do_nmi at 9562213e
 #17 [fe088ed0] do_nmi at 9562231c
 #18 [fe088ef0] end_repeat_nmi at 960016b4
 [exception RIP: native_safe_halt+14]
 RIP: 95e7223e  RSP: a222c06a3eb0  RFLAGS: 0246
 RAX: 95e71f30  RBX: 0003  RCX: 0001
 RDX: 0001  RSI: 0083  RDI: 
 RBP: 0003   R8: 0018cf7cd9a0   R9: 0001
 R10: 0400  R11: 03fb  R12: 
 R13:   R14:   R15: 
 ORIG_RAX:   CS: 0010  SS: 0018
 ---  ---
 #19 [a222c06a3eb0] native_safe_halt at 95e7223e
 #20 [a222c06a3eb0] default_idle at 95e71f4a
 #21 [a222c06a3ed0] do_idle at 956e959a
 #22 [a222c06a3f10] cpu_startup_entry at 956e981f
 #23 [a222c06a3f30] start_secondary at 9564e697
 #24 [a222c06a3f50] secondary_startup_64 at 956000e7

Fixes: e45361d73 ("Factor out message flushing procedure")
Signed-off-by: Feilong Lin 
Signed-off-by: Wu Bo 
---
 drivers/char/ipmi/ipmi_si_intf.c | 21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index 77b8d55..44ba9b6 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -882,16 +882,19 @@ static void flush_messages(void *send_info)
 {
struct smi_info *smi_info = send_info;
enum si_sm_result result;
+   unsigned long flags = 0;
+   int time = 0;
 
-   /*
-* Currently, this function is called only in run-to-completion
-* mode.  This means we are single-threaded, no need for locks.
-*/
-   result = smi_event_handler(smi_info, 0);
-   while (result != SI_SM_IDLE) {
-   udelay(SI_SHORT_TIMEOUT_USEC);
-   result = smi_event_handler(smi_info, SI_SHORT_TIMEOUT_USEC);
-   }
+restart:
+if (!spin_trylock_irqsave(&smi_info->si_lock, flags))
+return;
+result = smi_event_handler(smi_info, time);
+spin_unlock_irqrestore(&smi_info->si_lock, flags);
+if (result != SI_SM_IDLE) {
+udelay(SI_SHORT_TIMEOUT_USEC);
+time = SI_SHORT_TIMEOUT_USEC;
+goto restart;
+}
 }
 
 static void sender(void*send_info,
-- 
1.8.3.1



Re: [RFC PATCH] mce: don't not enable IRQ in wait_for_panic()

2020-09-21 Thread Wu Bo

On 2020/9/17 18:37, Wu Bo wrote:

In my virtual machine (have 4 cpus), Use mce_inject to inject errors
into the system. After mce-inject injects an uncorrectable error,
there is a probability that the virtual machine is not reset immediately,
but hangs for more than 3000 seconds, and appeared unable to
handle kernel paging request.

The analysis reasons are as follows:
1) MCE appears on all CPUs, Currently all CPUs are in the NMI interrupt
context. cpu0 is the first to seize the opportunity to run panic
routines, and panic event should stop the other processors before
do ipmi flush_messages(). but cpu1, cpu2 and cpu3 has already
in NMI interrupt context, So the Second NMI interrupt(IPI)
will not be processed again by cpu1, cpu2 and cpu3.
At this time, cpu1,cpu2 and cpu3 did not stopped.

2) cpu1, cpu2 and cpu3 are waitting for cpu0 to finish the panic process.
if a timeout waiting for other CPUs happened, do wait_for_panic(),
the irq is enabled in the wait_for_panic() function.

3) ipmi IRQ occurs on the cpu3, and the cpu0 is doing the panic,
they have the opportunity to call the smi_event_handler()
function concurrently. the ipmi IRQ affects the panic process of cpu0.

   CPU0CPU3

|-nmi_handle do mce_panic   |-nmi_handle do_machine_check
|   |
|-panic()   |-wait_for_panic()
|   |
|-stop other cpus  NMI --> (Ignore, already in nmi interrupt)
|   |
|-notifier call(ipmi panic_event)   |<-ipmi IRQ occurs
|   |
   \|/ \|/
do smi_event_handler() do smi_event_handler()

My understanding is that panic() runs with only one operational CPU
in the system, other CPUs should be stopped, if other CPUs does not stop,
at least IRQ interrupts should be disabled. The x86 architecture,
disable IRQ interrupt will not affect IPI when do mce panic,
because IPI is notified through NMI interrupt. If my analysis
is not right, please correct me, thanks.

Steps to reproduce (Have a certain probability):
1. # vim /tmp/uncorrected
CPU 1 BANK 4
STATUS uncorrected 0xc0
MCGSTATUS  EIPV MCIP
ADDR 0x1234
RIP 0xdeadbabe
RAISINGCPU 0
MCGCAP SER CMCI TES 0x6
  
2. # modprobe mce_inject

3. # cd /tmp
4. # mce-inject uncorrected


Hi,

I tested the 5.9-rc5 version and found that the problem still exists. Is 
there a good solution ?


Best regards,
Wu Bo



[RFC PATCH] mce: don't not enable IRQ in wait_for_panic()

2020-09-17 Thread Wu Bo
8c48] wait_for_panic at 95637c6c
 #11 [fe088c58] mce_timed_out at 95637f5d
 #12 [fe088c70] do_machine_check at 95638db4
 #13 [fe088d80] raise_exception at c05b6117 [mce_inject]
 #14 [fe088e48] mce_raise_notify at c05b6a92 [mce_inject]
 #15 [fe088e58] nmi_handle at 95621c73
 #16 [fe088eb0] default_do_nmi at 9562213e
 #17 [fe088ed0] do_nmi at 9562231c
 #18 [fe088ef0] end_repeat_nmi at 960016b4
 [exception RIP: native_safe_halt+14]
 RIP: 95e7223e  RSP: a222c06a3eb0  RFLAGS: 0246
 RAX: 95e71f30  RBX: 0003  RCX: 0001
 RDX: 0001  RSI: 0083  RDI: 
 RBP: 0003   R8: 0018cf7cd9a0   R9: 0001
 R10: 0400  R11: 03fb  R12: 
 R13:   R14:   R15: 
 ORIG_RAX:   CS: 0010  SS: 0018
 ---  ---
 #19 [a222c06a3eb0] native_safe_halt at 95e7223e
 #20 [a222c06a3eb0] default_idle at 95e71f4a
 #21 [a222c06a3ed0] do_idle at 956e959a
 #22 [a222c06a3f10] cpu_startup_entry at 956e981f
 #23 [a222c06a3f30] start_secondary at 9564e697
 #24 [a222c06a3f50] secondary_startup_64 at 956000e7


Signed-off-by: Feilong Lin 
Signed-off-by: Wu Bo 
---
 arch/x86/kernel/cpu/mce/core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index f43a78b..738f582 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -281,13 +281,12 @@ static void print_mce(struct mce *m)
 static int fake_panic;
 static atomic_t mce_fake_panicked;
 
-/* Panic in progress. Enable interrupts and wait for final IPI */
+/* Panic in progress. Wait for final IPI */
 static void wait_for_panic(void)
 {
long timeout = PANIC_TIMEOUT*USEC_PER_SEC;
 
preempt_disable();
-   local_irq_enable();
while (timeout-- > 0)
udelay(1);
if (panic_timeout == 0)
-- 
1.8.3.1



[PATCH] blkcg:Fix memory leaks in blkg_conf_prep()

2020-05-18 Thread Wu Bo
From: Wu Bo 

If a call of the function blkg_lookup_check() failed,
we should be release the previously allocated block group 
before jumping to the lable 'fail_unlock' in the implementation of 
the function blkg_conf_prep().

Suggested-by: Markus Elfring 
Signed-off-by: Wu Bo 

---
V2: omit the source code quotation from 
the change description

---
 block/blk-cgroup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 930212c..afeb769 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -682,6 +682,7 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct 
blkcg_policy *pol,
blkg = blkg_lookup_check(pos, pol, q);
if (IS_ERR(blkg)) {
ret = PTR_ERR(blkg);
+   blkg_free(new_blkg);
goto fail_unlock;
}
 
-- 
1.8.3.1



Re: [PATCH] blkcg: Fix memory leak in blkg_conf_prep()

2020-05-18 Thread Wu Bo

On 2020/5/15 23:08, Markus Elfring wrote:

…

new_blkg = blkg_alloc(pos, q, GFP_KERNEL);

…

I suggest to omit the source code quotation from the change description.



if calling blkg_lookup_check() failed, at the IS_ERR block,
the new_blkg should be free before goto lable fail_unlock
in blkg_conf_prep() function.


How do you think about a wording variant like the following?

   If a call of the function “blkg_lookup_check” failed,
   release the previously allocated block group before jumping
   to the target “fail_unlock” in the implementation of
   the function “blkg_conf_prep”.



Thanks for your suggestion. omit the source code quotation from the 
description is more friendly. I will modify the description in V2 patch.


Thanks,
Wu Bo



Would you like to add the tag “Fixes” to the commit message?

Regards,
Markus






[PATCH] blkcg:fixes memory leaks in blkg_conf_prep()

2020-05-15 Thread Wu Bo
blkg_conf_prep():
 
new_blkg = blkg_alloc(pos, q, GFP_KERNEL);
...
blkg = blkg_lookup_check(pos, pol, q);
if (IS_ERR(blkg)) {
ret = PTR_ERR(blkg);
goto fail_unlock;
}
...


if calling blkg_lookup_check() failed, at the IS_ERR block, 
the new_blkg should be free before goto lable fail_unlock
in blkg_conf_prep() function.

Signed-off-by: Wu Bo 
---
 block/blk-cgroup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 930212c..afeb769 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -682,6 +682,7 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct 
blkcg_policy *pol,
blkg = blkg_lookup_check(pos, pol, q);
if (IS_ERR(blkg)) {
ret = PTR_ERR(blkg);
+   blkg_free(new_blkg);
goto fail_unlock;
}
 
-- 
1.8.3.1



[RESENT PATCH V2] nvme/core:disable streams when get stream params failed

2020-05-13 Thread Wu Bo
After enable nvme streams, then if get stream params failed, 
We should disable streams before return error in 
nvme_configure_directives() function.

Signed-off-by: Wu Bo 
---
 drivers/nvme/host/core.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f3c037f..29bef53 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -553,19 +553,22 @@ static int nvme_configure_directives(struct nvme_ctrl 
*ctrl)
 
ret = nvme_get_stream_params(ctrl, &s, NVME_NSID_ALL);
if (ret)
-   return ret;
+   goto out_disable_stream;
 
ctrl->nssa = le16_to_cpu(s.nssa);
if (ctrl->nssa < BLK_MAX_WRITE_HINTS - 1) {
dev_info(ctrl->device, "too few streams (%u) available\n",
ctrl->nssa);
-   nvme_disable_streams(ctrl);
-   return 0;
+   goto out_disable_stream;
}
 
ctrl->nr_streams = min_t(unsigned, ctrl->nssa, BLK_MAX_WRITE_HINTS - 1);
dev_info(ctrl->device, "Using %u streams\n", ctrl->nr_streams);
return 0;
+
+out_disable_stream:
+   nvme_disable_streams(ctrl);
+   return ret;
 }
 
 /*
-- 
1.8.3.1



[PATCH V2] nvme/core:disable streams when get stream params failed

2020-05-13 Thread Wu Bo
After enable nvme streams, then if get stream params failed, 
We should disable streams before return error in 
nvme_configure_directives() function.

Signed-off-by: Wu Bo 
---
 drivers/nvme/host/core.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f3c037f..298e60c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -552,20 +552,24 @@ static int nvme_configure_directives(struct nvme_ctrl 
*ctrl)
return ret;
 
ret = nvme_get_stream_params(ctrl, &s, NVME_NSID_ALL);
-   if (ret)
-   return ret;
+   if (ret) {
+   goto out_disable_stream;
+   }
 
ctrl->nssa = le16_to_cpu(s.nssa);
if (ctrl->nssa < BLK_MAX_WRITE_HINTS - 1) {
dev_info(ctrl->device, "too few streams (%u) available\n",
ctrl->nssa);
-   nvme_disable_streams(ctrl);
-   return 0;
+   goto out_disable_stream;
}
 
ctrl->nr_streams = min_t(unsigned, ctrl->nssa, BLK_MAX_WRITE_HINTS - 1);
dev_info(ctrl->device, "Using %u streams\n", ctrl->nr_streams);
return 0;
+
+out_disable_stream:
+   nvme_disable_streams(ctrl);
+   return ret;
 }
 
 /*
-- 
1.8.3.1



Re: [PATCH] nvme/core:disable streams when get stream params failed

2020-05-12 Thread Wu Bo

On 2020/5/13 0:06, Christoph Hellwig wrote:

On Wed, May 06, 2020 at 04:37:01PM +0800, Wu Bo wrote:

After enable nvme streams, then if get stream params failed,
We should disable streams before return error in
nvme_configure_directives() function.

Signed-off-by: Wu Bo 
---
  drivers/nvme/host/core.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f2adea9..afe1f5a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -552,8 +552,10 @@ static int nvme_configure_directives(struct nvme_ctrl 
*ctrl)
return ret;
  
  	ret = nvme_get_stream_params(ctrl, &s, NVME_NSID_ALL);

-   if (ret)
+   if (ret) {
+   nvme_disable_streams(ctrl);
return ret;
+   }


Please use a out_disable goto label to not duplicate the error
handling with the other case that needs it.



OK, I will modify it in the V2 patch.

Thanks,
Wu Bo


.






[PATCH] nvme/core:disable streams when get stream params failed

2020-05-06 Thread Wu Bo
After enable nvme streams, then if get stream params failed, 
We should disable streams before return error in 
nvme_configure_directives() function.

Signed-off-by: Wu Bo 
---
 drivers/nvme/host/core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f2adea9..afe1f5a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -552,8 +552,10 @@ static int nvme_configure_directives(struct nvme_ctrl 
*ctrl)
return ret;
 
ret = nvme_get_stream_params(ctrl, &s, NVME_NSID_ALL);
-   if (ret)
+   if (ret) {
+   nvme_disable_streams(ctrl);
return ret;
+   }
 
ctrl->nssa = le16_to_cpu(s.nssa);
if (ctrl->nssa < BLK_MAX_WRITE_HINTS - 1) {
-- 
1.8.3.1



[PATCH V3] fs/ceph:fix double unlock in handle_cap_export()

2020-04-29 Thread Wu Bo
If the ceph_mdsc_open_export_target_session() return fails,
we should add mutex_lock(&session->s_mutex) on IS_ERR(tsession) block 
to avoid twice unlocking. because the session->s_mutex will be unlock
at the out_unlock lable.

--
v2 -> v3:
  - Rewrite solution, adding a mutex_lock(&session->s_mutex) 
to the IS_ERR(tsession) block.
  - Modify the comment more clearly.
v1 -> v2:
  - add spin_lock(&ci->i_ceph_lock) before goto out_unlock lable
 

Signed-off-by: Wu Bo 
---
 fs/ceph/caps.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 185db76..d27d778 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -3746,6 +3746,7 @@ static void handle_cap_export(struct inode *inode, struct 
ceph_mds_caps *ex,
WARN_ON(1);
tsession = NULL;
target = -1;
+   mutex_lock(&session->s_mutex);
}
goto retry;
 
-- 
1.8.3.1



Re: [PATCH] sound:hdmi:fix without unlocked before return

2020-04-29 Thread Wu Bo

On 2020/4/29 15:27, Takashi Iwai wrote:

On Sun, 26 Apr 2020 15:17:22 +0200,
Wu Bo wrote:


Fix the following coccicheck warning:
sound/pci/hda/patch_hdmi.c:1852:2-8: preceding lock on line 1846

After add sanity check to pass klockwork check,
The spdif_mutex should be unlock before return true
in check_non_pcm_per_cvt().

Signed-off-by: Wu Bo 


Applied now with the correction of subject and Fixes tag as well as
Cc-to-stable tag.


thanks,

Takashi

.



Thank you, I am sorry to forget to modify the V2 Patch version in time.

thanks,
Wu Bo






Re: [PATCH V2] fs/ceph:fix double unlock in handle_cap_export()

2020-04-29 Thread Wu Bo

On 2020/4/30 10:50, Yan, Zheng wrote:

On Wed, Apr 29, 2020 at 8:49 AM Wu Bo  wrote:


On 2020/4/28 22:48, Jeff Layton wrote:

On Tue, 2020-04-28 at 21:13 +0800, Wu Bo wrote:

if the ceph_mdsc_open_export_target_session() return fails,
should add a lock to avoid twice unlocking.
Because the lock will be released at the retry or out_unlock tag.



The problem looks real, but...


--
v1 -> v2:
add spin_lock(&ci->i_ceph_lock) before goto out_unlock tag.

Signed-off-by: Wu Bo 
---
   fs/ceph/caps.c | 27 +++
   1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 185db76..414c0e2 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -3731,22 +3731,25 @@ static void handle_cap_export(struct inode *inode, 
struct ceph_mds_caps *ex,

  /* open target session */
  tsession = ceph_mdsc_open_export_target_session(mdsc, target);
-if (!IS_ERR(tsession)) {
-if (mds > target) {
-mutex_lock(&session->s_mutex);
-mutex_lock_nested(&tsession->s_mutex,
-  SINGLE_DEPTH_NESTING);
-} else {
-mutex_lock(&tsession->s_mutex);
-mutex_lock_nested(&session->s_mutex,
-  SINGLE_DEPTH_NESTING);
-}
-new_cap = ceph_get_cap(mdsc, NULL);
-} else {
+if (IS_ERR(tsession)) {
  WARN_ON(1);
  tsession = NULL;
  target = -1;
+mutex_lock(&session->s_mutex);
+spin_lock(&ci->i_ceph_lock);
+goto out_unlock;


Why did you make this case goto out_unlock instead of retrying as it did
before?



If the problem occurs, target = -1, and goto retry lable, you need to
call __get_cap_for_mds() or even call __ceph_remove_cap(), and then jump
to out_unlock lable. All I think is unnecessary, goto out_unlock instead
of retrying directly.



__ceph_remove_cap() must be called even if opening target session
failed. I think adding a mutex_lock(&session->s_mutex) to the
IS_ERR(tsession) block should be enough.



Yes,I will send the V3 patch later.




Thanks.
Wu Bo


+}
+
+if (mds > target) {
+mutex_lock(&session->s_mutex);
+mutex_lock_nested(&tsession->s_mutex,
+SINGLE_DEPTH_NESTING);
+} else {
+mutex_lock(&tsession->s_mutex);
+mutex_lock_nested(&session->s_mutex,
+SINGLE_DEPTH_NESTING);
  }
+new_cap = ceph_get_cap(mdsc, NULL);
  goto retry;

   out_unlock:







.






[PATCH] fs/ceph:fix speical error code in ceph_try_get_caps()

2020-04-28 Thread Wu Bo
There are 3 speical error codes: -EAGAIN/-EFBIG/-ESTALE.
After call try_get_cap_refs function, judge the same 
error code -EAGAIN twice. So corrected the error code of judgment 
from -EAGAIN to -ESTAE.

Signed-off-by: Wu Bo 
---
 fs/ceph/caps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 185db76..1a8e20e 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -2749,7 +2749,7 @@ int ceph_try_get_caps(struct inode *inode, int need, int 
want,
 
ret = try_get_cap_refs(inode, need, want, 0, flags, got);
/* three special error codes */
-   if (ret == -EAGAIN || ret == -EFBIG || ret == -EAGAIN)
+   if (ret == -EAGAIN || ret == -EFBIG || ret == -ESTALE)
ret = 0;
return ret;
 }
-- 
1.8.3.1



Re: [PATCH V2] fs/ceph:fix double unlock in handle_cap_export()

2020-04-28 Thread Wu Bo

On 2020/4/28 22:48, Jeff Layton wrote:

On Tue, 2020-04-28 at 21:13 +0800, Wu Bo wrote:

if the ceph_mdsc_open_export_target_session() return fails,
should add a lock to avoid twice unlocking.
Because the lock will be released at the retry or out_unlock tag.



The problem looks real, but...


--
v1 -> v2:
add spin_lock(&ci->i_ceph_lock) before goto out_unlock tag.

Signed-off-by: Wu Bo 
---
  fs/ceph/caps.c | 27 +++
  1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 185db76..414c0e2 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -3731,22 +3731,25 @@ static void handle_cap_export(struct inode *inode, 
struct ceph_mds_caps *ex,
  
  	/* open target session */

tsession = ceph_mdsc_open_export_target_session(mdsc, target);
-   if (!IS_ERR(tsession)) {
-   if (mds > target) {
-   mutex_lock(&session->s_mutex);
-   mutex_lock_nested(&tsession->s_mutex,
- SINGLE_DEPTH_NESTING);
-   } else {
-   mutex_lock(&tsession->s_mutex);
-   mutex_lock_nested(&session->s_mutex,
- SINGLE_DEPTH_NESTING);
-   }
-   new_cap = ceph_get_cap(mdsc, NULL);
-   } else {
+   if (IS_ERR(tsession)) {
WARN_ON(1);
tsession = NULL;
target = -1;
+   mutex_lock(&session->s_mutex);
+   spin_lock(&ci->i_ceph_lock);
+   goto out_unlock;


Why did you make this case goto out_unlock instead of retrying as it did
before?



If the problem occurs, target = -1, and goto retry lable, you need to 
call __get_cap_for_mds() or even call __ceph_remove_cap(), and then jump 
to out_unlock lable. All I think is unnecessary, goto out_unlock instead 
of retrying directly.


Thanks.
Wu Bo


+   }
+
+   if (mds > target) {
+   mutex_lock(&session->s_mutex);
+   mutex_lock_nested(&tsession->s_mutex,
+   SINGLE_DEPTH_NESTING);
+   } else {
+   mutex_lock(&tsession->s_mutex);
+   mutex_lock_nested(&session->s_mutex,
+   SINGLE_DEPTH_NESTING);
}
+   new_cap = ceph_get_cap(mdsc, NULL);
goto retry;
  
  out_unlock:







Re: [PATCH] fs/ceph:fix double unlock in handle_cap_export()

2020-04-28 Thread Wu Bo

On 2020/4/28 21:41, Yan, Zheng wrote:

On Tue, Apr 28, 2020 at 8:50 PM Wu Bo  wrote:


If the ceph_mdsc_open_export_target_session() return fails,
should add a lock to avoid twice unlocking.
Because the lock will be released at the retry or out_unlock tag.



at retry label, i_ceph_lock get locked. I don't see how i_ceph_lock
can get double unlock
sorry, maybe I didn't describe it clearly enough, not i_ceph_lock, but 

session->s_mutex.

Thanks.

Wu Bo

Signed-off-by: Wu Bo 
---
  fs/ceph/caps.c | 26 ++
  1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 185db76..b5a62a8 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -3731,22 +3731,24 @@ static void handle_cap_export(struct inode *inode, 
struct ceph_mds_caps *ex,

 /* open target session */
 tsession = ceph_mdsc_open_export_target_session(mdsc, target);
-   if (!IS_ERR(tsession)) {
-   if (mds > target) {
-   mutex_lock(&session->s_mutex);
-   mutex_lock_nested(&tsession->s_mutex,
- SINGLE_DEPTH_NESTING);
-   } else {
-   mutex_lock(&tsession->s_mutex);
-   mutex_lock_nested(&session->s_mutex,
- SINGLE_DEPTH_NESTING);
-   }
-   new_cap = ceph_get_cap(mdsc, NULL);
-   } else {
+   if (IS_ERR(tsession)) {
 WARN_ON(1);
 tsession = NULL;
 target = -1;
+   mutex_lock(&session->s_mutex);
+   goto out_unlock;
+   }
+
+   if (mds > target) {
+   mutex_lock(&session->s_mutex);
+   mutex_lock_nested(&tsession->s_mutex,
+   SINGLE_DEPTH_NESTING);
+   } else {
+   mutex_lock(&tsession->s_mutex);
+   mutex_lock_nested(&session->s_mutex,
+   SINGLE_DEPTH_NESTING);
 }
+   new_cap = ceph_get_cap(mdsc, NULL);
 goto retry;

  out_unlock:
--
1.8.3.1



.






[PATCH V2] fs/ceph:fix double unlock in handle_cap_export()

2020-04-28 Thread Wu Bo
if the ceph_mdsc_open_export_target_session() return fails,
should add a lock to avoid twice unlocking.
Because the lock will be released at the retry or out_unlock tag.

--
v1 -> v2:
add spin_lock(&ci->i_ceph_lock) before goto out_unlock tag. 

Signed-off-by: Wu Bo 
---
 fs/ceph/caps.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 185db76..414c0e2 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -3731,22 +3731,25 @@ static void handle_cap_export(struct inode *inode, 
struct ceph_mds_caps *ex,
 
/* open target session */
tsession = ceph_mdsc_open_export_target_session(mdsc, target);
-   if (!IS_ERR(tsession)) {
-   if (mds > target) {
-   mutex_lock(&session->s_mutex);
-   mutex_lock_nested(&tsession->s_mutex,
- SINGLE_DEPTH_NESTING);
-   } else {
-   mutex_lock(&tsession->s_mutex);
-   mutex_lock_nested(&session->s_mutex,
- SINGLE_DEPTH_NESTING);
-   }
-   new_cap = ceph_get_cap(mdsc, NULL);
-   } else {
+   if (IS_ERR(tsession)) {
WARN_ON(1);
tsession = NULL;
target = -1;
+   mutex_lock(&session->s_mutex);
+   spin_lock(&ci->i_ceph_lock);
+   goto out_unlock;
+   }
+
+   if (mds > target) {
+   mutex_lock(&session->s_mutex);
+   mutex_lock_nested(&tsession->s_mutex,
+   SINGLE_DEPTH_NESTING);
+   } else {
+   mutex_lock(&tsession->s_mutex);
+   mutex_lock_nested(&session->s_mutex,
+   SINGLE_DEPTH_NESTING);
}
+   new_cap = ceph_get_cap(mdsc, NULL);
goto retry;
 
 out_unlock:
-- 
1.8.3.1



[PATCH] fs/ceph:fix double unlock in handle_cap_export()

2020-04-28 Thread Wu Bo
If the ceph_mdsc_open_export_target_session() return fails,
should add a lock to avoid twice unlocking.
Because the lock will be released at the retry or out_unlock tag. 

Signed-off-by: Wu Bo 
---
 fs/ceph/caps.c | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 185db76..b5a62a8 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -3731,22 +3731,24 @@ static void handle_cap_export(struct inode *inode, 
struct ceph_mds_caps *ex,
 
/* open target session */
tsession = ceph_mdsc_open_export_target_session(mdsc, target);
-   if (!IS_ERR(tsession)) {
-   if (mds > target) {
-   mutex_lock(&session->s_mutex);
-   mutex_lock_nested(&tsession->s_mutex,
- SINGLE_DEPTH_NESTING);
-   } else {
-   mutex_lock(&tsession->s_mutex);
-   mutex_lock_nested(&session->s_mutex,
- SINGLE_DEPTH_NESTING);
-   }
-   new_cap = ceph_get_cap(mdsc, NULL);
-   } else {
+   if (IS_ERR(tsession)) {
WARN_ON(1);
tsession = NULL;
target = -1;
+   mutex_lock(&session->s_mutex);
+   goto out_unlock;
+   }
+
+   if (mds > target) {
+   mutex_lock(&session->s_mutex);
+   mutex_lock_nested(&tsession->s_mutex,
+   SINGLE_DEPTH_NESTING);
+   } else {
+   mutex_lock(&tsession->s_mutex);
+   mutex_lock_nested(&session->s_mutex,
+   SINGLE_DEPTH_NESTING);
}
+   new_cap = ceph_get_cap(mdsc, NULL);
goto retry;
 
 out_unlock:
-- 
1.8.3.1