[PATCH] igb: Fix a deadlock in igb_sriov_reinit
When igb_init_interrupt_scheme in igb_sriov_reinit is failed, the lock acquired by rtnl_lock() is not released, which causes a deadlock. This patch adds rtnl_unlock() in error handling to fix it. Signed-off-by: Jia-Ju Bai baijiaju1...@163.com --- drivers/net/ethernet/intel/igb/igb_main.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 2f70a9b..311d1ca 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -7539,6 +7539,7 @@ static int igb_sriov_reinit(struct pci_dev *dev) if (igb_init_interrupt_scheme(adapter, true)) { dev_err(pdev-dev, Unable to allocate memory for queues\n); + rtnl_unlock(); return -ENOMEM; } -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] 3c59x: Fix resource leaks in vortex_open
When vortex_up is failed, the skb buffers allocated by __netdev_alloc_skb in vortex_open are not released, which may cause resource leaks. This bug has been submitted before. This patch modifies the error handling code to fix it. Signed-off-by: Jia-Ju Bai baijiaju1...@163.com --- drivers/net/ethernet/3com/3c59x.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/3com/3c59x.c b/drivers/net/ethernet/3com/3c59x.c index 2d1ce3c..753887d 100644 --- a/drivers/net/ethernet/3com/3c59x.c +++ b/drivers/net/ethernet/3com/3c59x.c @@ -1763,16 +1763,9 @@ vortex_open(struct net_device *dev) vp-rx_ring[i].addr = cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb-data, PKT_BUF_SZ, PCI_DMA_FROMDEVICE)); } if (i != RX_RING_SIZE) { - int j; pr_emerg(%s: no memory for rx ring\n, dev-name); - for (j = 0; j i; j++) { - if (vp-rx_skbuff[j]) { - dev_kfree_skb(vp-rx_skbuff[j]); - vp-rx_skbuff[j] = NULL; - } - } retval = -ENOMEM; - goto err_free_irq; + goto err_free_skb; } /* Wrap the ring. */ vp-rx_ring[i-1].next = cpu_to_le32(vp-rx_ring_dma); @@ -1782,7 +1775,13 @@ vortex_open(struct net_device *dev) if (!retval) goto out; -err_free_irq: +err_free_skb: + for (i = 0; i RX_RING_SIZE; i++) { + if (vp-rx_skbuff[i]) { + dev_kfree_skb(vp-rx_skbuff[i]); + vp-rx_skbuff[i] = NULL; + } + } free_irq(dev-irq, dev); err: if (vortex_debug 1) -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] e1000e: Modify tx/rx configurations to avoid null pointer dereferences in e1000_open
When e1000e_setup_rx_resources is failed in e1000_open, e1000e_free_tx_resources in err_setup_rx segment is executed. writel(0, tx_ring-head) statement in e1000_clean_tx_ring in e1000e_free_tx_resources will cause a null poonter dereference(crash), because tx_ring-head is only assigned in e1000_configure_tx in e1000_configure, but it is after e1000e_setup_rx_resources. This patch moves head/tail register writing to e1000_configure_tx/rx, which can fix this problem. It is inspired by igb_configure_tx_ring in the igb driver. Specially, thank Alexander Duyck for his valuable suggestion. Signed-off-by: Jia-Ju Bai baijiaju1...@163.com --- drivers/net/ethernet/intel/e1000e/netdev.c | 24 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index 89d788d..3aee51b 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -1737,12 +1737,6 @@ static void e1000_clean_rx_ring(struct e1000_ring *rx_ring) rx_ring-next_to_clean = 0; rx_ring-next_to_use = 0; adapter-flags2 = ~FLAG2_IS_DISCARDING; - - writel(0, rx_ring-head); - if (adapter-flags2 FLAG2_PCIM2PCI_ARBITER_WA) - e1000e_update_rdt_wa(rx_ring, 0); - else - writel(0, rx_ring-tail); } static void e1000e_downshift_workaround(struct work_struct *work) @@ -2447,12 +2441,6 @@ static void e1000_clean_tx_ring(struct e1000_ring *tx_ring) tx_ring-next_to_use = 0; tx_ring-next_to_clean = 0; - - writel(0, tx_ring-head); - if (adapter-flags2 FLAG2_PCIM2PCI_ARBITER_WA) - e1000e_update_tdt_wa(tx_ring, 0); - else - writel(0, tx_ring-tail); } /** @@ -2954,6 +2942,12 @@ static void e1000_configure_tx(struct e1000_adapter *adapter) tx_ring-head = adapter-hw.hw_addr + E1000_TDH(0); tx_ring-tail = adapter-hw.hw_addr + E1000_TDT(0); + writel(0, tx_ring-head); + if (adapter-flags2 FLAG2_PCIM2PCI_ARBITER_WA) + e1000e_update_tdt_wa(tx_ring, 0); + else + writel(0, tx_ring-tail); + /* Set the Tx Interrupt Delay register */ ew32(TIDV, adapter-tx_int_delay); /* Tx irq moderation */ @@ -3275,6 +3269,12 @@ static void e1000_configure_rx(struct e1000_adapter *adapter) rx_ring-head = adapter-hw.hw_addr + E1000_RDH(0); rx_ring-tail = adapter-hw.hw_addr + E1000_RDT(0); + writel(0, rx_ring-head); + if (adapter-flags2 FLAG2_PCIM2PCI_ARBITER_WA) + e1000e_update_rdt_wa(rx_ring, 0); + else + writel(0, rx_ring-tail); + /* Enable Receive Checksum Offload for TCP and UDP */ rxcsum = er32(RXCSUM); if (adapter-netdev-features NETIF_F_RXCSUM) -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] e1000e: Modify tx/rx configurations to avoid null pointer dereferences in e1000_open
On 08/05/2015 06:43 PM, Jeff Kirsher wrote: Is your intention that this patch replace the existing patch: http://patchwork.ozlabs.org/patch/502990/ ...which is currently in my queue? Okay, please replace the previous patch. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] igb: Fix a memory leak in igb_probe
In error handling code of igb_probe, the memory adapter-shadow_vfta allocated by kcalloc in igb_sw_init is not freed. So when register_netdev or igb_init_i2c is failed, a memory leak will occur. This patch adds kfree to fix it. Signed-off-by: Jia-Ju Bai baijiaju1...@163.com --- drivers/net/ethernet/intel/igb/igb_main.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 2f70a9b..3efb757 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -2645,6 +2645,7 @@ err_eeprom: if (hw-flash_address) iounmap(hw-flash_address); err_sw_init: + kfree(adapter-shadow_vfta); igb_clear_interrupt_scheme(adapter); pci_iounmap(pdev, hw-hw_addr); err_ioremap: -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH]Fix a null dereference in e1000_open
When e1000e_setup_rx_resources is failed in e1000_open, e1000e_free_tx_resources in err_setup_rx segment is executed. writel(0, tx_ring-head) statement in e1000_clean_tx_ring in e1000e_free_tx_resources will cause a null dereference(crash), because tx_ring-head is only assigned in e1000_configure_tx in e1000_configure, but it is after e1000e_setup_rx_resources. This patch adds a check to fix it. Signed-off-by: Jia-Ju Bai baijiaju1...@163.com --- drivers/net/ethernet/intel/e1000e/netdev.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index 89d788d..838ec8e 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -2448,7 +2448,8 @@ static void e1000_clean_tx_ring(struct e1000_ring *tx_ring) tx_ring-next_to_use = 0; tx_ring-next_to_clean = 0; - writel(0, tx_ring-head); + if (tx_ring-head) + writel(0, tx_ring-head); if (adapter-flags2 FLAG2_PCIM2PCI_ARBITER_WA) e1000e_update_tdt_wa(tx_ring, 0); else -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] e100: Release skb when DMA mapping is failed in e100_xmit_prepare
When pci_dma_mapping_error in e100_xmit_prepare is failed, the skb buffer allocated by netdev_alloc_skb_ip_align in e100_rx_alloc_skb is not released, which causes a possible resource leak. This patch adds error handling code to fix it. Signed-off-by: Jia-Ju Bai baijiaju1...@163.com --- drivers/net/ethernet/intel/e100.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/e100.c b/drivers/net/ethernet/intel/e100.c index d2657a4..cc90616 100644 --- a/drivers/net/ethernet/intel/e100.c +++ b/drivers/net/ethernet/intel/e100.c @@ -1770,8 +1770,11 @@ static int e100_xmit_prepare(struct nic *nic, struct cb *cb, dma_addr = pci_map_single(nic-pdev, skb-data, skb-len, PCI_DMA_TODEVICE); /* If we can't map the skb, have the upper layer try later */ - if (pci_dma_mapping_error(nic-pdev, dma_addr)) + if (pci_dma_mapping_error(nic-pdev, dma_addr)) { + dev_kfree_skb_any(skb); + skb = NULL; return -ENOMEM; + } /* * Use the last 4 bytes of the SKB payload packet as the CRC, used for -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] e100: Add a check after pci_pool_create to avoid null pointer dereference
The driver lacks the check of nic-cbs_pool after pci_pool_create in e100_probe. When this function is failed, a null pointer dereference occurs when pci_pool_alloc uses nic-cbs_pool in e100_alloc_cbs. This patch adds a check and related error handling code to fix it. Signed-off-by: Jia-Ju Bai baijiaju1...@163.com --- drivers/net/ethernet/intel/e100.c |7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/ethernet/intel/e100.c b/drivers/net/ethernet/intel/e100.c index d2657a4..767c161 100644 --- a/drivers/net/ethernet/intel/e100.c +++ b/drivers/net/ethernet/intel/e100.c @@ -2967,6 +2967,11 @@ static int e100_probe(struct pci_dev *pdev, const struct pci_device_id *ent) nic-params.cbs.max * sizeof(struct cb), sizeof(u32), 0); + if (!nic-cbs_pool) { + netif_err(nic, probe, nic-netdev, Cannot create DMA pool, aborting\n); + err = -ENOMEM; + goto err_out_pool; + } netif_info(nic, probe, nic-netdev, addr 0x%llx, irq %d, MAC addr %pM\n, (unsigned long long)pci_resource_start(pdev, use_io ? 1 : 0), @@ -2974,6 +2979,8 @@ static int e100_probe(struct pci_dev *pdev, const struct pci_device_id *ent) return 0; +err_out_pool: + unregister_netdev(netdev); err_out_free: e100_free(nic); err_out_iounmap: -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] rt2x00pci: Disable memory-write-invalidate when the driver exits
The driver calls pci_set_mwi to enable memory-write-invalidate when it is initialized, but does not call pci_clear_mwi when it is removed. Many other drivers calls pci_clear_mwi when pci_set_mwi is called, such as r8169, 8139cp and e1000. This patch adds pci_clear_mwi in error handling and removal procedure, which can fix the problem. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/wireless/rt2x00/rt2x00pci.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/wireless/rt2x00/rt2x00pci.c b/drivers/net/wireless/rt2x00/rt2x00pci.c index d93db4b..eb6dbcd 100644 --- a/drivers/net/wireless/rt2x00/rt2x00pci.c +++ b/drivers/net/wireless/rt2x00/rt2x00pci.c @@ -149,6 +149,7 @@ exit_free_device: ieee80211_free_hw(hw); exit_release_regions: + pci_clear_mwi(pci_dev); pci_release_regions(pci_dev); exit_disable_device: @@ -173,6 +174,7 @@ void rt2x00pci_remove(struct pci_dev *pci_dev) /* * Free the PCI device data. */ + pci_clear_mwi(pci_dev); pci_disable_device(pci_dev); pci_release_regions(pci_dev); } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] ehci-hcd: Cleanup memory resources when ehci_halt fails
The driver calls ehci_mem_init to allocate memory resources. But these resources are not freed when ehci_halt fails. This patch adds "ehci_mem_cleanup" in error handling code to fix this problem. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/usb/host/ehci-hcd.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c index 48c92bf..015b411 100644 --- a/drivers/usb/host/ehci-hcd.c +++ b/drivers/usb/host/ehci-hcd.c @@ -675,8 +675,10 @@ int ehci_setup(struct usb_hcd *hcd) return retval; retval = ehci_halt(ehci); - if (retval) + if (retval) { + ehci_mem_cleanup(ehci); return retval; + } ehci_reset(ehci); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] ehci-hcd: Disable memory-write-invalidate when the driver is removed
The driver calls pci_set_mwi to enable memory-write-invalidate when it is initialized, but does not call pci_clear_mwi when it is removed. Many other drivers calls pci_clear_mwi when pci_set_mwi is called, such as r8169, 8139cp and e1000. This patch adds a function "ehci_pci_remove" to remove the pci driver. This function calls pci_clear_mwi and usb_hcd_pci_remove, which can fix the problem. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/usb/host/ehci-pci.c |8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/usb/host/ehci-pci.c b/drivers/usb/host/ehci-pci.c index 2a5d2fd..3b3649d 100644 --- a/drivers/usb/host/ehci-pci.c +++ b/drivers/usb/host/ehci-pci.c @@ -377,6 +377,12 @@ static int ehci_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) return usb_hcd_pci_probe(pdev, id); } +static void ehci_pci_remove(struct pci_dev *pdev) +{ + pci_clear_mwi(pdev); + usb_hcd_pci_remove(pdev); +} + /* PCI driver selection metadata; PCI hotplugging uses this */ static const struct pci_device_id pci_ids [] = { { /* handle any USB 2.0 EHCI controller */ @@ -396,7 +402,7 @@ static struct pci_driver ehci_pci_driver = { .id_table = pci_ids, .probe =ehci_pci_probe, - .remove = usb_hcd_pci_remove, + .remove = ehci_pci_remove, .shutdown = usb_hcd_pci_shutdown, #ifdef CONFIG_PM -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rt2x00pci: Disable memory-write-invalidate when the driver exits
On 01/05/2016 12:50 AM, Helmut Schaa wrote: On Mon, Jan 4, 2016 at 8:55 AM, Jia-Ju Bai<baijiaju1...@163.com> wrote: The driver calls pci_set_mwi to enable memory-write-invalidate when it is initialized, but does not call pci_clear_mwi when it is removed. Many other drivers calls pci_clear_mwi when pci_set_mwi is called, such as r8169, 8139cp and e1000. This patch adds pci_clear_mwi in error handling and removal procedure, which can fix the problem. Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com> Looks good to me. Does this fix any actual issue? If yes it might we worth to mention it in the commit message. Helmut Lacking pci_clear_mwi may cause a resource-release omission, but this omission may not cause obvious issues. For reliability, it is better to add pci_clear_mwi in the driver. Many other drivers do so, such as r8169, 8139cp and e1000. Jia-Ju Bai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_pci_mem_access_direct
On 06/21/2017 02:11 PM, Kalle Valo wrote: David Miller<da...@davemloft.net> writes: From: Jia-Ju Bai<baijiaju1...@163.com> Date: Mon, 19 Jun 2017 10:48:53 +0800 The driver may sleep under a spin lock, and the function call path is: netxen_nic_pci_mem_access_direct (acquire the lock by spin_lock) ioremap --> may sleep To fix it, the lock is released before "ioremap", and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com> This style of change you are making is really starting to be a problem. You can't just drop locks like this, especially without explaining why it's ok, and why the mutual exclusion this code was trying to achieve is still going to be OK afterwards. In fact, I see zero analysis of the locking situation here, why it was needed in the first place, and why your change is OK in that context. Any locking change is delicate, and you must put the greatest of care and consideration into it. Just putting "unlock/lock" around the sleeping operation shows a very low level of consideration for the implications of the change you are making. This isn't like making whitespace fixes, sorry... We already tried to explain this to Jia-Ju during review of a wireless patch: https://patchwork.kernel.org/patch/9756585/ Jia-Ju, you should listen to feedback. If you continue submitting random patches like this makes it hard for maintainers to trust your patches anymore. Hi, I am quite sorry for my incorrect patches, and I will listen carefully to your advice. In fact, for some bugs and patches which I have reported before, I have not received the feedback of them, so I resent them a few days ago, including this patch. Sorry for my mistake again. Thanks, Jia-Ju Bai
Re: [PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_pci_mem_access_direct
On 2017/6/22 14:08, Dan Carpenter wrote: We should probably add a might_sleep() to ioremap() to prevent these bugs in the future. I think it is right to do this. And it will be very useful to summarize common kernel interface functions which may sleep into a list. When writing a new driver, the developer can refer to this list to reduce or avoid sleep-in-atomic bugs. This bug is eight years old. You can report it, but it's going to hard to get anyone to fix it. I sometimes ignore ancient bugs. On the other hand, netxen is fairly well supported so it doesn't hurt to try. I try to report bugs as soon as they are introduced. I report it to the author and CC the relevant list. If people don't respond to my email after a month then I complain again. regards, dan carpenter Thanks for your helpful advice. Thanks, Jia-Ju Bai
Re: [PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_pci_mem_access_direct
On 2017/6/21 21:40, Kalle Valo wrote: Jia-Ju Bai <baijiaju1...@163.com> writes: On 06/21/2017 02:11 PM, Kalle Valo wrote: David Miller<da...@davemloft.net> writes: From: Jia-Ju Bai<baijiaju1...@163.com> Date: Mon, 19 Jun 2017 10:48:53 +0800 The driver may sleep under a spin lock, and the function call path is: netxen_nic_pci_mem_access_direct (acquire the lock by spin_lock) ioremap --> may sleep To fix it, the lock is released before "ioremap", and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com> This style of change you are making is really starting to be a problem. You can't just drop locks like this, especially without explaining why it's ok, and why the mutual exclusion this code was trying to achieve is still going to be OK afterwards. In fact, I see zero analysis of the locking situation here, why it was needed in the first place, and why your change is OK in that context. Any locking change is delicate, and you must put the greatest of care and consideration into it. Just putting "unlock/lock" around the sleeping operation shows a very low level of consideration for the implications of the change you are making. This isn't like making whitespace fixes, sorry... We already tried to explain this to Jia-Ju during review of a wireless patch: https://patchwork.kernel.org/patch/9756585/ Jia-Ju, you should listen to feedback. If you continue submitting random patches like this makes it hard for maintainers to trust your patches anymore. Hi, I am quite sorry for my incorrect patches, and I will listen carefully to your advice. In fact, for some bugs and patches which I have reported before, I have not received the feedback of them, so I resent them a few days ago, including this patch. Yeah, it is likely that some of your reports will not get any response. For that I only suggest being persistent and providing more information about the issue and suggestions how it might be possible to fix it. Also Dan Carpenter (Cced) might have some suggestions. But trying to "fix" it by just silencing the warning without proper analysis is totally the wrong approach, you do more harm than good. What tool do you use to find these issues? Is it publically available? Hi, Thanks a lot for your advice. And I am very glad to see that you may be interested in my work :) This static tool is written by myself, instead of using or improving existing tools. A reason why I write it is that I have encountered some sleep-in-atomic bugs in my driver development :( . However, due to preliminary implementation, this tool still has some limitations which can produce some false positives or negatives, and it may be not very easy to use. Thus, I am still improving this tool, checking more code and collecting results now. By the way, I apologize again for my incorrect patches of trying to "fix" the detected bugs. In fact, I am very glad to make this tool available to effectively and conveniently check more system code. After I finish the improvements and perform more evaluation, I will make it publicly available. If you have any suggestion or comment on my work, please feel free to contact me :) Thanks, Jia-Ju Bai
[PATCH] isdn: Fix a sleep-in-atomic bug
The driver may sleep under a spin lock, the function call path is: isdn_ppp_mp_receive (acquire the lock) isdn_ppp_mp_reassembly isdn_ppp_push_higher isdn_ppp_decompress isdn_ppp_ccp_reset_trans isdn_ppp_ccp_reset_alloc_state kzalloc(GFP_KERNEL) --> may sleep To fixed it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/isdn/i4l/isdn_ppp.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/isdn/i4l/isdn_ppp.c b/drivers/isdn/i4l/isdn_ppp.c index d07dd519..8aa158a 100644 --- a/drivers/isdn/i4l/isdn_ppp.c +++ b/drivers/isdn/i4l/isdn_ppp.c @@ -2364,7 +2364,7 @@ static struct ippp_ccp_reset_state *isdn_ppp_ccp_reset_alloc_state(struct ippp_s id); return NULL; } else { - rs = kzalloc(sizeof(struct ippp_ccp_reset_state), GFP_KERNEL); + rs = kzalloc(sizeof(struct ippp_ccp_reset_state), GFP_ATOMIC); if (!rs) return NULL; rs->state = CCPResetIdle; -- 1.7.9.5
[PATCH] [PATCH] qla4xxx: Fix a sleep-in-atomic bug
The driver may sleep under a write spin lock, the function call path is: qla4_82xx_wr_32 (acquire the lock) qla4_82xx_crb_win_lock schedule or cpu_relax To fixed it, the lock is released before "schedule" and "cpu_relax", and the lock is acquired again after "schedule" and "cpu_relax". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/scsi/qla4xxx/ql4_nx.c |8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/qla4xxx/ql4_nx.c b/drivers/scsi/qla4xxx/ql4_nx.c index e91abb3..1cf5f4a 100644 --- a/drivers/scsi/qla4xxx/ql4_nx.c +++ b/drivers/scsi/qla4xxx/ql4_nx.c @@ -386,7 +386,7 @@ if (rv == 1) { write_lock_irqsave(>hw_lock, flags); - qla4_82xx_crb_win_lock(ha); + qla4_82xx_crb_win_lock(ha, flags); qla4_82xx_pci_set_crbwindow_2M(ha, ); } @@ -410,7 +410,7 @@ uint32_t qla4_82xx_rd_32(struct scsi_qla_host *ha, ulong off) if (rv == 1) { write_lock_irqsave(>hw_lock, flags); - qla4_82xx_crb_win_lock(ha); + qla4_82xx_crb_win_lock(ha, flags); qla4_82xx_pci_set_crbwindow_2M(ha, ); } data = readl((void __iomem *)off); @@ -476,7 +476,7 @@ int qla4_82xx_md_wr_32(struct scsi_qla_host *ha, uint32_t off, uint32_t data) #define CRB_WIN_LOCK_TIMEOUT 1 -int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha) +int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha, unsigned long flags) { int i; int done = 0, timeout = 0; @@ -491,6 +491,7 @@ int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha) timeout++; + write_unlock_irqrestore(>hw_lock, flags); /* Yield CPU */ if (!in_interrupt()) schedule(); @@ -498,6 +499,7 @@ int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha) for (i = 0; i < 20; i++) cpu_relax();/*This a nop instr on i386*/ } + write_lock_irqsave(>hw_lock, flags); } qla4_82xx_wr_32(ha, QLA82XX_CRB_WIN_LOCK_ID, ha->func_num); return 0; -- 1.7.9.5
[PATCH] iscsi: Fix a sleep-in-atomic bug
The driver may sleep under a spin lock, and the function call path is: iscsit_tpg_enable_portal_group (acquire the lock by spin_lock) iscsi_update_param_value kstrdup(GFP_KERNEL) --> may sleep To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/target/iscsi/iscsi_target_parameters.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/target/iscsi/iscsi_target_parameters.c b/drivers/target/iscsi/iscsi_target_parameters.c index fce6276..8768916 100644 --- a/drivers/target/iscsi/iscsi_target_parameters.c +++ b/drivers/target/iscsi/iscsi_target_parameters.c @@ -702,7 +702,7 @@ int iscsi_update_param_value(struct iscsi_param *param, char *value) { kfree(param->value); - param->value = kstrdup(value, GFP_KERNEL); + param->value = kstrdup(value, GFP_ATOMIC); if (!param->value) { pr_err("Unable to allocate memory for value.\n"); return -ENOMEM; -- 1.7.9.5
[PATCH] enic: Fix a sleep-in-atomic bug
The driver may sleep under a spin lock in some function call paths. The 1st function call path is: enic_reset (acquire the lock by spin_lock) enic_stop enic_synchronize_irqs synchronize_irq --> may sleep The 2nd function call path is: enic_reset (acquire the lock by spin_lock) enic_dev_soft_reset enic_dev_wait schedule_timeout_uninterruptible --> may sleep The 3rd function call path is: enic_reset (acquire the lock by spin_lock) enic_open enic_request_intr enic_set_rx_cpu_rmap enic_free_rx_cpu_rmap free_irq_cpu_rmap --> may sleep To fix it, the "spin_lock" and "spin_unlock" are removed in enic_reset. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/ethernet/cisco/enic/enic_main.c |2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c index 4b87bee..2a9bef8 100644 --- a/drivers/net/ethernet/cisco/enic/enic_main.c +++ b/drivers/net/ethernet/cisco/enic/enic_main.c @@ -2310,7 +2310,6 @@ static void enic_reset(struct work_struct *work) rtnl_lock(); - spin_lock(>enic_api_lock); enic_stop(enic->netdev); enic_dev_soft_reset(enic); enic_reset_addr_lists(enic); @@ -2318,7 +2317,6 @@ static void enic_reset(struct work_struct *work) enic_set_rss_nic_cfg(enic); enic_dev_set_ig_vlan_rewrite_mode(enic); enic_open(enic->netdev); - spin_unlock(>enic_api_lock); call_netdevice_notifiers(NETDEV_REBOOT, enic->netdev); rtnl_unlock(); -- 1.7.9.5
[PATCH] i40e: Fix a sleep-in-atomic bug
The driver may sleep under a spin lock, and the function call path is: i40e_ndo_set_vf_port_vlan (acquire the lock by spin_lock_bh) i40e_vsi_remove_pvid i40e_vlan_stripping_disable i40e_aq_update_vsi_params i40e_asq_send_command mutex_lock --> may sleep To fixed it, the spin lock is released before "i40e_vsi_remove_pvid", and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index 95c23fb..0fb38ca 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -3017,10 +3017,12 @@ int i40e_ndo_set_vf_port_vlan(struct net_device *netdev, int vf_id, VLAN_VID_MASK)); } + spin_unlock_bh(>mac_filter_hash_lock); if (vlan_id || qos) ret = i40e_vsi_add_pvid(vsi, vlanprio); else i40e_vsi_remove_pvid(vsi); + spin_lock_bh(>mac_filter_hash_lock); if (vlan_id) { dev_info(>pdev->dev, "Setting VLAN %d, QOS 0x%x on VF %d\n", -- 1.7.9.5
[PATCH] enic: Fix another sleep-in-atomic bug
The driver may sleep under a spin lock in some function call paths. The 1st function call path is: enic_tx_hang_reset (acquire the lock by spin_lock) enic_stop enic_synchronize_irqs synchronize_irq --> may sleep The 2nd function call path is: enic_tx_hang_reset (acquire the lock by spin_lock) enic_dev_wait schedule_timeout_uninterruptible --> may sleep The 3rd function call path is: enic_tx_hang_reset (acquire the lock by spin_lock) enic_open enic_request_intr enic_set_rx_cpu_rmap enic_free_rx_cpu_rmap free_irq_cpu_rmap --> may sleep To fix it, the "spin_lock" and "spin_unlock" are removed in enic_tx_hang_reset. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/ethernet/cisco/enic/enic_main.c |2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c index 4b87bee..d6523e2 100644 --- a/drivers/net/ethernet/cisco/enic/enic_main.c +++ b/drivers/net/ethernet/cisco/enic/enic_main.c @@ -2330,7 +2330,6 @@ static void enic_tx_hang_reset(struct work_struct *work) rtnl_lock(); - spin_lock(>enic_api_lock); enic_dev_hang_notify(enic); enic_stop(enic->netdev); enic_dev_hang_reset(enic); @@ -2339,7 +2338,6 @@ static void enic_tx_hang_reset(struct work_struct *work) enic_set_rss_nic_cfg(enic); enic_dev_set_ig_vlan_rewrite_mode(enic); enic_open(enic->netdev); - spin_unlock(>enic_api_lock); call_netdevice_notifiers(NETDEV_REBOOT, enic->netdev); rtnl_unlock(); -- 1.7.9.5
[PATCH] megaraid: Fix a sleep-in-atomic bug
The driver may sleep under a spin lock, and the function call path is: mraid_mm_attach_buf (acquire the lock by spin_lock_irqsave) pci_pool_alloc(GFP_KERNEL) --> may sleep To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/scsi/megaraid/megaraid_mm.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/megaraid/megaraid_mm.c b/drivers/scsi/megaraid/megaraid_mm.c index 4cf9ed9..c43afb8 100644 --- a/drivers/scsi/megaraid/megaraid_mm.c +++ b/drivers/scsi/megaraid/megaraid_mm.c @@ -574,7 +574,7 @@ kioc->pool_index= right_pool; kioc->free_buf = 1; - kioc->buf_vaddr = pci_pool_alloc(pool->handle, GFP_KERNEL, + kioc->buf_vaddr = pci_pool_alloc(pool->handle, GFP_ATOMIC, >buf_paddr); spin_unlock_irqrestore(>lock, flags); -- 1.7.9.5
[PATCH V2] qla4xxx: Fix a sleep-in-atomic bug
The driver may sleep under a write spin lock, the function call path is: qla4_82xx_wr_32 (acquire the lock) qla4_82xx_crb_win_lock schedule or cpu_relax To fix it, the lock is released before "schedule" and "cpu_relax", and the lock is acquired again after "schedule" and "cpu_relax". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/scsi/qla4xxx/ql4_glbl.h |2 +- drivers/scsi/qla4xxx/ql4_nx.c |8 +--- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/qla4xxx/ql4_glbl.h b/drivers/scsi/qla4xxx/ql4_glbl.h index bce96a5..b723bef 100644 --- a/drivers/scsi/qla4xxx/ql4_glbl.h +++ b/drivers/scsi/qla4xxx/ql4_glbl.h @@ -115,7 +115,7 @@ uint8_t qla4xxx_update_local_ifcb(struct scsi_qla_host *ha, void qla4_82xx_queue_iocb(struct scsi_qla_host *ha); void qla4_82xx_complete_iocb(struct scsi_qla_host *ha); -int qla4_82xx_crb_win_lock(struct scsi_qla_host *); +int qla4_82xx_crb_win_lock(struct scsi_qla_host *, unsigned long); void qla4_82xx_crb_win_unlock(struct scsi_qla_host *); int qla4_82xx_pci_get_crb_addr_2M(struct scsi_qla_host *, ulong *); void qla4_82xx_wr_32(struct scsi_qla_host *, ulong, u32); diff --git a/drivers/scsi/qla4xxx/ql4_nx.c b/drivers/scsi/qla4xxx/ql4_nx.c index e91abb3..1cf5f4a 100644 --- a/drivers/scsi/qla4xxx/ql4_nx.c +++ b/drivers/scsi/qla4xxx/ql4_nx.c @@ -386,7 +386,7 @@ if (rv == 1) { write_lock_irqsave(>hw_lock, flags); - qla4_82xx_crb_win_lock(ha); + qla4_82xx_crb_win_lock(ha, flags); qla4_82xx_pci_set_crbwindow_2M(ha, ); } @@ -410,7 +410,7 @@ uint32_t qla4_82xx_rd_32(struct scsi_qla_host *ha, ulong off) if (rv == 1) { write_lock_irqsave(>hw_lock, flags); - qla4_82xx_crb_win_lock(ha); + qla4_82xx_crb_win_lock(ha, flags); qla4_82xx_pci_set_crbwindow_2M(ha, ); } data = readl((void __iomem *)off); @@ -476,7 +476,7 @@ int qla4_82xx_md_wr_32(struct scsi_qla_host *ha, uint32_t off, uint32_t data) #define CRB_WIN_LOCK_TIMEOUT 1 -int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha) +int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha, unsigned long flags) { int i; int done = 0, timeout = 0; @@ -491,6 +491,7 @@ int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha) timeout++; + write_unlock_irqrestore(>hw_lock, flags); /* Yield CPU */ if (!in_interrupt()) schedule(); @@ -498,6 +499,7 @@ int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha) for (i = 0; i < 20; i++) cpu_relax();/*This a nop instr on i386*/ } + write_lock_irqsave(>hw_lock, flags); } qla4_82xx_wr_32(ha, QLA82XX_CRB_WIN_LOCK_ID, ha->func_num); return 0; -- 1.7.9.5
[PATCH] gadget: Fix a sleep-in-atomic bug
The driver may sleep under a spin lock, and the function call path is: ffs_epfile_io (acquire the lock by spin_lock_irq) usb_ep_alloc_request(GFP_KERNEL) --> may sleep To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/usb/gadget/function/f_fs.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index 47dda34..be90e25 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -1015,7 +1015,7 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data) else ret = ep->status; goto error_mutex; - } else if (!(req = usb_ep_alloc_request(ep->ep, GFP_KERNEL))) { + } else if (!(req = usb_ep_alloc_request(ep->ep, GFP_ATOMIC))) { ret = -ENOMEM; } else { req->buf = data; -- 1.7.9.5
[PATCH] libcfs: Fix a sleep-in-atomic bug in cfs_wi_deschedule
The driver may sleep under a spin lock, and the function call path is: cfs_wi_deschedule (acquire the lock by spin_lock) LASSERT lbug_with_loc libcfs_debug_dumplog schedule and kthread_run --> may sleep To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/lustre/lnet/libcfs/workitem.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c b/drivers/staging/lustre/lnet/libcfs/workitem.c index dbc2a9b..7e25eb9 100644 --- a/drivers/staging/lustre/lnet/libcfs/workitem.c +++ b/drivers/staging/lustre/lnet/libcfs/workitem.c @@ -140,6 +140,11 @@ struct cfs_wi_sched { LASSERT(!in_interrupt()); /* because we use plain spinlock */ LASSERT(!sched->ws_stopping); + if (wi->wi_scheduled) { + LASSERT(!list_empty(>wi_list)); + LASSERT(sched->ws_nscheduled > 0); + } + LASSERT(list_empty(>wi_list)); /* * return 0 if it's running already, otherwise return 1, which @@ -151,17 +156,11 @@ struct cfs_wi_sched { rc = !(wi->wi_running); if (wi->wi_scheduled) { /* cancel pending schedules */ - LASSERT(!list_empty(>wi_list)); list_del_init(>wi_list); - - LASSERT(sched->ws_nscheduled > 0); sched->ws_nscheduled--; - wi->wi_scheduled = 0; } - LASSERT(list_empty(>wi_list)); - spin_unlock(>ws_lock); return rc; } -- 1.7.9.5
[PATCH] libcfs: Fix a sleep-in-atomic bug in cfs_wi_schedule
The driver may sleep under a spin lock, and the function call path is: cfs_wi_schedule (acquire the lock by spin_lock) LASSERT lbug_with_loc libcfs_debug_dumplog schedule and kthread_run --> may sleep To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/lustre/lnet/libcfs/workitem.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c b/drivers/staging/lustre/lnet/libcfs/workitem.c index dbc2a9b..30d28cd 100644 --- a/drivers/staging/lustre/lnet/libcfs/workitem.c +++ b/drivers/staging/lustre/lnet/libcfs/workitem.c @@ -179,12 +179,12 @@ struct cfs_wi_sched { { LASSERT(!in_interrupt()); /* because we use plain spinlock */ LASSERT(!sched->ws_stopping); + if (!wi->wi_scheduled) + LASSERT(list_empty(>wi_list)); spin_lock(>ws_lock); if (!wi->wi_scheduled) { - LASSERT(list_empty(>wi_list)); - wi->wi_scheduled = 1; sched->ws_nscheduled++; if (!wi->wi_running) { @@ -195,8 +195,8 @@ struct cfs_wi_sched { } } - LASSERT(!list_empty(>wi_list)); spin_unlock(>ws_lock); + LASSERT(!list_empty(>wi_list)); } EXPORT_SYMBOL(cfs_wi_schedule); -- 1.7.9.5
[PATCH] bcache: Fix a sleep-in-atomic bug
The driver may sleep under a spin lock, and the function call path is: journal_wait_for_write (acquire the lock by spin_lock) closure_sync schedule --> may sleep To fix it, the lock is released before "closure_sync", and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/md/bcache/journal.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index 1198e53..ad47c36 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -724,6 +724,7 @@ static struct journal_write *journal_wait_for_write(struct cache_set *c, btree_flush_write(c); } + spin_unlock(>journal.lock); closure_sync(); spin_lock(>journal.lock); wait = true; -- 1.7.9.5
[PATCH V2] libcfs: Fix a sleep-in-atomic bug in cfs_wi_deschedule
The driver may sleep under a spin lock, and the function call path is: cfs_wi_deschedule (acquire the lock by spin_lock) LASSERT lbug_with_loc libcfs_debug_dumplog schedule and kthread_run --> may sleep To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/lustre/lnet/libcfs/workitem.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c b/drivers/staging/lustre/lnet/libcfs/workitem.c index dbc2a9b..9c530cf 100644 --- a/drivers/staging/lustre/lnet/libcfs/workitem.c +++ b/drivers/staging/lustre/lnet/libcfs/workitem.c @@ -140,6 +140,10 @@ struct cfs_wi_sched { LASSERT(!in_interrupt()); /* because we use plain spinlock */ LASSERT(!sched->ws_stopping); + if (wi->wi_scheduled) { + LASSERT(!list_empty(>wi_list)); + LASSERT(sched->ws_nscheduled > 0); + } /* * return 0 if it's running already, otherwise return 1, which @@ -151,18 +155,14 @@ struct cfs_wi_sched { rc = !(wi->wi_running); if (wi->wi_scheduled) { /* cancel pending schedules */ - LASSERT(!list_empty(>wi_list)); list_del_init(>wi_list); - - LASSERT(sched->ws_nscheduled > 0); sched->ws_nscheduled--; - wi->wi_scheduled = 0; } - LASSERT(list_empty(>wi_list)); - spin_unlock(>ws_lock); + + LASSERT(list_empty(>wi_list)); return rc; } EXPORT_SYMBOL(cfs_wi_deschedule); -- 1.7.9.5
[PATCH] libcfs: Fix a sleep-in-atomic bug in cfs_wi_sched_destroy
The driver may sleep under a spin lock, and the function call path is: cfs_wi_sched_destroy (acquire the lock by spin_lock) LASSERT lbug_with_loc libcfs_debug_dumplog schedule and kthread_run --> may sleep To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/lustre/lnet/libcfs/workitem.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c b/drivers/staging/lustre/lnet/libcfs/workitem.c index dbc2a9b..e0424f6 100644 --- a/drivers/staging/lustre/lnet/libcfs/workitem.c +++ b/drivers/staging/lustre/lnet/libcfs/workitem.c @@ -302,11 +302,12 @@ static int cfs_wi_scheduler(void *arg) return; } - LASSERT(!list_empty(>ws_list)); sched->ws_stopping = 1; spin_unlock(_wi_data.wi_glock); + LASSERT(!list_empty(>ws_list)); + i = 2; wake_up_all(>ws_waitq); -- 1.7.9.5
[PATCH V2] libcfs: Fix a sleep-in-atomic bug in cfs_wi_exit
The driver may sleep under a spin lock, and the function call path is: cfs_wi_exit (acquire the lock by spin_lock) LASSERT lbug_with_loc libcfs_debug_dumplog schedule and kthread_run --> may sleep To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/lustre/lnet/libcfs/workitem.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c b/drivers/staging/lustre/lnet/libcfs/workitem.c index dbc2a9b..928d06d 100644 --- a/drivers/staging/lustre/lnet/libcfs/workitem.c +++ b/drivers/staging/lustre/lnet/libcfs/workitem.c @@ -111,22 +111,23 @@ struct cfs_wi_sched { { LASSERT(!in_interrupt()); /* because we use plain spinlock */ LASSERT(!sched->ws_stopping); + LASSERT(wi->wi_running); + if (wi->wi_scheduled) { + LASSERT(!list_empty(>wi_list)); + LASSERT(sched->ws_nscheduled > 0); + } spin_lock(>ws_lock); - LASSERT(wi->wi_running); if (wi->wi_scheduled) { /* cancel pending schedules */ - LASSERT(!list_empty(>wi_list)); list_del_init(>wi_list); - - LASSERT(sched->ws_nscheduled > 0); sched->ws_nscheduled--; } - LASSERT(list_empty(>wi_list)); - wi->wi_scheduled = 1; /* LBUG future schedule attempts */ spin_unlock(>ws_lock); + + LASSERT(list_empty(>wi_list)); } EXPORT_SYMBOL(cfs_wi_exit); -- 1.7.9.5
[PATCH] libcfs: Fix a sleep-in-atomic bug in cfs_wi_exit
The driver may sleep under a spin lock, and the function call path is: cfs_wi_exit (acquire the lock by spin_lock) LASSERT lbug_with_loc libcfs_debug_dumplog schedule and kthread_run --> may sleep To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/lustre/lnet/libcfs/workitem.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c b/drivers/staging/lustre/lnet/libcfs/workitem.c index dbc2a9b..cef25c8 100644 --- a/drivers/staging/lustre/lnet/libcfs/workitem.c +++ b/drivers/staging/lustre/lnet/libcfs/workitem.c @@ -111,20 +111,20 @@ struct cfs_wi_sched { { LASSERT(!in_interrupt()); /* because we use plain spinlock */ LASSERT(!sched->ws_stopping); + LASSERT(wi->wi_running); + if (wi->wi_scheduled) { + LASSERT(!list_empty(>wi_list)); + LASSERT(sched->ws_nscheduled > 0); + } + LASSERT(list_empty(>wi_list)); spin_lock(>ws_lock); - LASSERT(wi->wi_running); if (wi->wi_scheduled) { /* cancel pending schedules */ - LASSERT(!list_empty(>wi_list)); list_del_init(>wi_list); - - LASSERT(sched->ws_nscheduled > 0); sched->ws_nscheduled--; } - LASSERT(list_empty(>wi_list)); - wi->wi_scheduled = 1; /* LBUG future schedule attempts */ spin_unlock(>ws_lock); } -- 1.7.9.5
[PATCH] mISDN: Fix a sleep-in-atomic bug
The driver may sleep under a read spin lock, and the function call path is: send_socklist (acquire the lock by read_lock) skb_copy(GFP_KERNEL) --> may sleep To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/isdn/mISDN/stack.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/isdn/mISDN/stack.c b/drivers/isdn/mISDN/stack.c index 8b7faea..422dced 100644 --- a/drivers/isdn/mISDN/stack.c +++ b/drivers/isdn/mISDN/stack.c @@ -75,7 +75,7 @@ if (sk->sk_state != MISDN_BOUND) continue; if (!cskb) - cskb = skb_copy(skb, GFP_KERNEL); + cskb = skb_copy(skb, GFP_ATOMIC); if (!cskb) { printk(KERN_WARNING "%s no skb\n", __func__); break; -- 1.7.9.5
[PATCH] libcfs: Fix a sleep-in-atomic bug in cfs_wi_scheduler
The driver may sleep under a spin lock, and the function call path is: cfs_wi_scheduler (acquire the lock by spin_lock) LASSERT lbug_with_loc libcfs_debug_dumplog schedule and kthread_run --> may sleep To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/lustre/lnet/libcfs/workitem.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c b/drivers/staging/lustre/lnet/libcfs/workitem.c index dbc2a9b..9f7832e 100644 --- a/drivers/staging/lustre/lnet/libcfs/workitem.c +++ b/drivers/staging/lustre/lnet/libcfs/workitem.c @@ -212,9 +212,9 @@ static int cfs_wi_scheduler(void *arg) CWARN("Unable to bind %s on CPU partition %d\n", sched->ws_name, sched->ws_cpt); + LASSERT(sched->ws_starting == 1); spin_lock(_wi_data.wi_glock); - LASSERT(sched->ws_starting == 1); sched->ws_starting--; sched->ws_nthreads++; @@ -231,11 +231,14 @@ static int cfs_wi_scheduler(void *arg) nloops < CFS_WI_RESCHED) { wi = list_entry(sched->ws_runq.next, struct cfs_workitem, wi_list); + + spin_unlock(>ws_lock); LASSERT(wi->wi_scheduled && !wi->wi_running); + LASSERT(sched->ws_nscheduled > 0); + spin_lock(>ws_lock); list_del_init(>wi_list); - LASSERT(sched->ws_nscheduled > 0); sched->ws_nscheduled--; wi->wi_running = 1; @@ -254,7 +257,10 @@ static int cfs_wi_scheduler(void *arg) if (list_empty(>wi_list)) continue; + spin_unlock(>ws_lock); LASSERT(wi->wi_scheduled); + spin_lock(>ws_lock); + /* wi is rescheduled, should be on rerunq now, we * move it to runq so it can run action now */ -- 1.7.9.5
[PATCH] libcfs: Fix a sleep-in-atomic bug in cfs_percpt_lock and cfs_percpt_unlock
The driver may sleep under a spin lock, and the function call path is: cfs_percpt_lock/cfs_percpt_unlock (acquire the lock by spin_lock) LASSERT lbug_with_loc libcfs_debug_dumplog schedule and kthread_run --> may sleep To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/lustre/lnet/libcfs/libcfs_lock.c |9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/staging/lustre/lnet/libcfs/libcfs_lock.c b/drivers/staging/lustre/lnet/libcfs/libcfs_lock.c index 1967b97c..a2ce092f 100644 --- a/drivers/staging/lustre/lnet/libcfs/libcfs_lock.c +++ b/drivers/staging/lustre/lnet/libcfs/libcfs_lock.c @@ -113,9 +113,10 @@ struct cfs_percpt_lock * /* exclusive lock request */ for (i = 0; i < ncpt; i++) { + if (!i) + LASSERT(!pcl->pcl_locked); spin_lock(pcl->pcl_locks[i]); if (!i) { - LASSERT(!pcl->pcl_locked); /* nobody should take private lock after this * so I wouldn't starve for too long time */ @@ -141,11 +142,11 @@ struct cfs_percpt_lock * } for (i = ncpt - 1; i >= 0; i--) { - if (!i) { - LASSERT(pcl->pcl_locked); + if (!i) pcl->pcl_locked = 0; - } spin_unlock(pcl->pcl_locks[i]); + if (!i) + LASSERT(pcl->pcl_locked); } } EXPORT_SYMBOL(cfs_percpt_unlock); -- 1.7.9.5
[PATCH] gma500: Fix a sleep-in-atomic bug in psbfb_2d_submit
The driver may sleep under a spin lock, and the function call path is: psbfb_2d_submit (acquire the lock by spin_lock_irqsave) psb_2d_wait_available psb_spank msleep --> may sleep To fix it, the "msleep" is replaced with "mdelay" in psb_spank. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/gpu/drm/gma500/accel_2d.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/gma500/accel_2d.c b/drivers/gpu/drm/gma500/accel_2d.c index c51d925..7c24c8a 100644 --- a/drivers/gpu/drm/gma500/accel_2d.c +++ b/drivers/gpu/drm/gma500/accel_2d.c @@ -55,7 +55,7 @@ void psb_spank(struct drm_psb_private *dev_priv) _PSB_CS_RESET_TWOD_RESET, PSB_CR_SOFT_RESET); PSB_RSGX32(PSB_CR_SOFT_RESET); - msleep(1); + mdelay(1); PSB_WSGX32(0, PSB_CR_SOFT_RESET); wmb(); @@ -64,7 +64,7 @@ void psb_spank(struct drm_psb_private *dev_priv) wmb(); (void) PSB_RSGX32(PSB_CR_BIF_CTRL); - msleep(1); + mdelay(1); PSB_WSGX32(PSB_RSGX32(PSB_CR_BIF_CTRL) & ~_PSB_CB_CTRL_CLEAR_FAULT, PSB_CR_BIF_CTRL); (void) PSB_RSGX32(PSB_CR_BIF_CTRL); -- 1.7.9.5
[PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_pci_mem_access_direct
The driver may sleep under a spin lock, and the function call path is: netxen_nic_pci_mem_access_direct (acquire the lock by spin_lock) ioremap --> may sleep To fix it, the lock is released before "ioremap", and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c index a996801..5ea553e 100644 --- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c +++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c @@ -1419,7 +1419,9 @@ static u32 netxen_nic_io_read_2M(struct netxen_adapter *adapter, mem_base = pci_resource_start(adapter->pdev, 0) + (start & PAGE_MASK); + spin_unlock(>ahw.mem_lock); mem_ptr = ioremap(mem_base, PAGE_SIZE); + spin_lock(>ahw.mem_lock); if (mem_ptr == NULL) { ret = -EIO; goto unlock; -- 1.7.9.5
[PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed
The driver may sleep under a spin lock, and the function call path is: b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave) b43legacy_synchronize_irq synchronize_irq --> may sleep To fix it, the lock is released before b43legacy_synchronize_irq, and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/wireless/broadcom/b43legacy/main.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c index f1e3dad..31ead21 100644 --- a/drivers/net/wireless/broadcom/b43legacy/main.c +++ b/drivers/net/wireless/broadcom/b43legacy/main.c @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw, b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0); if (changed & BSS_CHANGED_BSSID) { + spin_unlock_irqrestore(>irq_lock, flags); b43legacy_synchronize_irq(dev); + spin_lock_irqsave(>irq_lock, flags); if (conf->bssid) memcpy(wl->bssid, conf->bssid, ETH_ALEN); -- 1.7.9.5
Re: [PATCH] bcache: Fix a sleep-in-atomic bug
On 05/31/2017 03:23 PM, Jia-Ju Bai wrote: The driver may sleep under a spin lock, and the function call path is: journal_wait_for_write (acquire the lock by spin_lock) closure_sync schedule --> may sleep To fix it, the lock is released before "closure_sync", and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com> --- drivers/md/bcache/journal.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index 1198e53..ad47c36 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -724,6 +724,7 @@ static struct journal_write *journal_wait_for_write(struct cache_set *c, btree_flush_write(c); } + spin_unlock(>journal.lock); closure_sync(); spin_lock(>journal.lock); wait = true; Sorry, my patch is not correct, and it will cause double unlock. Please ignore my patch. Jia-Ju Bai
[PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_hw_read_wx_2M and netxen_nic_hw_write_wx_2M
The driver may sleep under a write spin lock, and function call path is: netxen_nic_hw_read_wx_2M (acquire the lock by write_lock_irqsave) crb_win_lock netxen_pcie_sem_lock msleep --> may sleep netxen_nic_hw_write_wx_2M (acquire the lock by write_lock_irqsave) crb_win_lock netxen_pcie_sem_lock msleep --> may sleep To fix it, the "msleep" is replaced with "mdelay" in netxen_pcie_sem_lock. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c index a996801..0a9da42 100644 --- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c +++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c @@ -329,7 +329,7 @@ static void __iomem *pci_base_offset(struct netxen_adapter *adapter, break; if (++timeout >= NETXEN_PCIE_SEM_TIMEOUT) return -EIO; - msleep(1); + mdelay(1); } if (id_reg) -- 1.7.9.5
Re: [PATCH] iscsi: Fix a sleep-in-atomic bug
On 06/01/2017 02:21 PM, Nicholas A. Bellinger wrote: Hi Jia-Ju, On Wed, 2017-05-31 at 11:26 +0800, Jia-Ju Bai wrote: The driver may sleep under a spin lock, and the function call path is: iscsit_tpg_enable_portal_group (acquire the lock by spin_lock) iscsi_update_param_value kstrdup(GFP_KERNEL) --> may sleep To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com> --- drivers/target/iscsi/iscsi_target_parameters.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Btw, the use of tpg->tpg_state_lock in iscsit_tpg_enable_portal_group() while checking existing state and calling iscsi_update_param_value() is not necessary, since lio_target_tpg_enable_store() is already holding iscsit_get_tpg() -> tpg->tpg_access_lock. How about the following instead to only take tpg->tpg_state_lock when updating tpg->tpg_state instead..? diff --git a/drivers/target/iscsi/iscsi_target_tpg.c b/drivers/target/iscsi/iscsi_target_tpg.c index 2e7e08d..abaabba 100644 --- a/drivers/target/iscsi/iscsi_target_tpg.c +++ b/drivers/target/iscsi/iscsi_target_tpg.c @@ -311,11 +311,9 @@ int iscsit_tpg_enable_portal_group(struct iscsi_portal_group *tpg) struct iscsi_tiqn *tiqn = tpg->tpg_tiqn; int ret; - spin_lock(>tpg_state_lock); if (tpg->tpg_state == TPG_STATE_ACTIVE) { pr_err("iSCSI target portal group: %hu is already" " active, ignoring request.\n", tpg->tpgt); - spin_unlock(>tpg_state_lock); return -EINVAL; } /* @@ -324,10 +322,8 @@ int iscsit_tpg_enable_portal_group(struct iscsi_portal_group *tpg) * is enforced (as per default), and remove the NONE option. */ param = iscsi_find_param_from_key(AUTHMETHOD, tpg->param_list); - if (!param) { - spin_unlock(>tpg_state_lock); + if (!param) return -EINVAL; - } if (tpg->tpg_attrib.authentication) { if (!strcmp(param->value, NONE)) { @@ -341,6 +337,7 @@ int iscsit_tpg_enable_portal_group(struct iscsi_portal_group *tpg) goto err; } + spin_lock(>tpg_state_lock); tpg->tpg_state = TPG_STATE_ACTIVE; spin_unlock(>tpg_state_lock); @@ -353,7 +350,6 @@ int iscsit_tpg_enable_portal_group(struct iscsi_portal_group *tpg) return 0; err: - spin_unlock(>tpg_state_lock); return ret; } I think it is fine to me. Thanks, Jia-Ju Bai
Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_attr_interfmode_store
On 06/02/2017 12:11 AM, Jonathan Corbet wrote: On Thu, 01 Jun 2017 09:05:07 +0800 Jia-Ju Bai<baijiaju1...@163.com> wrote: I admit my patches are not well tested, and they may not well fix the bugs. I am looking forward to opinions and suggestions :) May I politely suggest that sending out untested locking changes is a dangerous thing to do? You really should not be changing the locking in a piece of kernel code without understanding very well what the lock is protecting and being able to say why your changes are safe. Without that, the risk of introducing subtle bugs is very high. It looks like you have written a useful tool that could help us to make the kernel more robust. If you are interested in my suggestion, I would recommend that you post the sleep-in-atomic scenarios that you are finding, but refrain from "fixing" them in any case where you cannot offer a strong explanation of why your fix is correct. Thanks for working to find bugs in the kernel! jon Hi, Thanks for your good and helpful advice. I am sorry for my improper patches. I will only report bugs instead of sending improper patches when I have no good solution of fixing the bugs. Thanks, Jia-Ju Bai
[PATCH] cw1200: Fix a sleep-in-atomic bug in cw1200_tx_confirm_cb and cw1200_cqm_bssloss_sm
The driver may sleep under a spin lock, and the function call path is: cw1200_tx_confirm_cb (acquire the lock by spin_lock) __cw1200_cqm_bssloss_sm cancel_work_sync --> may sleep cw1200_cqm_bssloss_sm __cw1200_cqm_bssloss_sm cancel_work_sync --> may sleep To fix it, the lock is released before cancel_work_sync, and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/wireless/st/cw1200/sta.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/wireless/st/cw1200/sta.c b/drivers/net/wireless/st/cw1200/sta.c index a522248..d5f7698 100644 --- a/drivers/net/wireless/st/cw1200/sta.c +++ b/drivers/net/wireless/st/cw1200/sta.c @@ -154,7 +154,9 @@ void __cw1200_cqm_bssloss_sm(struct cw1200_common *priv, int tx = 0; priv->delayed_link_loss = 0; + spin_unlock(>bss_loss_lock); cancel_work_sync(>bss_params_work); + spin_lock(>bss_loss_lock); pr_debug("[STA] CQM BSSLOSS_SM: state: %d init %d good %d bad: %d txlock: %d uj: %d\n", priv->bss_loss_state, -- 1.7.9.5
[PATCH] rxe: Fix a sleep-in-atomic bug in post_one_send
The driver may sleep under a spin lock, and the function call path is: post_one_send (acquire the lock by spin_lock_irqsave) init_send_wqe copy_from_user --> may sleep To fix it, the lock is released before copy_from_user, and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/infiniband/sw/rxe/rxe_verbs.c |9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 83d709e..6fb7e1a 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -725,7 +725,7 @@ static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr, { int num_sge = ibwr->num_sge; struct ib_sge *sge; - int i; + int i, err; u8 *p; init_send_wr(qp, >wr, ibwr); @@ -740,8 +740,11 @@ static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr, sge = ibwr->sg_list; for (i = 0; i < num_sge; i++, sge++) { - if (qp->is_user && copy_from_user(p, (__user void *) - (uintptr_t)sge->addr, sge->length)) + spin_unlock_irq(>sq.sq_lock); + err = copy_from_user(p, (__user void *) + (uintptr_t)sge->addr, sge->length); + spin_lock_irq(>sq.sq_lock); + if (qp->is_user && err) return -EFAULT; else if (!qp->is_user) -- 1.7.9.5
Re: [PATCH] megaraid: Fix a sleep-in-atomic bug
On 05/31/2017 06:18 PM, Sumit Saxena wrote: -Original Message- From: Jia-Ju Bai [mailto:baijiaju1...@163.com] Sent: Wednesday, May 31, 2017 8:27 AM To: kashyap.de...@broadcom.com; sumit.sax...@broadcom.com; shivasharan.srikanteshw...@broadcom.com; j...@linux.vnet.ibm.com; martin.peter...@oracle.com Cc: megaraidlinux@broadcom.com; linux-s...@vger.kernel.org; linux- ker...@vger.kernel.org; Jia-Ju Bai Subject: [PATCH] megaraid: Fix a sleep-in-atomic bug The driver may sleep under a spin lock, and the function call path is: mraid_mm_attach_buf (acquire the lock by spin_lock_irqsave) pci_pool_alloc(GFP_KERNEL) --> may sleep To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com> --- drivers/scsi/megaraid/megaraid_mm.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/megaraid/megaraid_mm.c b/drivers/scsi/megaraid/megaraid_mm.c index 4cf9ed9..c43afb8 100644 --- a/drivers/scsi/megaraid/megaraid_mm.c +++ b/drivers/scsi/megaraid/megaraid_mm.c @@ -574,7 +574,7 @@ kioc->pool_index = right_pool; kioc->free_buf = 1; - kioc->buf_vaddr = pci_pool_alloc(pool->handle, GFP_KERNEL, + kioc->buf_vaddr = pci_pool_alloc(pool->handle, GFP_ATOMIC, >buf_paddr); spin_unlock_irqrestore(>lock, flags); This is very old driver and reached EOL. Did you face any issue because of this bug or discover this through code review? Anyways patch looks good to me. Acked-by: Sumit Saxena<sumit.sax...@broadcom.com> -- 1.7.9.5 Hi, This bug is found by a static analysis tool and my code review. Jia-Ju Bai
[PATCH] rts5208: Fix a sleep-in-atomic bug in rtsx_exclusive_enter_ss
The driver may sleep under a spin lock, and the function call path is: rtsx_exclusive_enter_ss (acquire the lock by spin_lock) rtsx_enter_ss rtsx_power_off_card sd_cleanup_work sd_stop_seq_mode sd_switch_clock sd_ddr_tuning sd_ddr_pre_tuning_tx sd_change_phase wait_timeout schedule_timeout --> may sleep To fix it, "wait_timeout" is replaced with mdelay in sd_change_phase. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/rts5208/sd.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rts5208/sd.c b/drivers/staging/rts5208/sd.c index bdd35b6..76bd105 100644 --- a/drivers/staging/rts5208/sd.c +++ b/drivers/staging/rts5208/sd.c @@ -1057,7 +1057,7 @@ static int sd_change_phase(struct rtsx_chip *chip, u8 sample_point, u8 tune_dir) rtsx_write_register(chip, SD_DCMPS_CTL, DCMPS_CHANGE, 0); rtsx_write_register(chip, SD_VP_CTL, PHASE_CHANGE, 0); - wait_timeout(10); + mdelay(10); sd_reset_dcm(chip, tune_dir); return STATUS_FAIL; } -- 1.7.9.5
Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_attr_interfmode_store
On 06/01/2017 01:33 AM, Larry Finger wrote: On 05/31/2017 05:29 AM, Jia-Ju Bai wrote: The driver may sleep under a spin lock, and the function call path is: b43legacy_attr_interfmode_store (acquire the lock by spin_lock_irqsave) b43legacy_radio_set_interference_mitigation b43legacy_radio_interference_mitigation_disable b43legacy_calc_nrssi_slope b43legacy_synth_pu_workaround might_sleep and msleep --> may sleep Fixing it may be complex, and a possible way is to remove spin_lock_irqsave and spin_lock_irqrestore in b43legacy_attr_interfmode_store, and the code has been protected by mutex_lock and mutex_unlock. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/wireless/broadcom/b43legacy/sysfs.c |2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/net/wireless/broadcom/b43legacy/sysfs.c b/drivers/net/wireless/broadcom/b43legacy/sysfs.c index 2a1da15..9ede143 100644 --- a/drivers/net/wireless/broadcom/b43legacy/sysfs.c +++ b/drivers/net/wireless/broadcom/b43legacy/sysfs.c @@ -137,14 +137,12 @@ static ssize_t b43legacy_attr_interfmode_store(struct device *dev, } mutex_lock(>wl->mutex); -spin_lock_irqsave(>wl->irq_lock, flags); err = b43legacy_radio_set_interference_mitigation(wldev, mode); if (err) b43legacyerr(wldev->wl, "Interference Mitigation not " "supported by device\n"); mmiowb(); -spin_unlock_irqrestore(>wl->irq_lock, flags); mutex_unlock(>wl->mutex); return err ? err : count; Jia-Ju, Did you actually observe the attempt to sleep under the spin lock, or did you discover this using some tool? In other words, have either of your patches been tested? Larry Hi, In fact, my reported bugs are found by a static analysis tool written by me, and they are checked by my review of the driver code. I admit my patches are not well tested, and they may not well fix the bugs. I am looking forward to opinions and suggestions :) Thanks, Jia-Ju Bai
Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed
On 06/01/2017 08:07 AM, Larry Finger wrote: On 05/31/2017 10:32 AM, Michael Büsch wrote: On Wed, 31 May 2017 13:26:43 +0300 Kalle Valo <kv...@codeaurora.org> wrote: Jia-Ju Bai <baijiaju1...@163.com> writes: The driver may sleep under a spin lock, and the function call path is: b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave) b43legacy_synchronize_irq synchronize_irq --> may sleep To fix it, the lock is released before b43legacy_synchronize_irq, and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/wireless/broadcom/b43legacy/main.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c index f1e3dad..31ead21 100644 --- a/drivers/net/wireless/broadcom/b43legacy/main.c +++ b/drivers/net/wireless/broadcom/b43legacy/main.c @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw, b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0); if (changed & BSS_CHANGED_BSSID) { +spin_unlock_irqrestore(>irq_lock, flags); b43legacy_synchronize_irq(dev); +spin_lock_irqsave(>irq_lock, flags); To me this looks like a fragile workaround and not a real fix. You can easily add new race conditions with releasing the lock like this. I think releasing the lock possibly is fine. It certainly is better than sleeping with a lock held. We disabled the device interrupts just before this line. However I think the synchronize_irq should be outside of the conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So two lines above) I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID is set. On the other hand b43 does not have this irq-disabling foobar anymore. So somebody must have removed it. Maybe you can find the commit that removed this stuff from b43 and port it to b43legacy? So I would vote for moving the synchronize_irq up outside of the conditional and put the unlock/lock sequence around it. And as a second patch on top of that try to remove this stuff altogether like b43 did. The patch that removed it in b43 is commit 36dbd9548e92268127b0c31b0e121e63e9207108 Author: Michael Buesch <m...@bu3sch.de> Date: Fri Sep 4 22:51:29 2009 +0200 b43: Use a threaded IRQ handler Use a threaded IRQ handler to allow locking the mutex and sleeping while executing an interrupt. This removes usage of the irq_lock spinlock, but introduces a new hardirq_lock, which is _only_ used for the PCI/SSB lowlevel hard-irq handler. Sleeping busses (SDIO) will use mutex instead. Signed-off-by: Michael Buesch <m...@bu3sch.de> Tested-by: Larry Finger <larry.fin...@lwfinger.net> Signed-off-by: John W. Linville <linvi...@tuxdriver.com> I vaguely remember this patch. Although it is roughly a 1000-line fix, I will try to port it to b43legacy. I still have an old BCM4306 PCMCIA card that I can test in a PowerBook G4. I agree with Michael that this is the way to go. Both of Jia-Ju's patches should be rejected. Larry It is fine to me to fix the bug by porting this former patch. Thanks, Jia-Ju Bai
[PATCH] i40iw: Add a value assignment to avoid sleep-in-atomic bug caused by uninitialized value
The value "cqp_request->waiting" indicates whether the sleeping operation should be performed, and it is not assigned in i40iw_get_cqp_request, so the driver may sleep in interrupt handling. The function call path is: i40iw_dpc (tasklet_init indicates it handles interrupt) i40iw_process_aeq i40iw_next_iw_state i40iw_hw_modify_qp (call i40iw_get_cqp_request) i40iw_handle_cqp_op i40iw_wait_event --> may sleep To fix it, "cqp_request->waiting" is assigned in "else" branch in i40iw_get_cqp_request. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/infiniband/hw/i40iw/i40iw_utils.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/infiniband/hw/i40iw/i40iw_utils.c b/drivers/infiniband/hw/i40iw/i40iw_utils.c index 409a378..0f4e633 100644 --- a/drivers/infiniband/hw/i40iw/i40iw_utils.c +++ b/drivers/infiniband/hw/i40iw/i40iw_utils.c @@ -326,6 +326,7 @@ struct i40iw_cqp_request *i40iw_get_cqp_request(struct i40iw_cqp *cqp, bool wait cqp_request->waiting = true; } else { atomic_set(_request->refcount, 1); + cqp_request->waiting = false; } return cqp_request; } -- 1.7.9.5
[PATCH] ivtv: Fix a sleep-in-atomic bug in snd_ivtv_pcm_hw_free
The driver may sleep under a spin lock, and the function call path is: snd_ivtv_pcm_hw_free (acquire the lock by spin_lock_irqsave) vfree --> may sleep To fix it, the "substream->runtime->dma_area" is passed to a temporary value, and mark it NULL when holding the lock. The memory is freed by vfree through the temporary value outside the lock holding. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/media/pci/ivtv/ivtv-alsa-pcm.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/media/pci/ivtv/ivtv-alsa-pcm.c b/drivers/media/pci/ivtv/ivtv-alsa-pcm.c index 807ead2..92d088c 100644 --- a/drivers/media/pci/ivtv/ivtv-alsa-pcm.c +++ b/drivers/media/pci/ivtv/ivtv-alsa-pcm.c @@ -262,14 +262,16 @@ static int snd_ivtv_pcm_hw_free(struct snd_pcm_substream *substream) { struct snd_ivtv_card *itvsc = snd_pcm_substream_chip(substream); unsigned long flags; + unsigned char *dma_area; spin_lock_irqsave(>slock, flags); if (substream->runtime->dma_area) { dprintk("freeing pcm capture region\n"); - vfree(substream->runtime->dma_area); + dma_area = substream->runtime->dma_area; substream->runtime->dma_area = NULL; } spin_unlock_irqrestore(>slock, flags); + vfree(dma_area); return 0; } -- 1.7.9.5
[PATCH] cx18: Fix a sleep-in-atomic bug in snd_cx18_pcm_hw_free
The driver may sleep under a spin lock, and the function call path is: snd_cx18_pcm_hw_free (acquire the lock by spin_lock_irqsave) vfree --> may sleep To fix it, the "substream->runtime->dma_area" is passed to a temporary value, and mark it NULL when holding the lock. The memory is freed by vfree through the temporary value outside the lock holding. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/media/pci/cx18/cx18-alsa-pcm.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/media/pci/cx18/cx18-alsa-pcm.c b/drivers/media/pci/cx18/cx18-alsa-pcm.c index 205a98d..ba83147 100644 --- a/drivers/media/pci/cx18/cx18-alsa-pcm.c +++ b/drivers/media/pci/cx18/cx18-alsa-pcm.c @@ -257,14 +257,16 @@ static int snd_cx18_pcm_hw_free(struct snd_pcm_substream *substream) { struct snd_cx18_card *cxsc = snd_pcm_substream_chip(substream); unsigned long flags; + unsigned char *dma_area; spin_lock_irqsave(>slock, flags); if (substream->runtime->dma_area) { dprintk("freeing pcm capture region\n"); - vfree(substream->runtime->dma_area); + dma_area = substream->runtime->dma_area; substream->runtime->dma_area = NULL; } spin_unlock_irqrestore(>slock, flags); + vfree(dma_area); return 0; } -- 1.7.9.5
[PATCH] rts5208: Fix a sleep-in-atomic bug in sd_power_off_card3v3
The driver may sleep under a spin lock, and the function call path is: rtsx_exclusive_enter_ss (acquire the lock by spin_lock) rtsx_enter_ss rtsx_power_off_card sd_power_off_card3v3 wait_timeout schedule_timeout --> may sleep To fix it, "wait_timeout" is replaced with mdelay in sd_power_off_card3v3. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/rts5208/sd.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rts5208/sd.c b/drivers/staging/rts5208/sd.c index bdd35b6..aa14454 100644 --- a/drivers/staging/rts5208/sd.c +++ b/drivers/staging/rts5208/sd.c @@ -5231,7 +5231,7 @@ int sd_power_off_card3v3(struct rtsx_chip *chip) return STATUS_FAIL; } - wait_timeout(50); + mdelay(50); } if (chip->asic_code) { -- 1.7.9.5
Re: [PATCH] rts5208: Fix a sleep-in-atomic bug in rtsx_exclusive_enter_ss
On 06/03/2017 04:52 PM, Greg KH wrote: On Thu, Jun 01, 2017 at 11:43:35AM +0800, Jia-Ju Bai wrote: The driver may sleep under a spin lock, and the function call path is: rtsx_exclusive_enter_ss (acquire the lock by spin_lock) rtsx_enter_ss rtsx_power_off_card sd_cleanup_work sd_stop_seq_mode sd_switch_clock sd_ddr_tuning sd_ddr_pre_tuning_tx sd_change_phase wait_timeout schedule_timeout --> may sleep To fix it, "wait_timeout" is replaced with mdelay in sd_change_phase. Nice work, how are you finding these bugs? What tools gives you this kind of analysis? thanks, greg k-h Hi, I am very glad to get your praise on my work :) I recently write a static analysis tool for detecting sleep-in-atomic bugs, instead of using existing tools. One reason is that I have encountered these bugs for some times when I writing drivers. I am still improving my tool and detecting other similar bugs in Linux kernel. If you have suggestions or comments on my work, please feel free to contact me :) Thanks, Jia-Ju Bai
[PATCH] rt5208: Fix a sleep-in-atomic bug in xd_copy_page
Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/rts5208/xd.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rts5208/xd.c b/drivers/staging/rts5208/xd.c index 85aba05..74d36f9 100644 --- a/drivers/staging/rts5208/xd.c +++ b/drivers/staging/rts5208/xd.c @@ -1268,7 +1268,7 @@ static int xd_copy_page(struct rtsx_chip *chip, u32 old_blk, u32 new_blk, reg = 0; rtsx_read_register(chip, XD_CTL, ); if (reg & (XD_ECC1_ERROR | XD_ECC2_ERROR)) { - wait_timeout(100); + mdelay(100); if (detect_card_cd(chip, XD_CARD) != STATUS_SUCCESS) { -- 1.7.9.5
[PATCH resend] rt5208: Fix a sleep-in-atomic bug in xd_copy_page
Last patch lacks code explanation, and it is included in this patch. The driver may sleep under a spin lock, and the function call path is: rtsx_exclusive_enter_ss (acquire the lock by spin_lock) rtsx_enter_ss rtsx_power_off_card xd_cleanup_work xd_delay_write xd_finish_write xd_copy_page wait_timeout schedule_timeout --> may sleep To fix it, "wait_timeout" is replaced with mdelay in xd_copy_page. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/rts5208/xd.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rts5208/xd.c b/drivers/staging/rts5208/xd.c index 85aba05..74d36f9 100644 --- a/drivers/staging/rts5208/xd.c +++ b/drivers/staging/rts5208/xd.c @@ -1268,7 +1268,7 @@ static int xd_copy_page(struct rtsx_chip *chip, u32 old_blk, u32 new_blk, reg = 0; rtsx_read_register(chip, XD_CTL, ); if (reg & (XD_ECC1_ERROR | XD_ECC2_ERROR)) { - wait_timeout(100); + mdelay(100); if (detect_card_cd(chip, XD_CARD) != STATUS_SUCCESS) { -- 1.7.9.5
[PATCH] rts5208: Fix a sleep-in-atomic bug in sd_send_cmd_get_rsp
The driver may sleep under a spin lock, and the function call path is: rtsx_exclusive_enter_ss (acquire the lock by spin_lock) rtsx_enter_ss rtsx_power_off_card sd_cleanup_work sd_stop_seq_mode sd_switch_clock sd_ddr_tuning sd_ddr_pre_tuning_tx sd_send_cmd_get_rsp wait_timeout schedule_timeout --> may sleep To fix it, "wait_timeout" is replaced with mdelay in sd_send_cmd_get_rsp. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/rts5208/sd.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rts5208/sd.c b/drivers/staging/rts5208/sd.c index bdd35b6..fed17ff 100644 --- a/drivers/staging/rts5208/sd.c +++ b/drivers/staging/rts5208/sd.c @@ -226,7 +226,7 @@ static int sd_send_cmd_get_rsp(struct rtsx_chip *chip, u8 cmd_idx, return STATUS_FAIL; } if (rty_cnt < SD_MAX_RETRY_COUNT) { - wait_timeout(20); + mdelay(20); rty_cnt++; goto RTY_SEND_CMD; } else { -- 1.7.9.5
[BUG] rts5208: Sleeping under a spin lock in xd_init_l2p_tbl
According to rtsx_chip.c and xd.c, the driver may sleep under a spin lock. The function call path is: rtsx_exclusive_enter_ss (acquire the lock by spin_lock) rtsx_enter_ss rtsx_power_off_card xd_cleanup_work xd_delay_write xd_finish_write xd_set_unused_block xd_build_l2p_tbl xd_init_l2p_tbl vmalloc --> may sleep This bug is found by my static analysis tool and my code review. I hope to fix it, but I do not have a good solution. Thanks, Jia-Ju Bai
[BUG] rts5208: Sleeping under a spin lock in xd_build_l2p_tbl
According to rtsx_chip.c and xd.c, the driver may sleep under a spin lock. The function call path is: rtsx_exclusive_enter_ss (acquire the lock by spin_lock) rtsx_enter_ss rtsx_power_off_card xd_cleanup_work xd_delay_write xd_finish_write xd_set_unused_block xd_build_l2p_tbl vmalloc --> may sleep This bug is found by my static analysis tool and my code review. I hope to fix it, but I do not have a good solution. Thanks, Jia-Ju Bai
[BUG] rts5208: Sleeping under a spin lock in free_zone
According to rtsx_chip.c and xd.c, the driver may sleep under a spin lock. The function call path is: rtsx_exclusive_enter_ss (acquire the lock by spin_lock) rtsx_enter_ss rtsx_power_off_card xd_cleanup_work xd_delay_write xd_finish_write xd_set_unused_block free_zone vfree --> may sleep This bug is found by my static analysis tool and my code review. I hope to fix it, but I do not have a good solution. Thanks, Jia-Ju Bai
Re: [PATCH] fs: nfs: Fix a sleep-in-atomic bug in nfs_access_add_cache
On 06/05/2017 07:48 PM, Trond Myklebust wrote: On Mon, 2017-06-05 at 16:05 +0800, Jia-Ju Bai wrote: The driver may sleep under a rcu read lock, and function call path is: nfs_permission (acquire the lock by rcu_read_lock) nfs_do_access nfs_access_add_cache kmalloc(GFP_KERNEL) --> may sleep To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com> --- fs/nfs/dir.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 32ccd77..7a074db 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -2333,7 +2333,7 @@ static void nfs_access_add_rbtree(struct inode *inode, struct nfs_access_entry * void nfs_access_add_cache(struct inode *inode, struct nfs_access_entry *set) { - struct nfs_access_entry *cache = kmalloc(sizeof(*cache), GFP_KERNEL); + struct nfs_access_entry *cache = kmalloc(sizeof(*cache), GFP_ATOMIC); if (cache == NULL) return; RB_CLEAR_NODE(>rb_node); The RCU locked codepath will not ever hit nfs_access_add_rbtree(). It returns with an error code of -ECHILD after the test of "may_block". Cheers Trond Yes, I think you are right. Thanks, Jia-Ju Bai
[PATCH V5] rxe: Fix a sleep-in-atomic bug in post_one_send
The driver may sleep under a spin lock, and the function call path is: post_one_send (acquire the lock by spin_lock_irqsave) init_send_wqe copy_from_user --> may sleep There is no flow that makes "qp->is_user" true, and copy_from_user may cause bug when a non-user pointer is used. So the lines of copy_from_user and check of "qp->is_user" are removed. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- V5: * Remove useless check of "qp->is_user". Thank Leon for pointing it out. V4: * Remove the line of copy_from_user. Thank Moni for good advice. V3: * It corrects the mistakes of remaining legacy code in V2. Thank Ram for pointing it out. V2: * The parameter "flags" is added to restore and save the irq status. Thank Leon for good advice. --- drivers/infiniband/sw/rxe/rxe_verbs.c |9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 83d709e..073e667 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -740,13 +740,8 @@ static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr, sge = ibwr->sg_list; for (i = 0; i < num_sge; i++, sge++) { - if (qp->is_user && copy_from_user(p, (__user void *) - (uintptr_t)sge->addr, sge->length)) - return -EFAULT; - - else if (!qp->is_user) - memcpy(p, (void *)(uintptr_t)sge->addr, - sge->length); + memcpy(p, (void *)(uintptr_t)sge->addr, + sge->length); p += sge->length; } -- 1.7.9.5
Re: [PATCH] ubifs: Fix a sleep-in-atomic bug in ubifs_read_nnode
On 06/05/2017 04:25 PM, Richard Weinberger wrote: Jia-Ju Bai, Am 05.06.2017 um 05:38 schrieb Jia-Ju Bai: The driver may sleep under a spin lock, and the function call path is: ubifs_change_lp (acquire the lock by spin_lock) change_category ubifs_remove_from_cat remove_from_lpt_heap dbg_check_heap ubifs_lpt_lookup ubifs_read_nnode kzalloc(GFP_NOFS) --> may sleep To fix it, "GFP_NOFS" is replaced with "GFP_ATOMIC". So, this happens only when dbg_check_heap() is activated, right? Thanks, //richard Yes, I think so. Thanks, Jia-Ju Bai
Re: [PATCH V3] rxe: Fix a sleep-in-atomic bug in post_one_send
On 06/05/2017 04:30 PM, Moni Shoua wrote: - if (qp->is_user&& copy_from_user(p, (__user void *) - (uintptr_t)sge->addr, sge->length)) + spin_unlock_irqrestore(>sq.sq_lock, *flags); + err = copy_from_user(p, (__user void *) + (uintptr_t)sge->addr, sge->length); + spin_lock_irqsave(>sq.sq_lock, *flags); + if (qp->is_user&& err) return -EFAULT; qp-_is_user is always false in this function (flow starts from rxe_post_send_kernel) so this line is a dead code In fact, this patch seems to add a serious bug when it uses copy_from_user() from a non user pointer. Do you agree? I agree. So, it is fine to me to remove this line, as you said in the former email: Second, I think that there is no flow that leads to this function when qp->is user is true so maybe the correct action is to remove this line completely if (qp->is_user&& copy_from_user(p, (__user void *)
Re: [PATCH] fs: xfs: Fix a lock-twice and sleep-in-atomic bug in xfs_iget
On 06/05/2017 04:32 PM, Shan Hai wrote: On 2017年06月05日 16:17, Jia-Ju Bai wrote: The driver may sleep under a read rcu lock, and function call path is: xfs_iget (acquire the lock by rcu_read_lock) "goto out_error_or_again" after xfs_iget_cache_hit delay schedule_timeout_uninterruptible --> may sleep Meanwhile, the rcu_read_lock will be called twice in this situation. To fix it, the lock is released before "goto". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- fs/xfs/xfs_icache.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index f61c84f8..c2a4722 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -600,8 +600,10 @@ struct xfs_inode * if (ip) { error = xfs_iget_cache_hit(pag, ip, ino, flags, lock_flags); -if (error) +if (error) { +rcu_read_unlock(); Seems you are going to double unlock by doing this, since it is unlocked in the xfs_iget_cache_hit. Thanks Shan Hai goto out_error_or_again; +} } else { rcu_read_unlock(); XFS_STATS_INC(mp, xs_ig_missed); I think you are right. Please ignore my patch. Thanks, Jia-Ju Bai
[PATCH V4] rxe: Fix a sleep-in-atomic bug in post_one_send
The driver may sleep under a spin lock, and the function call path is: post_one_send (acquire the lock by spin_lock_irqsave) init_send_wqe copy_from_user --> may sleep There is no flow that makes "qp->is_user" true, and copy_from_user may cause bug when a non-user pointer is used. So, the line of copy_from_user is removed. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- V4: * Remove the line of copy_from_user. V3: * It corrects the mistakes of remaining legacy code in V2. (Thank Ram for pointing it out) V2: * The parameter "flags" is added to restore and save the irq status. Thank Leon for good advice. --- drivers/infiniband/sw/rxe/rxe_verbs.c |6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 83d709e..7c52c7c 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -740,11 +740,7 @@ static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr, sge = ibwr->sg_list; for (i = 0; i < num_sge; i++, sge++) { - if (qp->is_user && copy_from_user(p, (__user void *) - (uintptr_t)sge->addr, sge->length)) - return -EFAULT; - - else if (!qp->is_user) + if (!qp->is_user) memcpy(p, (void *)(uintptr_t)sge->addr, sge->length); -- 1.7.9.5
[PATCH V2] ivtv: Fix a sleep-in-atomic bug in snd_ivtv_pcm_hw_free
The driver may sleep under a spin lock, and the function call path is: snd_ivtv_pcm_hw_free (acquire the lock by spin_lock_irqsave) vfree --> may sleep To fix it, the "substream->runtime->dma_area" is passed to a temporary value, and mark it NULL when holding the lock. The memory is freed by vfree through the temporary value outside the lock holding. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/media/pci/ivtv/ivtv-alsa-pcm.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/media/pci/ivtv/ivtv-alsa-pcm.c b/drivers/media/pci/ivtv/ivtv-alsa-pcm.c index 807ead2..a692554 100644 --- a/drivers/media/pci/ivtv/ivtv-alsa-pcm.c +++ b/drivers/media/pci/ivtv/ivtv-alsa-pcm.c @@ -262,14 +262,17 @@ static int snd_ivtv_pcm_hw_free(struct snd_pcm_substream *substream) { struct snd_ivtv_card *itvsc = snd_pcm_substream_chip(substream); unsigned long flags; + unsigned char *dma_area = NULL; spin_lock_irqsave(>slock, flags); if (substream->runtime->dma_area) { dprintk("freeing pcm capture region\n"); - vfree(substream->runtime->dma_area); + dma_area = substream->runtime->dma_area; substream->runtime->dma_area = NULL; } spin_unlock_irqrestore(>slock, flags); + if (dma_area) + vfree(dma_area); return 0; } -- 1.7.9.5
[PATCH] qlcnic: Fix a sleep-in-atomic bug in qlcnic_82xx_hw_write_wx_2M and qlcnic_82xx_hw_read_wx_2M
The driver may sleep under a write spin lock, and the function call path is: qlcnic_82xx_hw_write_wx_2M (acquire the lock by write_lock_irqsave) crb_win_lock qlcnic_pcie_sem_lock usleep_range qlcnic_82xx_hw_read_wx_2M (acquire the lock by write_lock_irqsave) crb_win_lock qlcnic_pcie_sem_lock usleep_range To fix it, the usleep_range is replaced with udelay. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c index 838cc0c..7848cf0 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c @@ -341,7 +341,7 @@ static void qlcnic_write_window_reg(u32 addr, void __iomem *bar0, u32 data) } return -EIO; } - usleep_range(1000, 1500); + udelay(1200); } if (id_reg) -- 1.7.9.5
[PATCH V3] rxe: Fix a sleep-in-atomic bug in post_one_send
The driver may sleep under a spin lock, and the function call path is: post_one_send (acquire the lock by spin_lock_irqsave) init_send_wqe copy_from_user --> may sleep To fix it, the lock is released before copy_from_user, and the lock is acquired again after this function. The parameter "flags" is used to restore and save the irq status. Thank Leon for good advice. This patch corrects the mistakes in V2. (Thank Ram for pointing it out) Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/infiniband/sw/rxe/rxe_verbs.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 83d709e..5293d15 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -721,11 +721,11 @@ static void init_send_wr(struct rxe_qp *qp, struct rxe_send_wr *wr, static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr, unsigned int mask, unsigned int length, -struct rxe_send_wqe *wqe) +struct rxe_send_wqe *wqe, unsigned long *flags) { int num_sge = ibwr->num_sge; struct ib_sge *sge; - int i; + int i, err; u8 *p; init_send_wr(qp, >wr, ibwr); @@ -740,8 +740,11 @@ static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr, sge = ibwr->sg_list; for (i = 0; i < num_sge; i++, sge++) { - if (qp->is_user && copy_from_user(p, (__user void *) - (uintptr_t)sge->addr, sge->length)) + spin_unlock_irqrestore(>sq.sq_lock, *flags); + err = copy_from_user(p, (__user void *) + (uintptr_t)sge->addr, sge->length); + spin_lock_irqsave(>sq.sq_lock, *flags); + if (qp->is_user && err) return -EFAULT; else if (!qp->is_user) @@ -794,7 +797,7 @@ static int post_one_send(struct rxe_qp *qp, struct ib_send_wr *ibwr, send_wqe = producer_addr(sq->queue); - err = init_send_wqe(qp, ibwr, mask, length, send_wqe); + err = init_send_wqe(qp, ibwr, mask, length, send_wqe, ); if (unlikely(err)) goto err1; -- 1.7.9.5
[PATCH] cx18: Fix a sleep-in-atomic bug in snd_cx18_pcm_hw_free
The driver may sleep under a spin lock, and the function call path is: snd_cx18_pcm_hw_free (acquire the lock by spin_lock_irqsave) vfree --> may sleep To fix it, the "substream->runtime->dma_area" is passed to a temporary value, and mark it NULL when holding the lock. The memory is freed by vfree through the temporary value outside the lock holding. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/media/pci/cx18/cx18-alsa-pcm.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/media/pci/cx18/cx18-alsa-pcm.c b/drivers/media/pci/cx18/cx18-alsa-pcm.c index 205a98d..8c51e4c 100644 --- a/drivers/media/pci/cx18/cx18-alsa-pcm.c +++ b/drivers/media/pci/cx18/cx18-alsa-pcm.c @@ -257,14 +257,17 @@ static int snd_cx18_pcm_hw_free(struct snd_pcm_substream *substream) { struct snd_cx18_card *cxsc = snd_pcm_substream_chip(substream); unsigned long flags; + unsigned char *dma_area = NULL; spin_lock_irqsave(>slock, flags); if (substream->runtime->dma_area) { dprintk("freeing pcm capture region\n"); - vfree(substream->runtime->dma_area); + dma_area = substream->runtime->dma_area; substream->runtime->dma_area = NULL; } spin_unlock_irqrestore(>slock, flags); + if (dma_area) + vfree(dma_area); return 0; } -- 1.7.9.5
[PATCH V2] rxe: Fix a sleep-in-atomic bug in post_one_send
The driver may sleep under a spin lock, and the function call path is: post_one_send (acquire the lock by spin_lock_irqsave) init_send_wqe copy_from_user --> may sleep To fix it, the lock is released before copy_from_user, and the lock is acquired again after this function. The parameter "flags" is used to restore and save the irq status. Thank Leon for good advice. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/infiniband/sw/rxe/rxe_verbs.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 83d709e..7dcdf67 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -721,11 +721,11 @@ static void init_send_wr(struct rxe_qp *qp, struct rxe_send_wr *wr, static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr, unsigned int mask, unsigned int length, -struct rxe_send_wqe *wqe) +struct rxe_send_wqe *wqe, unsigned long *flags) { int num_sge = ibwr->num_sge; struct ib_sge *sge; - int i; + int i, err; u8 *p; init_send_wr(qp, >wr, ibwr); @@ -742,7 +742,12 @@ static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr, for (i = 0; i < num_sge; i++, sge++) { if (qp->is_user && copy_from_user(p, (__user void *) (uintptr_t)sge->addr, sge->length)) - return -EFAULT; + spin_unlock_irqrestore(>sq.sq_lock, *flags); + err = copy_from_user(p, (__user void *) + (uintptr_t)sge->addr, sge->length); + spin_lock_irqsave(>sq.sq_lock, *flags); + if (qp->is_user && err) + return -EFAULT; else if (!qp->is_user) memcpy(p, (void *)(uintptr_t)sge->addr, @@ -794,7 +799,7 @@ static int post_one_send(struct rxe_qp *qp, struct ib_send_wr *ibwr, send_wqe = producer_addr(sq->queue); - err = init_send_wqe(qp, ibwr, mask, length, send_wqe); + err = init_send_wqe(qp, ibwr, mask, length, send_wqe, ); if (unlikely(err)) goto err1; -- 1.7.9.5
[BUG] ntb: Sleep in interrupt handling
According to ntb_transport.c, the driver may sleep in interrupt handling. The function call path is: ntb_transport_rxc_db (tasklet_init indicates it handles interrupt) ntb_process_rxc ntb_async_rx ntb_async_rx_submit schedule_timeout --> may sleep This bug is found by my static analysis tool and my code review. I hope to fix it, but I do not have a good solution. Thanks, Jia-Ju Bai
[PATCH] oss: Fix a sleep-in-atomic bug in midi_outc
The driver may sleep under a spin lock, and the function call path is: midi_outc (acquire the lock by spin_lock_irqsave) oss_broken_sleep_on schedule_timeout --> may sleep To fix it, the lock is released before oss_broken_sleep_on, and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- sound/oss/sequencer.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/oss/sequencer.c b/sound/oss/sequencer.c index f19da4b..3d95d752 100644 --- a/sound/oss/sequencer.c +++ b/sound/oss/sequencer.c @@ -1211,7 +1211,9 @@ static void midi_outc(int dev, unsigned char data) spin_lock_irqsave(,flags); while (n && !midi_devs[dev]->outputc(dev, data)) { + spin_unlock_irqrestore(, flags); oss_broken_sleep_on(_sleeper, HZ/25); + spin_lock_irqsave(, flags); n--; } spin_unlock_irqrestore(,flags); -- 1.7.9.5
[PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_attr_interfmode_store
The driver may sleep under a spin lock, and the function call path is: b43legacy_attr_interfmode_store (acquire the lock by spin_lock_irqsave) b43legacy_radio_set_interference_mitigation b43legacy_radio_interference_mitigation_disable b43legacy_calc_nrssi_slope b43legacy_synth_pu_workaround might_sleep and msleep --> may sleep Fixing it may be complex, and a possible way is to remove spin_lock_irqsave and spin_lock_irqrestore in b43legacy_attr_interfmode_store, and the code has been protected by mutex_lock and mutex_unlock. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/wireless/broadcom/b43legacy/sysfs.c |2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/net/wireless/broadcom/b43legacy/sysfs.c b/drivers/net/wireless/broadcom/b43legacy/sysfs.c index 2a1da15..9ede143 100644 --- a/drivers/net/wireless/broadcom/b43legacy/sysfs.c +++ b/drivers/net/wireless/broadcom/b43legacy/sysfs.c @@ -137,14 +137,12 @@ static ssize_t b43legacy_attr_interfmode_store(struct device *dev, } mutex_lock(>wl->mutex); - spin_lock_irqsave(>wl->irq_lock, flags); err = b43legacy_radio_set_interference_mitigation(wldev, mode); if (err) b43legacyerr(wldev->wl, "Interference Mitigation not " "supported by device\n"); mmiowb(); - spin_unlock_irqrestore(>wl->irq_lock, flags); mutex_unlock(>wl->mutex); return err ? err : count; -- 1.7.9.5
[PATCH] ubifs: Fix a sleep-in-atomic bug in ubifs_read_nnode
The driver may sleep under a spin lock, and the function call path is: ubifs_change_lp (acquire the lock by spin_lock) change_category ubifs_remove_from_cat remove_from_lpt_heap dbg_check_heap ubifs_lpt_lookup ubifs_read_nnode kzalloc(GFP_NOFS) --> may sleep To fix it, "GFP_NOFS" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- fs/ubifs/lpt.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ubifs/lpt.c b/fs/ubifs/lpt.c index 9a51710..4247934c 100644 --- a/fs/ubifs/lpt.c +++ b/fs/ubifs/lpt.c @@ -1205,7 +1205,7 @@ int ubifs_read_nnode(struct ubifs_info *c, struct ubifs_nnode *parent, int iip) lnum = c->lpt_lnum; offs = c->lpt_offs; } - nnode = kzalloc(sizeof(struct ubifs_nnode), GFP_NOFS); + nnode = kzalloc(sizeof(struct ubifs_nnode), GFP_ATOMIC); if (!nnode) { err = -ENOMEM; goto out; -- 1.7.9.5
[PATCH] ubifs: Fix a sleep-in-atomic bug in read_nnode
The driver may sleep under a spin lock, and the function call path is: ubifs_change_lp (acquire the lock by spin_lock) change_category ubifs_remove_from_cat remove_from_lpt_heap dbg_check_heap ubifs_lpt_lookup ubifs_get_pnode read_pnode kzalloc(GFP_NOFS) --> may sleep To fix it, "GFP_NOFS" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- fs/ubifs/lpt.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ubifs/lpt.c b/fs/ubifs/lpt.c index 9a51710..fabe2fd 100644 --- a/fs/ubifs/lpt.c +++ b/fs/ubifs/lpt.c @@ -1268,7 +1268,7 @@ static int read_pnode(struct ubifs_info *c, struct ubifs_nnode *parent, int iip) branch = >nbranch[iip]; lnum = branch->lnum; offs = branch->offs; - pnode = kzalloc(sizeof(struct ubifs_pnode), GFP_NOFS); + pnode = kzalloc(sizeof(struct ubifs_pnode), GFP_ATOMIC); if (!pnode) return -ENOMEM; -- 1.7.9.5
[PATCH] cachefiles: Fix a sleep-in-atomic bug in cachefiles_printk_object
The driver may sleep under a write spin lock, and the function call path is: cachefiles_mark_object_active (acquire the lock by write_lock) cachefiles_printk_object kmalloc(GFP_NOIO) --> may sleep cachefiles_mark_object_buried (acquire the lock by write_lock) cachefiles_printk_object kmalloc(GFP_NOIO) --> may sleep To fix it, "GFP_NOIO" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- fs/cachefiles/namei.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index 41df8a2..d0f76e7 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -83,7 +83,7 @@ static noinline void cachefiles_printk_object(struct cachefiles_object *object, { u8 *keybuf; - keybuf = kmalloc(CACHEFILES_KEYBUF_SIZE, GFP_NOIO); + keybuf = kmalloc(CACHEFILES_KEYBUF_SIZE, GFP_ATOMIC); if (object) __cachefiles_printk_object(object, "", keybuf); if (xobject) -- 1.7.9.5
[PATCH V2] staging: rt5208: Fix a sleep-in-atomic bug in xd_copy_page
The driver may sleep under a spin lock, and the function call path is: rtsx_exclusive_enter_ss (acquire the lock by spin_lock) rtsx_enter_ss rtsx_power_off_card xd_cleanup_work xd_delay_write xd_finish_write xd_copy_page wait_timeout schedule_timeout --> may sleep To fix it, "wait_timeout" is replaced with mdelay in xd_copy_page. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- V2: * Add code explanation in this patch. drivers/staging/rts5208/xd.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rts5208/xd.c b/drivers/staging/rts5208/xd.c index 85aba05..74d36f9 100644 --- a/drivers/staging/rts5208/xd.c +++ b/drivers/staging/rts5208/xd.c @@ -1268,7 +1268,7 @@ static int xd_copy_page(struct rtsx_chip *chip, u32 old_blk, u32 new_blk, reg = 0; rtsx_read_register(chip, XD_CTL, ); if (reg & (XD_ECC1_ERROR | XD_ECC2_ERROR)) { - wait_timeout(100); + mdelay(100); if (detect_card_cd(chip, XD_CARD) != STATUS_SUCCESS) { -- 1.7.9.5
[PATCH V3] rxe: Fix a sleep-in-atomic bug in post_one_send
The driver may sleep under a spin lock, and the function call path is: post_one_send (acquire the lock by spin_lock_irqsave) init_send_wqe copy_from_user --> may sleep To fix it, the lock is released before copy_from_user, and the lock is acquired again after this function. The parameter "flags" is used to restore and save the irq status. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- V3: * It corrects the mistakes of remaining legacy code in V2. (Thank Ram for pointing it out) V2: * The parameter "flags" is added to restore and save the irq status. Thank Leon for good advice. drivers/infiniband/sw/rxe/rxe_verbs.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 83d709e..5293d15 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -721,11 +721,11 @@ static void init_send_wr(struct rxe_qp *qp, struct rxe_send_wr *wr, static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr, unsigned int mask, unsigned int length, -struct rxe_send_wqe *wqe) +struct rxe_send_wqe *wqe, unsigned long *flags) { int num_sge = ibwr->num_sge; struct ib_sge *sge; - int i; + int i, err; u8 *p; init_send_wr(qp, >wr, ibwr); @@ -740,8 +740,11 @@ static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr, sge = ibwr->sg_list; for (i = 0; i < num_sge; i++, sge++) { - if (qp->is_user && copy_from_user(p, (__user void *) - (uintptr_t)sge->addr, sge->length)) + spin_unlock_irqrestore(>sq.sq_lock, *flags); + err = copy_from_user(p, (__user void *) + (uintptr_t)sge->addr, sge->length); + spin_lock_irqsave(>sq.sq_lock, *flags); + if (qp->is_user && err) return -EFAULT; else if (!qp->is_user) @@ -794,7 +797,7 @@ static int post_one_send(struct rxe_qp *qp, struct ib_send_wr *ibwr, send_wqe = producer_addr(sq->queue); - err = init_send_wqe(qp, ibwr, mask, length, send_wqe); + err = init_send_wqe(qp, ibwr, mask, length, send_wqe, ); if (unlikely(err)) goto err1; -- 1.7.9.5
[PATCH V3] rxe: Fix a sleep-in-atomic bug in post_one_send
The driver may sleep under a spin lock, and the function call path is: post_one_send (acquire the lock by spin_lock_irqsave) init_send_wqe copy_from_user --> may sleep To fix it, the lock is released before copy_from_user, and the lock is acquired again after this function. The parameter "flags" is used to restore and save the irq status. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- V3: * It corrects the mistakes of remaining legacy code in V2. (Thank Ram for pointing it out) V2: * The parameter "flags" is added to restore and save the irq status. Thank Leon for good advice. --- drivers/infiniband/sw/rxe/rxe_verbs.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 83d709e..5293d15 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -721,11 +721,11 @@ static void init_send_wr(struct rxe_qp *qp, struct rxe_send_wr *wr, static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr, unsigned int mask, unsigned int length, -struct rxe_send_wqe *wqe) +struct rxe_send_wqe *wqe, unsigned long *flags) { int num_sge = ibwr->num_sge; struct ib_sge *sge; - int i; + int i, err; u8 *p; init_send_wr(qp, >wr, ibwr); @@ -740,8 +740,11 @@ static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr, sge = ibwr->sg_list; for (i = 0; i < num_sge; i++, sge++) { - if (qp->is_user && copy_from_user(p, (__user void *) - (uintptr_t)sge->addr, sge->length)) + spin_unlock_irqrestore(>sq.sq_lock, *flags); + err = copy_from_user(p, (__user void *) + (uintptr_t)sge->addr, sge->length); + spin_lock_irqsave(>sq.sq_lock, *flags); + if (qp->is_user && err) return -EFAULT; else if (!qp->is_user) @@ -794,7 +797,7 @@ static int post_one_send(struct rxe_qp *qp, struct ib_send_wr *ibwr, send_wqe = producer_addr(sq->queue); - err = init_send_wqe(qp, ibwr, mask, length, send_wqe); + err = init_send_wqe(qp, ibwr, mask, length, send_wqe, ); if (unlikely(err)) goto err1; -- 1.7.9.5
[PATCH V2] staging: rt5208: Fix a sleep-in-atomic bug in xd_copy_page
The driver may sleep under a spin lock, and the function call path is: rtsx_exclusive_enter_ss (acquire the lock by spin_lock) rtsx_enter_ss rtsx_power_off_card xd_cleanup_work xd_delay_write xd_finish_write xd_copy_page wait_timeout schedule_timeout --> may sleep To fix it, "wait_timeout" is replaced with mdelay in xd_copy_page. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- V2: * Add code explanation in this patch. --- drivers/staging/rts5208/xd.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rts5208/xd.c b/drivers/staging/rts5208/xd.c index 85aba05..74d36f9 100644 --- a/drivers/staging/rts5208/xd.c +++ b/drivers/staging/rts5208/xd.c @@ -1268,7 +1268,7 @@ static int xd_copy_page(struct rtsx_chip *chip, u32 old_blk, u32 new_blk, reg = 0; rtsx_read_register(chip, XD_CTL, ); if (reg & (XD_ECC1_ERROR | XD_ECC2_ERROR)) { - wait_timeout(100); + mdelay(100); if (detect_card_cd(chip, XD_CARD) != STATUS_SUCCESS) { -- 1.7.9.5
[PATCH] fs: nfs: Fix a sleep-in-atomic bug in nfs_access_add_cache
The driver may sleep under a rcu read lock, and function call path is: nfs_permission (acquire the lock by rcu_read_lock) nfs_do_access nfs_access_add_cache kmalloc(GFP_KERNEL) --> may sleep To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- fs/nfs/dir.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 32ccd77..7a074db 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -2333,7 +2333,7 @@ static void nfs_access_add_rbtree(struct inode *inode, struct nfs_access_entry * void nfs_access_add_cache(struct inode *inode, struct nfs_access_entry *set) { - struct nfs_access_entry *cache = kmalloc(sizeof(*cache), GFP_KERNEL); + struct nfs_access_entry *cache = kmalloc(sizeof(*cache), GFP_ATOMIC); if (cache == NULL) return; RB_CLEAR_NODE(>rb_node); -- 1.7.9.5
[PATCH] fs: xfs: Fix a lock-twice and sleep-in-atomic bug in xfs_iget
The driver may sleep under a read rcu lock, and function call path is: xfs_iget (acquire the lock by rcu_read_lock) "goto out_error_or_again" after xfs_iget_cache_hit delay schedule_timeout_uninterruptible --> may sleep Meanwhile, the rcu_read_lock will be called twice in this situation. To fix it, the lock is released before "goto". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- fs/xfs/xfs_icache.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index f61c84f8..c2a4722 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -600,8 +600,10 @@ struct xfs_inode * if (ip) { error = xfs_iget_cache_hit(pag, ip, ino, flags, lock_flags); - if (error) + if (error) { + rcu_read_unlock(); goto out_error_or_again; + } } else { rcu_read_unlock(); XFS_STATS_INC(mp, xs_ig_missed); -- 1.7.9.5
[PATCH] rts5208: Fix a sleep-in-atomic bug in sd_send_cmd_get_rsp
The driver may sleep under a spin lock, and the function call path is: rtsx_exclusive_enter_ss (acquire the lock by spin_lock) rtsx_enter_ss rtsx_power_off_card sd_cleanup_work sd_stop_seq_mode sd_switch_clock sd_ddr_tuning sd_ddr_pre_tuning_tx sd_send_cmd_get_rsp wait_timeout schedule_timeout --> may sleep To fix it, "wait_timeout" is replaced with mdelay in sd_send_cmd_get_rsp. Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com> --- drivers/staging/rts5208/sd.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rts5208/sd.c b/drivers/staging/rts5208/sd.c index bdd35b6..fed17ff 100644 --- a/drivers/staging/rts5208/sd.c +++ b/drivers/staging/rts5208/sd.c @@ -226,7 +226,7 @@ static int sd_send_cmd_get_rsp(struct rtsx_chip *chip, u8 cmd_idx, return STATUS_FAIL; } if (rty_cnt< SD_MAX_RETRY_COUNT) { - wait_timeout(20); + mdelay(20); rty_cnt++; goto RTY_SEND_CMD; } else { -- 1.7.9.5
[PATCH] net: tipc: Fix a sleep-in-atomic bug in tipc_msg_reverse
The kernel may sleep under a rcu read lock in tipc_msg_reverse, and the function call path is: tipc_l2_rcv_msg (acquire the lock by rcu_read_lock) tipc_rcv tipc_sk_rcv tipc_msg_reverse pskb_expand_head(GFP_KERNEL) --> may sleep tipc_node_broadcast tipc_node_xmit_skb tipc_node_xmit tipc_sk_rcv tipc_msg_reverse pskb_expand_head(GFP_KERNEL) --> may sleep To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- net/tipc/msg.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 312ef7d..ab30876 100644 --- a/net/tipc/msg.c +++ b/net/tipc/msg.c @@ -508,7 +508,7 @@ bool tipc_msg_reverse(u32 own_node, struct sk_buff **skb, int err) } if (skb_cloned(_skb) && - pskb_expand_head(_skb, BUF_HEADROOM, BUF_TAILROOM, GFP_KERNEL)) + pskb_expand_head(_skb, BUF_HEADROOM, BUF_TAILROOM, GFP_ATOMIC)) goto exit; /* Now reverse the concerned fields */ -- 1.7.9.5
[PATCH] net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx
The kernel may sleep under a rcu read lock in cfpkt_create_pfx, and the function call path is: cfcnfg_linkup_rsp (acquire the lock by rcu_read_lock) cfctrl_linkdown_req cfpkt_create cfpkt_create_pfx alloc_skb(GFP_KERNEL) --> may sleep cfserl_receive (acquire the lock by rcu_read_lock) cfpkt_split cfpkt_create_pfx alloc_skb(GFP_KERNEL) --> may sleep There is "in_interrupt" in cfpkt_create_pfx to decide use "GFP_KERNEL" or "GFP_ATOMIC". In this situation, "GFP_KERNEL" is used because the function is called under a rcu read lock, instead in interrupt. To fix it, only "GFP_ATOMIC" is used in cfpkt_create_pfx. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- net/caif/cfpkt_skbuff.c |6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/net/caif/cfpkt_skbuff.c b/net/caif/cfpkt_skbuff.c index 59ce1fc..71b6ab2 100644 --- a/net/caif/cfpkt_skbuff.c +++ b/net/caif/cfpkt_skbuff.c @@ -81,11 +81,7 @@ static struct cfpkt *cfpkt_create_pfx(u16 len, u16 pfx) { struct sk_buff *skb; - if (likely(in_interrupt())) - skb = alloc_skb(len + pfx, GFP_ATOMIC); - else - skb = alloc_skb(len + pfx, GFP_KERNEL); - + skb = alloc_skb(len + pfx, GFP_ATOMIC); if (unlikely(skb == NULL)) return NULL; -- 1.7.9.5
[PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_hw_read_wx_2M and netxen_nic_hw_write_wx_2M
The driver may sleep under a write spin lock, and function call path is: netxen_nic_hw_read_wx_2M (acquire the lock by write_lock_irqsave) crb_win_lock netxen_pcie_sem_lock msleep --> may sleep netxen_nic_hw_write_wx_2M (acquire the lock by write_lock_irqsave) crb_win_lock netxen_pcie_sem_lock msleep --> may sleep To fix it, the "msleep" is replaced with "mdelay" in netxen_pcie_sem_lock. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c index a996801..0a9da42 100644 --- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c +++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c @@ -329,7 +329,7 @@ static void __iomem *pci_base_offset(struct netxen_adapter *adapter, break; if (++timeout >= NETXEN_PCIE_SEM_TIMEOUT) return -EIO; - msleep(1); + mdelay(1); } if (id_reg) -- 1.7.9.5
[PATCH] qla4xxx: Fix a sleep-in-atomic bug in qla4_82xx_crb_win_lock
The driver may sleep under a write spin lock, the function call path is: qla4_82xx_wr_32 (acquire the lock) qla4_82xx_crb_win_lock schedule or cpu_relax To fix it, the lock is released before "schedule" and "cpu_relax", and the lock is acquired again after "schedule" and "cpu_relax". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/scsi/qla4xxx/ql4_glbl.h |2 +- drivers/scsi/qla4xxx/ql4_nx.c |8 +--- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/qla4xxx/ql4_glbl.h b/drivers/scsi/qla4xxx/ql4_glbl.h index bce96a5..b723bef 100644 --- a/drivers/scsi/qla4xxx/ql4_glbl.h +++ b/drivers/scsi/qla4xxx/ql4_glbl.h @@ -115,7 +115,7 @@ uint8_t qla4xxx_update_local_ifcb(struct scsi_qla_host *ha, void qla4_82xx_queue_iocb(struct scsi_qla_host *ha); void qla4_82xx_complete_iocb(struct scsi_qla_host *ha); -int qla4_82xx_crb_win_lock(struct scsi_qla_host *); +int qla4_82xx_crb_win_lock(struct scsi_qla_host *, unsigned long *); void qla4_82xx_crb_win_unlock(struct scsi_qla_host *); int qla4_82xx_pci_get_crb_addr_2M(struct scsi_qla_host *, ulong *); void qla4_82xx_wr_32(struct scsi_qla_host *, ulong, u32); diff --git a/drivers/scsi/qla4xxx/ql4_nx.c b/drivers/scsi/qla4xxx/ql4_nx.c index e91abb3..1cf5f4a 100644 --- a/drivers/scsi/qla4xxx/ql4_nx.c +++ b/drivers/scsi/qla4xxx/ql4_nx.c @@ -386,7 +386,7 @@ if (rv == 1) { write_lock_irqsave(>hw_lock, flags); - qla4_82xx_crb_win_lock(ha); + qla4_82xx_crb_win_lock(ha, ); qla4_82xx_pci_set_crbwindow_2M(ha, ); } @@ -410,7 +410,7 @@ uint32_t qla4_82xx_rd_32(struct scsi_qla_host *ha, ulong off) if (rv == 1) { write_lock_irqsave(>hw_lock, flags); - qla4_82xx_crb_win_lock(ha); + qla4_82xx_crb_win_lock(ha, ); qla4_82xx_pci_set_crbwindow_2M(ha, ); } data = readl((void __iomem *)off); @@ -476,7 +476,7 @@ int qla4_82xx_md_wr_32(struct scsi_qla_host *ha, uint32_t off, uint32_t data) #define CRB_WIN_LOCK_TIMEOUT 1 -int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha) +int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha, unsigned long *flags) { int i; int done = 0, timeout = 0; @@ -491,6 +491,7 @@ int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha) timeout++; + write_unlock_irqrestore(>hw_lock, *flags); /* Yield CPU */ if (!in_interrupt()) schedule(); @@ -498,6 +499,7 @@ int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha) for (i = 0; i < 20; i++) cpu_relax();/*This a nop instr on i386*/ } + write_lock_irqsave(>hw_lock, *flags); } qla4_82xx_wr_32(ha, QLA82XX_CRB_WIN_LOCK_ID, ha->func_num); return 0; -- 1.7.9.5
[PATCH] oss: Fix a sleep-in-atomic bug in midi_outc
The driver may sleep under a spin lock, and the function call path is: midi_outc (acquire the lock by spin_lock_irqsave) oss_broken_sleep_on schedule_timeout --> may sleep To fix it, the lock is released before oss_broken_sleep_on, and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- sound/oss/sequencer.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/oss/sequencer.c b/sound/oss/sequencer.c index f19da4b..3d95d752 100644 --- a/sound/oss/sequencer.c +++ b/sound/oss/sequencer.c @@ -1211,7 +1211,9 @@ static void midi_outc(int dev, unsigned char data) spin_lock_irqsave(,flags); while (n && !midi_devs[dev]->outputc(dev, data)) { + spin_unlock_irqrestore(, flags); oss_broken_sleep_on(_sleeper, HZ/25); + spin_lock_irqsave(, flags); n--; } spin_unlock_irqrestore(,flags); -- 1.7.9.5
[PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_pci_mem_access_direct
The driver may sleep under a spin lock, and the function call path is: netxen_nic_pci_mem_access_direct (acquire the lock by spin_lock) ioremap --> may sleep To fix it, the lock is released before "ioremap", and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c index a996801..5ea553e 100644 --- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c +++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c @@ -1419,7 +1419,9 @@ static u32 netxen_nic_io_read_2M(struct netxen_adapter *adapter, mem_base = pci_resource_start(adapter->pdev, 0) + (start & PAGE_MASK); + spin_unlock(>ahw.mem_lock); mem_ptr = ioremap(mem_base, PAGE_SIZE); + spin_lock(>ahw.mem_lock); if (mem_ptr == NULL) { ret = -EIO; goto unlock; -- 1.7.9.5
[PATCH] qed: Fix a sleep-in-interrupt bug in qed_int_sp_dpc
The driver may sleep in interrupt handling, and the function call path is: qed_int_sp_dpc (tasklet_init indicates it handles interrupt) qed_int_attentions qed_mcp_handle_events qed_mcp_handle_link_change qed_link_update qed_fill_link qed_mcp_get_media_type qed_ptt_acquire usleep_range --> may sleep To fix it, the "usleep_range" is replaced with "udelay". Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/ethernet/qlogic/qed/qed_hw.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_hw.c b/drivers/net/ethernet/qlogic/qed/qed_hw.c index a05feb3..3250cc4 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_hw.c +++ b/drivers/net/ethernet/qlogic/qed/qed_hw.c @@ -131,7 +131,7 @@ struct qed_ptt *qed_ptt_acquire(struct qed_hwfn *p_hwfn) } spin_unlock_bh(_hwfn->p_ptt_pool->lock); - usleep_range(1000, 2000); + udelay(1500); } DP_NOTICE(p_hwfn, "PTT acquire timeout - failed to allocate PTT\n"); -- 1.7.9.5
[PATCH v2 2/3] rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_createbss_cmd
The driver may sleep under a spinlock, and the function call path is: rtw_surveydone_event_callback(acquire the spinlock) rtw_createbss_cmd kzalloc(GFP_KERNEL) --> may sleep To fix it, GFP_KERNEL is replaced with GFP_ATOMIC. This bug is found by my static analysis tool and my code review. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/rtl8188eu/core/rtw_cmd.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rtl8188eu/core/rtw_cmd.c b/drivers/staging/rtl8188eu/core/rtw_cmd.c index 9461bce..430b8eb 100644 --- a/drivers/staging/rtl8188eu/core/rtw_cmd.c +++ b/drivers/staging/rtl8188eu/core/rtw_cmd.c @@ -333,7 +333,7 @@ u8 rtw_createbss_cmd(struct adapter *padapter) else RT_TRACE(_module_rtl871x_cmd_c_, _drv_info_, (" createbss for SSid:%s\n", pmlmepriv->assoc_ssid.Ssid)); - pcmd = kzalloc(sizeof(struct cmd_obj), GFP_KERNEL); + pcmd = kzalloc(sizeof(struct cmd_obj), GFP_ATOMIC); if (!pcmd) { res = _FAIL; goto exit; -- 1.7.9.5
[PATCH v2 1/3] rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_disassoc_cmd
The driver may sleep under a spinlock, and the function call path is: rtw_set_802_11_bssid(acquire the spinlock) rtw_disassoc_cmd kzalloc(GFP_KERNEL) --> may sleep To fix it, GFP_KERNEL is replaced with GFP_ATOMIC. This bug is found by my static analysis tool and my code review. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/rtl8188eu/core/rtw_cmd.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rtl8188eu/core/rtw_cmd.c b/drivers/staging/rtl8188eu/core/rtw_cmd.c index 9461bce..65083a7 100644 --- a/drivers/staging/rtl8188eu/core/rtw_cmd.c +++ b/drivers/staging/rtl8188eu/core/rtw_cmd.c @@ -508,7 +508,7 @@ u8 rtw_disassoc_cmd(struct adapter *padapter, u32 deauth_timeout_ms, bool enqueu if (enqueue) { /* need enqueue, prepare cmd_obj and enqueue */ - cmdobj = kzalloc(sizeof(*cmdobj), GFP_KERNEL); + cmdobj = kzalloc(sizeof(*cmdobj), GFP_ATOMIC); if (!cmdobj) { res = _FAIL; kfree(param); -- 1.7.9.5
[PATCH v2 3/3] rtl8188eu: Fix a possible sleep-in-atomic bug in _rtw_pwr_wakeup
The driver may sleep under a spinlock, and the function call path is: rtw_set_802_11_disassociate(acquire the spinlock) _rtw_pwr_wakeup usleep_range --> may sleep To fix it, usleep_range is replaced with udelay. This bug is found by my static analysis tool and my code review. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/rtl8188eu/core/rtw_pwrctrl.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rtl8188eu/core/rtw_pwrctrl.c b/drivers/staging/rtl8188eu/core/rtw_pwrctrl.c index f86c9ce..2913661 100644 --- a/drivers/staging/rtl8188eu/core/rtw_pwrctrl.c +++ b/drivers/staging/rtl8188eu/core/rtw_pwrctrl.c @@ -569,7 +569,7 @@ int _rtw_pwr_wakeup(struct adapter *padapter, u32 ips_deffer_ms, const char *cal DBG_88E("%s wait ps_processing...\n", __func__); while (pwrpriv->ps_processing && jiffies_to_msecs(jiffies - start) <= 3000) - usleep_range(1000, 3000); + udelay(1500); if (pwrpriv->ps_processing) DBG_88E("%s wait ps_processing timeout\n", __func__); else -- 1.7.9.5
[PATCH] arcmsr: Fix possible sleep-in-atomic bugs in arcmsr_queue_command
The driver may sleep under a spinlock, and the function call paths are: arcmsr_queue_command(acquire the spinlock) arcmsr_queue_command_lck arcmsr_handle_virtual_command arcmsr_iop_message_xfer arcmsr_iop_parking arcmsr_stop_adapter_bgrb arcmsr_hbaA_stop_bgrb arcmsr_hbaA_wait_msgint_ready msleep --> may sleep arcmsr_queue_command(acquire the spinlock) arcmsr_queue_command_lck arcmsr_handle_virtual_command arcmsr_iop_message_xfer arcmsr_iop_parking arcmsr_stop_adapter_bgrb arcmsr_hbaB_stop_bgrb arcmsr_hbaB_wait_msgint_ready msleep --> may sleep arcmsr_queue_command(acquire the spinlock) arcmsr_queue_command_lck arcmsr_handle_virtual_command arcmsr_iop_message_xfer arcmsr_iop_parking arcmsr_stop_adapter_bgrb arcmsr_hbaC_stop_bgrb arcmsr_hbaC_wait_msgint_ready msleep --> may sleep arcmsr_queue_command(acquire the spinlock) arcmsr_queue_command_lck arcmsr_handle_virtual_command arcmsr_iop_message_xfer arcmsr_iop_parking arcmsr_stop_adapter_bgrb arcmsr_hbaD_stop_bgrb arcmsr_hbaD_wait_msgint_ready msleep --> may sleep To fix them, msleep is replaced with mdelay. These bugs are found by my static analysis tool and my code review. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/scsi/arcmsr/arcmsr_hba.c |8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c index af032c4..31d94bd 100644 --- a/drivers/scsi/arcmsr/arcmsr_hba.c +++ b/drivers/scsi/arcmsr/arcmsr_hba.c @@ -347,7 +347,7 @@ static uint8_t arcmsr_hbaA_wait_msgint_ready(struct AdapterControlBlock *acb) >outbound_intstatus); return true; } - msleep(10); + mdelay(10); } /* max 20 seconds */ return false; @@ -367,7 +367,7 @@ static uint8_t arcmsr_hbaB_wait_msgint_ready(struct AdapterControlBlock *acb) reg->drv2iop_doorbell); return true; } - msleep(10); + mdelay(10); } /* max 20 seconds */ return false; @@ -385,7 +385,7 @@ static uint8_t arcmsr_hbaC_wait_msgint_ready(struct AdapterControlBlock *pACB) >outbound_doorbell_clear); /*clear interrupt*/ return true; } - msleep(10); + mdelay(10); } /* max 20 seconds */ return false; @@ -403,7 +403,7 @@ static bool arcmsr_hbaD_wait_msgint_ready(struct AdapterControlBlock *pACB) reg->outbound_doorbell); return true; } - msleep(10); + mdelay(10); } /* max 20 seconds */ return false; } -- 1.7.9.5
[PATCH] wcn36xx: Remove unnecessary rcu_read_unlock in wcn36xx_bss_info_changed
No rcu_read_lock is called, but rcu_read_unlock is still called. Thus rcu_read_unlock should be removed. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/net/wireless/ath/wcn36xx/main.c |1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c index 35bd50b..b83f01d 100644 --- a/drivers/net/wireless/ath/wcn36xx/main.c +++ b/drivers/net/wireless/ath/wcn36xx/main.c @@ -812,7 +812,6 @@ static void wcn36xx_bss_info_changed(struct ieee80211_hw *hw, if (!sta) { wcn36xx_err("sta %pM is not found\n", bss_conf->bssid); - rcu_read_unlock(); goto out; } sta_priv = wcn36xx_sta_to_priv(sta); -- 1.7.9.5
[BUG] gma500: Possible sleep-in-atomic bugs in gma_resume_pci
The driver may sleep under a spinlock, and the function call paths are: gma_power_begin (acquire the spinlock) gma_resume_pci pci_set_power_state __pci_start_power_transition (drivers/pci/pci.c) msleep --> may sleep gma_power_begin (acquire the spinlock) gma_resume_pci pci_enable_device pci_enable_device_flags (drivers/pci/pci.c) do_pci_enable_device pci_set_power_state __pci_start_power_transition msleep --> may sleep A possible fix is to replace msleep with mdelay in __pci_start_power_transition in drivers/pci/pci.c. These bugs are found by my static analysis tool and my code review. Thanks, Jia-Ju Bai
[BUG] rtl8188eu: Some possible sleep-in-atomic bugs in ips_leave
CC to mailing list. On 2017/10/8 20:13, Jia-Ju Bai wrote: The driver may sleep under a spinlock when calling the function "ips_leave", which causes some possible sleep-in-atomic bugs. Here are several examples: rtw_set_802_11_disassociate (acquire the spinlock) _rtw_pwr_wakeup ips_leave mutex_lock --> may sleep rtw_set_802_11_disassociate (acquire the spinlock) _rtw_pwr_wakeup ips_leave rtw_ips_pwr_up ips_netdrv_open rtw_hal_init rtl8188eu_hal_init rtl88eu_download_fw request_firmware --> may sleep kmalloc --> may sleep rtw_set_802_11_disassociate (acquire the spinlock) _rtw_pwr_wakeup ips_leave rtw_set_key kzalloc(GFP_KERNEL) --> may sleep All these bugs are caused by that "ips_leave" calls some sleep-able functions. A possible fix is to release the spinlock before calling "ips_leave", and acquire the spinlock again after it. These bugs are found by my static analysis tool and my code review. Thanks, Jia-Ju Bai
[PATCH] staging/rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_disassoc_cmd
The driver may sleep under a spinlock, and the function call path is: rtw_set_802_11_bssid(acquire the spinlock) rtw_disassoc_cmd kzalloc(GFP_KERNEL) --> may sleep To fix it, GFP_KERNEL is replaced with GFP_ATOMIC. This bug is found by my static analysis tool and my code review. Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com> --- drivers/staging/rtl8188eu/core/rtw_cmd.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rtl8188eu/core/rtw_cmd.c b/drivers/staging/rtl8188eu/core/rtw_cmd.c index 9461bce..65083a7 100644 --- a/drivers/staging/rtl8188eu/core/rtw_cmd.c +++ b/drivers/staging/rtl8188eu/core/rtw_cmd.c @@ -508,7 +508,7 @@ u8 rtw_disassoc_cmd(struct adapter *padapter, u32 deauth_timeout_ms, bool enqueu if (enqueue) { /* need enqueue, prepare cmd_obj and enqueue */ - cmdobj = kzalloc(sizeof(*cmdobj), GFP_KERNEL); + cmdobj = kzalloc(sizeof(*cmdobj), GFP_ATOMIC); if (!cmdobj) { res = _FAIL; kfree(param); -- 1.7.9.5
[BUG] haswell: A possible sleep-in-atomic bug in hsw_irq_thread
According to sst-haswell-ipc.c, the driver may sleep under a spinlock, and the function call path is: hsw_irq_thread (acquire the spinlock) hsw_process_notification hsw_log_message mutex_lock --> may sleep This bug is found by my static analysis tool and my code review. Thanks, Jia-Ju Bai
[BUG] stmmac: A possible sleep-in-atomic bug in stmmac_suspend
According to stmmac_main.c, the driver may sleep under a spinlock, and the function call path is: stmmac_suspend (acquire the spinlock) stmmac_disable_all_queues napi_disable might_sleep --> may sleep msleep --> may sleep This bug is found by my static analysis tool and my code review. Thanks, Jia-Ju Bai
Re: [PATCH] ext2/super: Fix a possible sleep-in-atomic bug in parse_options
Thanks for your reply. I agree that extra allocation in match_number() and match_u64int() may be unnecessary. Thanks, Jia-Ju Bai On 2017/10/7 9:37, Linus Torvalds wrote: On Fri, Oct 6, 2017 at 6:20 PM, Jia-Ju Bai <baijiaju1...@163.com> wrote: To fix it, GFP_KERNEL is replaced with GFP_ATOMIC. This bug is found by my static analysis tool and my code review. I'm not saying your patch is wrong, but it's a shame that we do that extra allocation in match_number() and match_u64int(), and that we don't have anything that is just size-limited. And there really isn't anything saying that we shouldn't do the same silly thing to match_u64int(). Maybe we don't have any actual users that need it for now, but still.. Oh well. I do wonder if we shouldn't just use something like "skip leading zeroes, copy to size-limited stack location instead" because the input length really *is* limited once you skip leading zeroes (and whatever base marker we have). We might have at most a 64-bit value in octal, so 22 bytes max. But I guess just changing the two GFP_KERNEL's to GFP_ATOMIC is much simpler. Linus