[PATCH] igb: Fix a deadlock in igb_sriov_reinit

2015-08-02 Thread Jia-Ju Bai
When igb_init_interrupt_scheme in igb_sriov_reinit is failed, the lock 
acquired by rtnl_lock() is not released, which causes a deadlock.
This patch adds rtnl_unlock() in error handling to fix it.

Signed-off-by: Jia-Ju Bai baijiaju1...@163.com
---
 drivers/net/ethernet/intel/igb/igb_main.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index 2f70a9b..311d1ca 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -7539,6 +7539,7 @@ static int igb_sriov_reinit(struct pci_dev *dev)
 
if (igb_init_interrupt_scheme(adapter, true)) {
dev_err(pdev-dev, Unable to allocate memory for queues\n);
+   rtnl_unlock();
return -ENOMEM;
}
 
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] 3c59x: Fix resource leaks in vortex_open

2015-08-02 Thread Jia-Ju Bai
When vortex_up is failed, the skb buffers allocated by __netdev_alloc_skb
in vortex_open are not released, which may cause resource leaks.
This bug has been submitted before.
This patch modifies the error handling code to fix it.

Signed-off-by: Jia-Ju Bai baijiaju1...@163.com
---
 drivers/net/ethernet/3com/3c59x.c |   17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/3com/3c59x.c 
b/drivers/net/ethernet/3com/3c59x.c
index 2d1ce3c..753887d 100644
--- a/drivers/net/ethernet/3com/3c59x.c
+++ b/drivers/net/ethernet/3com/3c59x.c
@@ -1763,16 +1763,9 @@ vortex_open(struct net_device *dev)
vp-rx_ring[i].addr = 
cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb-data, PKT_BUF_SZ, 
PCI_DMA_FROMDEVICE));
}
if (i != RX_RING_SIZE) {
-   int j;
pr_emerg(%s: no memory for rx ring\n, dev-name);
-   for (j = 0; j  i; j++) {
-   if (vp-rx_skbuff[j]) {
-   dev_kfree_skb(vp-rx_skbuff[j]);
-   vp-rx_skbuff[j] = NULL;
-   }
-   }
retval = -ENOMEM;
-   goto err_free_irq;
+   goto err_free_skb;
}
/* Wrap the ring. */
vp-rx_ring[i-1].next = cpu_to_le32(vp-rx_ring_dma);
@@ -1782,7 +1775,13 @@ vortex_open(struct net_device *dev)
if (!retval)
goto out;
 
-err_free_irq:
+err_free_skb:
+   for (i = 0; i  RX_RING_SIZE; i++) {
+   if (vp-rx_skbuff[i]) {
+   dev_kfree_skb(vp-rx_skbuff[i]);
+   vp-rx_skbuff[i] = NULL;
+   }
+   }
free_irq(dev-irq, dev);
 err:
if (vortex_debug  1)
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] e1000e: Modify tx/rx configurations to avoid null pointer dereferences in e1000_open

2015-08-05 Thread Jia-Ju Bai
When e1000e_setup_rx_resources is failed in e1000_open,
e1000e_free_tx_resources in err_setup_rx segment is executed.
writel(0, tx_ring-head) statement in e1000_clean_tx_ring
in e1000e_free_tx_resources will cause a null poonter dereference(crash),
because tx_ring-head is only assigned in e1000_configure_tx
in e1000_configure, but it is after e1000e_setup_rx_resources.

This patch moves head/tail register writing to e1000_configure_tx/rx, 
which can fix this problem. It is inspired by igb_configure_tx_ring 
in the igb driver.

Specially, thank Alexander Duyck for his valuable suggestion.

Signed-off-by: Jia-Ju Bai baijiaju1...@163.com
---
 drivers/net/ethernet/intel/e1000e/netdev.c |   24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
b/drivers/net/ethernet/intel/e1000e/netdev.c
index 89d788d..3aee51b 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1737,12 +1737,6 @@ static void e1000_clean_rx_ring(struct e1000_ring 
*rx_ring)
rx_ring-next_to_clean = 0;
rx_ring-next_to_use = 0;
adapter-flags2 = ~FLAG2_IS_DISCARDING;
-
-   writel(0, rx_ring-head);
-   if (adapter-flags2  FLAG2_PCIM2PCI_ARBITER_WA)
-   e1000e_update_rdt_wa(rx_ring, 0);
-   else
-   writel(0, rx_ring-tail);
 }
 
 static void e1000e_downshift_workaround(struct work_struct *work)
@@ -2447,12 +2441,6 @@ static void e1000_clean_tx_ring(struct e1000_ring 
*tx_ring)
 
tx_ring-next_to_use = 0;
tx_ring-next_to_clean = 0;
-
-   writel(0, tx_ring-head);
-   if (adapter-flags2  FLAG2_PCIM2PCI_ARBITER_WA)
-   e1000e_update_tdt_wa(tx_ring, 0);
-   else
-   writel(0, tx_ring-tail);
 }
 
 /**
@@ -2954,6 +2942,12 @@ static void e1000_configure_tx(struct e1000_adapter 
*adapter)
tx_ring-head = adapter-hw.hw_addr + E1000_TDH(0);
tx_ring-tail = adapter-hw.hw_addr + E1000_TDT(0);
 
+   writel(0, tx_ring-head);
+   if (adapter-flags2  FLAG2_PCIM2PCI_ARBITER_WA)
+   e1000e_update_tdt_wa(tx_ring, 0);
+   else
+   writel(0, tx_ring-tail);
+
/* Set the Tx Interrupt Delay register */
ew32(TIDV, adapter-tx_int_delay);
/* Tx irq moderation */
@@ -3275,6 +3269,12 @@ static void e1000_configure_rx(struct e1000_adapter 
*adapter)
rx_ring-head = adapter-hw.hw_addr + E1000_RDH(0);
rx_ring-tail = adapter-hw.hw_addr + E1000_RDT(0);
 
+   writel(0, rx_ring-head);
+   if (adapter-flags2  FLAG2_PCIM2PCI_ARBITER_WA)
+   e1000e_update_rdt_wa(rx_ring, 0);
+   else
+   writel(0, rx_ring-tail);
+
/* Enable Receive Checksum Offload for TCP and UDP */
rxcsum = er32(RXCSUM);
if (adapter-netdev-features  NETIF_F_RXCSUM)
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] e1000e: Modify tx/rx configurations to avoid null pointer dereferences in e1000_open

2015-08-05 Thread Jia-Ju Bai

On 08/05/2015 06:43 PM, Jeff Kirsher wrote:


Is your intention that this patch replace the existing patch:
http://patchwork.ozlabs.org/patch/502990/
...which is currently in my queue?



Okay, please replace the previous patch.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] igb: Fix a memory leak in igb_probe

2015-08-05 Thread Jia-Ju Bai
In error handling code of igb_probe, the memory adapter-shadow_vfta 
allocated by kcalloc in igb_sw_init is not freed. So when register_netdev 
or igb_init_i2c is failed, a memory leak will occur.
This patch adds kfree to fix it.

Signed-off-by: Jia-Ju Bai baijiaju1...@163.com
---
 drivers/net/ethernet/intel/igb/igb_main.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index 2f70a9b..3efb757 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2645,6 +2645,7 @@ err_eeprom:
if (hw-flash_address)
iounmap(hw-flash_address);
 err_sw_init:
+   kfree(adapter-shadow_vfta);
igb_clear_interrupt_scheme(adapter);
pci_iounmap(pdev, hw-hw_addr);
 err_ioremap:
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH]Fix a null dereference in e1000_open

2015-08-02 Thread Jia-Ju Bai
When e1000e_setup_rx_resources is failed in e1000_open,
e1000e_free_tx_resources in err_setup_rx segment is executed.
writel(0, tx_ring-head) statement in e1000_clean_tx_ring
in e1000e_free_tx_resources will cause a null dereference(crash), 
because tx_ring-head is only assigned in e1000_configure_tx
in e1000_configure, but it is after e1000e_setup_rx_resources.
This patch adds a check to fix it.

Signed-off-by: Jia-Ju Bai baijiaju1...@163.com
---
 drivers/net/ethernet/intel/e1000e/netdev.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
b/drivers/net/ethernet/intel/e1000e/netdev.c
index 89d788d..838ec8e 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -2448,7 +2448,8 @@ static void e1000_clean_tx_ring(struct e1000_ring 
*tx_ring)
tx_ring-next_to_use = 0;
tx_ring-next_to_clean = 0;
 
-   writel(0, tx_ring-head);
+   if (tx_ring-head)
+   writel(0, tx_ring-head);
if (adapter-flags2  FLAG2_PCIM2PCI_ARBITER_WA)
e1000e_update_tdt_wa(tx_ring, 0);
else
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] e100: Release skb when DMA mapping is failed in e100_xmit_prepare

2015-08-02 Thread Jia-Ju Bai
When pci_dma_mapping_error in e100_xmit_prepare is failed, the skb buffer
allocated by netdev_alloc_skb_ip_align in e100_rx_alloc_skb is not
released, which causes a possible resource leak.
This patch adds error handling code to fix it.

Signed-off-by: Jia-Ju Bai baijiaju1...@163.com
---
 drivers/net/ethernet/intel/e100.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e100.c 
b/drivers/net/ethernet/intel/e100.c
index d2657a4..cc90616 100644
--- a/drivers/net/ethernet/intel/e100.c
+++ b/drivers/net/ethernet/intel/e100.c
@@ -1770,8 +1770,11 @@ static int e100_xmit_prepare(struct nic *nic, struct cb 
*cb,
dma_addr = pci_map_single(nic-pdev,
  skb-data, skb-len, PCI_DMA_TODEVICE);
/* If we can't map the skb, have the upper layer try later */
-   if (pci_dma_mapping_error(nic-pdev, dma_addr))
+   if (pci_dma_mapping_error(nic-pdev, dma_addr)) {
+   dev_kfree_skb_any(skb);
+   skb = NULL;
return -ENOMEM;
+   }
 
/*
 * Use the last 4 bytes of the SKB payload packet as the CRC, used for
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] e100: Add a check after pci_pool_create to avoid null pointer dereference

2015-08-02 Thread Jia-Ju Bai
The driver lacks the check of nic-cbs_pool after pci_pool_create
in e100_probe. When this function is failed, a null pointer dereference 
occurs when pci_pool_alloc uses nic-cbs_pool in e100_alloc_cbs.
This patch adds a check and related error handling code to fix it.

Signed-off-by: Jia-Ju Bai baijiaju1...@163.com
---
 drivers/net/ethernet/intel/e100.c |7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/intel/e100.c 
b/drivers/net/ethernet/intel/e100.c
index d2657a4..767c161 100644
--- a/drivers/net/ethernet/intel/e100.c
+++ b/drivers/net/ethernet/intel/e100.c
@@ -2967,6 +2967,11 @@ static int e100_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
   nic-params.cbs.max * sizeof(struct cb),
   sizeof(u32),
   0);
+   if (!nic-cbs_pool) {
+   netif_err(nic, probe, nic-netdev, Cannot create DMA pool, 
aborting\n);
+   err = -ENOMEM;
+   goto err_out_pool;
+   }
netif_info(nic, probe, nic-netdev,
   addr 0x%llx, irq %d, MAC addr %pM\n,
   (unsigned long long)pci_resource_start(pdev, use_io ? 1 : 0),
@@ -2974,6 +2979,8 @@ static int e100_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
 
return 0;
 
+err_out_pool:
+   unregister_netdev(netdev);
 err_out_free:
e100_free(nic);
 err_out_iounmap:
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] rt2x00pci: Disable memory-write-invalidate when the driver exits

2016-01-03 Thread Jia-Ju Bai
The driver calls pci_set_mwi to enable memory-write-invalidate when it
is initialized, but does not call pci_clear_mwi when it is removed. Many
other drivers calls pci_clear_mwi when pci_set_mwi is called, such as
r8169, 8139cp and e1000.

This patch adds pci_clear_mwi in error handling and removal procedure, 
which can fix the problem.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/wireless/rt2x00/rt2x00pci.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/wireless/rt2x00/rt2x00pci.c 
b/drivers/net/wireless/rt2x00/rt2x00pci.c
index d93db4b..eb6dbcd 100644
--- a/drivers/net/wireless/rt2x00/rt2x00pci.c
+++ b/drivers/net/wireless/rt2x00/rt2x00pci.c
@@ -149,6 +149,7 @@ exit_free_device:
ieee80211_free_hw(hw);
 
 exit_release_regions:
+   pci_clear_mwi(pci_dev);
pci_release_regions(pci_dev);
 
 exit_disable_device:
@@ -173,6 +174,7 @@ void rt2x00pci_remove(struct pci_dev *pci_dev)
/*
 * Free the PCI device data.
 */
+   pci_clear_mwi(pci_dev);
pci_disable_device(pci_dev);
pci_release_regions(pci_dev);
 }
-- 
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] ehci-hcd: Cleanup memory resources when ehci_halt fails

2016-01-04 Thread Jia-Ju Bai
The driver calls ehci_mem_init to allocate memory resources. 
But these resources are not freed when ehci_halt fails.

This patch adds "ehci_mem_cleanup" in error handling code to fix this problem.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/usb/host/ehci-hcd.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 48c92bf..015b411 100644
--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -675,8 +675,10 @@ int ehci_setup(struct usb_hcd *hcd)
return retval;
 
retval = ehci_halt(ehci);
-   if (retval)
+   if (retval) {
+   ehci_mem_cleanup(ehci);
return retval;
+   }
 
ehci_reset(ehci);
 
-- 
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] ehci-hcd: Disable memory-write-invalidate when the driver is removed

2016-01-04 Thread Jia-Ju Bai
The driver calls pci_set_mwi to enable memory-write-invalidate when it 
is initialized, but does not call pci_clear_mwi when it is removed. Many 
other drivers calls pci_clear_mwi when pci_set_mwi is called, such as 
r8169, 8139cp and e1000.

This patch adds a function "ehci_pci_remove" to remove the pci driver.
This function calls pci_clear_mwi and usb_hcd_pci_remove, which can 
fix the problem.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/usb/host/ehci-pci.c |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/ehci-pci.c b/drivers/usb/host/ehci-pci.c
index 2a5d2fd..3b3649d 100644
--- a/drivers/usb/host/ehci-pci.c
+++ b/drivers/usb/host/ehci-pci.c
@@ -377,6 +377,12 @@ static int ehci_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *id)
return usb_hcd_pci_probe(pdev, id);
 }
 
+static void ehci_pci_remove(struct pci_dev *pdev)
+{
+   pci_clear_mwi(pdev);
+   usb_hcd_pci_remove(pdev);   
+}
+
 /* PCI driver selection metadata; PCI hotplugging uses this */
 static const struct pci_device_id pci_ids [] = { {
/* handle any USB 2.0 EHCI controller */
@@ -396,7 +402,7 @@ static struct pci_driver ehci_pci_driver = {
.id_table = pci_ids,
 
.probe =ehci_pci_probe,
-   .remove =   usb_hcd_pci_remove,
+   .remove =   ehci_pci_remove,
.shutdown = usb_hcd_pci_shutdown,
 
 #ifdef CONFIG_PM
-- 
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] rt2x00pci: Disable memory-write-invalidate when the driver exits

2016-01-04 Thread Jia-Ju Bai

On 01/05/2016 12:50 AM, Helmut Schaa wrote:

On Mon, Jan 4, 2016 at 8:55 AM, Jia-Ju Bai<baijiaju1...@163.com>  wrote:

The driver calls pci_set_mwi to enable memory-write-invalidate when it
is initialized, but does not call pci_clear_mwi when it is removed. Many
other drivers calls pci_clear_mwi when pci_set_mwi is called, such as
r8169, 8139cp and e1000.

This patch adds pci_clear_mwi in error handling and removal procedure,
which can fix the problem.

Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com>

Looks good to me.
Does this fix any actual issue?
If yes it might we worth to mention it in the commit message.
Helmut



Lacking pci_clear_mwi may cause a resource-release omission,
but this omission may not cause obvious issues.
For reliability, it is better to add pci_clear_mwi in the driver.
Many other drivers do so, such as r8169, 8139cp and e1000.


Jia-Ju Bai

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_pci_mem_access_direct

2017-06-21 Thread Jia-Ju Bai

On 06/21/2017 02:11 PM, Kalle Valo wrote:

David Miller<da...@davemloft.net>  writes:


From: Jia-Ju Bai<baijiaju1...@163.com>
Date: Mon, 19 Jun 2017 10:48:53 +0800


The driver may sleep under a spin lock, and the function call path is:
netxen_nic_pci_mem_access_direct (acquire the lock by spin_lock)
   ioremap -->  may sleep

To fix it, the lock is released before "ioremap", and the lock is
acquired again after this function.

Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com>

This style of change you are making is really starting to be a
problem.

You can't just drop locks like this, especially without explaining
why it's ok, and why the mutual exclusion this code was trying to
achieve is still going to be OK afterwards.

In fact, I see zero analysis of the locking situation here, why
it was needed in the first place, and why your change is OK in
that context.

Any locking change is delicate, and you must put the greatest of
care and consideration into it.

Just putting "unlock/lock" around the sleeping operation shows a
very low level of consideration for the implications of the change
you are making.

This isn't like making whitespace fixes, sorry...

We already tried to explain this to Jia-Ju during review of a wireless
patch:

https://patchwork.kernel.org/patch/9756585/

Jia-Ju, you should listen to feedback. If you continue submitting random
patches like this makes it hard for maintainers to trust your patches
anymore.


Hi,

I am quite sorry for my incorrect patches, and I will listen carefully 
to your advice.
In fact, for some bugs and patches which I have reported before, I have 
not received the feedback of them, so I resent them a few days ago, 
including this patch.

Sorry for my mistake again.

Thanks,
Jia-Ju Bai



Re: [PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_pci_mem_access_direct

2017-06-22 Thread Jia-Ju Bai

On 2017/6/22 14:08, Dan Carpenter wrote:

We should probably add a might_sleep() to ioremap() to prevent these
bugs in the future.

I think it is right to do this.
And it will be very useful to summarize common kernel interface 
functions which may sleep into a list. When writing a new driver, the 
developer can refer to this list to reduce or avoid sleep-in-atomic bugs.




This bug is eight years old.  You can report it, but it's going to hard
to get anyone to fix it.  I sometimes ignore ancient bugs.  On the other
hand, netxen is fairly well supported so it doesn't hurt to try.

I try to report bugs as soon as they are introduced.  I report it to
the author and CC the relevant list.  If people don't respond to my
email after a month then I complain again.

regards,
dan carpenter



Thanks for your helpful advice.

Thanks,
Jia-Ju Bai



Re: [PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_pci_mem_access_direct

2017-06-21 Thread Jia-Ju Bai

On 2017/6/21 21:40, Kalle Valo wrote:


Jia-Ju Bai <baijiaju1...@163.com> writes:


On 06/21/2017 02:11 PM, Kalle Valo wrote:

David Miller<da...@davemloft.net>  writes:


From: Jia-Ju Bai<baijiaju1...@163.com>
Date: Mon, 19 Jun 2017 10:48:53 +0800


The driver may sleep under a spin lock, and the function call path is:
netxen_nic_pci_mem_access_direct (acquire the lock by spin_lock)
ioremap -->  may sleep

To fix it, the lock is released before "ioremap", and the lock is
acquired again after this function.

Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com>

This style of change you are making is really starting to be a
problem.

You can't just drop locks like this, especially without explaining
why it's ok, and why the mutual exclusion this code was trying to
achieve is still going to be OK afterwards.

In fact, I see zero analysis of the locking situation here, why
it was needed in the first place, and why your change is OK in
that context.

Any locking change is delicate, and you must put the greatest of
care and consideration into it.

Just putting "unlock/lock" around the sleeping operation shows a
very low level of consideration for the implications of the change
you are making.

This isn't like making whitespace fixes, sorry...

We already tried to explain this to Jia-Ju during review of a wireless
patch:

https://patchwork.kernel.org/patch/9756585/

Jia-Ju, you should listen to feedback. If you continue submitting random
patches like this makes it hard for maintainers to trust your patches
anymore.


Hi,

I am quite sorry for my incorrect patches, and I will listen carefully
to your advice. In fact, for some bugs and patches which I have
reported before, I have not received the feedback of them, so I resent
them a few days ago, including this patch.

Yeah, it is likely that some of your reports will not get any response.
For that I only suggest being persistent and providing more information
about the issue and suggestions how it might be possible to fix it. Also
Dan Carpenter (Cced) might have some suggestions.

But trying to "fix" it by just silencing the warning without proper
analysis is totally the wrong approach, you do more harm than good.

What tool do you use to find these issues? Is it publically available?



Hi,

Thanks a lot for your advice. And I am very glad to see that you may be 
interested in my work :)
This static tool is written by myself, instead of using or improving 
existing tools. A reason why I write it is that I have encountered some 
sleep-in-atomic bugs in my driver development :( .
However, due to preliminary implementation, this tool still has some 
limitations which can produce some false positives or negatives, and it 
may be not very easy to use. Thus, I am still improving this tool, 
checking more code and collecting results now. By the way, I apologize 
again for my incorrect patches of trying to "fix" the detected bugs.
In fact, I am very glad to make this tool available to effectively and 
conveniently check more system code. After I finish the improvements and 
perform more evaluation, I will make it publicly available.
If you have any suggestion or comment on my work, please feel free to 
contact me :)


Thanks,
Jia-Ju Bai






[PATCH] isdn: Fix a sleep-in-atomic bug

2017-05-30 Thread Jia-Ju Bai
The driver may sleep under a spin lock, the function call path is:
isdn_ppp_mp_receive (acquire the lock)
  isdn_ppp_mp_reassembly
isdn_ppp_push_higher
  isdn_ppp_decompress
isdn_ppp_ccp_reset_trans
  isdn_ppp_ccp_reset_alloc_state
kzalloc(GFP_KERNEL) --> may sleep

To fixed it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/isdn/i4l/isdn_ppp.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/isdn/i4l/isdn_ppp.c b/drivers/isdn/i4l/isdn_ppp.c
index d07dd519..8aa158a 100644
--- a/drivers/isdn/i4l/isdn_ppp.c
+++ b/drivers/isdn/i4l/isdn_ppp.c
@@ -2364,7 +2364,7 @@ static struct ippp_ccp_reset_state 
*isdn_ppp_ccp_reset_alloc_state(struct ippp_s
   id);
return NULL;
} else {
-   rs = kzalloc(sizeof(struct ippp_ccp_reset_state), GFP_KERNEL);
+   rs = kzalloc(sizeof(struct ippp_ccp_reset_state), GFP_ATOMIC);
if (!rs)
return NULL;
rs->state = CCPResetIdle;
-- 
1.7.9.5




[PATCH] [PATCH] qla4xxx: Fix a sleep-in-atomic bug

2017-05-30 Thread Jia-Ju Bai
The driver may sleep under a write spin lock, the function call path is:
qla4_82xx_wr_32 (acquire the lock)
  qla4_82xx_crb_win_lock
  schedule or cpu_relax

To fixed it, the lock is released before "schedule" and "cpu_relax", 
and the lock is acquired again after "schedule" and "cpu_relax".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/scsi/qla4xxx/ql4_nx.c |8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/qla4xxx/ql4_nx.c b/drivers/scsi/qla4xxx/ql4_nx.c
index e91abb3..1cf5f4a 100644
--- a/drivers/scsi/qla4xxx/ql4_nx.c
+++ b/drivers/scsi/qla4xxx/ql4_nx.c
@@ -386,7 +386,7 @@
 
if (rv == 1) {
write_lock_irqsave(>hw_lock, flags);
-   qla4_82xx_crb_win_lock(ha);
+   qla4_82xx_crb_win_lock(ha, flags);
qla4_82xx_pci_set_crbwindow_2M(ha, );
}
 
@@ -410,7 +410,7 @@ uint32_t qla4_82xx_rd_32(struct scsi_qla_host *ha, ulong 
off)
 
if (rv == 1) {
write_lock_irqsave(>hw_lock, flags);
-   qla4_82xx_crb_win_lock(ha);
+   qla4_82xx_crb_win_lock(ha, flags);
qla4_82xx_pci_set_crbwindow_2M(ha, );
}
data = readl((void __iomem *)off);
@@ -476,7 +476,7 @@ int qla4_82xx_md_wr_32(struct scsi_qla_host *ha, uint32_t 
off, uint32_t data)
 
 #define CRB_WIN_LOCK_TIMEOUT 1
 
-int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha)
+int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha, unsigned long flags)
 {
int i;
int done = 0, timeout = 0;
@@ -491,6 +491,7 @@ int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha)
 
timeout++;
 
+   write_unlock_irqrestore(>hw_lock, flags);
/* Yield CPU */
if (!in_interrupt())
schedule();
@@ -498,6 +499,7 @@ int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha)
for (i = 0; i < 20; i++)
cpu_relax();/*This a nop instr on i386*/
}
+   write_lock_irqsave(>hw_lock, flags);
}
qla4_82xx_wr_32(ha, QLA82XX_CRB_WIN_LOCK_ID, ha->func_num);
return 0;
-- 
1.7.9.5




[PATCH] iscsi: Fix a sleep-in-atomic bug

2017-05-30 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
iscsit_tpg_enable_portal_group (acquire the lock by spin_lock)
  iscsi_update_param_value
kstrdup(GFP_KERNEL) --> may sleep

To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/target/iscsi/iscsi_target_parameters.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/target/iscsi/iscsi_target_parameters.c 
b/drivers/target/iscsi/iscsi_target_parameters.c
index fce6276..8768916 100644
--- a/drivers/target/iscsi/iscsi_target_parameters.c
+++ b/drivers/target/iscsi/iscsi_target_parameters.c
@@ -702,7 +702,7 @@ int iscsi_update_param_value(struct iscsi_param *param, 
char *value)
 {
kfree(param->value);
 
-   param->value = kstrdup(value, GFP_KERNEL);
+   param->value = kstrdup(value, GFP_ATOMIC);
if (!param->value) {
pr_err("Unable to allocate memory for value.\n");
return -ENOMEM;
-- 
1.7.9.5




[PATCH] enic: Fix a sleep-in-atomic bug

2017-05-30 Thread Jia-Ju Bai
The driver may sleep under a spin lock in some function call paths.
The 1st function call path is:
enic_reset (acquire the lock by spin_lock)
  enic_stop
enic_synchronize_irqs
  synchronize_irq --> may sleep
The 2nd function call path is:
enic_reset (acquire the lock by spin_lock)
  enic_dev_soft_reset
enic_dev_wait
  schedule_timeout_uninterruptible --> may sleep
The 3rd function call path is:
enic_reset (acquire the lock by spin_lock)
  enic_open
enic_request_intr
  enic_set_rx_cpu_rmap
enic_free_rx_cpu_rmap
  free_irq_cpu_rmap --> may sleep

To fix it, the "spin_lock" and "spin_unlock" are removed in enic_reset.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/ethernet/cisco/enic/enic_main.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c 
b/drivers/net/ethernet/cisco/enic/enic_main.c
index 4b87bee..2a9bef8 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -2310,7 +2310,6 @@ static void enic_reset(struct work_struct *work)
 
rtnl_lock();
 
-   spin_lock(>enic_api_lock);
enic_stop(enic->netdev);
enic_dev_soft_reset(enic);
enic_reset_addr_lists(enic);
@@ -2318,7 +2317,6 @@ static void enic_reset(struct work_struct *work)
enic_set_rss_nic_cfg(enic);
enic_dev_set_ig_vlan_rewrite_mode(enic);
enic_open(enic->netdev);
-   spin_unlock(>enic_api_lock);
call_netdevice_notifiers(NETDEV_REBOOT, enic->netdev);
 
rtnl_unlock();
-- 
1.7.9.5




[PATCH] i40e: Fix a sleep-in-atomic bug

2017-05-30 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
i40e_ndo_set_vf_port_vlan (acquire the lock by spin_lock_bh)
  i40e_vsi_remove_pvid
i40e_vlan_stripping_disable
  i40e_aq_update_vsi_params
i40e_asq_send_command
  mutex_lock --> may sleep

To fixed it, the spin lock is released before "i40e_vsi_remove_pvid", and 
the lock is acquired again after this function.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c 
b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 95c23fb..0fb38ca 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -3017,10 +3017,12 @@ int i40e_ndo_set_vf_port_vlan(struct net_device 
*netdev, int vf_id,
   VLAN_VID_MASK));
}
 
+   spin_unlock_bh(>mac_filter_hash_lock);
if (vlan_id || qos)
ret = i40e_vsi_add_pvid(vsi, vlanprio);
else
i40e_vsi_remove_pvid(vsi);
+   spin_lock_bh(>mac_filter_hash_lock);
 
if (vlan_id) {
dev_info(>pdev->dev, "Setting VLAN %d, QOS 0x%x on VF %d\n",
-- 
1.7.9.5




[PATCH] enic: Fix another sleep-in-atomic bug

2017-05-30 Thread Jia-Ju Bai
The driver may sleep under a spin lock in some function call paths.
The 1st function call path is:
enic_tx_hang_reset (acquire the lock by spin_lock)
  enic_stop
enic_synchronize_irqs
  synchronize_irq --> may sleep
The 2nd function call path is:
enic_tx_hang_reset (acquire the lock by spin_lock)
  enic_dev_wait
schedule_timeout_uninterruptible --> may sleep
The 3rd function call path is:
enic_tx_hang_reset (acquire the lock by spin_lock)
  enic_open
enic_request_intr
  enic_set_rx_cpu_rmap
enic_free_rx_cpu_rmap
  free_irq_cpu_rmap --> may sleep

To fix it, the "spin_lock" and "spin_unlock" are removed 
in enic_tx_hang_reset.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/ethernet/cisco/enic/enic_main.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c 
b/drivers/net/ethernet/cisco/enic/enic_main.c
index 4b87bee..d6523e2 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -2330,7 +2330,6 @@ static void enic_tx_hang_reset(struct work_struct *work)
 
rtnl_lock();
 
-   spin_lock(>enic_api_lock);
enic_dev_hang_notify(enic);
enic_stop(enic->netdev);
enic_dev_hang_reset(enic);
@@ -2339,7 +2338,6 @@ static void enic_tx_hang_reset(struct work_struct *work)
enic_set_rss_nic_cfg(enic);
enic_dev_set_ig_vlan_rewrite_mode(enic);
enic_open(enic->netdev);
-   spin_unlock(>enic_api_lock);
call_netdevice_notifiers(NETDEV_REBOOT, enic->netdev);
 
rtnl_unlock();
-- 
1.7.9.5




[PATCH] megaraid: Fix a sleep-in-atomic bug

2017-05-30 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
mraid_mm_attach_buf (acquire the lock by spin_lock_irqsave)
  pci_pool_alloc(GFP_KERNEL) --> may sleep

To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/scsi/megaraid/megaraid_mm.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/megaraid/megaraid_mm.c 
b/drivers/scsi/megaraid/megaraid_mm.c
index 4cf9ed9..c43afb8 100644
--- a/drivers/scsi/megaraid/megaraid_mm.c
+++ b/drivers/scsi/megaraid/megaraid_mm.c
@@ -574,7 +574,7 @@
 
kioc->pool_index= right_pool;
kioc->free_buf  = 1;
-   kioc->buf_vaddr = pci_pool_alloc(pool->handle, GFP_KERNEL,
+   kioc->buf_vaddr = pci_pool_alloc(pool->handle, GFP_ATOMIC,
>buf_paddr);
spin_unlock_irqrestore(>lock, flags);
 
-- 
1.7.9.5




[PATCH V2] qla4xxx: Fix a sleep-in-atomic bug

2017-05-30 Thread Jia-Ju Bai
The driver may sleep under a write spin lock, the function call path is:
qla4_82xx_wr_32 (acquire the lock)
  qla4_82xx_crb_win_lock
  schedule or cpu_relax

To fix it, the lock is released before "schedule" and "cpu_relax",
and the lock is acquired again after "schedule" and "cpu_relax".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/scsi/qla4xxx/ql4_glbl.h |2 +-
 drivers/scsi/qla4xxx/ql4_nx.c   |8 +---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/qla4xxx/ql4_glbl.h b/drivers/scsi/qla4xxx/ql4_glbl.h
index bce96a5..b723bef 100644
--- a/drivers/scsi/qla4xxx/ql4_glbl.h
+++ b/drivers/scsi/qla4xxx/ql4_glbl.h
@@ -115,7 +115,7 @@ uint8_t qla4xxx_update_local_ifcb(struct scsi_qla_host *ha,
 void qla4_82xx_queue_iocb(struct scsi_qla_host *ha);
 void qla4_82xx_complete_iocb(struct scsi_qla_host *ha);
 
-int qla4_82xx_crb_win_lock(struct scsi_qla_host *);
+int qla4_82xx_crb_win_lock(struct scsi_qla_host *, unsigned long);
 void qla4_82xx_crb_win_unlock(struct scsi_qla_host *);
 int qla4_82xx_pci_get_crb_addr_2M(struct scsi_qla_host *, ulong *);
 void qla4_82xx_wr_32(struct scsi_qla_host *, ulong, u32);
diff --git a/drivers/scsi/qla4xxx/ql4_nx.c b/drivers/scsi/qla4xxx/ql4_nx.c
index e91abb3..1cf5f4a 100644
--- a/drivers/scsi/qla4xxx/ql4_nx.c
+++ b/drivers/scsi/qla4xxx/ql4_nx.c
@@ -386,7 +386,7 @@
 
if (rv == 1) {
write_lock_irqsave(>hw_lock, flags);
-   qla4_82xx_crb_win_lock(ha);
+   qla4_82xx_crb_win_lock(ha, flags);
qla4_82xx_pci_set_crbwindow_2M(ha, );
}
 
@@ -410,7 +410,7 @@ uint32_t qla4_82xx_rd_32(struct scsi_qla_host *ha, ulong 
off)
 
if (rv == 1) {
write_lock_irqsave(>hw_lock, flags);
-   qla4_82xx_crb_win_lock(ha);
+   qla4_82xx_crb_win_lock(ha, flags);
qla4_82xx_pci_set_crbwindow_2M(ha, );
}
data = readl((void __iomem *)off);
@@ -476,7 +476,7 @@ int qla4_82xx_md_wr_32(struct scsi_qla_host *ha, uint32_t 
off, uint32_t data)
 
 #define CRB_WIN_LOCK_TIMEOUT 1
 
-int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha)
+int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha, unsigned long flags)
 {
int i;
int done = 0, timeout = 0;
@@ -491,6 +491,7 @@ int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha)
 
timeout++;
 
+   write_unlock_irqrestore(>hw_lock, flags);
/* Yield CPU */
if (!in_interrupt())
schedule();
@@ -498,6 +499,7 @@ int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha)
for (i = 0; i < 20; i++)
cpu_relax();/*This a nop instr on i386*/
}
+   write_lock_irqsave(>hw_lock, flags);
}
qla4_82xx_wr_32(ha, QLA82XX_CRB_WIN_LOCK_ID, ha->func_num);
return 0;
-- 
1.7.9.5




[PATCH] gadget: Fix a sleep-in-atomic bug

2017-05-30 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
ffs_epfile_io (acquire the lock by spin_lock_irq)
  usb_ep_alloc_request(GFP_KERNEL) --> may sleep

To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/usb/gadget/function/f_fs.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/gadget/function/f_fs.c 
b/drivers/usb/gadget/function/f_fs.c
index 47dda34..be90e25 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -1015,7 +1015,7 @@ static ssize_t ffs_epfile_io(struct file *file, struct 
ffs_io_data *io_data)
else
ret = ep->status;
goto error_mutex;
-   } else if (!(req = usb_ep_alloc_request(ep->ep, GFP_KERNEL))) {
+   } else if (!(req = usb_ep_alloc_request(ep->ep, GFP_ATOMIC))) {
ret = -ENOMEM;
} else {
req->buf  = data;
-- 
1.7.9.5




[PATCH] libcfs: Fix a sleep-in-atomic bug in cfs_wi_deschedule

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
cfs_wi_deschedule (acquire the lock by spin_lock)
  LASSERT
lbug_with_loc
  libcfs_debug_dumplog
schedule and kthread_run --> may sleep

To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/lustre/lnet/libcfs/workitem.c |   11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
b/drivers/staging/lustre/lnet/libcfs/workitem.c
index dbc2a9b..7e25eb9 100644
--- a/drivers/staging/lustre/lnet/libcfs/workitem.c
+++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
@@ -140,6 +140,11 @@ struct cfs_wi_sched {
 
LASSERT(!in_interrupt()); /* because we use plain spinlock */
LASSERT(!sched->ws_stopping);
+   if (wi->wi_scheduled) {
+   LASSERT(!list_empty(>wi_list));
+   LASSERT(sched->ws_nscheduled > 0);
+   }
+   LASSERT(list_empty(>wi_list));
 
/*
 * return 0 if it's running already, otherwise return 1, which
@@ -151,17 +156,11 @@ struct cfs_wi_sched {
rc = !(wi->wi_running);
 
if (wi->wi_scheduled) { /* cancel pending schedules */
-   LASSERT(!list_empty(>wi_list));
list_del_init(>wi_list);
-
-   LASSERT(sched->ws_nscheduled > 0);
sched->ws_nscheduled--;
-
wi->wi_scheduled = 0;
}
 
-   LASSERT(list_empty(>wi_list));
-
spin_unlock(>ws_lock);
return rc;
 }
-- 
1.7.9.5




[PATCH] libcfs: Fix a sleep-in-atomic bug in cfs_wi_schedule

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
cfs_wi_schedule (acquire the lock by spin_lock)
  LASSERT
lbug_with_loc
  libcfs_debug_dumplog
schedule and kthread_run --> may sleep

To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/lustre/lnet/libcfs/workitem.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
b/drivers/staging/lustre/lnet/libcfs/workitem.c
index dbc2a9b..30d28cd 100644
--- a/drivers/staging/lustre/lnet/libcfs/workitem.c
+++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
@@ -179,12 +179,12 @@ struct cfs_wi_sched {
 {
LASSERT(!in_interrupt()); /* because we use plain spinlock */
LASSERT(!sched->ws_stopping);
+   if (!wi->wi_scheduled)
+   LASSERT(list_empty(>wi_list));
 
spin_lock(>ws_lock);
 
if (!wi->wi_scheduled) {
-   LASSERT(list_empty(>wi_list));
-
wi->wi_scheduled = 1;
sched->ws_nscheduled++;
if (!wi->wi_running) {
@@ -195,8 +195,8 @@ struct cfs_wi_sched {
}
}
 
-   LASSERT(!list_empty(>wi_list));
spin_unlock(>ws_lock);
+   LASSERT(!list_empty(>wi_list));
 }
 EXPORT_SYMBOL(cfs_wi_schedule);
 
-- 
1.7.9.5




[PATCH] bcache: Fix a sleep-in-atomic bug

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
journal_wait_for_write (acquire the lock by spin_lock)
  closure_sync
schedule --> may sleep

To fix it, the lock is released before "closure_sync", and the lock is 
acquired again after this function.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/md/bcache/journal.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index 1198e53..ad47c36 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -724,6 +724,7 @@ static struct journal_write *journal_wait_for_write(struct 
cache_set *c,
btree_flush_write(c);
}
 
+   spin_unlock(>journal.lock);
closure_sync();
spin_lock(>journal.lock);
wait = true;
-- 
1.7.9.5




[PATCH V2] libcfs: Fix a sleep-in-atomic bug in cfs_wi_deschedule

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
cfs_wi_deschedule (acquire the lock by spin_lock)
  LASSERT
lbug_with_loc
  libcfs_debug_dumplog
schedule and kthread_run --> may sleep

To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/lustre/lnet/libcfs/workitem.c |   12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
b/drivers/staging/lustre/lnet/libcfs/workitem.c
index dbc2a9b..9c530cf 100644
--- a/drivers/staging/lustre/lnet/libcfs/workitem.c
+++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
@@ -140,6 +140,10 @@ struct cfs_wi_sched {
 
LASSERT(!in_interrupt()); /* because we use plain spinlock */
LASSERT(!sched->ws_stopping);
+   if (wi->wi_scheduled) {
+   LASSERT(!list_empty(>wi_list));
+   LASSERT(sched->ws_nscheduled > 0);
+   }
 
/*
 * return 0 if it's running already, otherwise return 1, which
@@ -151,18 +155,14 @@ struct cfs_wi_sched {
rc = !(wi->wi_running);
 
if (wi->wi_scheduled) { /* cancel pending schedules */
-   LASSERT(!list_empty(>wi_list));
list_del_init(>wi_list);
-
-   LASSERT(sched->ws_nscheduled > 0);
sched->ws_nscheduled--;
-
wi->wi_scheduled = 0;
}
 
-   LASSERT(list_empty(>wi_list));
-
spin_unlock(>ws_lock);
+
+   LASSERT(list_empty(>wi_list));
return rc;
 }
 EXPORT_SYMBOL(cfs_wi_deschedule);
-- 
1.7.9.5




[PATCH] libcfs: Fix a sleep-in-atomic bug in cfs_wi_sched_destroy

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
cfs_wi_sched_destroy (acquire the lock by spin_lock)
  LASSERT
lbug_with_loc
  libcfs_debug_dumplog
schedule and kthread_run --> may sleep

To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/lustre/lnet/libcfs/workitem.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
b/drivers/staging/lustre/lnet/libcfs/workitem.c
index dbc2a9b..e0424f6 100644
--- a/drivers/staging/lustre/lnet/libcfs/workitem.c
+++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
@@ -302,11 +302,12 @@ static int cfs_wi_scheduler(void *arg)
return;
}
 
-   LASSERT(!list_empty(>ws_list));
sched->ws_stopping = 1;
 
spin_unlock(_wi_data.wi_glock);
 
+   LASSERT(!list_empty(>ws_list));
+
i = 2;
wake_up_all(>ws_waitq);
 
-- 
1.7.9.5




[PATCH V2] libcfs: Fix a sleep-in-atomic bug in cfs_wi_exit

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
cfs_wi_exit (acquire the lock by spin_lock)
  LASSERT
lbug_with_loc
  libcfs_debug_dumplog
schedule and kthread_run --> may sleep

To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/lustre/lnet/libcfs/workitem.c |   13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
b/drivers/staging/lustre/lnet/libcfs/workitem.c
index dbc2a9b..928d06d 100644
--- a/drivers/staging/lustre/lnet/libcfs/workitem.c
+++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
@@ -111,22 +111,23 @@ struct cfs_wi_sched {
 {
LASSERT(!in_interrupt()); /* because we use plain spinlock */
LASSERT(!sched->ws_stopping);
+   LASSERT(wi->wi_running);
+   if (wi->wi_scheduled) {
+   LASSERT(!list_empty(>wi_list));
+   LASSERT(sched->ws_nscheduled > 0);
+   }
 
spin_lock(>ws_lock);
 
-   LASSERT(wi->wi_running);
if (wi->wi_scheduled) { /* cancel pending schedules */
-   LASSERT(!list_empty(>wi_list));
list_del_init(>wi_list);
-
-   LASSERT(sched->ws_nscheduled > 0);
sched->ws_nscheduled--;
}
 
-   LASSERT(list_empty(>wi_list));
-
wi->wi_scheduled = 1; /* LBUG future schedule attempts */
spin_unlock(>ws_lock);
+
+   LASSERT(list_empty(>wi_list));
 }
 EXPORT_SYMBOL(cfs_wi_exit);
 
-- 
1.7.9.5




[PATCH] libcfs: Fix a sleep-in-atomic bug in cfs_wi_exit

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
cfs_wi_exit (acquire the lock by spin_lock)
  LASSERT
lbug_with_loc
  libcfs_debug_dumplog
schedule and kthread_run --> may sleep

To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/lustre/lnet/libcfs/workitem.c |   12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
b/drivers/staging/lustre/lnet/libcfs/workitem.c
index dbc2a9b..cef25c8 100644
--- a/drivers/staging/lustre/lnet/libcfs/workitem.c
+++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
@@ -111,20 +111,20 @@ struct cfs_wi_sched {
 {
LASSERT(!in_interrupt()); /* because we use plain spinlock */
LASSERT(!sched->ws_stopping);
+   LASSERT(wi->wi_running);
+   if (wi->wi_scheduled) {
+   LASSERT(!list_empty(>wi_list));
+   LASSERT(sched->ws_nscheduled > 0);
+   }
+   LASSERT(list_empty(>wi_list));
 
spin_lock(>ws_lock);
 
-   LASSERT(wi->wi_running);
if (wi->wi_scheduled) { /* cancel pending schedules */
-   LASSERT(!list_empty(>wi_list));
list_del_init(>wi_list);
-
-   LASSERT(sched->ws_nscheduled > 0);
sched->ws_nscheduled--;
}
 
-   LASSERT(list_empty(>wi_list));
-
wi->wi_scheduled = 1; /* LBUG future schedule attempts */
spin_unlock(>ws_lock);
 }
-- 
1.7.9.5




[PATCH] mISDN: Fix a sleep-in-atomic bug

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a read spin lock, and the function call path is:
send_socklist (acquire the lock by read_lock)
  skb_copy(GFP_KERNEL) --> may sleep

To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/isdn/mISDN/stack.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/isdn/mISDN/stack.c b/drivers/isdn/mISDN/stack.c
index 8b7faea..422dced 100644
--- a/drivers/isdn/mISDN/stack.c
+++ b/drivers/isdn/mISDN/stack.c
@@ -75,7 +75,7 @@
if (sk->sk_state != MISDN_BOUND)
continue;
if (!cskb)
-   cskb = skb_copy(skb, GFP_KERNEL);
+   cskb = skb_copy(skb, GFP_ATOMIC);
if (!cskb) {
printk(KERN_WARNING "%s no skb\n", __func__);
break;
-- 
1.7.9.5




[PATCH] libcfs: Fix a sleep-in-atomic bug in cfs_wi_scheduler

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
cfs_wi_scheduler (acquire the lock by spin_lock)
  LASSERT
lbug_with_loc
  libcfs_debug_dumplog
schedule and kthread_run --> may sleep

To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/lustre/lnet/libcfs/workitem.c |   10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
b/drivers/staging/lustre/lnet/libcfs/workitem.c
index dbc2a9b..9f7832e 100644
--- a/drivers/staging/lustre/lnet/libcfs/workitem.c
+++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
@@ -212,9 +212,9 @@ static int cfs_wi_scheduler(void *arg)
CWARN("Unable to bind %s on CPU partition %d\n",
  sched->ws_name, sched->ws_cpt);
 
+   LASSERT(sched->ws_starting == 1);
spin_lock(_wi_data.wi_glock);
 
-   LASSERT(sched->ws_starting == 1);
sched->ws_starting--;
sched->ws_nthreads++;
 
@@ -231,11 +231,14 @@ static int cfs_wi_scheduler(void *arg)
   nloops < CFS_WI_RESCHED) {
wi = list_entry(sched->ws_runq.next,
struct cfs_workitem, wi_list);
+
+   spin_unlock(>ws_lock);
LASSERT(wi->wi_scheduled && !wi->wi_running);
+   LASSERT(sched->ws_nscheduled > 0);
+   spin_lock(>ws_lock);
 
list_del_init(>wi_list);
 
-   LASSERT(sched->ws_nscheduled > 0);
sched->ws_nscheduled--;
 
wi->wi_running = 1;
@@ -254,7 +257,10 @@ static int cfs_wi_scheduler(void *arg)
if (list_empty(>wi_list))
continue;
 
+   spin_unlock(>ws_lock);
LASSERT(wi->wi_scheduled);
+   spin_lock(>ws_lock);
+
/* wi is rescheduled, should be on rerunq now, we
 * move it to runq so it can run action now
 */
-- 
1.7.9.5




[PATCH] libcfs: Fix a sleep-in-atomic bug in cfs_percpt_lock and cfs_percpt_unlock

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
cfs_percpt_lock/cfs_percpt_unlock (acquire the lock by spin_lock)
  LASSERT
lbug_with_loc
  libcfs_debug_dumplog
schedule and kthread_run --> may sleep

To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/lustre/lnet/libcfs/libcfs_lock.c |9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/libcfs_lock.c 
b/drivers/staging/lustre/lnet/libcfs/libcfs_lock.c
index 1967b97c..a2ce092f 100644
--- a/drivers/staging/lustre/lnet/libcfs/libcfs_lock.c
+++ b/drivers/staging/lustre/lnet/libcfs/libcfs_lock.c
@@ -113,9 +113,10 @@ struct cfs_percpt_lock *
 
/* exclusive lock request */
for (i = 0; i < ncpt; i++) {
+   if (!i)
+   LASSERT(!pcl->pcl_locked);
spin_lock(pcl->pcl_locks[i]);
if (!i) {
-   LASSERT(!pcl->pcl_locked);
/* nobody should take private lock after this
 * so I wouldn't starve for too long time
 */
@@ -141,11 +142,11 @@ struct cfs_percpt_lock *
}
 
for (i = ncpt - 1; i >= 0; i--) {
-   if (!i) {
-   LASSERT(pcl->pcl_locked);
+   if (!i)
pcl->pcl_locked = 0;
-   }
spin_unlock(pcl->pcl_locks[i]);
+   if (!i)
+   LASSERT(pcl->pcl_locked);
}
 }
 EXPORT_SYMBOL(cfs_percpt_unlock);
-- 
1.7.9.5




[PATCH] gma500: Fix a sleep-in-atomic bug in psbfb_2d_submit

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
psbfb_2d_submit (acquire the lock by spin_lock_irqsave)
  psb_2d_wait_available
psb_spank
  msleep --> may sleep

To fix it, the "msleep" is replaced with "mdelay" in psb_spank.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/gpu/drm/gma500/accel_2d.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/gma500/accel_2d.c 
b/drivers/gpu/drm/gma500/accel_2d.c
index c51d925..7c24c8a 100644
--- a/drivers/gpu/drm/gma500/accel_2d.c
+++ b/drivers/gpu/drm/gma500/accel_2d.c
@@ -55,7 +55,7 @@ void psb_spank(struct drm_psb_private *dev_priv)
_PSB_CS_RESET_TWOD_RESET, PSB_CR_SOFT_RESET);
PSB_RSGX32(PSB_CR_SOFT_RESET);
 
-   msleep(1);
+   mdelay(1);
 
PSB_WSGX32(0, PSB_CR_SOFT_RESET);
wmb();
@@ -64,7 +64,7 @@ void psb_spank(struct drm_psb_private *dev_priv)
wmb();
(void) PSB_RSGX32(PSB_CR_BIF_CTRL);
 
-   msleep(1);
+   mdelay(1);
PSB_WSGX32(PSB_RSGX32(PSB_CR_BIF_CTRL) & ~_PSB_CB_CTRL_CLEAR_FAULT,
   PSB_CR_BIF_CTRL);
(void) PSB_RSGX32(PSB_CR_BIF_CTRL);
-- 
1.7.9.5




[PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_pci_mem_access_direct

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
netxen_nic_pci_mem_access_direct (acquire the lock by spin_lock)
  ioremap --> may sleep

To fix it, the lock is released before "ioremap", and the lock is 
acquired again after this function.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c 
b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
index a996801..5ea553e 100644
--- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
+++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
@@ -1419,7 +1419,9 @@ static u32 netxen_nic_io_read_2M(struct netxen_adapter 
*adapter,
 
mem_base = pci_resource_start(adapter->pdev, 0) +
(start & PAGE_MASK);
+   spin_unlock(>ahw.mem_lock);
mem_ptr = ioremap(mem_base, PAGE_SIZE);
+   spin_lock(>ahw.mem_lock);
if (mem_ptr == NULL) {
ret = -EIO;
goto unlock;
-- 
1.7.9.5




[PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
  b43legacy_synchronize_irq
synchronize_irq --> may sleep

To fix it, the lock is released before b43legacy_synchronize_irq, and the 
lock is acquired again after this function.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/wireless/broadcom/b43legacy/main.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c 
b/drivers/net/wireless/broadcom/b43legacy/main.c
index f1e3dad..31ead21 100644
--- a/drivers/net/wireless/broadcom/b43legacy/main.c
+++ b/drivers/net/wireless/broadcom/b43legacy/main.c
@@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct 
ieee80211_hw *hw,
b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
 
if (changed & BSS_CHANGED_BSSID) {
+   spin_unlock_irqrestore(>irq_lock, flags);
b43legacy_synchronize_irq(dev);
+   spin_lock_irqsave(>irq_lock, flags);
 
if (conf->bssid)
memcpy(wl->bssid, conf->bssid, ETH_ALEN);
-- 
1.7.9.5




Re: [PATCH] bcache: Fix a sleep-in-atomic bug

2017-05-31 Thread Jia-Ju Bai

On 05/31/2017 03:23 PM, Jia-Ju Bai wrote:

The driver may sleep under a spin lock, and the function call path is:
journal_wait_for_write (acquire the lock by spin_lock)
   closure_sync
 schedule -->  may sleep

To fix it, the lock is released before "closure_sync", and the lock is
acquired again after this function.

Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com>
---
  drivers/md/bcache/journal.c |1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index 1198e53..ad47c36 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -724,6 +724,7 @@ static struct journal_write *journal_wait_for_write(struct 
cache_set *c,
btree_flush_write(c);
}

+   spin_unlock(>journal.lock);
closure_sync();
spin_lock(>journal.lock);
wait = true;


Sorry, my patch is not correct, and it will cause double unlock.
Please ignore my patch.


Jia-Ju Bai



[PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_hw_read_wx_2M and netxen_nic_hw_write_wx_2M

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a write spin lock, and function call path is:
netxen_nic_hw_read_wx_2M (acquire the lock by write_lock_irqsave)
  crb_win_lock
netxen_pcie_sem_lock
  msleep --> may sleep
netxen_nic_hw_write_wx_2M (acquire the lock by write_lock_irqsave)
  crb_win_lock
netxen_pcie_sem_lock
  msleep --> may sleep

To fix it, the "msleep" is replaced with "mdelay" in netxen_pcie_sem_lock.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c 
b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
index a996801..0a9da42 100644
--- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
+++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
@@ -329,7 +329,7 @@ static void __iomem *pci_base_offset(struct netxen_adapter 
*adapter,
break;
if (++timeout >= NETXEN_PCIE_SEM_TIMEOUT)
return -EIO;
-   msleep(1);
+   mdelay(1);
}
 
if (id_reg)
-- 
1.7.9.5




Re: [PATCH] iscsi: Fix a sleep-in-atomic bug

2017-06-01 Thread Jia-Ju Bai

On 06/01/2017 02:21 PM, Nicholas A. Bellinger wrote:

Hi Jia-Ju,

On Wed, 2017-05-31 at 11:26 +0800, Jia-Ju Bai wrote:

The driver may sleep under a spin lock, and the function call path is:
iscsit_tpg_enable_portal_group (acquire the lock by spin_lock)
   iscsi_update_param_value
 kstrdup(GFP_KERNEL) -->  may sleep

To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com>
---
  drivers/target/iscsi/iscsi_target_parameters.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

Btw, the use of tpg->tpg_state_lock in iscsit_tpg_enable_portal_group()
while checking existing state and calling iscsi_update_param_value() is
not necessary, since lio_target_tpg_enable_store() is already holding
iscsit_get_tpg() ->  tpg->tpg_access_lock.

How about the following instead to only take tpg->tpg_state_lock when
updating tpg->tpg_state instead..?

diff --git a/drivers/target/iscsi/iscsi_target_tpg.c 
b/drivers/target/iscsi/iscsi_target_tpg.c
index 2e7e08d..abaabba 100644
--- a/drivers/target/iscsi/iscsi_target_tpg.c
+++ b/drivers/target/iscsi/iscsi_target_tpg.c
@@ -311,11 +311,9 @@ int iscsit_tpg_enable_portal_group(struct 
iscsi_portal_group *tpg)
 struct iscsi_tiqn *tiqn = tpg->tpg_tiqn;
 int ret;

-   spin_lock(>tpg_state_lock);
 if (tpg->tpg_state == TPG_STATE_ACTIVE) {
 pr_err("iSCSI target portal group: %hu is already"
 " active, ignoring request.\n", tpg->tpgt);
-   spin_unlock(>tpg_state_lock);
 return -EINVAL;
 }
 /*
@@ -324,10 +322,8 @@ int iscsit_tpg_enable_portal_group(struct 
iscsi_portal_group *tpg)
  * is enforced (as per default), and remove the NONE option.
  */
 param = iscsi_find_param_from_key(AUTHMETHOD, tpg->param_list);
-   if (!param) {
-   spin_unlock(>tpg_state_lock);
+   if (!param)
 return -EINVAL;
-   }

 if (tpg->tpg_attrib.authentication) {
 if (!strcmp(param->value, NONE)) {
@@ -341,6 +337,7 @@ int iscsit_tpg_enable_portal_group(struct 
iscsi_portal_group *tpg)
 goto err;
 }

+   spin_lock(>tpg_state_lock);
 tpg->tpg_state = TPG_STATE_ACTIVE;
 spin_unlock(>tpg_state_lock);

@@ -353,7 +350,6 @@ int iscsit_tpg_enable_portal_group(struct 
iscsi_portal_group *tpg)
 return 0;

  err:
-       spin_unlock(>tpg_state_lock);
 return ret;
  }


I think it is fine to me.

Thanks,
Jia-Ju Bai



Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_attr_interfmode_store

2017-06-01 Thread Jia-Ju Bai

On 06/02/2017 12:11 AM, Jonathan Corbet wrote:

On Thu, 01 Jun 2017 09:05:07 +0800
Jia-Ju Bai<baijiaju1...@163.com>  wrote:


I admit my patches are not well tested, and they may not well fix the bugs.
I am looking forward to opinions and suggestions :)

May I politely suggest that sending out untested locking changes is a
dangerous thing to do?  You really should not be changing the locking in a
piece of kernel code without understanding very well what the lock is
protecting and being able to say why your changes are safe.  Without that,
the risk of introducing subtle bugs is very high.

It looks like you have written a useful tool that could help us to make
the kernel more robust.  If you are interested in my suggestion, I would
recommend that you post the sleep-in-atomic scenarios that you are
finding, but refrain from "fixing" them in any case where you cannot offer
a strong explanation of why your fix is correct.

Thanks for working to find bugs in the kernel!

jon

Hi,

Thanks for your good and helpful advice. I am sorry for my improper patches.
I will only report bugs instead of sending improper patches when I have 
no good solution of fixing the bugs.


Thanks,
Jia-Ju Bai



[PATCH] cw1200: Fix a sleep-in-atomic bug in cw1200_tx_confirm_cb and cw1200_cqm_bssloss_sm

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
cw1200_tx_confirm_cb (acquire the lock by spin_lock)
  __cw1200_cqm_bssloss_sm
cancel_work_sync --> may sleep

cw1200_cqm_bssloss_sm
  __cw1200_cqm_bssloss_sm
cancel_work_sync --> may sleep

To fix it, the lock is released before cancel_work_sync, and the lock 
is acquired again after this function.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/wireless/st/cw1200/sta.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/wireless/st/cw1200/sta.c 
b/drivers/net/wireless/st/cw1200/sta.c
index a522248..d5f7698 100644
--- a/drivers/net/wireless/st/cw1200/sta.c
+++ b/drivers/net/wireless/st/cw1200/sta.c
@@ -154,7 +154,9 @@ void __cw1200_cqm_bssloss_sm(struct cw1200_common *priv,
int tx = 0;
 
priv->delayed_link_loss = 0;
+   spin_unlock(>bss_loss_lock);
cancel_work_sync(>bss_params_work);
+   spin_lock(>bss_loss_lock);
 
pr_debug("[STA] CQM BSSLOSS_SM: state: %d init %d good %d bad: %d 
txlock: %d uj: %d\n",
 priv->bss_loss_state,
-- 
1.7.9.5




[PATCH] rxe: Fix a sleep-in-atomic bug in post_one_send

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
post_one_send (acquire the lock by spin_lock_irqsave)
  init_send_wqe
copy_from_user --> may sleep

To fix it, the lock is released before copy_from_user, and the lock is
acquired again after this function.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/infiniband/sw/rxe/rxe_verbs.c |9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c 
b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 83d709e..6fb7e1a 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -725,7 +725,7 @@ static int init_send_wqe(struct rxe_qp *qp, struct 
ib_send_wr *ibwr,
 {
int num_sge = ibwr->num_sge;
struct ib_sge *sge;
-   int i;
+   int i, err;
u8 *p;
 
init_send_wr(qp, >wr, ibwr);
@@ -740,8 +740,11 @@ static int init_send_wqe(struct rxe_qp *qp, struct 
ib_send_wr *ibwr,
 
sge = ibwr->sg_list;
for (i = 0; i < num_sge; i++, sge++) {
-   if (qp->is_user && copy_from_user(p, (__user void *)
-   (uintptr_t)sge->addr, sge->length))
+   spin_unlock_irq(>sq.sq_lock);
+   err = copy_from_user(p, (__user void *)
+   (uintptr_t)sge->addr, sge->length);
+   spin_lock_irq(>sq.sq_lock);
+   if (qp->is_user && err)
return -EFAULT;
 
else if (!qp->is_user)
-- 
1.7.9.5




Re: [PATCH] megaraid: Fix a sleep-in-atomic bug

2017-05-31 Thread Jia-Ju Bai

On 05/31/2017 06:18 PM, Sumit Saxena wrote:

-Original Message-
From: Jia-Ju Bai [mailto:baijiaju1...@163.com]
Sent: Wednesday, May 31, 2017 8:27 AM
To: kashyap.de...@broadcom.com; sumit.sax...@broadcom.com;
shivasharan.srikanteshw...@broadcom.com; j...@linux.vnet.ibm.com;
martin.peter...@oracle.com
Cc: megaraidlinux@broadcom.com; linux-s...@vger.kernel.org; linux-
ker...@vger.kernel.org; Jia-Ju Bai
Subject: [PATCH] megaraid: Fix a sleep-in-atomic bug

The driver may sleep under a spin lock, and the function call path is:
mraid_mm_attach_buf (acquire the lock by spin_lock_irqsave)
  pci_pool_alloc(GFP_KERNEL) -->  may sleep

To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com>
---
drivers/scsi/megaraid/megaraid_mm.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/megaraid/megaraid_mm.c
b/drivers/scsi/megaraid/megaraid_mm.c
index 4cf9ed9..c43afb8 100644
--- a/drivers/scsi/megaraid/megaraid_mm.c
+++ b/drivers/scsi/megaraid/megaraid_mm.c
@@ -574,7 +574,7 @@

kioc->pool_index = right_pool;
kioc->free_buf   = 1;
-   kioc->buf_vaddr  = pci_pool_alloc(pool->handle, GFP_KERNEL,
+   kioc->buf_vaddr  = pci_pool_alloc(pool->handle, GFP_ATOMIC,
>buf_paddr);
spin_unlock_irqrestore(>lock, flags);

This is very old driver and reached EOL. Did you face any issue because of
this bug or discover this through code review?
Anyways patch looks good to me.

Acked-by: Sumit Saxena<sumit.sax...@broadcom.com>


--
1.7.9.5


Hi,

This bug is found by a static analysis tool and my code review.

Jia-Ju Bai




[PATCH] rts5208: Fix a sleep-in-atomic bug in rtsx_exclusive_enter_ss

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
rtsx_exclusive_enter_ss (acquire the lock by spin_lock)
  rtsx_enter_ss
rtsx_power_off_card
  sd_cleanup_work
sd_stop_seq_mode
  sd_switch_clock
sd_ddr_tuning
  sd_ddr_pre_tuning_tx
sd_change_phase
  wait_timeout
schedule_timeout --> may sleep

To fix it, "wait_timeout" is replaced with mdelay in sd_change_phase.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/rts5208/sd.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rts5208/sd.c b/drivers/staging/rts5208/sd.c
index bdd35b6..76bd105 100644
--- a/drivers/staging/rts5208/sd.c
+++ b/drivers/staging/rts5208/sd.c
@@ -1057,7 +1057,7 @@ static int sd_change_phase(struct rtsx_chip *chip, u8 
sample_point, u8 tune_dir)
 
rtsx_write_register(chip, SD_DCMPS_CTL, DCMPS_CHANGE, 0);
rtsx_write_register(chip, SD_VP_CTL, PHASE_CHANGE, 0);
-   wait_timeout(10);
+   mdelay(10);
sd_reset_dcm(chip, tune_dir);
return STATUS_FAIL;
 }
-- 
1.7.9.5




Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_attr_interfmode_store

2017-05-31 Thread Jia-Ju Bai

On 06/01/2017 01:33 AM, Larry Finger wrote:

On 05/31/2017 05:29 AM, Jia-Ju Bai wrote:

The driver may sleep under a spin lock, and the function call path is:
b43legacy_attr_interfmode_store (acquire the lock by spin_lock_irqsave)
   b43legacy_radio_set_interference_mitigation
 b43legacy_radio_interference_mitigation_disable
   b43legacy_calc_nrssi_slope
 b43legacy_synth_pu_workaround
   might_sleep and msleep --> may sleep

Fixing it may be complex, and a possible way is to remove
spin_lock_irqsave and spin_lock_irqrestore in
b43legacy_attr_interfmode_store, and the code has been protected by
mutex_lock and mutex_unlock.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
  drivers/net/wireless/broadcom/b43legacy/sysfs.c |2 --
  1 file changed, 2 deletions(-)

diff --git a/drivers/net/wireless/broadcom/b43legacy/sysfs.c 
b/drivers/net/wireless/broadcom/b43legacy/sysfs.c

index 2a1da15..9ede143 100644
--- a/drivers/net/wireless/broadcom/b43legacy/sysfs.c
+++ b/drivers/net/wireless/broadcom/b43legacy/sysfs.c
@@ -137,14 +137,12 @@ static ssize_t 
b43legacy_attr_interfmode_store(struct device *dev,

  }
mutex_lock(>wl->mutex);
-spin_lock_irqsave(>wl->irq_lock, flags);
err = b43legacy_radio_set_interference_mitigation(wldev, mode);
  if (err)
  b43legacyerr(wldev->wl, "Interference Mitigation not "
 "supported by device\n");
  mmiowb();
-spin_unlock_irqrestore(>wl->irq_lock, flags);
  mutex_unlock(>wl->mutex);
return err ? err : count;



Jia-Ju,

Did you actually observe the attempt to sleep under the spin lock, or 
did you discover this using some tool? In other words, have either of 
your patches been tested?


Larry


Hi,

In fact, my reported bugs are found by a static analysis tool written by 
me, and they are checked by my review of the driver code.

I admit my patches are not well tested, and they may not well fix the bugs.
I am looking forward to opinions and suggestions :)

Thanks,
Jia-Ju Bai



Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed

2017-05-31 Thread Jia-Ju Bai

On 06/01/2017 08:07 AM, Larry Finger wrote:

On 05/31/2017 10:32 AM, Michael Büsch wrote:

On Wed, 31 May 2017 13:26:43 +0300
Kalle Valo <kv...@codeaurora.org> wrote:


Jia-Ju Bai <baijiaju1...@163.com> writes:


The driver may sleep under a spin lock, and the function call path is:
b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
   b43legacy_synchronize_irq
 synchronize_irq --> may sleep

To fix it, the lock is released before b43legacy_synchronize_irq, 
and the

lock is acquired again after this function.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
  drivers/net/wireless/broadcom/b43legacy/main.c |2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c 
b/drivers/net/wireless/broadcom/b43legacy/main.c

index f1e3dad..31ead21 100644
--- a/drivers/net/wireless/broadcom/b43legacy/main.c
+++ b/drivers/net/wireless/broadcom/b43legacy/main.c
@@ -2859,7 +2859,9 @@ static void 
b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,

  b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
if (changed & BSS_CHANGED_BSSID) {
+spin_unlock_irqrestore(>irq_lock, flags);
  b43legacy_synchronize_irq(dev);
+spin_lock_irqsave(>irq_lock, flags);


To me this looks like a fragile workaround and not a real fix. You can
easily add new race conditions with releasing the lock like this.




I think releasing the lock possibly is fine. It certainly is better than
sleeping with a lock held.
We disabled the device interrupts just before this line.

However I think the synchronize_irq should be outside of the
conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So
two lines above)
I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID
is set.


On the other hand b43 does not have this irq-disabling foobar anymore.
So somebody must have removed it. Maybe you can find the commit that
removed this stuff from b43 and port it to b43legacy?


So I would vote for moving the synchronize_irq up outside of the
conditional and put the unlock/lock sequence around it.
And as a second patch on top of that try to remove this stuff
altogether like b43 did.


The patch that removed it in b43 is

commit 36dbd9548e92268127b0c31b0e121e63e9207108
Author: Michael Buesch <m...@bu3sch.de>
Date:   Fri Sep 4 22:51:29 2009 +0200

b43: Use a threaded IRQ handler

Use a threaded IRQ handler to allow locking the mutex and
sleeping while executing an interrupt.
This removes usage of the irq_lock spinlock, but introduces
a new hardirq_lock, which is _only_ used for the PCI/SSB lowlevel
hard-irq handler. Sleeping busses (SDIO) will use mutex instead.

Signed-off-by: Michael Buesch <m...@bu3sch.de>
Tested-by: Larry Finger <larry.fin...@lwfinger.net>
Signed-off-by: John W. Linville <linvi...@tuxdriver.com>

I vaguely remember this patch. Although it is roughly a 1000-line fix, 
I will try to port it to b43legacy. I still have an old BCM4306 PCMCIA 
card that I can test in a PowerBook G4.


I agree with Michael that this is the way to go. Both of Jia-Ju's 
patches should be rejected.


Larry




It is fine to me to fix the bug by porting this former patch.

Thanks,
Jia-Ju Bai



[PATCH] i40iw: Add a value assignment to avoid sleep-in-atomic bug caused by uninitialized value

2017-05-31 Thread Jia-Ju Bai
The value "cqp_request->waiting" indicates whether the sleeping operation 
should be performed, and it is not assigned in i40iw_get_cqp_request, so 
the driver may sleep in interrupt handling. The function call path is:

i40iw_dpc (tasklet_init indicates it handles interrupt)
  i40iw_process_aeq
i40iw_next_iw_state
  i40iw_hw_modify_qp (call i40iw_get_cqp_request)
i40iw_handle_cqp_op
  i40iw_wait_event --> may sleep

To fix it, "cqp_request->waiting" is assigned in "else" branch in
i40iw_get_cqp_request.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/infiniband/hw/i40iw/i40iw_utils.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/infiniband/hw/i40iw/i40iw_utils.c 
b/drivers/infiniband/hw/i40iw/i40iw_utils.c
index 409a378..0f4e633 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_utils.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_utils.c
@@ -326,6 +326,7 @@ struct i40iw_cqp_request *i40iw_get_cqp_request(struct 
i40iw_cqp *cqp, bool wait
cqp_request->waiting = true;
} else {
atomic_set(_request->refcount, 1);
+   cqp_request->waiting = false;
}
return cqp_request;
 }
-- 
1.7.9.5




[PATCH] ivtv: Fix a sleep-in-atomic bug in snd_ivtv_pcm_hw_free

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
snd_ivtv_pcm_hw_free (acquire the lock by spin_lock_irqsave)
  vfree --> may sleep

To fix it, the "substream->runtime->dma_area" is passed to a temporary
value, and mark it NULL when holding the lock. The memory is freed by
vfree through the temporary value outside the lock holding.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/media/pci/ivtv/ivtv-alsa-pcm.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/media/pci/ivtv/ivtv-alsa-pcm.c 
b/drivers/media/pci/ivtv/ivtv-alsa-pcm.c
index 807ead2..92d088c 100644
--- a/drivers/media/pci/ivtv/ivtv-alsa-pcm.c
+++ b/drivers/media/pci/ivtv/ivtv-alsa-pcm.c
@@ -262,14 +262,16 @@ static int snd_ivtv_pcm_hw_free(struct snd_pcm_substream 
*substream)
 {
struct snd_ivtv_card *itvsc = snd_pcm_substream_chip(substream);
unsigned long flags;
+   unsigned char *dma_area;
 
spin_lock_irqsave(>slock, flags);
if (substream->runtime->dma_area) {
dprintk("freeing pcm capture region\n");
-   vfree(substream->runtime->dma_area);
+   dma_area = substream->runtime->dma_area;
substream->runtime->dma_area = NULL;
}
spin_unlock_irqrestore(>slock, flags);
+   vfree(dma_area);
 
return 0;
 }
-- 
1.7.9.5




[PATCH] cx18: Fix a sleep-in-atomic bug in snd_cx18_pcm_hw_free

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
snd_cx18_pcm_hw_free (acquire the lock by spin_lock_irqsave)
  vfree --> may sleep

To fix it, the "substream->runtime->dma_area" is passed to a temporary 
value, and mark it NULL when holding the lock. The memory is freed by 
vfree through the temporary value outside the lock holding.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/media/pci/cx18/cx18-alsa-pcm.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/media/pci/cx18/cx18-alsa-pcm.c 
b/drivers/media/pci/cx18/cx18-alsa-pcm.c
index 205a98d..ba83147 100644
--- a/drivers/media/pci/cx18/cx18-alsa-pcm.c
+++ b/drivers/media/pci/cx18/cx18-alsa-pcm.c
@@ -257,14 +257,16 @@ static int snd_cx18_pcm_hw_free(struct snd_pcm_substream 
*substream)
 {
struct snd_cx18_card *cxsc = snd_pcm_substream_chip(substream);
unsigned long flags;
+   unsigned char *dma_area;
 
spin_lock_irqsave(>slock, flags);
if (substream->runtime->dma_area) {
dprintk("freeing pcm capture region\n");
-   vfree(substream->runtime->dma_area);
+   dma_area = substream->runtime->dma_area;
substream->runtime->dma_area = NULL;
}
spin_unlock_irqrestore(>slock, flags);
+   vfree(dma_area);
 
return 0;
 }
-- 
1.7.9.5




[PATCH] rts5208: Fix a sleep-in-atomic bug in sd_power_off_card3v3

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
rtsx_exclusive_enter_ss (acquire the lock by spin_lock)
  rtsx_enter_ss
rtsx_power_off_card
  sd_power_off_card3v3
wait_timeout
  schedule_timeout --> may sleep

To fix it, "wait_timeout" is replaced with mdelay in sd_power_off_card3v3.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/rts5208/sd.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rts5208/sd.c b/drivers/staging/rts5208/sd.c
index bdd35b6..aa14454 100644
--- a/drivers/staging/rts5208/sd.c
+++ b/drivers/staging/rts5208/sd.c
@@ -5231,7 +5231,7 @@ int sd_power_off_card3v3(struct rtsx_chip *chip)
return STATUS_FAIL;
}
 
-   wait_timeout(50);
+   mdelay(50);
}
 
if (chip->asic_code) {
-- 
1.7.9.5




Re: [PATCH] rts5208: Fix a sleep-in-atomic bug in rtsx_exclusive_enter_ss

2017-06-04 Thread Jia-Ju Bai

On 06/03/2017 04:52 PM, Greg KH wrote:

On Thu, Jun 01, 2017 at 11:43:35AM +0800, Jia-Ju Bai wrote:

The driver may sleep under a spin lock, and the function call path is:
rtsx_exclusive_enter_ss (acquire the lock by spin_lock)
   rtsx_enter_ss
 rtsx_power_off_card
   sd_cleanup_work
 sd_stop_seq_mode
   sd_switch_clock
 sd_ddr_tuning
   sd_ddr_pre_tuning_tx
 sd_change_phase
   wait_timeout
 schedule_timeout -->  may sleep

To fix it, "wait_timeout" is replaced with mdelay in sd_change_phase.

Nice work, how are you finding these bugs?  What tools gives you this
kind of analysis?

thanks,

greg k-h

Hi,

I am very glad to get your praise on my work :)
I recently write a static analysis tool for detecting sleep-in-atomic 
bugs, instead of using existing tools. One reason is that I have 
encountered these bugs for some times when I writing drivers.
I am still improving my tool and detecting other similar bugs in Linux 
kernel. If you have suggestions or comments on my work, please feel free 
to contact me :)


Thanks,
Jia-Ju Bai



[PATCH] rt5208: Fix a sleep-in-atomic bug in xd_copy_page

2017-06-04 Thread Jia-Ju Bai
Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/rts5208/xd.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rts5208/xd.c b/drivers/staging/rts5208/xd.c
index 85aba05..74d36f9 100644
--- a/drivers/staging/rts5208/xd.c
+++ b/drivers/staging/rts5208/xd.c
@@ -1268,7 +1268,7 @@ static int xd_copy_page(struct rtsx_chip *chip, u32 
old_blk, u32 new_blk,
reg = 0;
rtsx_read_register(chip, XD_CTL, );
if (reg & (XD_ECC1_ERROR | XD_ECC2_ERROR)) {
-   wait_timeout(100);
+   mdelay(100);
 
if (detect_card_cd(chip,
   XD_CARD) != STATUS_SUCCESS) {
-- 
1.7.9.5




[PATCH resend] rt5208: Fix a sleep-in-atomic bug in xd_copy_page

2017-06-04 Thread Jia-Ju Bai
Last patch lacks code explanation, and it is included in this patch.

The driver may sleep under a spin lock, and the function call path is:
rtsx_exclusive_enter_ss (acquire the lock by spin_lock)
  rtsx_enter_ss
rtsx_power_off_card
  xd_cleanup_work
xd_delay_write
  xd_finish_write
xd_copy_page
  wait_timeout
schedule_timeout --> may sleep

To fix it, "wait_timeout" is replaced with mdelay in xd_copy_page.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/rts5208/xd.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rts5208/xd.c b/drivers/staging/rts5208/xd.c
index 85aba05..74d36f9 100644
--- a/drivers/staging/rts5208/xd.c
+++ b/drivers/staging/rts5208/xd.c
@@ -1268,7 +1268,7 @@ static int xd_copy_page(struct rtsx_chip *chip, u32 
old_blk, u32 new_blk,
reg = 0;
rtsx_read_register(chip, XD_CTL, );
if (reg & (XD_ECC1_ERROR | XD_ECC2_ERROR)) {
-   wait_timeout(100);
+   mdelay(100);
 
if (detect_card_cd(chip,
   XD_CARD) != STATUS_SUCCESS) {
-- 
1.7.9.5




[PATCH] rts5208: Fix a sleep-in-atomic bug in sd_send_cmd_get_rsp

2017-06-04 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
rtsx_exclusive_enter_ss (acquire the lock by spin_lock)
  rtsx_enter_ss
rtsx_power_off_card
  sd_cleanup_work
sd_stop_seq_mode
  sd_switch_clock
sd_ddr_tuning
  sd_ddr_pre_tuning_tx
sd_send_cmd_get_rsp
  wait_timeout
schedule_timeout --> may sleep

To fix it, "wait_timeout" is replaced with mdelay in sd_send_cmd_get_rsp.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/rts5208/sd.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rts5208/sd.c b/drivers/staging/rts5208/sd.c
index bdd35b6..fed17ff 100644
--- a/drivers/staging/rts5208/sd.c
+++ b/drivers/staging/rts5208/sd.c
@@ -226,7 +226,7 @@ static int sd_send_cmd_get_rsp(struct rtsx_chip *chip, u8 
cmd_idx,
return STATUS_FAIL;
}
if (rty_cnt < SD_MAX_RETRY_COUNT) {
-   wait_timeout(20);
+   mdelay(20);
rty_cnt++;
goto RTY_SEND_CMD;
} else {
-- 
1.7.9.5




[BUG] rts5208: Sleeping under a spin lock in xd_init_l2p_tbl

2017-06-04 Thread Jia-Ju Bai

According to rtsx_chip.c and xd.c, the driver may sleep under a spin lock.
The function call path is:
rtsx_exclusive_enter_ss (acquire the lock by spin_lock)
  rtsx_enter_ss
rtsx_power_off_card
  xd_cleanup_work
xd_delay_write
  xd_finish_write
xd_set_unused_block
  xd_build_l2p_tbl
xd_init_l2p_tbl
  vmalloc --> may sleep

This bug is found by my static analysis tool and my code review.
I hope to fix it, but I do not have a good solution.

Thanks,
Jia-Ju Bai



[BUG] rts5208: Sleeping under a spin lock in xd_build_l2p_tbl

2017-06-04 Thread Jia-Ju Bai

According to rtsx_chip.c and xd.c, the driver may sleep under a spin lock.
The function call path is:
rtsx_exclusive_enter_ss (acquire the lock by spin_lock)
  rtsx_enter_ss
rtsx_power_off_card
  xd_cleanup_work
xd_delay_write
  xd_finish_write
xd_set_unused_block
  xd_build_l2p_tbl
vmalloc --> may sleep

This bug is found by my static analysis tool and my code review.
I hope to fix it, but I do not have a good solution.

Thanks,
Jia-Ju Bai



[BUG] rts5208: Sleeping under a spin lock in free_zone

2017-06-04 Thread Jia-Ju Bai

According to rtsx_chip.c and xd.c, the driver may sleep under a spin lock.
The function call path is:
rtsx_exclusive_enter_ss (acquire the lock by spin_lock)
  rtsx_enter_ss
rtsx_power_off_card
  xd_cleanup_work
xd_delay_write
  xd_finish_write
xd_set_unused_block
  free_zone
vfree --> may sleep

This bug is found by my static analysis tool and my code review.
I hope to fix it, but I do not have a good solution.

Thanks,
Jia-Ju Bai



Re: [PATCH] fs: nfs: Fix a sleep-in-atomic bug in nfs_access_add_cache

2017-06-05 Thread Jia-Ju Bai

On 06/05/2017 07:48 PM, Trond Myklebust wrote:

On Mon, 2017-06-05 at 16:05 +0800, Jia-Ju Bai wrote:

The driver may sleep under a rcu read lock, and function call path
is:
nfs_permission (acquire the lock by rcu_read_lock)
   nfs_do_access
 nfs_access_add_cache
   kmalloc(GFP_KERNEL) -->  may sleep

To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com>
---
  fs/nfs/dir.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 32ccd77..7a074db 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2333,7 +2333,7 @@ static void nfs_access_add_rbtree(struct inode
*inode, struct nfs_access_entry *
  
  void nfs_access_add_cache(struct inode *inode, struct

nfs_access_entry *set)
  {
-   struct nfs_access_entry *cache = kmalloc(sizeof(*cache),
GFP_KERNEL);
+   struct nfs_access_entry *cache = kmalloc(sizeof(*cache),
GFP_ATOMIC);
if (cache == NULL)
return;
RB_CLEAR_NODE(>rb_node);

The RCU locked codepath will not ever hit nfs_access_add_rbtree(). It
returns with an error code of -ECHILD after the test of "may_block".

Cheers
   Trond

Yes, I think you are right.

Thanks,
Jia-Ju Bai



[PATCH V5] rxe: Fix a sleep-in-atomic bug in post_one_send

2017-06-05 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
post_one_send (acquire the lock by spin_lock_irqsave)
  init_send_wqe
copy_from_user --> may sleep

There is no flow that makes "qp->is_user" true, and copy_from_user may
cause bug when a non-user pointer is used. So the lines of copy_from_user
and check of "qp->is_user" are removed.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
V5:
* Remove useless check of "qp->is_user".
  Thank Leon for pointing it out.

V4:
* Remove the line of copy_from_user.
  Thank Moni for good advice.

V3:
* It corrects the mistakes of remaining legacy code in V2.
  Thank Ram for pointing it out.

V2:
* The parameter "flags" is added to restore and save the irq status.
  Thank Leon for good advice.
---
 drivers/infiniband/sw/rxe/rxe_verbs.c |9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c 
b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 83d709e..073e667 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -740,13 +740,8 @@ static int init_send_wqe(struct rxe_qp *qp, struct 
ib_send_wr *ibwr,
 
sge = ibwr->sg_list;
for (i = 0; i < num_sge; i++, sge++) {
-   if (qp->is_user && copy_from_user(p, (__user void *)
-   (uintptr_t)sge->addr, sge->length))
-   return -EFAULT;
-
-   else if (!qp->is_user)
-   memcpy(p, (void *)(uintptr_t)sge->addr,
-  sge->length);
+   memcpy(p, (void *)(uintptr_t)sge->addr,
+   sge->length);
 
p += sge->length;
}
-- 
1.7.9.5




Re: [PATCH] ubifs: Fix a sleep-in-atomic bug in ubifs_read_nnode

2017-06-05 Thread Jia-Ju Bai

On 06/05/2017 04:25 PM, Richard Weinberger wrote:

Jia-Ju Bai,

Am 05.06.2017 um 05:38 schrieb Jia-Ju Bai:

The driver may sleep under a spin lock, and the function call path is:
ubifs_change_lp (acquire the lock by spin_lock)
   change_category
 ubifs_remove_from_cat
   remove_from_lpt_heap
 dbg_check_heap
   ubifs_lpt_lookup
 ubifs_read_nnode
   kzalloc(GFP_NOFS) -->  may sleep

To fix it, "GFP_NOFS" is replaced with "GFP_ATOMIC".

So, this happens only when dbg_check_heap() is activated, right?

Thanks,
//richard

Yes, I think so.

Thanks,
Jia-Ju Bai



Re: [PATCH V3] rxe: Fix a sleep-in-atomic bug in post_one_send

2017-06-05 Thread Jia-Ju Bai

On 06/05/2017 04:30 PM, Moni Shoua wrote:

-   if (qp->is_user&&  copy_from_user(p, (__user void *)
-   (uintptr_t)sge->addr, sge->length))
+   spin_unlock_irqrestore(>sq.sq_lock, *flags);
+   err = copy_from_user(p, (__user void *)
+   (uintptr_t)sge->addr, sge->length);
+   spin_lock_irqsave(>sq.sq_lock, *flags);
+   if (qp->is_user&&  err)
 return -EFAULT;

qp-_is_user is always false in this function (flow starts from
rxe_post_send_kernel) so this line is a dead code
In fact, this patch seems to add a serious bug when it uses
copy_from_user() from a non user pointer.
Do you agree?

I agree.
So, it is fine to me to remove this line, as you said in the former email:


 Second, I think that there is no flow that leads to this function
 when qp->is user is true so maybe the correct action is to remove this
 line completely
 if (qp->is_user&&  copy_from_user(p, (__user void *)






Re: [PATCH] fs: xfs: Fix a lock-twice and sleep-in-atomic bug in xfs_iget

2017-06-05 Thread Jia-Ju Bai

On 06/05/2017 04:32 PM, Shan Hai wrote:



On 2017年06月05日 16:17, Jia-Ju Bai wrote:

The driver may sleep under a read rcu lock, and function call path is:
xfs_iget (acquire the lock by rcu_read_lock)
   "goto out_error_or_again" after xfs_iget_cache_hit
 delay
   schedule_timeout_uninterruptible --> may sleep
Meanwhile, the rcu_read_lock will be called twice in this situation.

To fix it, the lock is released before "goto".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
  fs/xfs/xfs_icache.c |4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index f61c84f8..c2a4722 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -600,8 +600,10 @@ struct xfs_inode *
if (ip) {
  error = xfs_iget_cache_hit(pag, ip, ino, flags, lock_flags);
-if (error)
+if (error) {
+rcu_read_unlock();


Seems you are going to double unlock by doing this, since it is 
unlocked in the xfs_iget_cache_hit.


Thanks
Shan Hai


  goto out_error_or_again;
+}
  } else {
  rcu_read_unlock();
  XFS_STATS_INC(mp, xs_ig_missed);



I think you are right. Please ignore my patch.

Thanks,
Jia-Ju Bai



[PATCH V4] rxe: Fix a sleep-in-atomic bug in post_one_send

2017-06-05 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
post_one_send (acquire the lock by spin_lock_irqsave)
  init_send_wqe
copy_from_user --> may sleep

There is no flow that makes "qp->is_user" true, and copy_from_user may 
cause bug when a non-user pointer is used. So, the line of copy_from_user
is removed.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
V4:
* Remove the line of copy_from_user.

V3:
* It corrects the mistakes of remaining legacy code in V2.
  (Thank Ram for pointing it out)

V2:
* The parameter "flags" is added to restore and save the irq status.
  Thank Leon for good advice.
---
 drivers/infiniband/sw/rxe/rxe_verbs.c |6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c 
b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 83d709e..7c52c7c 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -740,11 +740,7 @@ static int init_send_wqe(struct rxe_qp *qp, struct 
ib_send_wr *ibwr,
 
sge = ibwr->sg_list;
for (i = 0; i < num_sge; i++, sge++) {
-   if (qp->is_user && copy_from_user(p, (__user void *)
-   (uintptr_t)sge->addr, sge->length))
-   return -EFAULT;
-
-   else if (!qp->is_user)
+   if (!qp->is_user)
memcpy(p, (void *)(uintptr_t)sge->addr,
   sge->length);
 
-- 
1.7.9.5




[PATCH V2] ivtv: Fix a sleep-in-atomic bug in snd_ivtv_pcm_hw_free

2017-06-01 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
snd_ivtv_pcm_hw_free (acquire the lock by spin_lock_irqsave)
  vfree --> may sleep

To fix it, the "substream->runtime->dma_area" is passed to a temporary
value, and mark it NULL when holding the lock. The memory is freed by
vfree through the temporary value outside the lock holding.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/media/pci/ivtv/ivtv-alsa-pcm.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/media/pci/ivtv/ivtv-alsa-pcm.c 
b/drivers/media/pci/ivtv/ivtv-alsa-pcm.c
index 807ead2..a692554 100644
--- a/drivers/media/pci/ivtv/ivtv-alsa-pcm.c
+++ b/drivers/media/pci/ivtv/ivtv-alsa-pcm.c
@@ -262,14 +262,17 @@ static int snd_ivtv_pcm_hw_free(struct snd_pcm_substream 
*substream)
 {
struct snd_ivtv_card *itvsc = snd_pcm_substream_chip(substream);
unsigned long flags;
+   unsigned char *dma_area = NULL;
 
spin_lock_irqsave(>slock, flags);
if (substream->runtime->dma_area) {
dprintk("freeing pcm capture region\n");
-   vfree(substream->runtime->dma_area);
+   dma_area = substream->runtime->dma_area;
substream->runtime->dma_area = NULL;
}
spin_unlock_irqrestore(>slock, flags);
+   if (dma_area)
+   vfree(dma_area);
 
return 0;
 }
-- 
1.7.9.5




[PATCH] qlcnic: Fix a sleep-in-atomic bug in qlcnic_82xx_hw_write_wx_2M and qlcnic_82xx_hw_read_wx_2M

2017-06-01 Thread Jia-Ju Bai
The driver may sleep under a write spin lock, and the function 
call path is:
qlcnic_82xx_hw_write_wx_2M (acquire the lock by write_lock_irqsave)
  crb_win_lock
qlcnic_pcie_sem_lock
  usleep_range
qlcnic_82xx_hw_read_wx_2M (acquire the lock by write_lock_irqsave)
  crb_win_lock
qlcnic_pcie_sem_lock
  usleep_range

To fix it, the usleep_range is replaced with udelay.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c
index 838cc0c..7848cf0 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c
@@ -341,7 +341,7 @@ static void qlcnic_write_window_reg(u32 addr, void __iomem 
*bar0, u32 data)
}
return -EIO;
}
-   usleep_range(1000, 1500);
+   udelay(1200);
}
 
if (id_reg)
-- 
1.7.9.5




[PATCH V3] rxe: Fix a sleep-in-atomic bug in post_one_send

2017-06-01 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
post_one_send (acquire the lock by spin_lock_irqsave)
  init_send_wqe
copy_from_user --> may sleep

To fix it, the lock is released before copy_from_user, and the lock is
acquired again after this function. The parameter "flags" is used to
restore and save the irq status.
Thank Leon for good advice.

This patch corrects the mistakes in V2. (Thank Ram for pointing it out)

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/infiniband/sw/rxe/rxe_verbs.c |   13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c 
b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 83d709e..5293d15 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -721,11 +721,11 @@ static void init_send_wr(struct rxe_qp *qp, struct 
rxe_send_wr *wr,
 
 static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr,
 unsigned int mask, unsigned int length,
-struct rxe_send_wqe *wqe)
+struct rxe_send_wqe *wqe, unsigned long *flags)
 {
int num_sge = ibwr->num_sge;
struct ib_sge *sge;
-   int i;
+   int i, err;
u8 *p;
 
init_send_wr(qp, >wr, ibwr);
@@ -740,8 +740,11 @@ static int init_send_wqe(struct rxe_qp *qp, struct 
ib_send_wr *ibwr,
 
sge = ibwr->sg_list;
for (i = 0; i < num_sge; i++, sge++) {
-   if (qp->is_user && copy_from_user(p, (__user void *)
-   (uintptr_t)sge->addr, sge->length))
+   spin_unlock_irqrestore(>sq.sq_lock, *flags);
+   err = copy_from_user(p, (__user void *)
+   (uintptr_t)sge->addr, sge->length);
+   spin_lock_irqsave(>sq.sq_lock, *flags);
+   if (qp->is_user && err)
return -EFAULT;
 
else if (!qp->is_user)
@@ -794,7 +797,7 @@ static int post_one_send(struct rxe_qp *qp, struct 
ib_send_wr *ibwr,
 
send_wqe = producer_addr(sq->queue);
 
-   err = init_send_wqe(qp, ibwr, mask, length, send_wqe);
+   err = init_send_wqe(qp, ibwr, mask, length, send_wqe, );
if (unlikely(err))
goto err1;
 
-- 
1.7.9.5




[PATCH] cx18: Fix a sleep-in-atomic bug in snd_cx18_pcm_hw_free

2017-06-01 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
snd_cx18_pcm_hw_free (acquire the lock by spin_lock_irqsave)
  vfree --> may sleep

To fix it, the "substream->runtime->dma_area" is passed to a temporary
value, and mark it NULL when holding the lock. The memory is freed by
vfree through the temporary value outside the lock holding.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/media/pci/cx18/cx18-alsa-pcm.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/media/pci/cx18/cx18-alsa-pcm.c 
b/drivers/media/pci/cx18/cx18-alsa-pcm.c
index 205a98d..8c51e4c 100644
--- a/drivers/media/pci/cx18/cx18-alsa-pcm.c
+++ b/drivers/media/pci/cx18/cx18-alsa-pcm.c
@@ -257,14 +257,17 @@ static int snd_cx18_pcm_hw_free(struct snd_pcm_substream 
*substream)
 {
struct snd_cx18_card *cxsc = snd_pcm_substream_chip(substream);
unsigned long flags;
+   unsigned char *dma_area = NULL;
 
spin_lock_irqsave(>slock, flags);
if (substream->runtime->dma_area) {
dprintk("freeing pcm capture region\n");
-   vfree(substream->runtime->dma_area);
+   dma_area = substream->runtime->dma_area;
substream->runtime->dma_area = NULL;
}
spin_unlock_irqrestore(>slock, flags);
+   if (dma_area)
+   vfree(dma_area);
 
return 0;
 }
-- 
1.7.9.5




[PATCH V2] rxe: Fix a sleep-in-atomic bug in post_one_send

2017-06-01 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
post_one_send (acquire the lock by spin_lock_irqsave)
  init_send_wqe
copy_from_user --> may sleep

To fix it, the lock is released before copy_from_user, and the lock is
acquired again after this function. The parameter "flags" is used to
restore and save the irq status.
Thank Leon for good advice.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/infiniband/sw/rxe/rxe_verbs.c |   13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c 
b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 83d709e..7dcdf67 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -721,11 +721,11 @@ static void init_send_wr(struct rxe_qp *qp, struct 
rxe_send_wr *wr,
 
 static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr,
 unsigned int mask, unsigned int length,
-struct rxe_send_wqe *wqe)
+struct rxe_send_wqe *wqe, unsigned long *flags)
 {
int num_sge = ibwr->num_sge;
struct ib_sge *sge;
-   int i;
+   int i, err;
u8 *p;
 
init_send_wr(qp, >wr, ibwr);
@@ -742,7 +742,12 @@ static int init_send_wqe(struct rxe_qp *qp, struct 
ib_send_wr *ibwr,
for (i = 0; i < num_sge; i++, sge++) {
if (qp->is_user && copy_from_user(p, (__user void *)
(uintptr_t)sge->addr, sge->length))
-   return -EFAULT;
+   spin_unlock_irqrestore(>sq.sq_lock, *flags);
+   err = copy_from_user(p, (__user void *)
+   (uintptr_t)sge->addr, sge->length);
+   spin_lock_irqsave(>sq.sq_lock, *flags);
+   if (qp->is_user && err)
+   return -EFAULT;
 
else if (!qp->is_user)
memcpy(p, (void *)(uintptr_t)sge->addr,
@@ -794,7 +799,7 @@ static int post_one_send(struct rxe_qp *qp, struct 
ib_send_wr *ibwr,
 
send_wqe = producer_addr(sq->queue);
 
-   err = init_send_wqe(qp, ibwr, mask, length, send_wqe);
+   err = init_send_wqe(qp, ibwr, mask, length, send_wqe, );
if (unlikely(err))
goto err1;
 
-- 
1.7.9.5




[BUG] ntb: Sleep in interrupt handling

2017-06-01 Thread Jia-Ju Bai

According to ntb_transport.c, the driver may sleep in interrupt handling.
The function call path is:
ntb_transport_rxc_db (tasklet_init indicates it handles interrupt)
  ntb_process_rxc
ntb_async_rx
  ntb_async_rx_submit
schedule_timeout --> may sleep

This bug is found by my static analysis tool and my code review.
I hope to fix it, but I do not have a good solution.

Thanks,
Jia-Ju Bai



[PATCH] oss: Fix a sleep-in-atomic bug in midi_outc

2017-06-01 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
midi_outc (acquire the lock by spin_lock_irqsave)
  oss_broken_sleep_on
schedule_timeout --> may sleep

To fix it, the lock is released before oss_broken_sleep_on, and the lock 
is acquired again after this function.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 sound/oss/sequencer.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/sound/oss/sequencer.c b/sound/oss/sequencer.c
index f19da4b..3d95d752 100644
--- a/sound/oss/sequencer.c
+++ b/sound/oss/sequencer.c
@@ -1211,7 +1211,9 @@ static void midi_outc(int dev, unsigned char data)
 
spin_lock_irqsave(,flags);
while (n && !midi_devs[dev]->outputc(dev, data)) {
+   spin_unlock_irqrestore(, flags);
oss_broken_sleep_on(_sleeper, HZ/25);
+   spin_lock_irqsave(, flags);
n--;
}
spin_unlock_irqrestore(,flags);
-- 
1.7.9.5




[PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_attr_interfmode_store

2017-05-31 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
b43legacy_attr_interfmode_store (acquire the lock by spin_lock_irqsave)
  b43legacy_radio_set_interference_mitigation
b43legacy_radio_interference_mitigation_disable
  b43legacy_calc_nrssi_slope
b43legacy_synth_pu_workaround
  might_sleep and msleep --> may sleep

Fixing it may be complex, and a possible way is to remove 
spin_lock_irqsave and spin_lock_irqrestore in 
b43legacy_attr_interfmode_store, and the code has been protected by
mutex_lock and mutex_unlock.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/wireless/broadcom/b43legacy/sysfs.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/wireless/broadcom/b43legacy/sysfs.c 
b/drivers/net/wireless/broadcom/b43legacy/sysfs.c
index 2a1da15..9ede143 100644
--- a/drivers/net/wireless/broadcom/b43legacy/sysfs.c
+++ b/drivers/net/wireless/broadcom/b43legacy/sysfs.c
@@ -137,14 +137,12 @@ static ssize_t b43legacy_attr_interfmode_store(struct 
device *dev,
}
 
mutex_lock(>wl->mutex);
-   spin_lock_irqsave(>wl->irq_lock, flags);
 
err = b43legacy_radio_set_interference_mitigation(wldev, mode);
if (err)
b43legacyerr(wldev->wl, "Interference Mitigation not "
   "supported by device\n");
mmiowb();
-   spin_unlock_irqrestore(>wl->irq_lock, flags);
mutex_unlock(>wl->mutex);
 
return err ? err : count;
-- 
1.7.9.5




[PATCH] ubifs: Fix a sleep-in-atomic bug in ubifs_read_nnode

2017-06-04 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
ubifs_change_lp (acquire the lock by spin_lock)
  change_category
ubifs_remove_from_cat
  remove_from_lpt_heap
dbg_check_heap
  ubifs_lpt_lookup
ubifs_read_nnode
  kzalloc(GFP_NOFS) --> may sleep

To fix it, "GFP_NOFS" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 fs/ubifs/lpt.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ubifs/lpt.c b/fs/ubifs/lpt.c
index 9a51710..4247934c 100644
--- a/fs/ubifs/lpt.c
+++ b/fs/ubifs/lpt.c
@@ -1205,7 +1205,7 @@ int ubifs_read_nnode(struct ubifs_info *c, struct 
ubifs_nnode *parent, int iip)
lnum = c->lpt_lnum;
offs = c->lpt_offs;
}
-   nnode = kzalloc(sizeof(struct ubifs_nnode), GFP_NOFS);
+   nnode = kzalloc(sizeof(struct ubifs_nnode), GFP_ATOMIC);
if (!nnode) {
err = -ENOMEM;
goto out;
-- 
1.7.9.5




[PATCH] ubifs: Fix a sleep-in-atomic bug in read_nnode

2017-06-04 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
ubifs_change_lp (acquire the lock by spin_lock)
  change_category
ubifs_remove_from_cat
  remove_from_lpt_heap
dbg_check_heap
  ubifs_lpt_lookup
ubifs_get_pnode
  read_pnode
kzalloc(GFP_NOFS) --> may sleep

To fix it, "GFP_NOFS" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 fs/ubifs/lpt.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ubifs/lpt.c b/fs/ubifs/lpt.c
index 9a51710..fabe2fd 100644
--- a/fs/ubifs/lpt.c
+++ b/fs/ubifs/lpt.c
@@ -1268,7 +1268,7 @@ static int read_pnode(struct ubifs_info *c, struct 
ubifs_nnode *parent, int iip)
branch = >nbranch[iip];
lnum = branch->lnum;
offs = branch->offs;
-   pnode = kzalloc(sizeof(struct ubifs_pnode), GFP_NOFS);
+   pnode = kzalloc(sizeof(struct ubifs_pnode), GFP_ATOMIC);
if (!pnode)
return -ENOMEM;
 
-- 
1.7.9.5




[PATCH] cachefiles: Fix a sleep-in-atomic bug in cachefiles_printk_object

2017-06-04 Thread Jia-Ju Bai
The driver may sleep under a write spin lock, and the function 
call path is:
cachefiles_mark_object_active (acquire the lock by write_lock)
  cachefiles_printk_object
kmalloc(GFP_NOIO) --> may sleep
cachefiles_mark_object_buried (acquire the lock by write_lock)
  cachefiles_printk_object
kmalloc(GFP_NOIO) --> may sleep

To fix it, "GFP_NOIO" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 fs/cachefiles/namei.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 41df8a2..d0f76e7 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -83,7 +83,7 @@ static noinline void cachefiles_printk_object(struct 
cachefiles_object *object,
 {
u8 *keybuf;
 
-   keybuf = kmalloc(CACHEFILES_KEYBUF_SIZE, GFP_NOIO);
+   keybuf = kmalloc(CACHEFILES_KEYBUF_SIZE, GFP_ATOMIC);
if (object)
__cachefiles_printk_object(object, "", keybuf);
if (xobject)
-- 
1.7.9.5




[PATCH V2] staging: rt5208: Fix a sleep-in-atomic bug in xd_copy_page

2017-06-05 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
rtsx_exclusive_enter_ss (acquire the lock by spin_lock)
  rtsx_enter_ss
rtsx_power_off_card
  xd_cleanup_work
xd_delay_write
  xd_finish_write
xd_copy_page
  wait_timeout
schedule_timeout --> may sleep

To fix it, "wait_timeout" is replaced with mdelay in xd_copy_page.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
V2:
* Add code explanation in this patch.

 drivers/staging/rts5208/xd.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rts5208/xd.c b/drivers/staging/rts5208/xd.c
index 85aba05..74d36f9 100644
--- a/drivers/staging/rts5208/xd.c
+++ b/drivers/staging/rts5208/xd.c
@@ -1268,7 +1268,7 @@ static int xd_copy_page(struct rtsx_chip *chip, u32 
old_blk, u32 new_blk,
reg = 0;
rtsx_read_register(chip, XD_CTL, );
if (reg & (XD_ECC1_ERROR | XD_ECC2_ERROR)) {
-   wait_timeout(100);
+   mdelay(100);
 
if (detect_card_cd(chip,
   XD_CARD) != STATUS_SUCCESS) {
-- 
1.7.9.5




[PATCH V3] rxe: Fix a sleep-in-atomic bug in post_one_send

2017-06-05 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
post_one_send (acquire the lock by spin_lock_irqsave)
  init_send_wqe
copy_from_user --> may sleep

To fix it, the lock is released before copy_from_user, and the lock is
acquired again after this function. The parameter "flags" is used to
restore and save the irq status.


Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
V3:
* It corrects the mistakes of remaining legacy code in V2. 
  (Thank Ram for pointing it out)

V2:
* The parameter "flags" is added to restore and save the irq status.
  Thank Leon for good advice.

 drivers/infiniband/sw/rxe/rxe_verbs.c |   13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c 
b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 83d709e..5293d15 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -721,11 +721,11 @@ static void init_send_wr(struct rxe_qp *qp, struct 
rxe_send_wr *wr,
 
 static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr,
 unsigned int mask, unsigned int length,
-struct rxe_send_wqe *wqe)
+struct rxe_send_wqe *wqe, unsigned long *flags)
 {
int num_sge = ibwr->num_sge;
struct ib_sge *sge;
-   int i;
+   int i, err;
u8 *p;
 
init_send_wr(qp, >wr, ibwr);
@@ -740,8 +740,11 @@ static int init_send_wqe(struct rxe_qp *qp, struct 
ib_send_wr *ibwr,
 
sge = ibwr->sg_list;
for (i = 0; i < num_sge; i++, sge++) {
-   if (qp->is_user && copy_from_user(p, (__user void *)
-   (uintptr_t)sge->addr, sge->length))
+   spin_unlock_irqrestore(>sq.sq_lock, *flags);
+   err = copy_from_user(p, (__user void *)
+   (uintptr_t)sge->addr, sge->length);
+   spin_lock_irqsave(>sq.sq_lock, *flags);
+   if (qp->is_user && err)
return -EFAULT;
 
else if (!qp->is_user)
@@ -794,7 +797,7 @@ static int post_one_send(struct rxe_qp *qp, struct 
ib_send_wr *ibwr,
 
send_wqe = producer_addr(sq->queue);
 
-   err = init_send_wqe(qp, ibwr, mask, length, send_wqe);
+   err = init_send_wqe(qp, ibwr, mask, length, send_wqe, );
if (unlikely(err))
goto err1;
 
-- 
1.7.9.5




[PATCH V3] rxe: Fix a sleep-in-atomic bug in post_one_send

2017-06-05 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
post_one_send (acquire the lock by spin_lock_irqsave)
  init_send_wqe
copy_from_user --> may sleep

To fix it, the lock is released before copy_from_user, and the lock is
acquired again after this function. The parameter "flags" is used to
restore and save the irq status.


Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
V3:
* It corrects the mistakes of remaining legacy code in V2. 
  (Thank Ram for pointing it out)

V2:
* The parameter "flags" is added to restore and save the irq status.
  Thank Leon for good advice.
---
 drivers/infiniband/sw/rxe/rxe_verbs.c |   13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c 
b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 83d709e..5293d15 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -721,11 +721,11 @@ static void init_send_wr(struct rxe_qp *qp, struct 
rxe_send_wr *wr,
 
 static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr,
 unsigned int mask, unsigned int length,
-struct rxe_send_wqe *wqe)
+struct rxe_send_wqe *wqe, unsigned long *flags)
 {
int num_sge = ibwr->num_sge;
struct ib_sge *sge;
-   int i;
+   int i, err;
u8 *p;
 
init_send_wr(qp, >wr, ibwr);
@@ -740,8 +740,11 @@ static int init_send_wqe(struct rxe_qp *qp, struct 
ib_send_wr *ibwr,
 
sge = ibwr->sg_list;
for (i = 0; i < num_sge; i++, sge++) {
-   if (qp->is_user && copy_from_user(p, (__user void *)
-   (uintptr_t)sge->addr, sge->length))
+   spin_unlock_irqrestore(>sq.sq_lock, *flags);
+   err = copy_from_user(p, (__user void *)
+   (uintptr_t)sge->addr, sge->length);
+   spin_lock_irqsave(>sq.sq_lock, *flags);
+   if (qp->is_user && err)
return -EFAULT;
 
else if (!qp->is_user)
@@ -794,7 +797,7 @@ static int post_one_send(struct rxe_qp *qp, struct 
ib_send_wr *ibwr,
 
send_wqe = producer_addr(sq->queue);
 
-   err = init_send_wqe(qp, ibwr, mask, length, send_wqe);
+   err = init_send_wqe(qp, ibwr, mask, length, send_wqe, );
if (unlikely(err))
goto err1;
 
-- 
1.7.9.5




[PATCH V2] staging: rt5208: Fix a sleep-in-atomic bug in xd_copy_page

2017-06-05 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
rtsx_exclusive_enter_ss (acquire the lock by spin_lock)
  rtsx_enter_ss
rtsx_power_off_card
  xd_cleanup_work
xd_delay_write
  xd_finish_write
xd_copy_page
  wait_timeout
schedule_timeout --> may sleep

To fix it, "wait_timeout" is replaced with mdelay in xd_copy_page.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
V2:
* Add code explanation in this patch.
---
 drivers/staging/rts5208/xd.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rts5208/xd.c b/drivers/staging/rts5208/xd.c
index 85aba05..74d36f9 100644
--- a/drivers/staging/rts5208/xd.c
+++ b/drivers/staging/rts5208/xd.c
@@ -1268,7 +1268,7 @@ static int xd_copy_page(struct rtsx_chip *chip, u32 
old_blk, u32 new_blk,
reg = 0;
rtsx_read_register(chip, XD_CTL, );
if (reg & (XD_ECC1_ERROR | XD_ECC2_ERROR)) {
-   wait_timeout(100);
+   mdelay(100);
 
if (detect_card_cd(chip,
   XD_CARD) != STATUS_SUCCESS) {
-- 
1.7.9.5




[PATCH] fs: nfs: Fix a sleep-in-atomic bug in nfs_access_add_cache

2017-06-05 Thread Jia-Ju Bai
The driver may sleep under a rcu read lock, and function call path is:
nfs_permission (acquire the lock by rcu_read_lock)
  nfs_do_access
nfs_access_add_cache
  kmalloc(GFP_KERNEL) --> may sleep

To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 fs/nfs/dir.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 32ccd77..7a074db 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2333,7 +2333,7 @@ static void nfs_access_add_rbtree(struct inode *inode, 
struct nfs_access_entry *
 
 void nfs_access_add_cache(struct inode *inode, struct nfs_access_entry *set)
 {
-   struct nfs_access_entry *cache = kmalloc(sizeof(*cache), GFP_KERNEL);
+   struct nfs_access_entry *cache = kmalloc(sizeof(*cache), GFP_ATOMIC);
if (cache == NULL)
return;
RB_CLEAR_NODE(>rb_node);
-- 
1.7.9.5




[PATCH] fs: xfs: Fix a lock-twice and sleep-in-atomic bug in xfs_iget

2017-06-05 Thread Jia-Ju Bai
The driver may sleep under a read rcu lock, and function call path is:
xfs_iget (acquire the lock by rcu_read_lock)
  "goto out_error_or_again" after xfs_iget_cache_hit
delay
  schedule_timeout_uninterruptible --> may sleep
Meanwhile, the rcu_read_lock will be called twice in this situation.

To fix it, the lock is released before "goto".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 fs/xfs/xfs_icache.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index f61c84f8..c2a4722 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -600,8 +600,10 @@ struct xfs_inode *
 
if (ip) {
error = xfs_iget_cache_hit(pag, ip, ino, flags, lock_flags);
-   if (error)
+   if (error) {
+   rcu_read_unlock();
goto out_error_or_again;
+   }
} else {
rcu_read_unlock();
XFS_STATS_INC(mp, xs_ig_missed);
-- 
1.7.9.5




[PATCH] rts5208: Fix a sleep-in-atomic bug in sd_send_cmd_get_rsp

2017-06-13 Thread Jia-Ju Bai

The driver may sleep under a spin lock, and the function call path is:
rtsx_exclusive_enter_ss (acquire the lock by spin_lock)
  rtsx_enter_ss
rtsx_power_off_card
  sd_cleanup_work
sd_stop_seq_mode
  sd_switch_clock
sd_ddr_tuning
  sd_ddr_pre_tuning_tx
sd_send_cmd_get_rsp
  wait_timeout
schedule_timeout -->  may sleep

To fix it, "wait_timeout" is replaced with mdelay in sd_send_cmd_get_rsp.

Signed-off-by: Jia-Ju Bai<baijiaju1...@163.com>
---
 drivers/staging/rts5208/sd.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rts5208/sd.c b/drivers/staging/rts5208/sd.c
index bdd35b6..fed17ff 100644
--- a/drivers/staging/rts5208/sd.c
+++ b/drivers/staging/rts5208/sd.c
@@ -226,7 +226,7 @@ static int sd_send_cmd_get_rsp(struct rtsx_chip *chip, u8 
cmd_idx,
return STATUS_FAIL;
}
if (rty_cnt<  SD_MAX_RETRY_COUNT) {
-   wait_timeout(20);
+   mdelay(20);
rty_cnt++;
goto RTY_SEND_CMD;
} else {
-- 1.7.9.5




[PATCH] net: tipc: Fix a sleep-in-atomic bug in tipc_msg_reverse

2017-06-10 Thread Jia-Ju Bai
The kernel may sleep under a rcu read lock in tipc_msg_reverse, and the
function call path is:
tipc_l2_rcv_msg (acquire the lock by rcu_read_lock)
  tipc_rcv
tipc_sk_rcv
  tipc_msg_reverse
pskb_expand_head(GFP_KERNEL) --> may sleep
tipc_node_broadcast
  tipc_node_xmit_skb
tipc_node_xmit
  tipc_sk_rcv
tipc_msg_reverse
  pskb_expand_head(GFP_KERNEL) --> may sleep

To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 net/tipc/msg.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tipc/msg.c b/net/tipc/msg.c
index 312ef7d..ab30876 100644
--- a/net/tipc/msg.c
+++ b/net/tipc/msg.c
@@ -508,7 +508,7 @@ bool tipc_msg_reverse(u32 own_node,  struct sk_buff **skb, 
int err)
}
 
if (skb_cloned(_skb) &&
-   pskb_expand_head(_skb, BUF_HEADROOM, BUF_TAILROOM, GFP_KERNEL))
+   pskb_expand_head(_skb, BUF_HEADROOM, BUF_TAILROOM, GFP_ATOMIC))
goto exit;
 
/* Now reverse the concerned fields */
-- 
1.7.9.5




[PATCH] net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx

2017-06-10 Thread Jia-Ju Bai
The kernel may sleep under a rcu read lock in cfpkt_create_pfx, and the
function call path is:
cfcnfg_linkup_rsp (acquire the lock by rcu_read_lock)
  cfctrl_linkdown_req
cfpkt_create
  cfpkt_create_pfx
alloc_skb(GFP_KERNEL) --> may sleep
cfserl_receive (acquire the lock by rcu_read_lock)
  cfpkt_split
cfpkt_create_pfx
  alloc_skb(GFP_KERNEL) --> may sleep

There is "in_interrupt" in cfpkt_create_pfx to decide use "GFP_KERNEL" or
"GFP_ATOMIC". In this situation, "GFP_KERNEL" is used because the function 
is called under a rcu read lock, instead in interrupt.

To fix it, only "GFP_ATOMIC" is used in cfpkt_create_pfx.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 net/caif/cfpkt_skbuff.c |6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/net/caif/cfpkt_skbuff.c b/net/caif/cfpkt_skbuff.c
index 59ce1fc..71b6ab2 100644
--- a/net/caif/cfpkt_skbuff.c
+++ b/net/caif/cfpkt_skbuff.c
@@ -81,11 +81,7 @@ static struct cfpkt *cfpkt_create_pfx(u16 len, u16 pfx)
 {
struct sk_buff *skb;
 
-   if (likely(in_interrupt()))
-   skb = alloc_skb(len + pfx, GFP_ATOMIC);
-   else
-   skb = alloc_skb(len + pfx, GFP_KERNEL);
-
+   skb = alloc_skb(len + pfx, GFP_ATOMIC);
if (unlikely(skb == NULL))
return NULL;
 
-- 
1.7.9.5




[PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_hw_read_wx_2M and netxen_nic_hw_write_wx_2M

2017-06-18 Thread Jia-Ju Bai
The driver may sleep under a write spin lock, and function call path is:
netxen_nic_hw_read_wx_2M (acquire the lock by write_lock_irqsave)
  crb_win_lock
netxen_pcie_sem_lock
  msleep --> may sleep
netxen_nic_hw_write_wx_2M (acquire the lock by write_lock_irqsave)
  crb_win_lock
netxen_pcie_sem_lock
  msleep --> may sleep

To fix it, the "msleep" is replaced with "mdelay" in netxen_pcie_sem_lock.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c 
b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
index a996801..0a9da42 100644
--- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
+++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
@@ -329,7 +329,7 @@ static void __iomem *pci_base_offset(struct netxen_adapter 
*adapter,
break;
if (++timeout >= NETXEN_PCIE_SEM_TIMEOUT)
return -EIO;
-   msleep(1);
+   mdelay(1);
}
 
if (id_reg)
-- 
1.7.9.5




[PATCH] qla4xxx: Fix a sleep-in-atomic bug in qla4_82xx_crb_win_lock

2017-06-18 Thread Jia-Ju Bai
The driver may sleep under a write spin lock, the function call path is:
qla4_82xx_wr_32 (acquire the lock)
  qla4_82xx_crb_win_lock
  schedule or cpu_relax

To fix it, the lock is released before "schedule" and "cpu_relax",
and the lock is acquired again after "schedule" and "cpu_relax".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/scsi/qla4xxx/ql4_glbl.h |2 +-
 drivers/scsi/qla4xxx/ql4_nx.c   |8 +---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/qla4xxx/ql4_glbl.h b/drivers/scsi/qla4xxx/ql4_glbl.h
index bce96a5..b723bef 100644
--- a/drivers/scsi/qla4xxx/ql4_glbl.h
+++ b/drivers/scsi/qla4xxx/ql4_glbl.h
@@ -115,7 +115,7 @@ uint8_t qla4xxx_update_local_ifcb(struct scsi_qla_host *ha,
 void qla4_82xx_queue_iocb(struct scsi_qla_host *ha);
 void qla4_82xx_complete_iocb(struct scsi_qla_host *ha);
 
-int qla4_82xx_crb_win_lock(struct scsi_qla_host *);
+int qla4_82xx_crb_win_lock(struct scsi_qla_host *, unsigned long *);
 void qla4_82xx_crb_win_unlock(struct scsi_qla_host *);
 int qla4_82xx_pci_get_crb_addr_2M(struct scsi_qla_host *, ulong *);
 void qla4_82xx_wr_32(struct scsi_qla_host *, ulong, u32);
diff --git a/drivers/scsi/qla4xxx/ql4_nx.c b/drivers/scsi/qla4xxx/ql4_nx.c
index e91abb3..1cf5f4a 100644
--- a/drivers/scsi/qla4xxx/ql4_nx.c
+++ b/drivers/scsi/qla4xxx/ql4_nx.c
@@ -386,7 +386,7 @@
 
if (rv == 1) {
write_lock_irqsave(>hw_lock, flags);
-   qla4_82xx_crb_win_lock(ha);
+   qla4_82xx_crb_win_lock(ha, );
qla4_82xx_pci_set_crbwindow_2M(ha, );
}
 
@@ -410,7 +410,7 @@ uint32_t qla4_82xx_rd_32(struct scsi_qla_host *ha, ulong 
off)
 
if (rv == 1) {
write_lock_irqsave(>hw_lock, flags);
-   qla4_82xx_crb_win_lock(ha);
+   qla4_82xx_crb_win_lock(ha, );
qla4_82xx_pci_set_crbwindow_2M(ha, );
}
data = readl((void __iomem *)off);
@@ -476,7 +476,7 @@ int qla4_82xx_md_wr_32(struct scsi_qla_host *ha, uint32_t 
off, uint32_t data)
 
 #define CRB_WIN_LOCK_TIMEOUT 1
 
-int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha)
+int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha, unsigned long *flags)
 {
int i;
int done = 0, timeout = 0;
@@ -491,6 +491,7 @@ int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha)
 
timeout++;
 
+   write_unlock_irqrestore(>hw_lock, *flags);
/* Yield CPU */
if (!in_interrupt())
schedule();
@@ -498,6 +499,7 @@ int qla4_82xx_crb_win_lock(struct scsi_qla_host *ha)
for (i = 0; i < 20; i++)
cpu_relax();/*This a nop instr on i386*/
}
+   write_lock_irqsave(>hw_lock, *flags);
}
qla4_82xx_wr_32(ha, QLA82XX_CRB_WIN_LOCK_ID, ha->func_num);
return 0;
-- 
1.7.9.5




[PATCH] oss: Fix a sleep-in-atomic bug in midi_outc

2017-06-18 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
midi_outc (acquire the lock by spin_lock_irqsave)
  oss_broken_sleep_on
schedule_timeout --> may sleep

To fix it, the lock is released before oss_broken_sleep_on, and the lock 
is acquired again after this function.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 sound/oss/sequencer.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/sound/oss/sequencer.c b/sound/oss/sequencer.c
index f19da4b..3d95d752 100644
--- a/sound/oss/sequencer.c
+++ b/sound/oss/sequencer.c
@@ -1211,7 +1211,9 @@ static void midi_outc(int dev, unsigned char data)
 
spin_lock_irqsave(,flags);
while (n && !midi_devs[dev]->outputc(dev, data)) {
+   spin_unlock_irqrestore(, flags);
oss_broken_sleep_on(_sleeper, HZ/25);
+   spin_lock_irqsave(, flags);
n--;
}
spin_unlock_irqrestore(,flags);
-- 
1.7.9.5




[PATCH] netxen: Fix a sleep-in-atomic bug in netxen_nic_pci_mem_access_direct

2017-06-18 Thread Jia-Ju Bai
The driver may sleep under a spin lock, and the function call path is:
netxen_nic_pci_mem_access_direct (acquire the lock by spin_lock)
  ioremap --> may sleep

To fix it, the lock is released before "ioremap", and the lock is 
acquired again after this function.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c 
b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
index a996801..5ea553e 100644
--- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
+++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
@@ -1419,7 +1419,9 @@ static u32 netxen_nic_io_read_2M(struct netxen_adapter 
*adapter,
 
mem_base = pci_resource_start(adapter->pdev, 0) +
(start & PAGE_MASK);
+   spin_unlock(>ahw.mem_lock);
mem_ptr = ioremap(mem_base, PAGE_SIZE);
+   spin_lock(>ahw.mem_lock);
if (mem_ptr == NULL) {
ret = -EIO;
goto unlock;
-- 
1.7.9.5




[PATCH] qed: Fix a sleep-in-interrupt bug in qed_int_sp_dpc

2017-05-31 Thread Jia-Ju Bai
The driver may sleep in interrupt handling, and the function call path is:
qed_int_sp_dpc (tasklet_init indicates it handles interrupt)
  qed_int_attentions
qed_mcp_handle_events
  qed_mcp_handle_link_change
qed_link_update
  qed_fill_link
qed_mcp_get_media_type
  qed_ptt_acquire
usleep_range --> may sleep

To fix it, the "usleep_range" is replaced with "udelay".

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/ethernet/qlogic/qed/qed_hw.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_hw.c 
b/drivers/net/ethernet/qlogic/qed/qed_hw.c
index a05feb3..3250cc4 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_hw.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_hw.c
@@ -131,7 +131,7 @@ struct qed_ptt *qed_ptt_acquire(struct qed_hwfn *p_hwfn)
}
 
spin_unlock_bh(_hwfn->p_ptt_pool->lock);
-   usleep_range(1000, 2000);
+   udelay(1500);
}
 
DP_NOTICE(p_hwfn, "PTT acquire timeout - failed to allocate PTT\n");
-- 
1.7.9.5




[PATCH v2 2/3] rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_createbss_cmd

2017-10-08 Thread Jia-Ju Bai
The driver may sleep under a spinlock, and the function call path is:
rtw_surveydone_event_callback(acquire the spinlock)
  rtw_createbss_cmd
kzalloc(GFP_KERNEL) --> may sleep

To fix it, GFP_KERNEL is replaced with GFP_ATOMIC.
This bug is found by my static analysis tool and my code review.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/rtl8188eu/core/rtw_cmd.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_cmd.c 
b/drivers/staging/rtl8188eu/core/rtw_cmd.c
index 9461bce..430b8eb 100644
--- a/drivers/staging/rtl8188eu/core/rtw_cmd.c
+++ b/drivers/staging/rtl8188eu/core/rtw_cmd.c
@@ -333,7 +333,7 @@ u8 rtw_createbss_cmd(struct adapter  *padapter)
else
RT_TRACE(_module_rtl871x_cmd_c_, _drv_info_, (" createbss for 
SSid:%s\n", pmlmepriv->assoc_ssid.Ssid));
 
-   pcmd = kzalloc(sizeof(struct cmd_obj), GFP_KERNEL);
+   pcmd = kzalloc(sizeof(struct cmd_obj), GFP_ATOMIC);
if (!pcmd) {
res = _FAIL;
goto exit;
-- 
1.7.9.5




[PATCH v2 1/3] rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_disassoc_cmd

2017-10-08 Thread Jia-Ju Bai
The driver may sleep under a spinlock, and the function call path is:
rtw_set_802_11_bssid(acquire the spinlock)
  rtw_disassoc_cmd
kzalloc(GFP_KERNEL) --> may sleep

To fix it, GFP_KERNEL is replaced with GFP_ATOMIC.
This bug is found by my static analysis tool and my code review.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/rtl8188eu/core/rtw_cmd.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_cmd.c 
b/drivers/staging/rtl8188eu/core/rtw_cmd.c
index 9461bce..65083a7 100644
--- a/drivers/staging/rtl8188eu/core/rtw_cmd.c
+++ b/drivers/staging/rtl8188eu/core/rtw_cmd.c
@@ -508,7 +508,7 @@ u8 rtw_disassoc_cmd(struct adapter *padapter, u32 
deauth_timeout_ms, bool enqueu
 
if (enqueue) {
/* need enqueue, prepare cmd_obj and enqueue */
-   cmdobj = kzalloc(sizeof(*cmdobj), GFP_KERNEL);
+   cmdobj = kzalloc(sizeof(*cmdobj), GFP_ATOMIC);
if (!cmdobj) {
res = _FAIL;
kfree(param);
-- 
1.7.9.5




[PATCH v2 3/3] rtl8188eu: Fix a possible sleep-in-atomic bug in _rtw_pwr_wakeup

2017-10-08 Thread Jia-Ju Bai
The driver may sleep under a spinlock, and the function call path is:
rtw_set_802_11_disassociate(acquire the spinlock)
  _rtw_pwr_wakeup
usleep_range --> may sleep

To fix it, usleep_range is replaced with udelay.
This bug is found by my static analysis tool and my code review.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/rtl8188eu/core/rtw_pwrctrl.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_pwrctrl.c 
b/drivers/staging/rtl8188eu/core/rtw_pwrctrl.c
index f86c9ce..2913661 100644
--- a/drivers/staging/rtl8188eu/core/rtw_pwrctrl.c
+++ b/drivers/staging/rtl8188eu/core/rtw_pwrctrl.c
@@ -569,7 +569,7 @@ int _rtw_pwr_wakeup(struct adapter *padapter, u32 
ips_deffer_ms, const char *cal
DBG_88E("%s wait ps_processing...\n", __func__);
while (pwrpriv->ps_processing &&
   jiffies_to_msecs(jiffies - start) <= 3000)
-   usleep_range(1000, 3000);
+   udelay(1500);
if (pwrpriv->ps_processing)
DBG_88E("%s wait ps_processing timeout\n", __func__);
else
-- 
1.7.9.5




[PATCH] arcmsr: Fix possible sleep-in-atomic bugs in arcmsr_queue_command

2017-10-08 Thread Jia-Ju Bai
The driver may sleep under a spinlock, and the function call paths are:
arcmsr_queue_command(acquire the spinlock)
  arcmsr_queue_command_lck
arcmsr_handle_virtual_command
  arcmsr_iop_message_xfer
arcmsr_iop_parking
  arcmsr_stop_adapter_bgrb
arcmsr_hbaA_stop_bgrb
  arcmsr_hbaA_wait_msgint_ready
msleep --> may sleep

arcmsr_queue_command(acquire the spinlock)
  arcmsr_queue_command_lck
arcmsr_handle_virtual_command
  arcmsr_iop_message_xfer
arcmsr_iop_parking
  arcmsr_stop_adapter_bgrb
arcmsr_hbaB_stop_bgrb
  arcmsr_hbaB_wait_msgint_ready
msleep --> may sleep

arcmsr_queue_command(acquire the spinlock)
  arcmsr_queue_command_lck
arcmsr_handle_virtual_command
  arcmsr_iop_message_xfer
arcmsr_iop_parking
  arcmsr_stop_adapter_bgrb
arcmsr_hbaC_stop_bgrb
  arcmsr_hbaC_wait_msgint_ready
msleep --> may sleep

arcmsr_queue_command(acquire the spinlock)
  arcmsr_queue_command_lck
arcmsr_handle_virtual_command
  arcmsr_iop_message_xfer
arcmsr_iop_parking
  arcmsr_stop_adapter_bgrb
arcmsr_hbaD_stop_bgrb
  arcmsr_hbaD_wait_msgint_ready
msleep --> may sleep

To fix them, msleep is replaced with mdelay.
These bugs are found by my static analysis tool and my code review.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/scsi/arcmsr/arcmsr_hba.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index af032c4..31d94bd 100644
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -347,7 +347,7 @@ static uint8_t arcmsr_hbaA_wait_msgint_ready(struct 
AdapterControlBlock *acb)
>outbound_intstatus);
return true;
}
-   msleep(10);
+   mdelay(10);
} /* max 20 seconds */
 
return false;
@@ -367,7 +367,7 @@ static uint8_t arcmsr_hbaB_wait_msgint_ready(struct 
AdapterControlBlock *acb)
reg->drv2iop_doorbell);
return true;
}
-   msleep(10);
+   mdelay(10);
} /* max 20 seconds */
 
return false;
@@ -385,7 +385,7 @@ static uint8_t arcmsr_hbaC_wait_msgint_ready(struct 
AdapterControlBlock *pACB)
>outbound_doorbell_clear); /*clear 
interrupt*/
return true;
}
-   msleep(10);
+   mdelay(10);
} /* max 20 seconds */
 
return false;
@@ -403,7 +403,7 @@ static bool arcmsr_hbaD_wait_msgint_ready(struct 
AdapterControlBlock *pACB)
reg->outbound_doorbell);
return true;
}
-   msleep(10);
+   mdelay(10);
} /* max 20 seconds */
return false;
 }
-- 
1.7.9.5




[PATCH] wcn36xx: Remove unnecessary rcu_read_unlock in wcn36xx_bss_info_changed

2017-10-08 Thread Jia-Ju Bai
No rcu_read_lock is called, but rcu_read_unlock is still called.
Thus rcu_read_unlock should be removed.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/net/wireless/ath/wcn36xx/main.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/wireless/ath/wcn36xx/main.c 
b/drivers/net/wireless/ath/wcn36xx/main.c
index 35bd50b..b83f01d 100644
--- a/drivers/net/wireless/ath/wcn36xx/main.c
+++ b/drivers/net/wireless/ath/wcn36xx/main.c
@@ -812,7 +812,6 @@ static void wcn36xx_bss_info_changed(struct ieee80211_hw 
*hw,
if (!sta) {
wcn36xx_err("sta %pM is not found\n",
  bss_conf->bssid);
-   rcu_read_unlock();
goto out;
}
sta_priv = wcn36xx_sta_to_priv(sta);
-- 
1.7.9.5




[BUG] gma500: Possible sleep-in-atomic bugs in gma_resume_pci

2017-10-08 Thread Jia-Ju Bai

The driver may sleep under a spinlock, and the function call paths are:
gma_power_begin (acquire the spinlock)
  gma_resume_pci
pci_set_power_state
  __pci_start_power_transition (drivers/pci/pci.c)
msleep --> may sleep

gma_power_begin (acquire the spinlock)
  gma_resume_pci
pci_enable_device
  pci_enable_device_flags (drivers/pci/pci.c)
do_pci_enable_device
  pci_set_power_state
__pci_start_power_transition
  msleep --> may sleep

A possible fix is to replace msleep with mdelay in 
__pci_start_power_transition in drivers/pci/pci.c.


These bugs are found by my static analysis tool and my code review.


Thanks,
Jia-Ju Bai



[BUG] rtl8188eu: Some possible sleep-in-atomic bugs in ips_leave

2017-10-08 Thread Jia-Ju Bai

CC to mailing list.

On 2017/10/8 20:13, Jia-Ju Bai wrote:
The driver may sleep under a spinlock when calling the function 
"ips_leave", which causes some possible sleep-in-atomic bugs.

Here are several examples:
rtw_set_802_11_disassociate (acquire the spinlock)
  _rtw_pwr_wakeup
ips_leave
  mutex_lock --> may sleep

rtw_set_802_11_disassociate (acquire the spinlock)
  _rtw_pwr_wakeup
ips_leave
  rtw_ips_pwr_up
ips_netdrv_open
  rtw_hal_init
rtl8188eu_hal_init
  rtl88eu_download_fw
request_firmware --> may sleep
kmalloc --> may sleep

rtw_set_802_11_disassociate (acquire the spinlock)
  _rtw_pwr_wakeup
ips_leave
  rtw_set_key
kzalloc(GFP_KERNEL) --> may sleep

All these bugs are caused by that "ips_leave" calls some sleep-able 
functions.
A possible fix is to release the spinlock before calling "ips_leave", 
and acquire the spinlock again after it.


These bugs are found by my static analysis tool and my code review.


Thanks,
Jia-Ju Bai





[PATCH] staging/rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_disassoc_cmd

2017-10-07 Thread Jia-Ju Bai
The driver may sleep under a spinlock, and the function call path is:
rtw_set_802_11_bssid(acquire the spinlock)
  rtw_disassoc_cmd
kzalloc(GFP_KERNEL) --> may sleep

To fix it, GFP_KERNEL is replaced with GFP_ATOMIC.
This bug is found by my static analysis tool and my code review.

Signed-off-by: Jia-Ju Bai <baijiaju1...@163.com>
---
 drivers/staging/rtl8188eu/core/rtw_cmd.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rtl8188eu/core/rtw_cmd.c 
b/drivers/staging/rtl8188eu/core/rtw_cmd.c
index 9461bce..65083a7 100644
--- a/drivers/staging/rtl8188eu/core/rtw_cmd.c
+++ b/drivers/staging/rtl8188eu/core/rtw_cmd.c
@@ -508,7 +508,7 @@ u8 rtw_disassoc_cmd(struct adapter *padapter, u32 
deauth_timeout_ms, bool enqueu
 
if (enqueue) {
/* need enqueue, prepare cmd_obj and enqueue */
-   cmdobj = kzalloc(sizeof(*cmdobj), GFP_KERNEL);
+   cmdobj = kzalloc(sizeof(*cmdobj), GFP_ATOMIC);
if (!cmdobj) {
res = _FAIL;
kfree(param);
-- 
1.7.9.5




[BUG] haswell: A possible sleep-in-atomic bug in hsw_irq_thread

2017-10-08 Thread Jia-Ju Bai

According to sst-haswell-ipc.c, the driver may sleep under a spinlock,
and the function call path is:
hsw_irq_thread (acquire the spinlock)
  hsw_process_notification
hsw_log_message
  mutex_lock --> may sleep

This bug is found by my static analysis tool and my code review.


Thanks,
Jia-Ju Bai



[BUG] stmmac: A possible sleep-in-atomic bug in stmmac_suspend

2017-10-08 Thread Jia-Ju Bai

According to stmmac_main.c, the driver may sleep under a spinlock,
and the function call path is:
stmmac_suspend (acquire the spinlock)
  stmmac_disable_all_queues
napi_disable
  might_sleep --> may sleep
  msleep --> may sleep

This bug is found by my static analysis tool and my code review.


Thanks,
Jia-Ju Bai



Re: [PATCH] ext2/super: Fix a possible sleep-in-atomic bug in parse_options

2017-10-06 Thread Jia-Ju Bai

Thanks for your reply.
I agree that extra allocation in match_number() and match_u64int() may 
be unnecessary.


Thanks,
Jia-Ju Bai


On 2017/10/7 9:37, Linus Torvalds wrote:

On Fri, Oct 6, 2017 at 6:20 PM, Jia-Ju Bai <baijiaju1...@163.com> wrote:

To fix it, GFP_KERNEL is replaced with GFP_ATOMIC.
This bug is found by my static analysis tool and my code review.

I'm not saying your patch is wrong, but it's a shame that we do that
extra allocation in match_number() and match_u64int(), and that we
don't have anything that is just size-limited.

And there really isn't anything saying that we shouldn't do the same
silly thing to match_u64int(). Maybe we don't have any actual users
that need it for now, but still..

Oh well.

I do wonder if we shouldn't just use something like

  "skip leading zeroes, copy to size-limited stack location instead"

because the input length really *is* limited once you skip leading
zeroes (and whatever base marker we have). We might have at most a
64-bit value in octal, so 22 bytes max.

But I guess just changing the two GFP_KERNEL's to GFP_ATOMIC is much simpler.

Linus





  1   2   3   4   5   6   7   8   9   10   >