Re: WARN_ON(irqs_disabled()) in dma_free_attrs?

2018-03-04 Thread Alan Stern
On Sun, 4 Mar 2018, Fredrik Noring wrote:

> Hi Alan Stern,
> 
> > > Alan, can dma_free_coherent be delayed to a point when IRQs are enabled?
> > 
> > Yes, subject to the usual concerns about not being delayed for too 
> > long.  Also, some HCDs are highly memory-constrained.  I don't know if 
> > they use this API, but if they do then delaying a free could result in 
> > not enough memory being available when an allocation is needed.
> 
> The PS2 HCD in this case is limited to 256 KiB and memory allocation
> failures can be observed frequently in USB traces (102 subsequent failures
> in the trace below involving mass storage transactions, for example). Is
> there any smartness to the allocations?

Not as far as I know.

>  My impression is that the USB core
> loops until it gets what it wants, and then happily resumes. Does it busy
> wait?

No, it doesn't loop and it doesn't busy-wait.  It just fails if memory 
can't be allocated immediately.  Maybe the higher-level driver has a 
retry loop of some sort.

> The RT3070 wireless USB device driver, for example, has hardcoded buffer
> limits that exceed the total amount of available memory. It refuses to
> accept devices unless adjusted in the source (as in the patch below), after
> which it works nicely.
> 
> Other USB device drivers such as the one for the AR9271 wireless device
> trigger endlessly repeating kernel warnings claiming
> 
>   BOGUS urb xfer, pipe 1 != type 3
> 
> as shown in this kernel backtrace:

...

> I don't know if this is related to memory limitations or some other problem
> though.

Another problem.  This message indicates there's a bug in the ath9k
driver; it says that the driver submitted an interrupt URB for a bulk
endpoint.

Alan Stern

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4] iommu/amd: Add support for fast IOTLB flushing

2018-03-04 Thread Suravee Suthikulpanit

Ping..

Joerg, when you get a chance, would you please let me know if you have any 
other concerns for this v4.

Thanks,
Suravee

On 2/21/18 2:19 PM, Suravee Suthikulpanit wrote:

Since AMD IOMMU driver currently flushes all TLB entries
when page size is more than one, use the same interface
for both iommu_ops.flush_iotlb_all() and iommu_ops.iotlb_sync().

Cc: Joerg Roedel 
Signed-off-by: Suravee Suthikulpanit 
---
Changes from v3 (https://patchwork.kernel.org/patch/10193235)
  * Change amd_iommu_iotlb_range_add() to no-op and iotlb_sync()
to full domain flush for now since we currently flush all entries
when the page size is more than one.
  * Fine-grained invalidation will be introduced in subsequent
patch series.

  drivers/iommu/amd_iommu.c | 19 ---
  1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index fed8059..6061a8d 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3043,9 +3043,6 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, 
unsigned long iova,
unmap_size = iommu_unmap_page(domain, iova, page_size);
mutex_unlock(>api_lock);
  
-	domain_flush_tlb_pde(domain);

-   domain_flush_complete(domain);
-
return unmap_size;
  }
  
@@ -3163,6 +3160,19 @@ static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,

return dev_data->defer_attach;
  }
  
+static void amd_iommu_flush_iotlb_all(struct iommu_domain *domain)

+{
+   struct protection_domain *dom = to_pdomain(domain);
+
+   domain_flush_tlb_pde(dom);
+   domain_flush_complete(dom);
+}
+
+static void amd_iommu_iotlb_range_add(struct iommu_domain *domain,
+ unsigned long iova, size_t size)
+{
+}
+
  const struct iommu_ops amd_iommu_ops = {
.capable = amd_iommu_capable,
.domain_alloc = amd_iommu_domain_alloc,
@@ -3181,6 +3191,9 @@ static bool amd_iommu_is_attach_deferred(struct 
iommu_domain *domain,
.apply_resv_region = amd_iommu_apply_resv_region,
.is_attach_deferred = amd_iommu_is_attach_deferred,
.pgsize_bitmap  = AMD_IOMMU_PGSIZES,
+   .flush_iotlb_all = amd_iommu_flush_iotlb_all,
+   .iotlb_range_add = amd_iommu_iotlb_range_add,
+   .iotlb_sync = amd_iommu_flush_iotlb_all,
  };
  
  /*



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: WARN_ON(irqs_disabled()) in dma_free_attrs?

2018-03-04 Thread Fredrik Noring
Hi Alan Stern,

> > Alan, can dma_free_coherent be delayed to a point when IRQs are enabled?
> 
> Yes, subject to the usual concerns about not being delayed for too 
> long.  Also, some HCDs are highly memory-constrained.  I don't know if 
> they use this API, but if they do then delaying a free could result in 
> not enough memory being available when an allocation is needed.

The PS2 HCD in this case is limited to 256 KiB and memory allocation
failures can be observed frequently in USB traces (102 subsequent failures
in the trace below involving mass storage transactions, for example). Is
there any smartness to the allocations? My impression is that the USB core
loops until it gets what it wants, and then happily resumes. Does it busy
wait?

The RT3070 wireless USB device driver, for example, has hardcoded buffer
limits that exceed the total amount of available memory. It refuses to
accept devices unless adjusted in the source (as in the patch below), after
which it works nicely.

Other USB device drivers such as the one for the AR9271 wireless device
trigger endlessly repeating kernel warnings claiming

BOGUS urb xfer, pipe 1 != type 3

as shown in this kernel backtrace:

[ cut here ]
WARNING: CPU: 0 PID: 15 at drivers/usb/core/urb.c:471 usb_submit_urb+0x280/0x528
usb 1-1: BOGUS urb xfer, pipe 1 != type 3
Modules linked in: ath9k_htc ath9k_common ath9k_hw ath mac80211 cfg80211 usbmon
CPU: 0 PID: 15 Comm: kworker/0:1 Tainted: GW4.15.0+ #831
Workqueue: events request_firmware_work_func
Stack : 0009 01d7 0001 81714028 81714000 8006161c 80519af4 000f
805e3b08 01d7 80519a68 81cd3ad4 81714000 10058c00 81cd3aa8 80587340
  03f8 0007  03f8  
0040 0008  81ccd288   803036a8 8053e854
0009 01d7 0001 81714028 0008 ffc7  805e
...
Call Trace:
[<2318fb9b>] show_stack+0x74/0x104
[] __warn+0x118/0x120
[<01359d10>] warn_slowpath_fmt+0x30/0x3c
[] usb_submit_urb+0x280/0x528
[<89eb7b59>] hif_usb_send+0x3b0/0x3e0 [ath9k_htc]
[<51446a9c>] ath9k_wmi_cmd+0x194/0x228 [ath9k_htc]
[<2119badc>] ath9k_regread+0x5c/0x88 [ath9k_htc]
[] ath9k_hw_wait+0xa4/0xc0 [ath9k_hw]
[<368b3c01>] ath9k_hw_set_reset_reg+0x23c/0x288 [ath9k_hw]
[<28807ea6>] ath9k_hw_init+0x3e8/0xa24 [ath9k_hw]
[] ath9k_htc_probe_device+0x38c/0xa2c [ath9k_htc]
[] ath9k_htc_hw_init+0x20/0x4c [ath9k_htc]
[<6500cd86>] ath9k_hif_usb_firmware_cb+0x780/0x854 [ath9k_htc]
[] request_firmware_work_func+0x40/0x70
[<54f8834a>] process_one_work+0x1f4/0x358
[<23b84d12>] worker_thread+0x354/0x4b8
[<583bf61b>] kthread+0x134/0x13c
[<213b5c9e>] ret_from_kernel_thread+0x14/0x1c
---[ end trace 2a97c69dce30b650 ]---

I don't know if this is related to memory limitations or some other problem
though.

Fredrik

--- a/drivers/net/wireless/ralink/rt2x00/rt2800usb.c
+++ b/drivers/net/wireless/ralink/rt2x00/rt2800usb.c
@@ -871,7 +871,7 @@ static void rt2800usb_queue_init(struct data_queue *queue)
 
switch (queue->qid) {
case QID_RX:
-   queue->limit = 128;
+   queue->limit = 8;
queue->data_size = AGGREGATION_SIZE;
queue->desc_size = RXINFO_DESC_SIZE;
queue->winfo_size = rxwi_size;
@@ -882,7 +882,7 @@ static void rt2800usb_queue_init(struct data_queue *queue)
case QID_AC_VI:
case QID_AC_BE:
case QID_AC_BK:
-   queue->limit = 16;
+   queue->limit = 8;
queue->data_size = AGGREGATION_SIZE;
queue->desc_size = TXINFO_DESC_SIZE;
queue->winfo_size = txwi_size;

...
81f58280 299082198 C Bi:1:003:1 0 13 = 55534253 ...
81f58280 299082510 S Bo:1:003:2 -150 31 = 55534243 ...
81f58280 299083200 C Bo:1:003:2 0 31 >
81f9aa00 299083298 S Bi:1:003:1 -150 65536 <
81f9a800 299093849 S Ci:1:002:0 s c0 07  1700 0004 4 <
81f9ab00 299095041 S Bi:1:003:1 -150 8192 <
81f9a800 299095229 C Ci:1:002:0 0 4 = 2d00
81f9a680 299096808 S Bi:1:003:1 -150 49152 <
81f9a680 299096838 E Bi:1:003:1 -12 0
81f9a680 299097868 S Bi:1:003:1 -150 49152 <
81f9a680 299097896 E Bi:1:003:1 -12 0
81f9a680 299098894 S Bi:1:003:1 -150 49152 <
... 97 allocation failures repeated ...
81f9a680 299151156 S Bi:1:003:1 -150 49152 <
81f9a680 299151166 E Bi:1:003:1 -12 0
81f9a680 299151179 S Bi:1:003:1 -150 49152 <
81f9a680 299151190 E Bi:1:003:1 -12 0
81f9a680 299151203 S Bi:1:003:1 -150 49152 <
81f9a680 299151216 E Bi:1:003:1 -12 0
81f9a680 299151231 S Bi:1:003:1 -150 49152 <
81f9a680 299159909 S Bi:1:003:1 -150 49152 <
81f9a680 299160814 S Bi:1:003:1 -150 49152 <
81f9a680 299161732 S Bi:1:003:1 -150 49152 <
81f9a680 299162644 S Bi:1:003:1 -150 49152 <
81f9a680 299163574 S Bi:1:003:1 -150 49152 <
81f9a680 299164490 S Bi:1:003:1 -150 49152 <
81f9aa00 299172593 C Bi:1:003:1 0 65536 = f7b07b39 ...
81f9ab00 299175259 C Bi:1:003:1 0 8192 = 26e2c254 

Re: WARN_ON(irqs_disabled()) in dma_free_attrs?

2018-03-04 Thread Alan Stern
On Sat, 3 Mar 2018, Fredrik Noring wrote:

> Christoph, Alan,
> 
> > If it is allocating / freeing this memory all the time in the hot path
> > it should really use a dma pool (see include/ilinux/dmapool.h).
> > The dma coherent APIs aren't really built for being called in the
> > hot path.
> 
> hcd_buffer_free uses a combination of dma pools and dma coherent APIs:
> 
>   ...
>   for (i = 0; i < HCD_BUFFER_POOLS; i++) {
>   if (size <= pool_max[i]) {
>   dma_pool_free(hcd->pool[i], addr, dma);
>   return;
>   }
>   }
>   dma_free_coherent(hcd->self.sysdev, size, addr, dma);
> 
> Alan, can dma_free_coherent be delayed to a point when IRQs are enabled?

Yes, subject to the usual concerns about not being delayed for too 
long.  Also, some HCDs are highly memory-constrained.  I don't know if 
they use this API, but if they do then delaying a free could result in 
not enough memory being available when an allocation is needed.

Alan Stern

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu