Re: amdgpu/TTM oopses since merging swiotlb_dma_ops into the dma_direct code

2019-01-14 Thread Sibren Vasse
On Mon, 14 Jan 2019 at 19:13, Christoph Hellwig  wrote:
>
> Hmm, I wonder if we are not actually using swiotlb in the end,
> can you check if your dmesg contains this line or not?
>
> PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
This line does not appear in my dmesg.

>
> If not I guess we found a bug in swiotlb exit vs is_swiotlb_buffer,
> and you can try this patch:
>
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index d6361776dc5c..1fb6fd68b9c7 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -378,6 +378,8 @@ void __init swiotlb_exit(void)
> memblock_free_late(io_tlb_start,
>PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
> }
> +   io_tlb_start = 0;
> +   io_tlb_end = 0;
> io_tlb_nslabs = 0;
> max_segment = 0;
>  }
With the patch applied to v5.0-rc2 I can no longer reproduce the issue.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: amdgpu/TTM oopses since merging swiotlb_dma_ops into the dma_direct code

2019-01-14 Thread Sibren Vasse
On Mon, 14 Jan 2019 at 19:10, Christoph Hellwig  wrote:
>
> On Thu, Jan 10, 2019 at 06:52:26PM +0100, Sibren Vasse wrote:
> > On Thu, 10 Jan 2019 at 15:48, Christoph Hellwig  wrote:
> > >
> > > On Thu, Jan 10, 2019 at 03:00:31PM +0100, Christian König wrote:
> > > >>  From the trace it looks like we git the case where swiotlb tries
> > > >> to copy back data from a bounce buffer, but hits a dangling or NULL
> > > >> pointer.  So a couple questions for the submitter:
> > > >>
> > > >>   - does the system have more than 4GB memory and thus use swiotlb?
> > > >> (check /proc/meminfo, and if something SWIOTLB appears in dmesg)
> > > >>   - does the device this happens on have a DMA mask smaller than
> > > >> the available memory, that is should swiotlb be used here to start
> > > >> with?
> > > >
> > > > Rather unlikely. The device is an AMD GPU, so we can address memory up 
> > > > to
> > > > 1TB.
> > >
> > > So we probably somehow got a false positive.
> > >
> > > For now I'like the reported to confirm that the dma_direct_unmap_page+0x92
> > > backtrace really is in the swiotlb code (I can't think of anything else,
> > > but I'd rather be sure).
> > I'm not sure what you want me to confirm. Could you elaborate?
>
> Please open the vmlinux file for which this happend in gdb,
> then send the output from this command
>
> l *(dma_direct_unmap_page+0x92)
>
> to this thread.
My call trace contained:
Jan 10 16:34:51  kernel:  dma_direct_unmap_page+0x7a/0x80

(gdb) list *(dma_direct_unmap_page+0x7a)
0x810fa28a is in dma_direct_unmap_page (kernel/dma/direct.c:291).
286 size_t size, enum dma_data_direction dir,
unsigned long attrs)
287 {
288 phys_addr_t phys = dma_to_phys(dev, addr);
289
290 if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
291 dma_direct_sync_single_for_cpu(dev, addr, size, dir);
292
293 if (unlikely(is_swiotlb_buffer(phys)))
294 swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs);
295 }
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: amdgpu/TTM oopses since merging swiotlb_dma_ops into the dma_direct code

2019-01-10 Thread Sibren Vasse
On Thu, 10 Jan 2019 at 15:48, Christoph Hellwig  wrote:
>
> On Thu, Jan 10, 2019 at 03:00:31PM +0100, Christian König wrote:
> >>  From the trace it looks like we git the case where swiotlb tries
> >> to copy back data from a bounce buffer, but hits a dangling or NULL
> >> pointer.  So a couple questions for the submitter:
> >>
> >>   - does the system have more than 4GB memory and thus use swiotlb?
> >> (check /proc/meminfo, and if something SWIOTLB appears in dmesg)
> >>   - does the device this happens on have a DMA mask smaller than
> >> the available memory, that is should swiotlb be used here to start
> >> with?
> >
> > Rather unlikely. The device is an AMD GPU, so we can address memory up to
> > 1TB.
>
> So we probably somehow got a false positive.
>
> For now I'like the reported to confirm that the dma_direct_unmap_page+0x92
> backtrace really is in the swiotlb code (I can't think of anything else,
> but I'd rather be sure).
I'm not sure what you want me to confirm. Could you elaborate?

>
> Second it would be great to print what the contents of io_tlb_start
> and io_tlb_end are, e.g. by doing a printk_once in is_swiotlb_buffer,
> maybe that gives a clue why we are hitting the swiotlb code here.

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 7c007ed7505f..042246dbae00 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -69,6 +69,7 @@ extern phys_addr_t io_tlb_start, io_tlb_end;

 static inline bool is_swiotlb_buffer(phys_addr_t paddr)
 {
+printk_once(KERN_INFO "io_tlb_start: %llu, io_tlb_end: %llu",
io_tlb_start, io_tlb_end);
 return paddr >= io_tlb_start && paddr < io_tlb_end;
 }

Result on boot:
[   11.405558] io_tlb_start: 3782983680, io_tlb_end: 3850092544

Regards,

Sibren
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: amdgpu/TTM oopses since merging swiotlb_dma_ops into the dma_direct code

2019-01-10 Thread Sibren Vasse
On Thu, 10 Jan 2019 at 18:06, Konrad Rzeszutek Wilk
 wrote:
>
> On Thu, Jan 10, 2019 at 04:26:43PM +0100, Sibren Vasse wrote:
> > On Thu, 10 Jan 2019 at 14:57, Christoph Hellwig  wrote:
> > >
> > > On Thu, Jan 10, 2019 at 10:59:02AM +0100, Michel Dänzer wrote:
> > > >
> > > > Hi Christoph,
> > > >
> > > >
> > > > https://bugs.freedesktop.org/109234 (please ignore comments #6-#9) was
> > > > bisected to your commit 55897af63091 "dma-direct: merge swiotlb_dma_ops
> > > > into the dma_direct code". Any ideas?
> > >
> > > From the trace it looks like we git the case where swiotlb tries
> > > to copy back data from a bounce buffer, but hits a dangling or NULL
> > > pointer.  So a couple questions for the submitter:
> > My apologies if I misunderstand something, this subject matter is new to me.
> >
> > >
> > >  - does the system have more than 4GB memory and thus use swiotlb?
> > My system has 8GB memory. The other report on the bug tracker had 16GB.
> >
> > >(check /proc/meminfo, and if something SWIOTLB appears in dmesg)
> > /proc/meminfo: https://ptpb.pw/4rxI
> > Can I grep dmesg for a string?
>
> Can you attach the 'dmesg'?
Dmesg attached.


dmesg
Description: Binary data
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: amdgpu/TTM oopses since merging swiotlb_dma_ops into the dma_direct code

2019-01-10 Thread Sibren Vasse
On Thu, 10 Jan 2019 at 14:57, Christoph Hellwig  wrote:
>
> On Thu, Jan 10, 2019 at 10:59:02AM +0100, Michel Dänzer wrote:
> >
> > Hi Christoph,
> >
> >
> > https://bugs.freedesktop.org/109234 (please ignore comments #6-#9) was
> > bisected to your commit 55897af63091 "dma-direct: merge swiotlb_dma_ops
> > into the dma_direct code". Any ideas?
>
> From the trace it looks like we git the case where swiotlb tries
> to copy back data from a bounce buffer, but hits a dangling or NULL
> pointer.  So a couple questions for the submitter:
My apologies if I misunderstand something, this subject matter is new to me.

>
>  - does the system have more than 4GB memory and thus use swiotlb?
My system has 8GB memory. The other report on the bug tracker had 16GB.

>(check /proc/meminfo, and if something SWIOTLB appears in dmesg)
/proc/meminfo: https://ptpb.pw/4rxI
Can I grep dmesg for a string?

>  - does the device this happens on have a DMA mask smaller than
>the available memory, that is should swiotlb be used here to start
>with?
It's a MSI Radeon RX 570 Gaming X 4GB. The other report was a RX 580.
lshw output: https://ptpb.pw/6s0H


Regards,

Sibren
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx