Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
..snip.. > > > I think the main question I have is how would you like to see patches for > > > 5.15? i.e. as patches on top of devel/for-linus-5.14 or something else? > > > > Yes that would be perfect. If there are any dependencies on the rc1, I > > can rebase it on top of that. > > Yes, please, rebasing would be very helpful. The broader rework of > 'io_tlb_default_mem' is going to conflict quite badly otherwise. There is a devel/for-linus-5.15 (based on v5.14-rc1) now. Thank you! > > Cheers, > > Will
Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
On Tue, Jul 06, 2021 at 05:57:21PM +0100, Will Deacon wrote: > On Tue, Jul 06, 2021 at 10:46:07AM -0400, Konrad Rzeszutek Wilk wrote: > > On Tue, Jul 06, 2021 at 04:05:13PM +0200, Christoph Hellwig wrote: > > > On Tue, Jul 06, 2021 at 03:01:04PM +0100, Robin Murphy wrote: > > > > FWIW I was pondering the question of whether to do something along > > > > those > > > > lines or just scrap the default assignment entirely, so since I hadn't > > > > got > > > > round to saying that I've gone ahead and hacked up the alternative > > > > (similarly untested) for comparison :) > > > > > > > > TBH I'm still not sure which one I prefer... > > > > > > Claire did implement something like your suggestion originally, but > > > I don't really like it as it doesn't scale for adding multiple global > > > pools, e.g. for the 64-bit addressable one for the various encrypted > > > secure guest schemes. > > > > Couple of things: > > - I am not pushing to Linus the Claire's patchset until we have a > >resolution on this. I hope you all agree that is a sensible way > >forward as much as I hate doing that. > > Sure, it's a pity but we could clearly use a bit more time to get these > just right and we've run out of time for 5.14. > > I think the main question I have is how would you like to see patches for > 5.15? i.e. as patches on top of devel/for-linus-5.14 or something else? Yes that would be perfect. If there are any dependencies on the rc1, I can rebase it on top of that. > > Cheers, > > Will
Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
On Tue, Jul 06, 2021 at 04:05:13PM +0200, Christoph Hellwig wrote: > On Tue, Jul 06, 2021 at 03:01:04PM +0100, Robin Murphy wrote: > > FWIW I was pondering the question of whether to do something along those > > lines or just scrap the default assignment entirely, so since I hadn't got > > round to saying that I've gone ahead and hacked up the alternative > > (similarly untested) for comparison :) > > > > TBH I'm still not sure which one I prefer... > > Claire did implement something like your suggestion originally, but > I don't really like it as it doesn't scale for adding multiple global > pools, e.g. for the 64-bit addressable one for the various encrypted > secure guest schemes. Couple of things: - I am not pushing to Linus the Claire's patchset until we have a resolution on this. I hope you all agree that is a sensible way forward as much as I hate doing that. - I like Robin's fix as it is simplest looking. Would love to see if it does fix the problem. - Christopher - we can always add multiple pools as the next milestone and just focus on this feature getting tested extensively during this release. - Would it be worth (for future or maybe in another tiny fix) to also add a printk in swiotlb when we de-allocate the buffer so when someone looks through the `dmesg` it becomes much easier to diagnose issues?
Re: [PATCH v14 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
On Thu, Jun 24, 2021 at 11:58:57PM +0800, Claire Chang wrote: > On Thu, Jun 24, 2021 at 11:56 PM Konrad Rzeszutek Wilk > wrote: > > > > On Thu, Jun 24, 2021 at 10:10:51AM -0400, Qian Cai wrote: > > > > > > > > > On 6/24/2021 7:48 AM, Will Deacon wrote: > > > > Ok, diff below which attempts to tackle the offset issue I mentioned as > > > > well. Qian Cai -- please can you try with these changes? > > > > > > This works fine. > > > > Cool. Let me squash this patch in #6 and rebase the rest of them. > > > > Claire, could you check the devel/for-linus-5.14 say by end of today to > > double check that I didn't mess anything up please? > > I just submitted v15 here > (https://lore.kernel.org/patchwork/cover/1451322/) in case it's > helpful. Oh! Nice! > I'll double check of course. Thanks for the efforts! I ended up using your patch #6 and #7. Please double-check.
Re: [PATCH v15 00/12] Restricted DMA
On Thu, Jun 24, 2021 at 11:55:14PM +0800, Claire Chang wrote: > This series implements mitigations for lack of DMA access control on > systems without an IOMMU, which could result in the DMA accessing the > system memory at unexpected times and/or unexpected addresses, possibly > leading to data leakage or corruption. > > For example, we plan to use the PCI-e bus for Wi-Fi and that PCI-e bus is > not behind an IOMMU. As PCI-e, by design, gives the device full access to > system memory, a vulnerability in the Wi-Fi firmware could easily escalate > to a full system exploit (remote wifi exploits: [1a], [1b] that shows a > full chain of exploits; [2], [3]). > > To mitigate the security concerns, we introduce restricted DMA. Restricted > DMA utilizes the existing swiotlb to bounce streaming DMA in and out of a > specially allocated region and does memory allocation from the same region. > The feature on its own provides a basic level of protection against the DMA > overwriting buffer contents at unexpected times. However, to protect > against general data leakage and system memory corruption, the system needs > to provide a way to restrict the DMA to a predefined memory region (this is > usually done at firmware level, e.g. MPU in ATF on some ARM platforms [4]). > > [1a] > https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_4.html > [1b] > https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_11.html > [2] https://blade.tencent.com/en/advisories/qualpwn/ > [3] > https://www.bleepingcomputer.com/news/security/vulnerabilities-found-in-highly-popular-firmware-for-wifi-chips/ > [4] > https://github.com/ARM-software/arm-trusted-firmware/blob/master/plat/mediatek/mt8183/drivers/emi_mpu/emi_mpu.c#L132 > > v15: > - Apply Will's diff (https://lore.kernel.org/patchwork/patch/1448957/#1647521) > to fix the crash reported by Qian. > - Add Stefano's Acked-by tag for patch 01/12 from v14 That all should be now be on https://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb.git/ devel/for-linus-5.14 (and linux-next)
Re: [PATCH v14 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
On Thu, Jun 24, 2021 at 10:10:51AM -0400, Qian Cai wrote: > > > On 6/24/2021 7:48 AM, Will Deacon wrote: > > Ok, diff below which attempts to tackle the offset issue I mentioned as > > well. Qian Cai -- please can you try with these changes? > > This works fine. Cool. Let me squash this patch in #6 and rebase the rest of them. Claire, could you check the devel/for-linus-5.14 say by end of today to double check that I didn't mess anything up please? Will, Thank you for generating the fix! I am going to run it on x86 and Xen to make sure all is good (granted last time I ran devel/for-linus-5.14 on that setup I didn't see any errors so I need to double check I didn't do something silly like run a wrong kernel). > > > > > Will > > > > --->8 > > > > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h > > index 175b6c113ed8..39284ff2a6cd 100644 > > --- a/include/linux/swiotlb.h > > +++ b/include/linux/swiotlb.h > > @@ -116,7 +116,9 @@ static inline bool is_swiotlb_buffer(struct device > > *dev, phys_addr_t paddr) > > > > static inline bool is_swiotlb_force_bounce(struct device *dev) > > { > > - return dev->dma_io_tlb_mem->force_bounce; > > + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; > > + > > + return mem && mem->force_bounce; > > } > > > > void __init swiotlb_exit(void); > > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c > > index 44be8258e27b..0ffbaae9fba2 100644 > > --- a/kernel/dma/swiotlb.c > > +++ b/kernel/dma/swiotlb.c > > @@ -449,6 +449,7 @@ static int swiotlb_find_slots(struct device *dev, > > phys_addr_t orig_addr, > > dma_get_min_align_mask(dev) & ~(IO_TLB_SIZE - 1); > > unsigned int nslots = nr_slots(alloc_size), stride; > > unsigned int index, wrap, count = 0, i; > > + unsigned int offset = swiotlb_align_offset(dev, orig_addr); > > unsigned long flags; > > > > BUG_ON(!nslots); > > @@ -497,7 +498,7 @@ static int swiotlb_find_slots(struct device *dev, > > phys_addr_t orig_addr, > > for (i = index; i < index + nslots; i++) { > > mem->slots[i].list = 0; > > mem->slots[i].alloc_size = > > - alloc_size - ((i - index) << IO_TLB_SHIFT); > > + alloc_size - (offset + ((i - index) << > > IO_TLB_SHIFT)); > > } > > for (i = index - 1; > > io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 && > >
Re: [PATCH v14 00/12] Restricted DMA
On Sat, Jun 19, 2021 at 11:40:31AM +0800, Claire Chang wrote: > This series implements mitigations for lack of DMA access control on > systems without an IOMMU, which could result in the DMA accessing the > system memory at unexpected times and/or unexpected addresses, possibly > leading to data leakage or corruption. > > For example, we plan to use the PCI-e bus for Wi-Fi and that PCI-e bus is > not behind an IOMMU. As PCI-e, by design, gives the device full access to > system memory, a vulnerability in the Wi-Fi firmware could easily escalate > to a full system exploit (remote wifi exploits: [1a], [1b] that shows a > full chain of exploits; [2], [3]). > > To mitigate the security concerns, we introduce restricted DMA. Restricted > DMA utilizes the existing swiotlb to bounce streaming DMA in and out of a > specially allocated region and does memory allocation from the same region. > The feature on its own provides a basic level of protection against the DMA > overwriting buffer contents at unexpected times. However, to protect > against general data leakage and system memory corruption, the system needs > to provide a way to restrict the DMA to a predefined memory region (this is > usually done at firmware level, e.g. MPU in ATF on some ARM platforms [4]). > > [1a] > https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_4.html > [1b] > https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_11.html > [2] https://blade.tencent.com/en/advisories/qualpwn/ > [3] > https://www.bleepingcomputer.com/news/security/vulnerabilities-found-in-highly-popular-firmware-for-wifi-chips/ > [4] > https://github.com/ARM-software/arm-trusted-firmware/blob/master/plat/mediatek/mt8183/drivers/emi_mpu/emi_mpu.c#L132 Heya Claire, I put all your patches on https://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb.git/log/?h=devel/for-linus-5.14 Please double-check that they all look ok. Thank you!
Re: [PATCH v10 03/12] swiotlb: Set dev->dma_io_tlb_mem to the swiotlb pool used
On Tue, Jun 15, 2021 at 09:27:02PM +0800, Claire Chang wrote: > Always have the pointer to the swiotlb pool used in struct device. This > could help simplify the code for other pools. Applying: swiotlb: Set dev->dma_io_tlb_mem to the swiotlb pool used error: patch failed: kernel/dma/swiotlb.c:339 error: kernel/dma/swiotlb.c: patch does not apply .. Would you be OK rebasing this against devel/for-linus-5.14 please? (And please send out with the Reviewed-by from Christopher) Thank you! > > Signed-off-by: Claire Chang > --- > drivers/base/core.c| 4 > include/linux/device.h | 4 > kernel/dma/swiotlb.c | 8 > 3 files changed, 12 insertions(+), 4 deletions(-) > > diff --git a/drivers/base/core.c b/drivers/base/core.c > index b8a8c96dca58..eeb2d49d3aa3 100644 > --- a/drivers/base/core.c > +++ b/drivers/base/core.c > @@ -27,6 +27,7 @@ > #include > #include > #include > +#include > #include > #include /* for dma_default_coherent */ > > @@ -2846,6 +2847,9 @@ void device_initialize(struct device *dev) > defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) > dev->dma_coherent = dma_default_coherent; > #endif > +#ifdef CONFIG_SWIOTLB > + dev->dma_io_tlb_mem = io_tlb_default_mem; > +#endif > } > EXPORT_SYMBOL_GPL(device_initialize); > > diff --git a/include/linux/device.h b/include/linux/device.h > index 4443e12238a0..2e9a378c9100 100644 > --- a/include/linux/device.h > +++ b/include/linux/device.h > @@ -432,6 +432,7 @@ struct dev_links_info { > * @dma_pools: Dma pools (if dma'ble device). > * @dma_mem: Internal for coherent mem override. > * @cma_area:Contiguous memory area for dma allocations > + * @dma_io_tlb_mem: Pointer to the swiotlb pool used. Not for driver use. > * @archdata:For arch-specific additions. > * @of_node: Associated device tree node. > * @fwnode: Associated device node supplied by platform firmware. > @@ -540,6 +541,9 @@ struct device { > #ifdef CONFIG_DMA_CMA > struct cma *cma_area; /* contiguous memory area for dma > allocations */ > +#endif > +#ifdef CONFIG_SWIOTLB > + struct io_tlb_mem *dma_io_tlb_mem; > #endif > /* arch specific additions */ > struct dev_archdata archdata; > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c > index 97c6ad50fdc2..949a6bb21343 100644 > --- a/kernel/dma/swiotlb.c > +++ b/kernel/dma/swiotlb.c > @@ -339,7 +339,7 @@ void __init swiotlb_exit(void) > static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t > size, > enum dma_data_direction dir) > { > - struct io_tlb_mem *mem = io_tlb_default_mem; > + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; > int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT; > phys_addr_t orig_addr = mem->slots[index].orig_addr; > size_t alloc_size = mem->slots[index].alloc_size; > @@ -421,7 +421,7 @@ static unsigned int wrap_index(struct io_tlb_mem *mem, > unsigned int index) > static int find_slots(struct device *dev, phys_addr_t orig_addr, > size_t alloc_size) > { > - struct io_tlb_mem *mem = io_tlb_default_mem; > + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; > unsigned long boundary_mask = dma_get_seg_boundary(dev); > dma_addr_t tbl_dma_addr = > phys_to_dma_unencrypted(dev, mem->start) & boundary_mask; > @@ -498,7 +498,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, > phys_addr_t orig_addr, > size_t mapping_size, size_t alloc_size, > enum dma_data_direction dir, unsigned long attrs) > { > - struct io_tlb_mem *mem = io_tlb_default_mem; > + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; > unsigned int offset = swiotlb_align_offset(dev, orig_addr); > unsigned int i; > int index; > @@ -549,7 +549,7 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, > phys_addr_t tlb_addr, > size_t mapping_size, enum dma_data_direction dir, > unsigned long attrs) > { > - struct io_tlb_mem *mem = io_tlb_default_mem; > + struct io_tlb_mem *mem = hwdev->dma_io_tlb_mem; > unsigned long flags; > unsigned int offset = swiotlb_align_offset(hwdev, tlb_addr); > int index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT; > -- > 2.32.0.272.g935e593368-goog >
Re: [PATCH v7 01/15] swiotlb: Refactor swiotlb init functions
> > do the set_memory_decrypted()+memset(). Is this okay or should > > swiotlb_init_io_tlb_mem() add an additional argument to do this > > conditionally? > > I'm actually not sure if this it okay. If not, will add an additional > argument for it. Any observations discovered? (Want to make sure my memory-cache has the correct semantics for set_memory_decrypted in mind). > > > -- > > Florian
Re: [PATCH v7 05/15] swiotlb: Add a new get_io_tlb_mem getter
On Tue, May 18, 2021 at 02:51:52PM +0800, Claire Chang wrote: > Still keep this function because directly using dev->dma_io_tlb_mem > will cause issues for memory allocation for existing devices. The pool > can't support atomic coherent allocation so we need to distinguish the > per device pool and the default pool in swiotlb_alloc. This above should really be rolled in the commit. You can prefix it by "The reason it was done this way was because directly using .."
Re: [PATCH v7 04/15] swiotlb: Add restricted DMA pool initialization
On Tue, May 18, 2021 at 02:48:35PM +0800, Claire Chang wrote: > I didn't move this to a separate file because I feel it might be > confusing for swiotlb_alloc/free (and need more functions to be > non-static). > Maybe instead of moving to a separate file, we can try to come up with > a better naming? I think you are referring to: rmem_swiotlb_setup ? Which is ARM specific and inside the generic code? Christopher wants to unify it in all the code so there is one single source, but the "you seperate arch code out from generic" saying makes me want to move it out. I agree that if you move it out from generic to arch-specific we have to expose more of the swiotlb functions, which will undo's Christopher cleanup code. How about this - lets leave it as is now, and when there are more use-cases we can revisit it and then if need to move the code?
Re: i915 and swiotlb_max_segment
On Mon, May 10, 2021 at 05:25:25PM +0200, Christoph Hellwig wrote: > Hi all, > > swiotlb_max_segment is a rather strange "API" export by swiotlb.c, > and i915 is the only (remaining) user. > > swiotlb_max_segment returns 0 if swiotlb is not in use, 1 if > SWIOTLB_FORCE is set or swiotlb-zen is set, and the swiotlb segment > size when swiotlb is otherwise enabled. > > i915 then uses it to: > > a) decided on the max order in i915_gem_object_get_pages_internal > b) decide on a max segment size in i915_sg_segment_size > > for a) it really seems i915 should switch to dma_alloc_noncoherent > or dma_alloc_noncontigous ASAP instead of using alloc_page and > streaming DMA mappings. Any chance I could trick one of the i915 > maintaines into doing just that given that the callchain is not > exactly trivial? > > For b) I'm not sure swiotlb and i915 really agree on the meaning > of the value. swiotlb_set_max_segment basically returns the entire > size of the swiotlb buffer, while i915 seems to use it to limit > the size each scatterlist entry. It seems like dma_max_mapping_size > might be the best value to use here. Yes. The background behind that was SWIOTLB would fail because well, the size of the sg was too large. And some way to limit it to max size was needed - the dma_max_mapping_size "should" be just fine. > > Once that is fixed I'd like to kill off swiotlb_max_segment as it is > a horribly confusing API.
Re: [PATCH RFC v1 5/6] xen-swiotlb: convert variables to arrays
On Sun, Feb 07, 2021 at 04:56:01PM +0100, Christoph Hellwig wrote: > On Thu, Feb 04, 2021 at 09:40:23AM +0100, Christoph Hellwig wrote: > > So one thing that has been on my mind for a while: I'd really like > > to kill the separate dma ops in Xen swiotlb. If we compare xen-swiotlb > > to swiotlb the main difference seems to be: > > > > - additional reasons to bounce I/O vs the plain DMA capable > > - the possibility to do a hypercall on arm/arm64 > > - an extra translation layer before doing the phys_to_dma and vice > >versa > > - an special memory allocator > > > > I wonder if inbetween a few jump labels or other no overhead enablement > > options and possibly better use of the dma_range_map we could kill > > off most of swiotlb-xen instead of maintaining all this code duplication? > > So I looked at this a bit more. > > For x86 with XENFEAT_auto_translated_physmap (how common is that?) Juergen, Boris please correct me if I am wrong, but that XENFEAT_auto_translated_physmap only works for PVH guests? > pfn_to_gfn is a nop, so plain phys_to_dma/dma_to_phys do work as-is. > > xen_arch_need_swiotlb always returns true for x86, and > range_straddles_page_boundary should never be true for the > XENFEAT_auto_translated_physmap case. Correct. The kernel should have no clue of what the real MFNs are for PFNs. > > So as far as I can tell the mapping fast path for the > XENFEAT_auto_translated_physmap can be trivially reused from swiotlb. > > That leaves us with the next more complicated case, x86 or fully cache > coherent arm{,64} without XENFEAT_auto_translated_physmap. In that case > we need to patch in a phys_to_dma/dma_to_phys that performs the MFN > lookup, which could be done using alternatives or jump labels. > I think if that is done right we should also be able to let that cover > the foreign pages in is_xen_swiotlb_buffer/is_swiotlb_buffer, but > in that worst case that would need another alternative / jump label. > > For non-coherent arm{,64} we'd also need to use alternatives or jump > labels to for the cache maintainance ops, but that isn't a hard problem > either. > > ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH RFC v1 2/6] swiotlb: convert variables to arrays
On Thu, Feb 04, 2021 at 11:49:23AM +, Robin Murphy wrote: > On 2021-02-04 07:29, Christoph Hellwig wrote: > > On Wed, Feb 03, 2021 at 03:37:05PM -0800, Dongli Zhang wrote: > > > This patch converts several swiotlb related variables to arrays, in > > > order to maintain stat/status for different swiotlb buffers. Here are > > > variables involved: > > > > > > - io_tlb_start and io_tlb_end > > > - io_tlb_nslabs and io_tlb_used > > > - io_tlb_list > > > - io_tlb_index > > > - max_segment > > > - io_tlb_orig_addr > > > - no_iotlb_memory > > > > > > There is no functional change and this is to prepare to enable 64-bit > > > swiotlb. > > > > Claire Chang (on Cc) already posted a patch like this a month ago, > > which looks much better because it actually uses a struct instead > > of all the random variables. > > Indeed, I skimmed the cover letter and immediately thought that this whole > thing is just the restricted DMA pool concept[1] again, only from a slightly > different angle. Kind of. Let me lay out how some of these pieces are right now: +---+ +--+ | | | | | | | | | a)Xen-SWIOTLB | | b)SWIOTLB (for !Xen) | | | | | +---XX--+ +---X--+ X XX XXX X XX +--XX---+ | | | | | c) SWIOTLB generic | | | +---+ Dongli's patches modify the SWIOTLB generic c), and Xen-SWIOTLB a) parts. Also see the IOMMU_INIT logic which lays this a bit more deepth (for example how to enable SWIOTLB on AMD boxes, or IBM with Calgary IOMMU, etc - see iommu_table.h). Furtheremore it lays the groundwork to allocate AMD SEV SWIOTLB buffers later after boot (so that you can stich different pools together). All the bits are kind of inside of the SWIOTLB code. And also it changes the Xen-SWIOTLB to do something similar. The mempool did it similarly by taking the internal parts (aka the various io_tlb) of SWIOTLB and exposing them out and having other code: +---+ +--+ | | | | | | | | | a)Xen-SWIOTLB | | b)SWIOTLB (for !Xen) | | | | | +---XX--+ +---X--+ X XX XXX X XX +--XX---+ +--+ | | | Device tree | | +<+ enabling SWIOTLB | |c) SWIOTLB generic | | | | | | mempool | +---+ +--+ What I was suggesting to Clarie to follow Xen model, that is do something like this: +---+ +--+ ++ | | | | || | | | | || | a)Xen-SWIOTLB | | b)SWIOTLB (for !Xen) | | e) DT-SWIOTLB | | | | | || +---XX--+ +---X--+ +XX-X+ XXXX X X XX X XX XX XXX X XX X +--XXX--+ | | | | |c) SWIOTLB generic | | | +---+ so using the SWIOTLB generic parts, and then bolt on top of the device-tree logic, along with the mempool logic. But Christopher has an interesting suggestion which is to squash the all the existing code (a, b, c) all together and pepper it with various jump-tables. So: -+ | SWIOTLB: | || | a) SWIOTLB (for non-Xen) | | b) Xen-SWIOTLB| | c) DT-SWIOTLB | || || -+ with all the various bits (M2P/P2M for Xen, mempool for ARM, and normal allocation for BM) in one big file. ___ dri-devel mailing list dri-devel@lists.freedesktop.org
Is: virtio_gpu_object_shmem_init issues? Was:Re: upstream boot error: general protection fault in swiotlb_map
On Thu, Aug 06, 2020 at 03:46:23AM -0700, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit:47ec5303 Merge git://git.kernel.org/pub/scm/linux/kernel/g.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=16fe1dea90 > kernel config: https://syzkaller.appspot.com/x/.config?x=7c06047f622c5724 > dashboard link: https://syzkaller.appspot.com/bug?extid=3f86afd0b1e4bf1cb64c > compiler: gcc (GCC) 10.1.0-syz 20200507 > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+3f86afd0b1e4bf1cb...@syzkaller.appspotmail.com > > ceph: loaded (mds proto 32) > NET: Registered protocol family 38 > async_tx: api initialized (async) > Key type asymmetric registered > Asymmetric key parser 'x509' registered > Asymmetric key parser 'pkcs8' registered > Key type pkcs7_test registered > Asymmetric key parser 'tpm_parser' registered > Block layer SCSI generic (bsg) driver version 0.4 loaded (major 243) > io scheduler mq-deadline registered > io scheduler kyber registered > io scheduler bfq registered > hgafb: HGA card not detected. > hgafb: probe of hgafb.0 failed with error -22 > usbcore: registered new interface driver udlfb > uvesafb: failed to execute /sbin/v86d > uvesafb: make sure that the v86d helper is installed and executable > uvesafb: Getting VBE info block failed (eax=0x4f00, err=-2) > uvesafb: vbe_init() failed with -22 > uvesafb: probe of uvesafb.0 failed with error -22 > vga16fb: mapped to 0x8aac772d > Console: switching to colour frame buffer device 80x30 > fb0: VGA16 VGA frame buffer device > input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0 > ACPI: Power Button [PWRF] > ioatdma: Intel(R) QuickData Technology Driver 5.00 > PCI Interrupt Link [GSIF] enabled at IRQ 21 > PCI Interrupt Link [GSIG] enabled at IRQ 22 > PCI Interrupt Link [GSIH] enabled at IRQ 23 > N_HDLC line discipline registered with maxframe=4096 > Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled > 00:05: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A > Cyclades driver 2.6 > Initializing Nozomi driver 2.1d > RocketPort device driver module, version 2.09, 12-June-2003 > No rocketport ports found; unloading driver > Non-volatile memory driver v1.3 > Linux agpgart interface v0.103 > [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 0 > [drm] Initialized vkms 1.0.0 20180514 for vkms on minor 1 > usbcore: registered new interface driver udl > [drm] pci: virtio-vga detected at :00:01.0 > fb0: switching to virtiodrmfb from VGA16 VGA > Console: switching to colour VGA+ 80x25 > virtio-pci :00:01.0: vgaarb: deactivate vga console > Console: switching to colour dummy device 80x25 > [drm] features: -virgl +edid > [drm] number of scanouts: 1 > [drm] number of cap sets: 0 > [drm] Initialized virtio_gpu 0.1.0 0 for virtio0 on minor 2 > general protection fault, probably for non-canonical address > 0xdc00: [#1] PREEMPT SMP KASAN > KASAN: null-ptr-deref in range [0x-0x0007] > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0-syzkaller #0 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > RIP: 0010:swiotlb_map+0x5ac/0x700 kernel/dma/swiotlb.c:683 > Code: 28 04 00 00 48 c1 ea 03 80 3c 02 00 0f 85 4d 01 00 00 4c 8b a5 18 04 00 > 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 0f 85 1e > 01 00 00 48 8d 7d 50 4d 8b 24 24 48 b8 00 00 > RSP: :c934f3e0 EFLAGS: 00010246 > RAX: dc00 RBX: RCX: 8162cc1d > RDX: RSI: 8162cc98 RDI: 88802971a470 > RBP: 88802971a048 R08: 0001 R09: 8c5dba77 > R10: R11: R12: > R13: 7ac0 R14: dc00 R15: 1000 > FS: () GS:88802ce0() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: CR3: 09a8d000 CR4: 00350ef0 > DR0: DR1: DR2: > DR3: DR6: fffe0ff0 DR7: 0400 > Call Trace: > dma_direct_map_page include/linux/dma-direct.h:170 [inline] > dma_direct_map_sg+0x3bb/0x670 kernel/dma/direct.c:368 > dma_map_sg_attrs+0xd0/0x160 kernel/dma/mapping.c:183 > drm_gem_shmem_get_pages_sgt drivers/gpu/drm/drm_gem_shmem_helper.c:700 > [inline] > drm_gem_shmem_get_pages_sgt+0x1fc/0x310 > drivers/gpu/drm/drm_gem_shmem_helper.c:679 > virtio_gpu_object_shmem_init drivers/gpu/drm/virtio/virtgpu_object.c:153 > [inline] > virtio_gpu_object_create+0x2fd/0xa70 > drivers/gpu/drm/virtio/virtgpu_object.c:232 > virtio_gpu_gem_create drivers/gpu/drm/virtio/virtgpu_gem.c:45 [inline] > virtio_gpu_mode_dumb_create+0x298/0x530 > drivers/gpu/drm/virtio/virtgpu_gem.c:85 >
Re: [PATCH] drm: add func to better detect wether swiotlb is needed
.snip.. > > -u64 drm_get_max_iomem(void) > > +bool drm_need_swiotlb(int dma_bits) > > { > > struct resource *tmp; > > resource_size_t max_iomem = 0; > > > > + /* > > +* Xen paravirtual hosts require swiotlb regardless of requested dma > > +* transfer size. > > +* > > +* NOTE: Really, what it requires is use of the dma_alloc_coherent > > +* allocator used in ttm_dma_populate() instead of > > +* ttm_populate_and_map_pages(), which bounce buffers so much > > in > > +* Xen it leads to swiotlb buffer exhaustion. > > +*/ > > + if (xen_pv_domain()) > > I've not been following all of the ins and outs of the discussion on this so > apologies if I'm missing some context, but... > > This looks like the wrong test to me. I think it should be: > > if ( xen_swiotlb ) Ah, that could be as well. > ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Xen-devel] Upstream Dom0 DRM problems regarding swiotlb
On Fri, Feb 15, 2019 at 11:07:22AM -0500, Michael Labriola wrote: > On Fri, Feb 15, 2019 at 12:57 AM Juergen Gross wrote: > > > > On 14/02/2019 18:57, Christoph Hellwig wrote: > > > On Thu, Feb 14, 2019 at 07:03:38AM +0100, Juergen Gross wrote: > > >>> The thing which is different between Xen PV guests and most others (all > > >>> others(?), now that Lguest and UML have been dropped) is that what Linux > > >>> thinks of as PFN $N isn't necessarily adjacent to PFN $N+1 in system > > >>> physical address space. > > >>> > > >>> Therefore, code which has a buffer spanning a page boundary can't just > > >>> convert a pointer to the buffer into a physical address, and hand that > > >>> address to a device. You generally end up with either memory corruption > > >>> (DMA hitting the wrong page allocated to the guest), or an IOMMU fault > > >>> (DMA hitting a pages which isn't allocated to the guest). > > > > > > The Linux DMA API allows for dma_map_page / dma_map_single calls to > > > spawn 4k boundaries. If Xen doesn't support that it will have to bounce > > > buffer for that case (and get horrible performance). > > > > > > But the latter text seems to agree with that. So what is the actual > > > problem that started this discussion? > > > > > > > See https://lists.xen.org/archives/html/xen-devel/2019-02/threads.html#00818 > > I believe the actual problem is either: > > 1) The radeon/amdgpu drivers are calling ttm_populate_and_map_pages() > which *should* work on a Xen PV host, but doesn't and needs to be > fixed. > > or > > 2) The radeon/amdgpu drivers are calling ttm_populate_and_map_pages() > which *cannot* work in Xen, and they should go back to calling > ttm_dma_populate() in that case. The Nvidia one has this (correct): 1583 #if IS_ENABLED(CONFIG_SWIOTLB) && IS_ENABLED(CONFIG_X86) 1584 if (swiotlb_nr_tbl()) { 1585 return ttm_dma_populate((void *)ttm, dev, ctx); 1586 } 1587 #endif The Radeon has this - where now it adds 'need_swiotlb': 695 #ifdef CONFIG_SWIOTLB 696 if (rdev->need_swiotlb && swiotlb_nr_tbl()) { 697 return ttm_dma_populate(>ttm, rdev->dev, ctx); 698 } 699 #endif The problem is fairly simple - the platform _requires_ to use DMA API. But the driver's have their own 'need_swiotlb' which ignores the platform and sets it based on the device's DMA width: rdev->need_swiotlb = drm_get_max_iomem() > ((u64)1 << dma_bits); There should be an extra check to see if the platform requires to use DMA API. > > I'm having a hard time figuring out which of those is correct. > > -- > Michael D Labriola > 21 Rip Van Winkle Cir > Warwick, RI 02886 > 401-316-9844 (cell) > 401-848-8871 (work) > 401-234-1306 (home) ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: amdgpu/TTM oopses since merging swiotlb_dma_ops into the dma_direct code
On Thu, Jan 10, 2019 at 04:26:43PM +0100, Sibren Vasse wrote: > On Thu, 10 Jan 2019 at 14:57, Christoph Hellwig wrote: > > > > On Thu, Jan 10, 2019 at 10:59:02AM +0100, Michel Dänzer wrote: > > > > > > Hi Christoph, > > > > > > > > > https://bugs.freedesktop.org/109234 (please ignore comments #6-#9) was > > > bisected to your commit 55897af63091 "dma-direct: merge swiotlb_dma_ops > > > into the dma_direct code". Any ideas? > > > > From the trace it looks like we git the case where swiotlb tries > > to copy back data from a bounce buffer, but hits a dangling or NULL > > pointer. So a couple questions for the submitter: > My apologies if I misunderstand something, this subject matter is new to me. > > > > > - does the system have more than 4GB memory and thus use swiotlb? > My system has 8GB memory. The other report on the bug tracker had 16GB. > > >(check /proc/meminfo, and if something SWIOTLB appears in dmesg) > /proc/meminfo: https://ptpb.pw/4rxI > Can I grep dmesg for a string? Can you attach the 'dmesg'? > > > - does the device this happens on have a DMA mask smaller than > >the available memory, that is should swiotlb be used here to start > >with? > It's a MSI Radeon RX 570 Gaming X 4GB. The other report was a RX 580. > lshw output: https://ptpb.pw/6s0H > > > Regards, > > Sibren ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 08/44] xen-swiotlb: implement ->mapping_error
On Thu, Jun 08, 2017 at 03:25:33PM +0200, Christoph Hellwig wrote: > DMA_ERROR_CODE is going to go away, so don't rely on it. Reviewed-by: Konrad Rzeszutek Wilk <konrad.w...@oracle.com> ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 07/44] xen-swiotlb: consolidate xen_swiotlb_dma_ops
On Thu, Jun 08, 2017 at 03:25:32PM +0200, Christoph Hellwig wrote: > ARM and x86 had duplicated versions of the dma_ops structure, the > only difference is that x86 hasn't wired up the set_dma_mask, > mmap, and get_sgtable ops yet. On x86 all of them are identical > to the generic version, so they aren't needed but harmless. > > All the symbols used only for xen_swiotlb_dma_ops can now be marked > static as well. > > Signed-off-by: Christoph Hellwig <h...@lst.de> > --- > arch/arm/xen/mm.c | 17 > arch/x86/xen/pci-swiotlb-xen.c | 14 --- > drivers/xen/swiotlb-xen.c | 93 > ++ > include/xen/swiotlb-xen.h | 62 +--- > 4 files changed, 49 insertions(+), 137 deletions(-) Yeeey! Reviewed-by: Konrad Rzeszutek Wilk <konrad.w...@oracle.com> ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 16/22] xen-blkfront: Make use of the new sg_map helper function
On Tue, Apr 18, 2017 at 09:42:20AM -0600, Logan Gunthorpe wrote: > > > On 18/04/17 08:27 AM, Konrad Rzeszutek Wilk wrote: > > Interesting that you didn't CC any of the maintainers. Could you > > do that in the future please? > > Please read the cover letter. The distribution list for the patchset > would have been way too large to cc every maintainer (even as limited as > it was, I had mailing lists yelling at me). My plan was to get buy in I am not sure if you know, but you can add on each patch the respective maintainer via 'CC'. That way you can have certain maintainers CCed only on the subsystems they cover. You put it after (or before) your SoB and git send-email happilly picks it up. It does mean that for every patch you have to run something like this: $ more add_cc #!/bin/bash git diff HEAD^.. > /tmp/a echo "---" scripts/get_maintainer.pl --no-l /tmp/a | while read file do echo "Cc: $file" done Or such. > for the first patch, get it merged and resend the rest independently to > their respective maintainers. Of course, though, I'd be open to other > suggestions. > > >>> > >>> Signed-off-by: Logan Gunthorpe <log...@deltatee.com> > >>> --- > >>> drivers/block/xen-blkfront.c | 33 +++-- > >>> 1 file changed, 27 insertions(+), 6 deletions(-) > >>> > >>> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c > >>> index 5067a0a..7dcf41d 100644 > >>> --- a/drivers/block/xen-blkfront.c > >>> +++ b/drivers/block/xen-blkfront.c > >>> @@ -807,8 +807,19 @@ static int blkif_queue_rw_req(struct request *req, > >>> struct blkfront_ring_info *ri > >>> BUG_ON(sg->offset + sg->length > PAGE_SIZE); > >>> > >>> if (setup.need_copy) { > >>> - setup.bvec_off = sg->offset; > >>> - setup.bvec_data = kmap_atomic(sg_page(sg)); > >>> + setup.bvec_off = 0; > >>> + setup.bvec_data = sg_map(sg, SG_KMAP_ATOMIC); > >>> + if (IS_ERR(setup.bvec_data)) { > >>> + /* > >>> + * This should really never happen unless > >>> + * the code is changed to use memory that is > >>> + * not mappable in the sg. Seeing there is a > >>> + * questionable error path out of here, > >>> + * we WARN. > >>> + */ > >>> + WARN(1, "Non-mappable memory used in sg!"); > >>> + return 1; > >>> + } > >> ... > >> > >> Perhaps add a flag to mark failure as 'unexpected' and trace (and panic?) > >> inside sg_map(). > > Thanks, that's a good suggestion. I'll make the change for v2. > > Logan ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 16/22] xen-blkfront: Make use of the new sg_map helper function
On Tue, Apr 18, 2017 at 02:13:59PM +, David Laight wrote: > From: Logan Gunthorpe > > Sent: 13 April 2017 23:05 > > Straightforward conversion to the new helper, except due to > > the lack of error path, we have to warn if unmapable memory > > is ever present in the sgl. Interesting that you didn't CC any of the maintainers. Could you do that in the future please? > > > > Signed-off-by: Logan Gunthorpe> > --- > > drivers/block/xen-blkfront.c | 33 +++-- > > 1 file changed, 27 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c > > index 5067a0a..7dcf41d 100644 > > --- a/drivers/block/xen-blkfront.c > > +++ b/drivers/block/xen-blkfront.c > > @@ -807,8 +807,19 @@ static int blkif_queue_rw_req(struct request *req, > > struct blkfront_ring_info *ri > > BUG_ON(sg->offset + sg->length > PAGE_SIZE); > > > > if (setup.need_copy) { > > - setup.bvec_off = sg->offset; > > - setup.bvec_data = kmap_atomic(sg_page(sg)); > > + setup.bvec_off = 0; > > + setup.bvec_data = sg_map(sg, SG_KMAP_ATOMIC); > > + if (IS_ERR(setup.bvec_data)) { > > + /* > > +* This should really never happen unless > > +* the code is changed to use memory that is > > +* not mappable in the sg. Seeing there is a > > +* questionable error path out of here, > > +* we WARN. > > +*/ > > + WARN(1, "Non-mappable memory used in sg!"); > > + return 1; > > + } > ... > > Perhaps add a flag to mark failure as 'unexpected' and trace (and panic?) > inside sg_map(). > > David > > > ___ > Linux-nvdimm mailing list > linux-nvd...@lists.01.org > https://lists.01.org/mailman/listinfo/linux-nvdimm ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
i915 regression in kernel 4.10
On Tue, Dec 20, 2016 at 09:42:46AM -0500, Konrad Rzeszutek Wilk wrote: > On Mon, Dec 19, 2016 at 03:16:44PM +0100, Juergen Gross wrote: > > On 19/12/16 13:29, Chris Wilson wrote: > > > On Mon, Dec 19, 2016 at 12:39:16PM +0100, Juergen Gross wrote: > > >> With recent 4.10 kernel the graphics isn't coming up under Xen. First > > >> failure message is: > > >> > > >> [ 46.656649] i915 :00:02.0: swiotlb buffer is full (sz: 1630208 > > >> bytes) > > > > > > Do we get a silent failure? i915_gem_gtt_prepare_pages() is where we > > > call dma_map_sg() and pass the sg to swiotlb (in this case) for > > > remapping, and we do check for an error value of 0. After that error, > > > SWIOTLB_MAP_ERROR is propagated back and converted to 0 for > > > dma_map_sg(). That looks valid, and we should report ENOMEM back to the > > > caller. > > > > > >> Later I see splats like: > > >> > > >> [ 49.393583] general protection fault: [#1] SMP > > > > > > What was the faulting address? RAX is particularly non-pointer-like so I > > > wonder if we walked onto an uninitialised portion of the sgtable. We may > > > have tripped over a bug in our sg_page iterator. > > > > During the bisect process there have been either GP or NULL pointer > > dereferences or other page faults. Typical addresses where: > > > > xen_swiotlb_unmap_sg_attrs+0x1f/0x50: access to 0018 > > xen_swiotlb_unmap_sg_attrs+0x1f/0x50: access to 03020118 > > > > > > > > The attached patch should prevent an early ENOMEM following the swiotlb > > > allocation failure. But I suspect that we will still be tripping up the > > > failure in the sg walker when binding to the GPU. > > > -Chris > > > > > > > The patch is working not too bad. :-) > > > > Still several "swiotlb buffer is full" messages (some with sz:, most > > without), but no faults any more (neither GP nor NULL pointer > > dereference). Graphical login is working now. > > > I think I know why. The optimization that was added assumes that > bus addresses is the same as physical address. Hence it packs all > of the virtual addresses in the sg, and hands it off to SWIOTLB > which walks each one and realizes that it has to use the bounce > buffer. > > I am wondering if would make sense to pull 'swiotlb_max_size' inside > of SWIOTLB and make it an library-ish - so Xen-SWIOTLB can register > as well and report say that it can only provide one page > (unless it is running under baremtal). > > Or make the usage of 'max_segement' and 'page_to_pfn(page) != last_pfn + 1' > in i915_gem_object_Get_pages_gtt use something similar to > xen_biovec_phys_mergeable? Chris, I was thinking of something like this (which Juergen has already tested). >From f196f1294fd25f1402c3dd1231babb8d7f5db2e7 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk <konrad.w...@oracle.com> Date: Tue, 20 Dec 2016 10:02:02 -0500 Subject: [PATCH] swiotlb: Export swiotlb_max_segment to users So they can figure out what is the optimal number of pages that can be contingously stitched together without fear of bounce buffer. We also expose an mechanism for sub-users of SWIOTLB API, such as Xen-SWIOTLB to set the max segment value. And lastly if swiotlb=force is set (which mandates we bounce buffer everything) we set max_segment so at least we can bounce buffer one 4K page instead of a giant 512KB one for which we may not have space. Reported-and-Tested-by: Juergen Gross Signed-off-by: Konrad Rzeszutek Wilk --- drivers/gpu/drm/i915/i915_gem.c | 11 +-- drivers/xen/swiotlb-xen.c | 4 include/linux/swiotlb.h | 3 +++ lib/swiotlb.c | 26 ++ 4 files changed, 34 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d0dcaf3..115fa39 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2290,15 +2290,6 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, mutex_unlock(>mm.lock); } -static unsigned int swiotlb_max_size(void) -{ -#if IS_ENABLED(CONFIG_SWIOTLB) - return rounddown(swiotlb_nr_tbl() << IO_TLB_SHIFT, PAGE_SIZE); -#else - return 0; -#endif -} - static void i915_sg_trim(struct sg_table *orig_st) { struct sg_table new_st; @@ -2345,7 +2336,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS); GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS); -
i915 regression in kernel 4.10
On Mon, Dec 19, 2016 at 03:16:44PM +0100, Juergen Gross wrote: > On 19/12/16 13:29, Chris Wilson wrote: > > On Mon, Dec 19, 2016 at 12:39:16PM +0100, Juergen Gross wrote: > >> With recent 4.10 kernel the graphics isn't coming up under Xen. First > >> failure message is: > >> > >> [ 46.656649] i915 :00:02.0: swiotlb buffer is full (sz: 1630208 > >> bytes) > > > > Do we get a silent failure? i915_gem_gtt_prepare_pages() is where we > > call dma_map_sg() and pass the sg to swiotlb (in this case) for > > remapping, and we do check for an error value of 0. After that error, > > SWIOTLB_MAP_ERROR is propagated back and converted to 0 for > > dma_map_sg(). That looks valid, and we should report ENOMEM back to the > > caller. > > > >> Later I see splats like: > >> > >> [ 49.393583] general protection fault: [#1] SMP > > > > What was the faulting address? RAX is particularly non-pointer-like so I > > wonder if we walked onto an uninitialised portion of the sgtable. We may > > have tripped over a bug in our sg_page iterator. > > During the bisect process there have been either GP or NULL pointer > dereferences or other page faults. Typical addresses where: > > xen_swiotlb_unmap_sg_attrs+0x1f/0x50: access to 0018 > xen_swiotlb_unmap_sg_attrs+0x1f/0x50: access to 03020118 > > > > > The attached patch should prevent an early ENOMEM following the swiotlb > > allocation failure. But I suspect that we will still be tripping up the > > failure in the sg walker when binding to the GPU. > > -Chris > > > > The patch is working not too bad. :-) > > Still several "swiotlb buffer is full" messages (some with sz:, most > without), but no faults any more (neither GP nor NULL pointer > dereference). Graphical login is working now. I think I know why. The optimization that was added assumes that bus addresses is the same as physical address. Hence it packs all of the virtual addresses in the sg, and hands it off to SWIOTLB which walks each one and realizes that it has to use the bounce buffer. I am wondering if would make sense to pull 'swiotlb_max_size' inside of SWIOTLB and make it an library-ish - so Xen-SWIOTLB can register as well and report say that it can only provide one page (unless it is running under baremtal). Or make the usage of 'max_segement' and 'page_to_pfn(page) != last_pfn + 1' in i915_gem_object_Get_pages_gtt use something similar to xen_biovec_phys_mergeable?
[RFC v2] dma-mapping: Use unsigned long for dma_attrs
On Mon, May 30, 2016 at 01:54:06PM +0200, Krzysztof Kozlowski wrote: > The dma-mapping core and the implementations do not change the > DMA attributes passed by pointer. Thus the pointer can point to const > data. However the attributes do not have to be a bitfield. Instead > unsigned long will do fine: > > 1. This is just simpler. Both in terms of reading the code and setting >attributes. Instead of initializing local attributes on the stack and >passing pointer to it to dma_set_attr(), just set the bits. > > 2. It brings safeness and checking for const correctness because the >attributes are passed by value. .. why not go the next step a do an enum? Perhaps that should be mentioned as part of the description? Thanks. > > Please have in mind that this is RFC, not finished yet. Only ARM and > ARM64 are fixed (and not everywhere). > However other API users also have to be converted which is quite > intrusive. I would rather avoid it until the overall approach is > accepted. > > Signed-off-by: Krzysztof Kozlowski > --- > Documentation/DMA-API.txt | 2 +- > Documentation/DMA-attributes.txt | 2 +- > arch/arm/include/asm/dma-mapping.h| 13 ++-- > arch/arm/include/asm/xen/page-coherent.h | 16 ++--- > arch/arm/mm/dma-mapping.c | 82 +++ > arch/arm/xen/mm.c | 4 +- > arch/arm64/mm/dma-mapping.c | 57 > drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 2 +- > drivers/gpu/drm/exynos/exynos_drm_g2d.c | 1 - > drivers/gpu/drm/exynos/exynos_drm_gem.c | 20 +++--- > drivers/gpu/drm/exynos/exynos_drm_gem.h | 2 +- > drivers/iommu/dma-iommu.c | 6 +- > drivers/xen/swiotlb-xen.c | 14 ++-- > include/linux/dma-attrs.h | 71 > include/linux/dma-iommu.h | 6 +- > include/linux/dma-mapping.h | 105 > +- > include/linux/swiotlb.h | 10 +-- > include/xen/swiotlb-xen.h | 12 ++-- > lib/dma-noop.c| 9 +-- > lib/swiotlb.c | 13 ++-- > 20 files changed, 195 insertions(+), 252 deletions(-) > delete mode 100644 include/linux/dma-attrs.h > > diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt > index 45ef3f279c3b..0b55cb7c5aaa 100644 > --- a/Documentation/DMA-API.txt > +++ b/Documentation/DMA-API.txt > @@ -391,7 +391,7 @@ without the _attrs suffixes, except that they pass an > optional > struct dma_attrs*. > > struct dma_attrs encapsulates a set of "DMA attributes". For the > -definition of struct dma_attrs see linux/dma-attrs.h. > +definition of struct dma_attrs see linux/dma-mapping.h. > > The interpretation of DMA attributes is architecture-specific, and > each attribute should be documented in Documentation/DMA-attributes.txt. > diff --git a/Documentation/DMA-attributes.txt > b/Documentation/DMA-attributes.txt > index e8cf9cf873b3..2d455a5cf671 100644 > --- a/Documentation/DMA-attributes.txt > +++ b/Documentation/DMA-attributes.txt > @@ -2,7 +2,7 @@ > == > > This document describes the semantics of the DMA attributes that are > -defined in linux/dma-attrs.h. > +defined in linux/dma-mapping.h. > > DMA_ATTR_WRITE_BARRIER > -- > diff --git a/arch/arm/include/asm/dma-mapping.h > b/arch/arm/include/asm/dma-mapping.h > index a83570f10124..d009f7911ffc 100644 > --- a/arch/arm/include/asm/dma-mapping.h > +++ b/arch/arm/include/asm/dma-mapping.h > @@ -5,7 +5,6 @@ > > #include > #include > -#include > #include > > #include > @@ -174,7 +173,7 @@ static inline void dma_mark_clean(void *addr, size_t > size) { } > * to be the device-viewed address. > */ > extern void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t > *handle, > -gfp_t gfp, struct dma_attrs *attrs); > +gfp_t gfp, unsigned long attrs); > > /** > * arm_dma_free - free memory allocated by arm_dma_alloc > @@ -191,7 +190,7 @@ extern void *arm_dma_alloc(struct device *dev, size_t > size, dma_addr_t *handle, > * during and after this call executing are illegal. > */ > extern void arm_dma_free(struct device *dev, size_t size, void *cpu_addr, > - dma_addr_t handle, struct dma_attrs *attrs); > + dma_addr_t handle, unsigned long attrs); > > /** > * arm_dma_mmap - map a coherent DMA allocation into user space > @@ -208,7 +207,7 @@ extern void arm_dma_free(struct device *dev, size_t size, > void *cpu_addr, > */ > extern int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma, > void *cpu_addr, dma_addr_t dma_addr, size_t size, > - struct dma_attrs *attrs); > + unsigned long attrs); > > /* > * This
[PATCH 1/6] swiotlb: Add helper to know if it is in use for a specific device.
On Wed, Aug 26, 2015 at 04:31:50PM -0400, Jerome Glisse wrote: > On Wed, Aug 26, 2015 at 03:44:52PM -0400, Konrad Rzeszutek Wilk wrote: > > On Wed, Aug 26, 2015 at 03:26:42PM -0400, Jerome Glisse wrote: > > > On Wed, Aug 26, 2015 at 03:02:31PM -0400, Konrad Rzeszutek Wilk wrote: > > > > On Wed, Aug 26, 2015 at 02:52:02PM -0400, jglisse at redhat.com wrote: > > > > > From: Jérôme Glisse > > > > > > > > > > Some device like GPU do things differently if swiotlb is in use. We > > > > > use to rely on swiotlb_nr_tbl() to know if swiotlb was enabled or not > > > > > but this is unreliable. Patch add a simple helpers to check if any of > > > > > > > > Why is it unreliable? > > > > > > Alex reported on irc that swiotlb_nr_tbl() returns non zero even if > > > swiotlb > > > is disabled. This seems to be due to > > > ac2cbab21f318e19bc176a7f38a120cec835220f > > > which cleanup swiotlb init and always allocate default size. Which i > > > believe > > > is a waste of memory. So we need to add a real helper to know if swiotlb > > > is > > > in use or not and we should not rely on expectation of some swiotlb value. > > > > Ah right, that patch. That should have been part of the description > > I believe. > > > > > > > > > > > > > > the dma_ops associated with a device points to the swiotlb functions, > > > > > making swiotlb check reliable for a device. > > > > > > > > > > Signed-off-by: Jérôme Glisse > > > > > Cc: Konrad Rzeszutek Wilk > > > > > Cc: Alex Deucher > > > > > Cc: Ben Skeggs > > > > > Cc: Dave Airlie > > > > > Cc: lkml at vger.kernel.org > > > > > Cc: Daniel Vetter > > > > > --- > > > > > include/linux/dma-mapping.h | 18 ++ > > > > > 1 file changed, 18 insertions(+) > > > > > > > > > > diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h > > > > > index ac07ff0..eac911e 100644 > > > > > --- a/include/linux/dma-mapping.h > > > > > +++ b/include/linux/dma-mapping.h > > > > > @@ -314,4 +314,22 @@ static inline int dma_mmap_writecombine(struct > > > > > device *dev, > > > > > #define dma_unmap_len_set(PTR, LEN_NAME, VAL)do { } while (0) > > > > > #endif > > > > > > > > > > + > > > > > +#ifdef CONFIG_SWIOTLB > > > > > +static inline bool swiotlb_in_use(struct device *dev) > > > > > +{ > > > > > + struct dma_map_ops *ops = get_dma_ops(dev); > > > > > + > > > > > + return (ops->map_sg == swiotlb_map_sg_attrs || > > > > > + ops->unmap_sg == swiotlb_unmap_sg_attrs || > > > > > + ops->map_page == swiotlb_map_page); > > > > > > > > That won't work. What if we use xen-swiotlb which has different function > > > > names? > > > > > > I didn't thought about xen, always doing things differently, i think xen > > > is > > > just a matter of also testing for the xen function. I just wanted to have > > > the helper in common code and only rely on common things, instead of > > > having > > > to add a per arch helper. > > > > There has to be a better way. Perhaps you can expand SWIOTLB to actually > > check if it is in use? > > This would require per arch modifications which is what i was trying to avoid. How? If you modify 'swiotlb_nr_tbl' to return true only if it has been used then the modifications are only in the lib/swiotlb.c ? > > Cheers, > Jérôme
[PATCH 1/6] swiotlb: Add helper to know if it is in use for a specific device.
On Wed, Aug 26, 2015 at 03:26:42PM -0400, Jerome Glisse wrote: > On Wed, Aug 26, 2015 at 03:02:31PM -0400, Konrad Rzeszutek Wilk wrote: > > On Wed, Aug 26, 2015 at 02:52:02PM -0400, jglisse at redhat.com wrote: > > > From: Jérôme Glisse > > > > > > Some device like GPU do things differently if swiotlb is in use. We > > > use to rely on swiotlb_nr_tbl() to know if swiotlb was enabled or not > > > but this is unreliable. Patch add a simple helpers to check if any of > > > > Why is it unreliable? > > Alex reported on irc that swiotlb_nr_tbl() returns non zero even if swiotlb > is disabled. This seems to be due to ac2cbab21f318e19bc176a7f38a120cec835220f > which cleanup swiotlb init and always allocate default size. Which i believe > is a waste of memory. So we need to add a real helper to know if swiotlb is > in use or not and we should not rely on expectation of some swiotlb value. Ah right, that patch. That should have been part of the description I believe. > > > > > > the dma_ops associated with a device points to the swiotlb functions, > > > making swiotlb check reliable for a device. > > > > > > Signed-off-by: Jérôme Glisse > > > Cc: Konrad Rzeszutek Wilk > > > Cc: Alex Deucher > > > Cc: Ben Skeggs > > > Cc: Dave Airlie > > > Cc: lkml at vger.kernel.org > > > Cc: Daniel Vetter > > > --- > > > include/linux/dma-mapping.h | 18 ++ > > > 1 file changed, 18 insertions(+) > > > > > > diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h > > > index ac07ff0..eac911e 100644 > > > --- a/include/linux/dma-mapping.h > > > +++ b/include/linux/dma-mapping.h > > > @@ -314,4 +314,22 @@ static inline int dma_mmap_writecombine(struct > > > device *dev, > > > #define dma_unmap_len_set(PTR, LEN_NAME, VAL)do { } while (0) > > > #endif > > > > > > + > > > +#ifdef CONFIG_SWIOTLB > > > +static inline bool swiotlb_in_use(struct device *dev) > > > +{ > > > + struct dma_map_ops *ops = get_dma_ops(dev); > > > + > > > + return (ops->map_sg == swiotlb_map_sg_attrs || > > > + ops->unmap_sg == swiotlb_unmap_sg_attrs || > > > + ops->map_page == swiotlb_map_page); > > > > That won't work. What if we use xen-swiotlb which has different function > > names? > > I didn't thought about xen, always doing things differently, i think xen is > just a matter of also testing for the xen function. I just wanted to have > the helper in common code and only rely on common things, instead of having > to add a per arch helper. There has to be a better way. Perhaps you can expand SWIOTLB to actually check if it is in use? > > Cheers, > Jérôme
[PATCH 1/6] swiotlb: Add helper to know if it is in use for a specific device.
On Wed, Aug 26, 2015 at 02:52:02PM -0400, jglisse at redhat.com wrote: > From: Jérôme Glisse > > Some device like GPU do things differently if swiotlb is in use. We > use to rely on swiotlb_nr_tbl() to know if swiotlb was enabled or not > but this is unreliable. Patch add a simple helpers to check if any of Why is it unreliable? > the dma_ops associated with a device points to the swiotlb functions, > making swiotlb check reliable for a device. > > Signed-off-by: Jérôme Glisse > Cc: Konrad Rzeszutek Wilk > Cc: Alex Deucher > Cc: Ben Skeggs > Cc: Dave Airlie > Cc: lkml at vger.kernel.org > Cc: Daniel Vetter > --- > include/linux/dma-mapping.h | 18 ++ > 1 file changed, 18 insertions(+) > > diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h > index ac07ff0..eac911e 100644 > --- a/include/linux/dma-mapping.h > +++ b/include/linux/dma-mapping.h > @@ -314,4 +314,22 @@ static inline int dma_mmap_writecombine(struct device > *dev, > #define dma_unmap_len_set(PTR, LEN_NAME, VAL)do { } while (0) > #endif > > + > +#ifdef CONFIG_SWIOTLB > +static inline bool swiotlb_in_use(struct device *dev) > +{ > + struct dma_map_ops *ops = get_dma_ops(dev); > + > + return (ops->map_sg == swiotlb_map_sg_attrs || > + ops->unmap_sg == swiotlb_unmap_sg_attrs || > + ops->map_page == swiotlb_map_page); That won't work. What if we use xen-swiotlb which has different function names? > +} > +#else > +static inline bool swiotlb_in_use(struct device *dev) > +{ > + return false; > +} > +#endif > + > + > #endif > -- > 2.1.0 >
[PATCH 1/2] drm/ttm: fix object deallocation to properly fill in the page pool.
On Wed, Jul 08, 2015 at 02:16:36PM -0400, j.glisse at gmail.com wrote: > From: Jérôme Glisse > > Current code never allowed the page pool to actualy fill in anyway. > This fix it, so that we only start freeing page from the pool when > we go over the pool size. > > Signed-off-by: Jérôme Glisse > Reviewed-by: Mario Kleiner > Tested-by: Michel Dänzer > Cc: Thomas Hellstrom > Cc: Konrad Rzeszutek Wilk Reviewed-by: Konrad Rzeszutek Wilk > --- > drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 8 +--- > 1 file changed, 1 insertion(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > index c96db43..0194a93 100644 > --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > @@ -953,14 +953,8 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, > struct device *dev) > } else { > pool->npages_free += count; > list_splice(_dma->pages_list, >free_list); > - npages = count; > - if (pool->npages_free > _manager->options.max_size) { > + if (pool->npages_free > _manager->options.max_size) > npages = pool->npages_free - _manager->options.max_size; > - /* free at least NUM_PAGES_TO_ALLOC number of pages > - * to reduce calls to set_memory_wb */ > - if (npages < NUM_PAGES_TO_ALLOC) > - npages = NUM_PAGES_TO_ALLOC; > - } > } > spin_unlock_irqrestore(>lock, irq_flags); > > -- > 1.8.3.1 >
[PATCH 2/2] drm/ttm: improve uncached page deallocation.
On Wed, Jul 08, 2015 at 02:16:37PM -0400, j.glisse at gmail.com wrote: > From: Jérôme Glisse > > Calls to set_memory_wb() incure heavy TLB flush and IPI cost. To > minimize those wait until pool grow beyond batch size before > draining the pool. > > Signed-off-by: Jérôme Glisse > Reviewed-by: Mario Kleiner > Cc: Michel Dänzer > Cc: Thomas Hellstrom > Cc: Konrad Rzeszutek Wilk Reviewed-by: Konrad Rzeszutek Wilk > --- > drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > index 0194a93..8028dd6 100644 > --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > @@ -953,7 +953,12 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, > struct device *dev) > } else { > pool->npages_free += count; > list_splice(_dma->pages_list, >free_list); > - if (pool->npages_free > _manager->options.max_size) > + /* > + * Wait to have at at least NUM_PAGES_TO_ALLOC number of pages > + * to free in order to minimize calls to set_memory_wb(). > + */ > + if (pool->npages_free >= (_manager->options.max_size + > + NUM_PAGES_TO_ALLOC)) > npages = pool->npages_free - _manager->options.max_size; > } > spin_unlock_irqrestore(>lock, irq_flags); > -- > 1.8.3.1 >
[PATCH] drm/ttm: dma: Don't crash on memory in the vmalloc range
On Tue, May 19, 2015 at 08:33:59PM +0900, Alexandre Courbot wrote: > On 05/16/2015 04:55 AM, Konrad Rzeszutek Wilk wrote: > >On Fri, May 15, 2015 at 04:09:54PM +0900, Alexandre Courbot wrote: > >>dma_alloc_coherent() can return memory in the vmalloc range. > >>virt_to_page() cannot handle such addresses and crashes. This > >>patch detects such cases and obtains the struct page * using > >>vmalloc_to_page() instead. > >> > >>Signed-off-by: Alexandre Courbot > >>--- > >>This patch is a follow-up of the following discussion: > >> > >>https://www.marc.info/?l=dri-devel=141579595431254=3 > >> > >>It works for me on both 32-bit and 64-bit Tegra, so I am not convinced > >>that Thierry's initial change from virt_to_page() to phys_to_page() is > >>still required - Thierry, can you confirm whether your patch is still > >>relevant after this one? > >> > >> drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 9 ++--- > >> 1 file changed, 6 insertions(+), 3 deletions(-) > >> > >>diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > >>b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > >>index 01e1d27eb078..3077f1554099 100644 > >>--- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > >>+++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > >>@@ -342,9 +342,12 @@ static struct dma_page *__ttm_dma_alloc_page(struct > >>dma_pool *pool) > >>d_page->vaddr = dma_alloc_coherent(pool->dev, pool->size, > >> _page->dma, > >> pool->gfp_flags); > >>- if (d_page->vaddr) > >>- d_page->p = virt_to_page(d_page->vaddr); > >>- else { > >>+ if (d_page->vaddr) { > >>+ if (is_vmalloc_addr(d_page->vaddr)) > >>+ d_page->p = vmalloc_to_page(d_page->vaddr); > >>+ else > >>+ d_page->p = virt_to_page(d_page->vaddr); > >>+ } else { > > > > > >Looks OK to me. > > Thanks guys. Could we translate these approvals into Acked-bys/Reviewed-bys > so Dave (?) can merge this patch? Acked-by: Konrad Rzeszutek Wilk
[PATCH] drm/ttm: dma: Don't crash on memory in the vmalloc range
On Fri, May 15, 2015 at 04:09:54PM +0900, Alexandre Courbot wrote: > dma_alloc_coherent() can return memory in the vmalloc range. > virt_to_page() cannot handle such addresses and crashes. This > patch detects such cases and obtains the struct page * using > vmalloc_to_page() instead. > > Signed-off-by: Alexandre Courbot > --- > This patch is a follow-up of the following discussion: > > https://www.marc.info/?l=dri-devel=141579595431254=3 > > It works for me on both 32-bit and 64-bit Tegra, so I am not convinced > that Thierry's initial change from virt_to_page() to phys_to_page() is > still required - Thierry, can you confirm whether your patch is still > relevant after this one? > > drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 9 ++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > index 01e1d27eb078..3077f1554099 100644 > --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > @@ -342,9 +342,12 @@ static struct dma_page *__ttm_dma_alloc_page(struct > dma_pool *pool) > d_page->vaddr = dma_alloc_coherent(pool->dev, pool->size, > _page->dma, > pool->gfp_flags); > - if (d_page->vaddr) > - d_page->p = virt_to_page(d_page->vaddr); > - else { > + if (d_page->vaddr) { > + if (is_vmalloc_addr(d_page->vaddr)) > + d_page->p = vmalloc_to_page(d_page->vaddr); > + else > + d_page->p = virt_to_page(d_page->vaddr); > + } else { Looks OK to me. > kfree(d_page); > d_page = NULL; > } > -- > 2.4.0 >
[PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool.
On Tue, Aug 12, 2014 at 11:52:05PM -0400, Jérôme Glisse wrote: > From: Jérôme Glisse > > Current code never allowed the page pool to actualy fill in anyway. This fix > it and also allow it to grow over its limit until it grow beyond the batch > size for allocation and deallocation. > > Signed-off-by: Jérôme Glisse > Reviewed-by: Mario Kleiner > Tested-by: Michel Dänzer > Cc: Thomas Hellstrom > Cc: Konrad Rzeszutek Wilk Reviewed-by: Konrad Rzeszutek Wilk > --- > drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 9 ++--- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > index c96db43..a076ff3 100644 > --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > @@ -953,14 +953,9 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, > struct device *dev) > } else { > pool->npages_free += count; > list_splice(_dma->pages_list, >free_list); > - npages = count; > - if (pool->npages_free > _manager->options.max_size) { > + if (pool->npages_free >= (_manager->options.max_size + > + NUM_PAGES_TO_ALLOC)) > npages = pool->npages_free - _manager->options.max_size; > - /* free at least NUM_PAGES_TO_ALLOC number of pages > - * to reduce calls to set_memory_wb */ > - if (npages < NUM_PAGES_TO_ALLOC) > - npages = NUM_PAGES_TO_ALLOC; > - } > } > spin_unlock_irqrestore(>lock, irq_flags); > > -- > 1.9.3 >
[RFC] drm/ttm: dma: Fixes for 32-bit and 64-bit ARM
On Wed, Nov 12, 2014 at 06:03:49PM +0100, Arnd Bergmann wrote: > On Wednesday 12 November 2014 09:18:59 Konrad Rzeszutek Wilk wrote: > > On Wed, Nov 12, 2014 at 01:39:05PM +0100, Thierry Reding wrote: > > > From: Thierry Reding > > > > > > dma_alloc_coherent() returns a kernel virtual address that is part of > > > the linear range. Passing such an address to virt_to_page() is illegal > > > on non-coherent architectures. This causes the kernel to oops on 64-bit > > > ARM because the struct page * obtained from virt_to_page() points to > > > unmapped memory. > > > > Oh! That is not good! > > > I think what Thierry meant is that the returned pointer is /not/ in the > linear range. > > > > Until that time, this temporary fix will allow TTM to work on 32-bit > > > and 64-bit ARM as well, provided that no IOMMU translations are enabled > > > for the GPU. > > > > Is there a way to query the 'struct device' to see if the IOMMU translation > > is enabled/disabled for said device? ? > > > > Now your patch looks to get the 'struct page' by doing some form of > > translation. Could you explain to me which type of memory have a 'struct > > page' > > and which ones do not ? > > > > It is OK if you explain this in nauseating details > > Basically there are two types of memory that have a struct page: > > - directly mapped cacheable memory, i.e. anything that can be accessed > through a kernel pointer without having to go though ioremap/vmalloc/... > > - highmem pages on 32-bit system. > > On noncoherent ARM systems, dma_alloc_coherent will return memory that > is was unmapped from the linear range to avoid having both cacheable and > noncachable mappings for the same page. > > Arnd
[RFC] drm/ttm: dma: Fixes for 32-bit and 64-bit ARM
On Wed, Nov 12, 2014 at 01:39:05PM +0100, Thierry Reding wrote: > From: Thierry Reding > > dma_alloc_coherent() returns a kernel virtual address that is part of > the linear range. Passing such an address to virt_to_page() is illegal > on non-coherent architectures. This causes the kernel to oops on 64-bit > ARM because the struct page * obtained from virt_to_page() points to > unmapped memory. Oh! That is not good! > > This commit fixes this by using phys_to_page() since we get a physical > address from dma_alloc_coherent(). Note that this is not a proper fix > because if an IOMMU is set up to translate addresses for the GPU this > address will be an I/O virtual address rather than a physical one. The > proper fix probably involves not getting a pointer to the struct page > in the first place, but that would be a much more intrusive change, if > at all possible. What type of caching types are there on ARM? We use the 'struct page' on the set_pages_to_[wc|uc|wb] but all of those are X86 specfic. But I think you could by passing the 'struct dma_page' instead of 'struct page' (and the array uses) around. That should solve the touching of 'struct page' and we can treat it as an opaque type. > > Until that time, this temporary fix will allow TTM to work on 32-bit > and 64-bit ARM as well, provided that no IOMMU translations are enabled > for the GPU. Is there a way to query the 'struct device' to see if the IOMMU translation is enabled/disabled for said device? Now your patch looks to get the 'struct page' by doing some form of translation. Could you explain to me which type of memory have a 'struct page' and which ones do not ? It is OK if you explain this in nauseating details :-) > > Signed-off-by: Thierry Reding > --- > Arnd, I realize that this isn't a proper fix according to what we discussed on > IRC yesterday, but I can't see a way to remove access to the pages array that > would be as simple as this. I've marked this as RFC in the hope that it will > trigger some discussion that will lead to a proper solution. > > drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > index c96db433f8af..d7993985752c 100644 > --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > @@ -343,7 +343,11 @@ static struct dma_page *__ttm_dma_alloc_page(struct > dma_pool *pool) > _page->dma, > pool->gfp_flags); > if (d_page->vaddr) > +#if defined(CONFIG_ARM) || defined(CONFIG_ARM64) > + d_page->p = phys_to_page(d_page->dma); > +#else > d_page->p = virt_to_page(d_page->vaddr); > +#endif > else { > kfree(d_page); > d_page = NULL; > -- > 2.1.3 >
CONFIG_DMA_CMA causes ttm performance problems/hangs.
On Tue, Aug 12, 2014 at 02:12:07PM +0200, Mario Kleiner wrote: > On 08/11/2014 05:17 PM, Jerome Glisse wrote: > >On Mon, Aug 11, 2014 at 12:11:21PM +0200, Thomas Hellstrom wrote: > >>On 08/10/2014 08:02 PM, Mario Kleiner wrote: > >>>On 08/10/2014 01:03 PM, Thomas Hellstrom wrote: > >>>>On 08/10/2014 05:11 AM, Mario Kleiner wrote: > >>>>>Resent this time without HTML formatting which lkml doesn't like. > >>>>>Sorry. > >>>>> > >>>>>On 08/09/2014 03:58 PM, Thomas Hellstrom wrote: > >>>>>>On 08/09/2014 03:33 PM, Konrad Rzeszutek Wilk wrote: > >>>>>>>On August 9, 2014 1:39:39 AM EDT, Thomas > >>>>>>>Hellstrom wrote: > >>>>>>>>Hi. > >>>>>>>> > >>>>>>>Hey Thomas! > >>>>>>> > >>>>>>>>IIRC I don't think the TTM DMA pool allocates coherent pages more > >>>>>>>>than > >>>>>>>>one page at a time, and _if that's true_ it's pretty unnecessary for > >>>>>>>>the > >>>>>>>>dma subsystem to route those allocations to CMA. Maybe Konrad could > >>>>>>>>shed > >>>>>>>>some light over this? > >>>>>>>It should allocate in batches and keep them in the TTM DMA pool for > >>>>>>>some time to be reused. > >>>>>>> > >>>>>>>The pages that it gets are in 4kb granularity though. > >>>>>>Then I feel inclined to say this is a DMA subsystem bug. Single page > >>>>>>allocations shouldn't get routed to CMA. > >>>>>> > >>>>>>/Thomas > >>>>>Yes, seems you're both right. I read through the code a bit more and > >>>>>indeed the TTM DMA pool allocates only one page during each > >>>>>dma_alloc_coherent() call, so it doesn't need CMA memory. The current > >>>>>allocators don't check for single page CMA allocations and therefore > >>>>>try to get it from the CMA area anyway, instead of skipping to the > >>>>>much cheaper fallback. > >>>>> > >>>>>So the callers of dma_alloc_from_contiguous() could need that little > >>>>>optimization of skipping it if only one page is requested. For > >>>>> > >>>>>dma_generic_alloc_coherent > >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3Ddma_generic_alloc_coherent=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A=d1852625e2ab2ff07eb34a7f33fc1f55f7f13959912d5a6ce9316d23070ce939> > >>>>> > >>>>>andintel_alloc_coherent > >>>>><https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/ident?i%3Dintel_alloc_coherent=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A=82d587e9b6aeced5cf9a7caefa91bf47fba809f3522b7379d22e45a2d5d35ebd> > >>>>>this > >>>>>seems easy to do. Looking at the arm arch variants, e.g., > >>>>> > >>>>>https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/source/arch/arm/mm/dma-mapping.c%23L1194=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A=4c178257eab9b5d7ca650dedba76cf27abeb49ddc7aebb9433f52b6c8bb3bbac > >>>>> > >>>>> > >>>>>and > >>>>> > >>>>>https://urldefense.proofpoint.com/v1/url?u=http://lxr.free-electrons.com/source/arch/arm64/mm/dma-mapping.c%23L44=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A=QQSN6uVpEiw6RuWLAfK%2FKWBFV5HspJUfDh4Y2mUz%2FH4%3D%0A=5f62f4cbe8cee1f1dd4cbba656354efe6867bcdc664cf90e9719e2f42a85de08 > >>>>> > >>>>> > >>>>>i'm not sure if it is that easily done, as there aren't any fallbacks > >>>>>for such a case and the code looks to me as if that's at least > >>>>>somewhat intentional. > >>>>> > >>>>>As far as TTM goes, one quick one-line fix to prevent it from using > >>>>>the CMA at least on SWIOTLB, NOMMU and Intel IOMMU (when using the > >>>>>above metho
CONFIG_DMA_CMA causes ttm performance problems/hangs.
On August 9, 2014 1:39:39 AM EDT, Thomas Hellstrom wrote: >Hi. > Hey Thomas! >IIRC I don't think the TTM DMA pool allocates coherent pages more than >one page at a time, and _if that's true_ it's pretty unnecessary for >the >dma subsystem to route those allocations to CMA. Maybe Konrad could >shed >some light over this? It should allocate in batches and keep them in the TTM DMA pool for some time to be reused. The pages that it gets are in 4kb granularity though. > >/Thomas > > >On 08/08/2014 07:42 PM, Mario Kleiner wrote: >> Hi all, >> >> there is a rather severe performance problem i accidentally found >when >> trying to give Linux 3.16.0 a final test on a x86_64 MacBookPro under >> Ubuntu 14.04 LTS with nouveau as graphics driver. >> >> I was lazy and just installed the Ubuntu precompiled mainline kernel. >> That kernel happens to have CONFIG_DMA_CMA=y set, with a default CMA >> (contiguous memory allocator) size of 64 MB. Older Ubuntu kernels >> weren't compiled with CMA, so i only observed this on 3.16, but >> previous kernels would likely be affected too. >> >> After a few minutes of regular desktop use like switching workspaces, >> scrolling text in a terminal window, Firefox with multiple tabs open, >> Thunderbird etc. (tested with KDE/Kwin, with/without desktop >> composition), i get chunky desktop updates, then multi-second >freezes, >> after a few minutes the desktop hangs for over a minute on almost any >> GUI action like switching windows etc. --> Unuseable. >> >> ftrace'ing shows the culprit being this callchain (typical good/bad >> example ftrace snippets at the end of this mail): >> >> ...ttm dma coherent memory allocations, e.g., from >> __ttm_dma_alloc_page() ... --> dma_alloc_coherent() --> platform >> specific hooks ... -> dma_generic_alloc_coherent() [on x86_64] --> >> dma_alloc_from_contiguous() >> >> dma_alloc_from_contiguous() is a no-op without CONFIG_DMA_CMA, or >when >> the machine is booted with kernel boot cmdline parameter "cma=0", so >> it triggers the fast alloc_pages_node() fallback at least on x86_64. >> >> With CMA, this function becomes progressively more slow with every >> minute of desktop use, e.g., runtimes going up from < 0.3 usecs to >> hundreds or thousands of microseconds (before it gives up and >> alloc_pages_node() fallback is used), so this causes the >> multi-second/minute hangs of the desktop. >> >> So it seems ttm memory allocations quickly fragment and/or exhaust >the >> CMA memory area, and dma_alloc_from_contiguous() tries very hard to >> find a fitting hole big enough to satisfy allocations with a retry >> loop (see >> >http://lxr.free-electrons.com/source/drivers/base/dma-contiguous.c#L339) >> that takes forever. I am curious why it does not end up using the pool. As in use the TTM DMA pool to pick pages instead of allocating (and freeing) new ones? >> >> This is not good, also not for other devices which actually need a >> non-fragmented CMA for DMA, so what to do? I doubt most current gpus >> still need physically contiguous dma memory, maybe with exception of >> some embedded gpus? Oh. If I understood you correctly - the CMA ends up giving huge chunks of contiguous area. But if the sizes are 4kb I wonder why it would do that? The modern GPUs on x86 can deal with scatter gather and as you surmise don't need contiguous physical contiguous areas. >> >> My naive approach would be to add a new gfp_t flag a la >> ___GFP_AVOIDCMA, and make callers of dma_alloc_from_contiguous() >> refrain from doing so if they have some fallback for getting memory. >> And then add that flag to ttm's ttm_dma_populate() gfp_flags, e.g., >> around here: >> >http://lxr.free-electrons.com/source/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c#L884 >> >> However i'm not familiar enough with memory management, so likely >> greater minds here have much better ideas on how to deal with this? >> That is a bit of hack to deal with CMA being slow. Hmm. Let's first figure out why TTM DMA pool is not reusing pages. >> thanks, >> -mario >> >> Typical snippet from an example trace of a badly stalling desktop >with >> CMA (alloc_pages_node() fallback may have been missing in this traces >> ftrace_filter settings): >> >> 1) | ttm_dma_pool_get_pages >> [ttm]() { >> 1) | ttm_dma_page_pool_fill_locked [ttm]() { >> 1) | ttm_dma_pool_alloc_new_pages [ttm]() { >> 1) | __ttm_dma_alloc_page [ttm]() { >> 1) | dma_generic_alloc_coherent() { >> 1) ! 1873.071 us | dma_alloc_from_contiguous(); >> 1) ! 1874.292 us | } >> 1) ! 1875.400 us |} >> 1) | __ttm_dma_alloc_page [ttm]() { >> 1) | dma_generic_alloc_coherent() { >> 1) ! 1868.372 us | dma_alloc_from_contiguous(); >> 1) ! 1869.586 us | } >> 1) ! 1870.053 us |} >> 1) |
[Xen-devel] [Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type
On Fri, Jul 11, 2014 at 08:29:56AM +0200, Daniel Vetter wrote: > On Thu, Jul 10, 2014 at 09:08:24PM +, Tian, Kevin wrote: > > actually I'm curious whether it's still necessary to __detect__ PCH. Could > > we assume a 1:1 mapping between GPU and PCH, e.g. BDW already hard > > code the knowledge: > > > > } else if (IS_BROADWELL(dev)) { > > dev_priv->pch_type = PCH_LPT; > > dev_priv->pch_id = > > INTEL_PCH_LPT_LP_DEVICE_ID_TYPE; > > DRM_DEBUG_KMS("This is Broadwell, assuming " > > "LynxPoint LP PCH\n"); > > > > Or if there is real usage on non-fixed mapping (not majority), could it be > > a > > better option to have fixed mapping as a fallback instead of leaving as > > PCH_NONE? Then even when Qemu doesn't provide a special tweaked PCH, > > the majority case just works. > > I guess we can do it, at least I haven't seen any strange combinations in > the wild outside of Intel ... How big is the QA matrix for this? Would it make sense to just include the latest hardware (say going two generations back) and ignore the older one?
[Xen-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support
On Wed, Jul 02, 2014 at 06:29:23PM +0200, Paolo Bonzini wrote: > Il 02/07/2014 17:27, Michael S. Tsirkin ha scritto: > > At some level, maybe Paolo is right. Ignore existing drivers and ask > > intel developers to update their drivers to do something sane on > > hypervisors, even if they do ugly things on real hardware. > > > > A simple proposal since what I wrote earlier though apparently wasn't > > very clear: > > > > Detect Xen subsystem vendor id on vga card. > > If there, avoid poking at chipset. Instead > > - use subsystem device # for card type > > You mean for PCH type (aka PCH device id). > > > - use second half of BAR0 of device > > - instead of access to pci host > > > > hypervisors will simply take BAR0 and double it in size, > > make second part map to what would be the pci host. > > Nice. Detecting the backdoor via the subsystem vendor id > is clever. > > I'm not sure if it's possible to just double the size of BAR0 > or not, but my laptop has: > > Region 0: Memory at d000 (64-bit, non-prefetchable) [size=4M] > Region 2: Memory at c000 (64-bit, prefetchable) [size=256M] > Region 4: I/O ports at 5000 [size=64] > > and I hope we can reserve a few KB for hypervisors within those > 4M, or 8 bytes for an address/data pair (like cf8/cfc) within BAR4's > 64 bytes (or grow BAR4 to 128 bytes, or something like that). > > Xen can still add the hacky machine type if they want for existing > hosts, but this would be a nice way forward. It would be good to understand first why i915 in the first place needs to setup the bridge MBAR if it has not been set. As in, why is this region needed? Is it needed to flush the pipeline (as in you need to write there?) or .. Perhaps it is not needed anymore with the current hardware and what can be done is put a stake in the ground saying that only genX or later will be supported. The commit ids allude to power managament and the earlier commits did poke there - but I don't see it on the latest tree. > > Paolo > > ___ > Xen-devel mailing list > Xen-devel at lists.xen.org > http://lists.xen.org/xen-devel
[Xen-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support
On Wed, Jul 02, 2014 at 05:08:43PM +0300, Michael S. Tsirkin wrote: > On Wed, Jul 02, 2014 at 10:00:33AM -0400, Konrad Rzeszutek Wilk wrote: > > On Wed, Jul 02, 2014 at 01:33:09PM +0200, Paolo Bonzini wrote: > > > Il 01/07/2014 19:39, Ross Philipson ha scritto: > > > > > > > >We do IGD pass-through in our project (XenClient). The patches > > > >originally came from our project. We surface the same ISA bridge and > > > >have never had activation issues on any version of Widows from XP to > > > >Win8. We do not normally run server platforms so I can't say for sure > > > >there. > > > > > > The problem is not activation, the problem is that the patches are making > > > assumptions on the driver and the firmware that might work today but are > > > IMHO just not sane. > > > > > > I would have no problem with a clean patchset that adds a new machine type > > > and doesn't touch code in "-M pc", but it looks like mst disagrees. > > > Ultimately, if a patchset is too hacky for upstream, you can include it in > > > your downstream XenClient (and XenServer) QEMU branch. It happens. > > > > And then this discussion will come back again in a year when folks > > rebase and ask: Why hasn't this been done upstream. > > > > Then the discussion resumes .. > > > > With this long thread I lost a bit context about the challenges > > that exists. But let me try summarizing it here - which will hopefully > > get some consensus. > > Before I answer could you clarify please: > by Southbridge do you mean the PCH at slot 1f or the MCH at slot 0 or both? MCH slot. We read/write from this (see intel_setup_mchbar) from couple of registers (0x44 and 0x48 if gen >= 4, otherwise 0x54). It is hard-coded in the i915_get_bridge_dev (see ec2a4c3fdc8e82fe82a25d800e85c1ea06b74372) as 0:0.0 BDF. The PCH (does not matter where it sits) we only use the model:vendor id to figure out the pch_type (see intel_detect_pch). I don't see why that model:vendor_id can't be exposed via checking the type of device:vendor_id of the IGD itself. CC-ing some Intel i915 authors. So for the discussion here, when I say Southbridge I mean MCH. > > > 1). Fix IGD hardware to not use Southbridge magic addresses. > > We can moan and moan but I doubt it is going to change. > > > > 2). Since we need the Southbridge magic addresses, we can expose > > an bridge. [I think everybody agrees that we need to do > > that since 1) is no go). > > > > 3). What kind of bridge. We can do: > > > > a) Two bridges - one 'passthrough' and the legacy ISA bridge > > that QEMU emulates. Both Linux and Windows are OK with > > two bridges (even thought it is pretty weird). > > > > b) One bridge - the one that QEMU emulates - and lets emulate > > more of the registers (by emulate - I mean for some get the > > data from the real hardware). > > > >b1). We can't use the legacy because the registers are > > above 256 (is that correct? Did I miss something?) > > > >b2) We would need to use the Q35. > > b2a). If we need Q35, that needs to be exposed in > > for Xen guests. That means exposing the > > MMCONFIG and restructing the E820 to fit that > > in. > > Problem: > > - Migration is not working with Q35. > > (But for v1 you wouldn't migrate, however > >later hardware will surely have SR-IOV so > >we will need to migrate). > > > > - There are no developers who have an OK > > from their management to focus on this. > >(Potential solution: Poke Intel management to see > > if they can get more developers on it) > > > > > > 4). Code does a bit of sysfs that could use some refacturing with > > the KVM code. > > Problem: More time needed to do the code restructing. > > > > > > Is that about correct? > > > > What are folks timezones and the best days next week to talk about > > this on either Google Hangout or the phone?
[PATCH 3/5] gpu/drm/ttm: Use mutex_trylock() to avoid deadlock inside shrinker functions.
On Sat, May 31, 2014 at 12:00:45PM +0900, Tetsuo Handa wrote: > >From 4e8d1a83629c5966bfd401c5f2187355624194f2 Mon Sep 17 00:00:00 2001 > From: Tetsuo Handa > Date: Sat, 31 May 2014 09:59:44 +0900 > Subject: [PATCH 3/5] gpu/drm/ttm: Use mutex_trylock() to avoid deadlock > inside shrinker functions. > > I can observe that RHEL7 environment stalls with 100% CPU usage when a > certain type of memory pressure is given. While the shrinker functions > are called by shrink_slab() before the OOM killer is triggered, the stall > lasts for many minutes. > > One of reasons of this stall is that > ttm_dma_pool_shrink_count()/ttm_dma_pool_shrink_scan() are called and > are blocked at mutex_lock(&_manager->lock). GFP_KERNEL allocation with > _manager->lock held causes someone (including kswapd) to deadlock when > these functions are called due to memory pressure. This patch changes > "mutex_lock();" to "if (!mutex_trylock()) return ...;" in order to > avoid deadlock. > > Signed-off-by: Tetsuo Handa > Cc: stable [3.3+] > --- > drivers/gpu/drm/ttm/ttm_page_alloc_dma.c |6 -- > 1 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > index d8e59f7..620da39 100644 > --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > @@ -1014,7 +1014,8 @@ ttm_dma_pool_shrink_scan(struct shrinker *shrink, > struct shrink_control *sc) > if (list_empty(&_manager->pools)) > return SHRINK_STOP; > > - mutex_lock(&_manager->lock); > + if (!mutex_lock(&_manager->lock)) > + return SHRINK_STOP; Hmm.. /home/konrad/linux/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c: In function ?ttm_dma_pool_shrink_scan?: /home/konrad/linux/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c:1015:2: error: invalid use of void expression if (!mutex_lock(&_manager->lock)) This is based on v3.15 with these patches. > if (!_manager->npools) > goto out; > pool_offset = ++start_pool % _manager->npools; > @@ -1047,7 +1048,8 @@ ttm_dma_pool_shrink_count(struct shrinker *shrink, > struct shrink_control *sc) > struct device_pools *p; > unsigned long count = 0; > > - mutex_lock(&_manager->lock); > + if (!mutex_trylock(&_manager->lock)) > + return 0; > list_for_each_entry(p, &_manager->pools, pools) > count += p->pool->npages_free; > mutex_unlock(&_manager->lock); > -- > 1.7.1
[PATCH] gpu/drm/ttm: Use mutex_lock_killable() for shrinker functions.
On Thu, May 29, 2014 at 11:34:59PM +0900, Tetsuo Handa wrote: > Tetsuo Handa wrote: > > Konrad Rzeszutek Wilk wrote: > > > On Sat, May 24, 2014 at 11:22:09PM +0900, Tetsuo Handa wrote: > > > > Hello. > > > > > > > > I tried to test whether it is OK (from point of view of reentrant) to > > > > use > > > > mutex_lock() or mutex_lock_killable() inside shrinker functions when > > > > shrinker > > > > functions do memory allocation, for > > > > drivers/gpu/drm/ttm/ttm_page_alloc_dma.c is > > > > doing memory allocation with mutex lock held inside > > > > ttm_dma_pool_shrink_scan(). > > > > > > > > If I compile a test module shown below which mimics extreme case of what > > > > ttm_dma_pool_shrink_scan() will do > > > > > > And ttm_pool_shrink_scan. > > > > I don't know why but ttm_pool_shrink_scan() does not take mutex. > > > Well, it seems to me that ttm_pool_shrink_scan() not taking mutex is a bug > which could lead to stack overflow if kmalloc() in ttm_page_pool_free() > triggered recursion. > > shrink_slab() > => ttm_pool_shrink_scan() > => ttm_page_pool_free() > => kmalloc(GFP_KERNEL) >=> shrink_slab() > => ttm_pool_shrink_scan() > => ttm_page_pool_free() > => kmalloc(GFP_KERNEL) > > Maybe shrink_slab() should be updated not to call same shrinker in parallel? > > Also, it seems to me that ttm_dma_pool_shrink_scan() has potential division > by 0 bug as described below. Is this patch correct? Looks OK. I would need to test it first. Could you send both patches to me please so I can just test them and queue them up together? Thank you! > -- > >From 4a65744a300e14e5e202c5f13ba2759e1e797d29 Mon Sep 17 00:00:00 2001 > From: Tetsuo Handa > Date: Thu, 29 May 2014 18:25:42 +0900 > Subject: [PATCH] gpu/drm/ttm: Use mutex_trylock() for shrinker functions. > > I can observe that RHEL7 environment stalls with 100% CPU usage when a > certain type of memory pressure is given. While the shrinker functions > are called by shrink_slab() before the OOM killer is triggered, the stall > lasts for many minutes. > > One of reasons of this stall is that > ttm_dma_pool_shrink_count()/ttm_dma_pool_shrink_scan() are called and > are blocked at mutex_lock(&_manager->lock). GFP_KERNEL allocation with > _manager->lock held causes someone (including kswapd) to deadlock when > these functions are called due to memory pressure. This patch changes > "mutex_lock();" to "if (!mutex_trylock()) return ...;" in order to > avoid deadlock. > > At the same time, this patch fixes potential division by 0 due to > unconditionally doing "% _manager->npools". This is because > list_empty(&_manager->pools) being false does not guarantee that > _manager->npools != 0 after taking the _manager->lock because > _manager->npools is updated under the _manager->lock. > > At the same time, this patch moves updating of start_pool variable > in order to avoid skipping when choosing a pool to shrink in > round-robin style. The start_pool is changed from "atomic_t" to > "unsigned int" because it is now updated under the _manager->lock. > > Signed-off-by: Tetsuo Handa > Cc: stable [3.3+] > --- > drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 15 ++- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > index fb8259f..5e332b4 100644 > --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > @@ -1004,9 +1004,9 @@ EXPORT_SYMBOL_GPL(ttm_dma_unpopulate); > static unsigned long > ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) > { > - static atomic_t start_pool = ATOMIC_INIT(0); > + static unsigned int start_pool; > unsigned idx = 0; > - unsigned pool_offset = atomic_add_return(1, _pool); > + unsigned pool_offset; > unsigned shrink_pages = sc->nr_to_scan; > struct device_pools *p; > unsigned long freed = 0; > @@ -1014,8 +1014,11 @@ ttm_dma_pool_shrink_scan(struct shrinker *shrink, > struct shrink_control *sc) > if (list_empty(&_manager->pools)) > return SHRINK_STOP; > > - mutex_lock(&_manager->lock); > - pool_offset = pool_offset % _manager->npools; > + if (!mutex_trylock(&_manager->lock)) > + return SHRINK_STOP; > +
[PATCH] gpu/drm/ttm: Use mutex_lock_killable() for shrinker functions.
On Thu, May 29, 2014 at 06:47:49AM +0900, Tetsuo Handa wrote: > Konrad Rzeszutek Wilk wrote: > > On Sat, May 24, 2014 at 11:22:09PM +0900, Tetsuo Handa wrote: > > > Hello. > > > > > > I tried to test whether it is OK (from point of view of reentrant) to use > > > mutex_lock() or mutex_lock_killable() inside shrinker functions when > > > shrinker > > > functions do memory allocation, for > > > drivers/gpu/drm/ttm/ttm_page_alloc_dma.c is > > > doing memory allocation with mutex lock held inside > > > ttm_dma_pool_shrink_scan(). > > > > > > If I compile a test module shown below which mimics extreme case of what > > > ttm_dma_pool_shrink_scan() will do > > > > And ttm_pool_shrink_scan. > > I don't know why but ttm_pool_shrink_scan() does not take mutex. > > > > and load the test module and do > > > > > > # echo 3 > /proc/sys/vm/drop_caches > > > > > > the system stalls with 0% CPU usage because of mutex deadlock > > > (with prior lockdep warning). > > > > > > Is this because wrong gfp flags are passed to kmalloc() ? Is this because > > > the test module's shrinker functions return wrong values? Is this because > > > doing memory allocation with mutex held inside shrinker functions is > > > forbidden? Can anybody tell me what is wrong with my test module? > > > > What is the sc->gfp_flags? What if you use GFP_ATOMIC? > > > I didn't check it but at least I'm sure that __GFP_WAIT bit is set. > Thus, GFP_ATOMIC or GFP_NOWAIT will solve this problem. > > > In regards to the lockdep warning below it looks like > > > > > > Regards. > > > > > > [ 48.077353] > > > [ 48.077999] = > > > [ 48.080023] [ INFO: inconsistent lock state ] > > > [ 48.080023] 3.15.0-rc6-00190-g1ee1cea #203 Tainted: G OE > > > [ 48.080023] - > > > [ 48.080023] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. > > > [ 48.086745] kswapd0/784 [HC0[0]:SC0[0]:HE1:SE1] takes: > > > [ 48.086745] (lock#2){+.+.?.}, at: [] > > > shrink_test_count+0x12/0x60 [test] > > > [ 48.086745] {RECLAIM_FS-ON-W} state was registered at: > > > > > > You have the scenario you described below, that is: > > > > shrink_test_scan > > mutex_lock_killable() > > -> kmalloc > > -> shrink_test_count > > mutex_lock_killable() > > > > And 'mutex_lock_killable' is the same (in at least this context) > > the same as 'mutex_lock'. In other words, your second 'mutex_lock' > > is going to spin - which is a deadlock. > > > > Perhaps a way of not getting in this scenario is: > > > > 1). Try to take the mutex (ie, one that won't spin if it can't > > get it). > > > > 2). Use the GFP_ATOMIC in the shrinker so that we never > > end up calling ourselves in case of memory pressure > > > > ? > > Yes, I think so as well. > > > > > > This patch changes "mutex_lock();" to "if (mutex_lock_killable()) > > > > > return ...;" > > > > > so that any threads can promptly give up. (By the way, as far as I > > > > > tested, > > > > > changing to "if (!mutex_trylock()) return ...;" likely shortens the > > > > > duration > > > > > of stall. Maybe we don't need to wait for mutex if someone is already > > > > > calling > > > > > these functions.) > > > > > > > > > > > > > While discussing about XFS problem, I got a question. Is it OK (from > > > > point > > > > of view of reentrant) to use mutex_lock() or mutex_lock_killable() > > > > inside > > > > shrinker's entry point functions? Can senario shown below possible? > > > > > > > > (1) kswapd is doing memory reclaim which does not need to hold mutex. > > > > > > > > (2) Someone in GFP_KERNEL context (not kswapd) calls > > > > ttm_dma_pool_shrink_count() and then calls > > > > ttm_dma_pool_shrink_scan() > > > > from direct reclaim path. > > > > > > > > (3) Inside ttm_dma_pool_shrink_scan(), GFP_KERNEL allocation is issued > > > > while mutex is held by the someone. > > > > > > > > (4) GFP_KERNEL all
[PATCH] gpu/drm/ttm: Use mutex_lock_killable() for shrinker functions.
On Sat, May 24, 2014 at 11:22:09PM +0900, Tetsuo Handa wrote: > Hello. > > I tried to test whether it is OK (from point of view of reentrant) to use > mutex_lock() or mutex_lock_killable() inside shrinker functions when shrinker > functions do memory allocation, for drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > is > doing memory allocation with mutex lock held inside > ttm_dma_pool_shrink_scan(). > > If I compile a test module shown below which mimics extreme case of what > ttm_dma_pool_shrink_scan() will do And ttm_pool_shrink_scan. > > -- test.c start -- > #include > #include > #include > #include > > static DEFINE_MUTEX(lock); > > static unsigned long shrink_test_count(struct shrinker *shrinker, struct > shrink_control *sc) > { > if (mutex_lock_killable()) { > printk(KERN_WARNING "Process %u (%s) gave up waiting for > mutex" >"\n", current->pid, current->comm); > return 0; > } > mutex_unlock(); > return 1; > } > > static unsigned long shrink_test_scan(struct shrinker *shrinker, struct > shrink_control *sc) > { > LIST_HEAD(list); > int i = 0; > if (mutex_lock_killable()) { > printk(KERN_WARNING "Process %u (%s) gave up waiting for > mutex" >"\n", current->pid, current->comm); > return 0; > } > while (1) { > struct list_head *l = kmalloc(PAGE_SIZE, sc->gfp_mask); > if (!l) > break; > list_add_tail(l, ); > i++; > } > printk(KERN_WARNING "Process %u (%s) allocated %u pages\n", >current->pid, current->comm, i); > while (i--) { > struct list_head *l = list.next; > list_del(l); > kfree(l); > } > mutex_unlock(); > return 1; > } > > static struct shrinker recursive_shrinker = { > .count_objects = shrink_test_count, > .scan_objects = shrink_test_scan, > .seeks = DEFAULT_SEEKS, > }; > > static int __init recursive_shrinker_init(void) > { > register_shrinker(_shrinker); > return 0; > } > > static void recursive_shrinker_exit(void) > { > unregister_shrinker(_shrinker); > } > > module_init(recursive_shrinker_init); > module_exit(recursive_shrinker_exit); > MODULE_LICENSE("GPL"); > -- test.c end -- > > and load the test module and do > > # echo 3 > /proc/sys/vm/drop_caches > > the system stalls with 0% CPU usage because of mutex deadlock > (with prior lockdep warning). > > Is this because wrong gfp flags are passed to kmalloc() ? Is this because > the test module's shrinker functions return wrong values? Is this because > doing memory allocation with mutex held inside shrinker functions is > forbidden? Can anybody tell me what is wrong with my test module? What is the sc->gfp_flags? What if you use GFP_ATOMIC? In regards to the lockdep warning below it looks like > > Regards. > > [ 48.077353] > [ 48.077999] = > [ 48.080023] [ INFO: inconsistent lock state ] > [ 48.080023] 3.15.0-rc6-00190-g1ee1cea #203 Tainted: G OE > [ 48.080023] - > [ 48.080023] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. > [ 48.086745] kswapd0/784 [HC0[0]:SC0[0]:HE1:SE1] takes: > [ 48.086745] (lock#2){+.+.?.}, at: [] > shrink_test_count+0x12/0x60 [test] > [ 48.086745] {RECLAIM_FS-ON-W} state was registered at: You have the scenario you described below, that is: shrink_test_scan mutex_lock_killable() -> kmalloc -> shrink_test_count mutex_lock_killable() And 'mutex_lock_killable' is the same (in at least this context) the same as 'mutex_lock'. In other words, your second 'mutex_lock' is going to spin - which is a deadlock. Perhaps a way of not getting in this scenario is: 1). Try to take the mutex (ie, one that won't spin if it can't get it). 2). Use the GFP_ATOMIC in the shrinker so that we never end up calling ourselves in case of memory pressure ? > [ 48.086745] [] mark_held_locks+0x68/0x90 > [ 48.086745] [] lockdep_trace_alloc+0x9a/0xe0 > [ 48.086745] [] kmem_cache_alloc+0x23/0x170 > [ 48.086745] [] shrink_test_scan+0x3a/0xf90 [test] > [ 48.086745] [] shrink_slab_node+0x13e/0x1d0 > [ 48.086745] [] shrink_slab+0x61/0xe0 > [ 48.086745] [] drop_caches_sysctl_handler+0x69/0xf0 > [ 48.086745] [] proc_sys_call_handler+0x6a/0xa0 > [ 48.086745] [] proc_sys_write+0x1a/0x20 > [ 48.086745] [] vfs_write+0xa0/0x190 > [ 48.086745] [] SyS_write+0x56/0xc0 > [ 48.086745] [] syscall_call+0x7/0xb > [ 48.086745] irq event stamp: 39 > [ 48.086745] hardirqs last enabled at (39): [] > count_shadow_nodes+0x20/0x40 > [
[PATCH] gpu/drm/ttm: Pass GFP flags in order to avoid deadlock.
On Tue, May 20, 2014 at 01:40:40AM +0100, Dave Airlie wrote: > > cc'ing dri-devel. It looks pretty simple and correct . I can test it tomorrow and make sure it works right. > > > >From d0d57745ba23faf605b0f249b57d283fe1a8ee60 Mon Sep 17 00:00:00 2001 > > From: Tetsuo Handa > > Date: Mon, 19 May 2014 17:59:03 +0900 > > Subject: [PATCH] gpu/drm/ttm: Pass GFP flags in order to avoid deadlock. > > > > Commit 7dc19d5a "drivers: convert shrinkers to new count/scan API" added > > deadlock warnings that ttm_page_pool_free() and ttm_dma_page_pool_free() > > are currently doing GFP_KERNEL allocation. > > > > But these functions did not get updated to receive gfp_t argument. > > This patch explicitly passes sc->gfp_mask or GFP_KERNEL to these functions, > > and removes the deadlock warning. > > > > Signed-off-by: Tetsuo Handa > > --- > > drivers/gpu/drm/ttm/ttm_page_alloc.c | 19 ++- > > drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 19 +-- > > 2 files changed, 19 insertions(+), 19 deletions(-) > > > > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c > > b/drivers/gpu/drm/ttm/ttm_page_alloc.c > > index 863bef9..ba8f78e 100644 > > --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c > > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c > > @@ -297,8 +297,10 @@ static void ttm_pool_update_free_locked(struct > > ttm_page_pool *pool, > > * > > * @pool: to free the pages from > > * @free_all: If set to true will free all pages in pool > > + * @gfp: GFP flags. > > **/ > > -static int ttm_page_pool_free(struct ttm_page_pool *pool, unsigned nr_free) > > +static int ttm_page_pool_free(struct ttm_page_pool *pool, unsigned nr_free, > > + gfp_t gfp) > > { > > unsigned long irq_flags; > > struct page *p; > > @@ -309,8 +311,7 @@ static int ttm_page_pool_free(struct ttm_page_pool > > *pool, unsigned nr_free) > > if (NUM_PAGES_TO_ALLOC < nr_free) > > npages_to_free = NUM_PAGES_TO_ALLOC; > > > > - pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), > > - GFP_KERNEL); > > + pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp); > > if (!pages_to_free) { > > pr_err("Failed to allocate memory for pool free operation\n"); > > return 0; > > @@ -382,9 +383,7 @@ out: > > * > > * XXX: (dchinner) Deadlock warning! > > * > > - * ttm_page_pool_free() does memory allocation using GFP_KERNEL. that > > means > > - * this can deadlock when called a sc->gfp_mask that is not equal to > > - * GFP_KERNEL. > > + * We need to pass sc->gfp_mask to ttm_page_pool_free(). > > * > > * This code is crying out for a shrinker per pool > > */ > > @@ -405,7 +404,8 @@ ttm_pool_shrink_scan(struct shrinker *shrink, struct > > shrink_control *sc) > > if (shrink_pages == 0) > > break; > > pool = &_manager->pools[(i + pool_offset)%NUM_POOLS]; > > - shrink_pages = ttm_page_pool_free(pool, nr_free); > > + shrink_pages = ttm_page_pool_free(pool, nr_free, > > + sc->gfp_mask); > > freed += nr_free - shrink_pages; > > } > > return freed; > > @@ -706,7 +706,7 @@ static void ttm_put_pages(struct page **pages, unsigned > > npages, int flags, > > } > > spin_unlock_irqrestore(>lock, irq_flags); > > if (npages) > > - ttm_page_pool_free(pool, npages); > > + ttm_page_pool_free(pool, npages, GFP_KERNEL); > > } > > > > /* > > @@ -846,7 +846,8 @@ void ttm_page_alloc_fini(void) > > ttm_pool_mm_shrink_fini(_manager); > > > > for (i = 0; i < NUM_POOLS; ++i) > > - ttm_page_pool_free(&_manager->pools[i], FREE_ALL_PAGES); > > + ttm_page_pool_free(&_manager->pools[i], FREE_ALL_PAGES, > > + GFP_KERNEL); > > > > kobject_put(&_manager->kobj); > > _manager = NULL; > > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > > b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > > index fb8259f..1b79bf0 100644 > > --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > > @@ -411,8 +411,10 @@ static void ttm_dma_page_put(struct dma_pool *pool, > > struct dma_page *d_page) > > * > > * @pool: to free the pages from > > * @nr_free: If set to true will free all pages in pool > > + * @gfp: GFP flags. > > **/ > > -static unsigned ttm_dma_page_pool_free(struct dma_pool *pool, unsigned > > nr_free) > > +static unsigned ttm_dma_page_pool_free(struct dma_pool *pool, unsigned > > nr_free, > > + gfp_t gfp) > > { > > unsigned long irq_flags; > > struct dma_page *dma_p, *tmp; > > @@ -430,8 +432,7 @@ static unsigned ttm_dma_page_pool_free(struct dma_pool > > *pool, unsigned nr_free) > > npages_to_free, nr_free); > > } > > #endif > > - pages_to_free = kmalloc(npages_to_free *
dri-top?
Hey, I should know this but I am bit behind on the latest of the drm-debug tools. Is there a way to figure out which applications are using GEM/TTM buffers? Or just even a simpler - which application is using which DRM pages? Michael (CC-ed here) is finding that TTM is hitting the memory ceiling quite often and we are not sure whether the problem is with a leaking application or something entirely different. Thanks!
[PATCH v2] ACPI: Fix acpi_evaluate_object() return value check
Yijing Wang wrote: >Fix acpi_evaluate_object() return value check, >shoud acpi_status not int. Should be? Your mailer also ate the word 'to' . > >Signed-off-by: Yijing Wang >--- > >v1->v2: Add CC to the related subsystem MAINTAINERS. > >--- > drivers/gpu/drm/i915/intel_acpi.c | 13 +++-- > drivers/gpu/drm/nouveau/core/subdev/mxm/base.c |6 +++--- > drivers/gpu/drm/nouveau/nouveau_acpi.c | 13 +++-- > drivers/pci/pci-label.c|6 +++--- > 4 files changed, 20 insertions(+), 18 deletions(-) > >diff --git a/drivers/gpu/drm/i915/intel_acpi.c >b/drivers/gpu/drm/i915/intel_acpi.c >index dfff090..7ea00e5 100644 >--- a/drivers/gpu/drm/i915/intel_acpi.c >+++ b/drivers/gpu/drm/i915/intel_acpi.c >@@ -35,7 +35,7 @@ static int intel_dsm(acpi_handle handle, int func) > union acpi_object params[4]; > union acpi_object *obj; > u32 result; >- int ret = 0; >+ acpi_status status; > > input.count = 4; > input.pointer = params; >@@ -50,8 +50,8 @@ static int intel_dsm(acpi_handle handle, int func) > params[3].package.count = 0; > params[3].package.elements = NULL; > >- ret = acpi_evaluate_object(handle, "_DSM", , ); >- if (ret) { >+ status = acpi_evaluate_object(handle, "_DSM", , ); >+ if (ACPI_FAILURE(status)) { > DRM_DEBUG_DRIVER("failed to evaluate _DSM: %d\n", ret); > return ret; > } >@@ -141,7 +141,8 @@ static void intel_dsm_platform_mux_info(void) > struct acpi_object_list input; > union acpi_object params[4]; > union acpi_object *pkg; >- int i, ret; >+ acpi_status status; >+ int i; > > input.count = 4; > input.pointer = params; >@@ -156,9 +157,9 @@ static void intel_dsm_platform_mux_info(void) > params[3].package.count = 0; > params[3].package.elements = NULL; > >- ret = acpi_evaluate_object(intel_dsm_priv.dhandle, "_DSM", , >+ acpi_status = acpi_evaluate_object(intel_dsm_priv.dhandle, "_DSM", >, > ); >- if (ret) { >+ if (ACPI_FAILURE(status)) { > DRM_DEBUG_DRIVER("failed to evaluate _DSM: %d\n", ret); > goto out; > } >diff --git a/drivers/gpu/drm/nouveau/core/subdev/mxm/base.c >b/drivers/gpu/drm/nouveau/core/subdev/mxm/base.c >index 1291204..3920943 100644 >--- a/drivers/gpu/drm/nouveau/core/subdev/mxm/base.c >+++ b/drivers/gpu/drm/nouveau/core/subdev/mxm/base.c >@@ -114,14 +114,14 @@ mxm_shadow_dsm(struct nouveau_mxm *mxm, u8 >version) > struct acpi_buffer retn = { ACPI_ALLOCATE_BUFFER, NULL }; > union acpi_object *obj; > acpi_handle handle; >- int ret; >+ acpi_status status; > > handle = ACPI_HANDLE(>pdev->dev); > if (!handle) > return false; > >- ret = acpi_evaluate_object(handle, "_DSM", , ); >- if (ret) { >+ status = acpi_evaluate_object(handle, "_DSM", , ); >+ if (ACPI_FAILURE(status)) { > nv_debug(mxm, "DSM MXMS failed: %d\n", ret); > return false; > } >diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c >b/drivers/gpu/drm/nouveau/nouveau_acpi.c >index ba0183f..6f810f2 100644 >--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c >+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c >@@ -82,7 +82,8 @@ static int nouveau_optimus_dsm(acpi_handle handle, >int func, int arg, uint32_t * > struct acpi_object_list input; > union acpi_object params[4]; > union acpi_object *obj; >- int i, err; >+ acpi_status status; >+ int i; > char args_buff[4]; > > input.count = 4; >@@ -101,8 +102,8 @@ static int nouveau_optimus_dsm(acpi_handle handle, >int func, int arg, uint32_t * > args_buff[i] = (arg >> i * 8) & 0xFF; > params[3].buffer.pointer = args_buff; > >- err = acpi_evaluate_object(handle, "_DSM", , ); >- if (err) { >+ status = acpi_evaluate_object(handle, "_DSM", , ); >+ if (ACPI_FAILURE(status)) { > printk(KERN_INFO "failed to evaluate _DSM: %d\n", err); > return err; > } >@@ -134,7 +135,7 @@ static int nouveau_dsm(acpi_handle handle, int >func, int arg, uint32_t *result) > struct acpi_object_list input; > union acpi_object params[4]; > union acpi_object *obj; >- int err; >+ acpi_status status; > > input.count = 4; > input.pointer = params; >@@ -148,8 +149,8 @@ static int nouveau_dsm(acpi_handle handle, int >func, int arg, uint32_t *result) > params[3].type = ACPI_TYPE_INTEGER; > params[3].integer.value = arg; > >- err = acpi_evaluate_object(handle, "_DSM", , ); >- if (err) { >+ status = acpi_evaluate_object(handle, "_DSM", , ); >+ if (ACPI_FAILURE(status)) { > printk(KERN_INFO "failed to evaluate _DSM: %d\n", err); > return err; > } >diff --git a/drivers/pci/pci-label.c
Radeon driver crashes when rebooting with 3.13-rc2
I keep on getting this: Any thoughts? I hadn't tried to do a bisection or such. [ 1784.126124] Console: switching to colour VGA+ 80x25 [ 1784.128286] drm_kms_helper: drm: unregistered panic notifier [ 1784.133601] [drm] radeon: finishing device. [ 1784.135934] BUG: unable to handle kernel NULL pointer dereference at 0378 [ 1784.135942] IP: [] post_xfer+0x18/0xb0 [radeon] [ 1784.135965] PGD 0 [ 1784.135968] Oops: [#1] SMP [ 1784.135972] Modules linked in: bnep bluetooth fuse tun xen_netback xen_blkback xen_gntalloc xen_gntdev xen_evtchn xenfs xen_privcmd snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec eeepc_wmi asus_wmi sparse_keymap snd_hwdep rfkill snd_seq snd_seq_device snd_pcm crc32c_intel ghash_clmulni_intel joydev snd_page_alloc serio_raw edac_core snd_timer pcspkr k10temp fam15h_power snd edac_mce_amd sp5100_tco r8169 i2c_piix4 soundcore mii shpchp uinput nouveau radeon video i2c_algo_bit drm_kms_helper ttm mxm_wmi drm usb_storage hid_logitech_dj i2c_core wmi ecryptfs encrypted_keys trusted tpm [last unloaded: iptable_raw] [ 1784.136027] CPU: 0 PID: 128 Comm: kworker/0:2 Tainted: GW 3.13.0-rc2+ #21 [ 1784.136031] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 LE R2.0, BIOS 1903 07/11/2013 [ 1784.136047] Workqueue: events radeon_hotplug_work_func [radeon] [ 1784.136051] task: 880130fc4860 ti: 88000a64a000 task.ti: 88000a64a000 [ 1784.136055] RIP: e030:[] [] post_xfer+0x18/0xb0 [radeon] [ 1784.136070] RSP: e02b:88000a64bbf8 EFLAGS: 00010246 [ 1784.136073] RAX: RBX: RCX: [ 1784.136076] RDX: f761bf4d RSI: 7e28 RDI: 880009d9e848 [ 1784.136080] RBP: 88000a64bc08 R08: 0200 R09: a00b9120 [ 1784.136084] R10: R11: 0001 R12: [ 1784.136087] R13: 0007 R14: 88000a64bd38 R15: [ 1784.136094] FS: 7f2c95b48740() GS:88013fe0() knlGS: [ 1784.136098] CS: e033 DS: ES: CR0: 8005003b [ 1784.136102] CR2: 0378 CR3: 01c0c000 CR4: 00040660 [ 1784.136106] Stack: [ 1784.136108] 0002 88000a64bc88 a00659e3 [ 1784.136115] 880009d9e848 880009d9eb80 0002 0002 [ 1784.136121] 88000a64bd30 0001f59b42d1 880009d9e800 0a64bd370007 [ 1784.136128] Call Trace: [ 1784.136134] [] bit_xfer+0x273/0x890 [i2c_algo_bit] [ 1784.136142] [] __i2c_transfer+0x5c/0x70 [i2c_core] [ 1784.136149] [] i2c_transfer+0x5c/0xe0 [i2c_core] [ 1784.136164] [] radeon_ddc_probe+0xde/0x120 [radeon] [ 1784.136179] [] radeon_dvi_detect+0x69/0x500 [radeon] [ 1784.136186] [] ? sched_clock+0x9/0x10 [ 1784.136193] [] drm_helper_hpd_irq_event+0x9d/0x160 [drm_kms_helper] [ 1784.136212] [] radeon_hotplug_work_func+0x5d/0x70 [radeon] [ 1784.136218] [] process_one_work+0x17b/0x460 [ 1784.136223] [] worker_thread+0x11b/0x400 [ 1784.136228] [] ? rescuer_thread+0x3e0/0x3e0 [ 1784.136233] [] kthread+0xd2/0xf0 [ 1784.136238] [] ? kthread_create_on_node+0x180/0x180 [ 1784.136244] [] ret_from_fork+0x7c/0xb0 [ 1784.136249] [] ? kthread_create_on_node+0x180/0x180 [ 1784.136252] Code: 44 d1 31 c9 09 c2 e8 48 d5 00 00 5b 41 5c 41 5d 5d c3 90 66 66 66 66 90 55 48 83 c7 48 48 89 e5 41 54 53 e8 1b 92 2b e1 48 89 c3 <48> 8b 80 78 03 00 00 31 d2 8b b3 cc 03 4.136301] RIP [] post_xfer+0x18/0xb0 [radeon] [ 1784.136316] RSP [ 1784.136318] CR2: 0378 [ 1784.136322] ---[ end trace ce465373f0865c07 ]--- [ 1784.136421] BUG: unable to handle kernel paging request at ffd8 [ 1784.136426] IP: [] kthread_data+0x10/0x20 [ 1784.136430] PGD 1c0f067 PUD 1c11067 PMD 0 [ 1784.136435] Oops: [#2] SMP [ 1784.136438] Modules linked in: bnep bluetooth fuse tun xen_netback xen_blkback xen_gntalloc xen_gntdev xen_evtchn xenfs xen_privcmd snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec eeepc_wmi asus_wmi sparse_keymap snd_hwdep rfkill snd_seq snd_seq_device snd_pcm crc32c_intel ghash_clmulni_intel joydev snd_page_alloc serio_raw edac_core snd_timer pcspkr k10temp fam15h_power snd edac_mce_amd sp5100_tco r8169 i2c_piix4 soundcore mii shpchp uinput nouveau radeon video i2c_algo_bit drm_kms_helper ttm mxm_wmi drm usb_storage hid_logitech_dj i2c_core wmi ecryptfs encrypted_keys trusted tpm [last unloaded: iptable_raw] [ 1784.136492] CPU: 0 PID: 128 Comm: kworker/0:2 Tainted: G D W 3.13.0-rc2+ #21 [ 1784.136495] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 LE R2.0, BIOS 1903 07/11/2013 [ 1784.136511] task: 880130fc4860 ti: 88000a64a000 task.ti: 88000a64a000 [ 1784.136514] RIP: e030:[] [] kthread_data+0x10/0x20 [ 1784.136520] RSP: e02b:88000a64b838 EFLAGS: 00010002 [ 1784.136523] RAX: RBX: RCX: 0001
[PATCH 2/2] drm/vmwgfx: Use the linux DMA api to get valid device addresses of pages
On Mon, Nov 04, 2013 at 05:57:39AM -0800, Thomas Hellstrom wrote: > The code handles three different cases: > 1) physical page addresses. The ttm page array is used. > 2) DMA subsystem addresses. A scatter-gather list is used. > 3) Coherent pages. The ttm dma pool is used, together with the dma_ttm > array os dma_addr_t > > Signed-off-by: Thomas Hellstrom > Reviewed-by: Jakob Bornecrantz I looked at it from the TTM DMA perspective and it looks OK for me. > --- > drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c | 379 > ++-- > drivers/gpu/drm/vmwgfx/vmwgfx_drv.c| 87 +++- > drivers/gpu/drm/vmwgfx/vmwgfx_drv.h| 98 - > drivers/gpu/drm/vmwgfx/vmwgfx_gmr.c| 150 ++--- > 4 files changed, 620 insertions(+), 94 deletions(-) > > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c > b/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c > index 96dc84d..7776e6f 100644 > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c > @@ -141,37 +141,374 @@ struct ttm_placement vmw_srf_placement = { > }; > > struct vmw_ttm_tt { > - struct ttm_tt ttm; > + struct ttm_dma_tt dma_ttm; > struct vmw_private *dev_priv; > int gmr_id; > + struct sg_table sgt; > + struct vmw_sg_table vsgt; > + uint64_t sg_alloc_size; > + bool mapped; > }; > > +/** > + * Helper functions to advance a struct vmw_piter iterator. > + * > + * @viter: Pointer to the iterator. > + * > + * These functions return false if past the end of the list, > + * true otherwise. Functions are selected depending on the current > + * DMA mapping mode. > + */ > +static bool __vmw_piter_non_sg_next(struct vmw_piter *viter) > +{ > + return ++(viter->i) < viter->num_pages; > +} > + > +static bool __vmw_piter_sg_next(struct vmw_piter *viter) > +{ > + return __sg_page_iter_next(>iter); > +} > + > + > +/** > + * Helper functions to return a pointer to the current page. > + * > + * @viter: Pointer to the iterator > + * > + * These functions return a pointer to the page currently > + * pointed to by @viter. Functions are selected depending on the > + * current mapping mode. > + */ > +static struct page *__vmw_piter_non_sg_page(struct vmw_piter *viter) > +{ > + return viter->pages[viter->i]; > +} > + > +static struct page *__vmw_piter_sg_page(struct vmw_piter *viter) > +{ > + return sg_page_iter_page(>iter); > +} > + > + > +/** > + * Helper functions to return the DMA address of the current page. > + * > + * @viter: Pointer to the iterator > + * > + * These functions return the DMA address of the page currently > + * pointed to by @viter. Functions are selected depending on the > + * current mapping mode. > + */ > +static dma_addr_t __vmw_piter_phys_addr(struct vmw_piter *viter) > +{ > + return page_to_phys(viter->pages[viter->i]); > +} > + > +static dma_addr_t __vmw_piter_dma_addr(struct vmw_piter *viter) > +{ > + return viter->addrs[viter->i]; > +} > + > +static dma_addr_t __vmw_piter_sg_addr(struct vmw_piter *viter) > +{ > + return sg_page_iter_dma_address(>iter); > +} > + > + > +/** > + * vmw_piter_start - Initialize a struct vmw_piter. > + * > + * @viter: Pointer to the iterator to initialize > + * @vsgt: Pointer to a struct vmw_sg_table to initialize from > + * > + * Note that we're following the convention of __sg_page_iter_start, so that > + * the iterator doesn't point to a valid page after initialization; it has > + * to be advanced one step first. > + */ > +void vmw_piter_start(struct vmw_piter *viter, const struct vmw_sg_table > *vsgt, > + unsigned long p_offset) > +{ > + viter->i = p_offset - 1; > + viter->num_pages = vsgt->num_pages; > + switch (vsgt->mode) { > + case vmw_dma_phys: > + viter->next = &__vmw_piter_non_sg_next; > + viter->dma_address = &__vmw_piter_phys_addr; > + viter->page = &__vmw_piter_non_sg_page; > + viter->pages = vsgt->pages; > + break; > + case vmw_dma_alloc_coherent: > + viter->next = &__vmw_piter_non_sg_next; > + viter->dma_address = &__vmw_piter_dma_addr; > + viter->page = &__vmw_piter_non_sg_page; > + viter->addrs = vsgt->addrs; > + break; > + case vmw_dma_map_populate: > + case vmw_dma_map_bind: > + viter->next = &__vmw_piter_sg_next; > + viter->dma_address = &__vmw_piter_sg_addr; > + viter->page = &__vmw_piter_sg_page; > + __sg_page_iter_start(>iter, vsgt->sgt->sgl, > + vsgt->sgt->orig_nents, p_offset); > + break; > + default: > + BUG(); > + } > +} > + > +/** > + * vmw_ttm_unmap_from_dma - unmap device addresses previsouly mapped for > + * TTM pages > + * > + * @vmw_tt: Pointer to a struct vmw_ttm_backend > + * > + * Used to free dma mappings previously mapped by vmw_ttm_map_for_dma. > + */ > +static void
[PATCH 1/2] drm/ttm: Enable the dma page pool also for intel IOMMUs
On Mon, Nov 04, 2013 at 05:57:38AM -0800, Thomas Hellstrom wrote: > Used by the vmwgfx driver That looks OK to me. And baremetal should not be affected as the Intel VT-d driver turns of the SWIOTLB driver - so it will still use the classic ttm pool code. Reviewed-by: Konrad Rzeszutek Wilk > > Signed-off-by: Thomas Hellstrom > Reviewed-by: Jakob Bornecrantz > --- > drivers/gpu/drm/ttm/Makefile |6 +- > drivers/gpu/drm/ttm/ttm_page_alloc_dma.c |3 +++ > include/drm/ttm/ttm_page_alloc.h | 11 ++- > 3 files changed, 14 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/Makefile b/drivers/gpu/drm/ttm/Makefile > index b2b33dd..b433b9f 100644 > --- a/drivers/gpu/drm/ttm/Makefile > +++ b/drivers/gpu/drm/ttm/Makefile > @@ -5,10 +5,6 @@ ccflags-y := -Iinclude/drm > ttm-y := ttm_agp_backend.o ttm_memory.o ttm_tt.o ttm_bo.o \ > ttm_bo_util.o ttm_bo_vm.o ttm_module.o \ > ttm_object.o ttm_lock.o ttm_execbuf_util.o ttm_page_alloc.o \ > - ttm_bo_manager.o > - > -ifeq ($(CONFIG_SWIOTLB),y) > -ttm-y += ttm_page_alloc_dma.o > -endif > + ttm_bo_manager.o ttm_page_alloc_dma.o > > obj-$(CONFIG_DRM_TTM) += ttm.o > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > index 7957bee..fb8259f 100644 > --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c > @@ -33,6 +33,7 @@ > * when freed). > */ > > +#if defined(CONFIG_SWIOTLB) || defined(CONFIG_INTEL_IOMMU) > #define pr_fmt(fmt) "[TTM] " fmt > > #include > @@ -1142,3 +1143,5 @@ int ttm_dma_page_alloc_debugfs(struct seq_file *m, void > *data) > return 0; > } > EXPORT_SYMBOL_GPL(ttm_dma_page_alloc_debugfs); > + > +#endif > diff --git a/include/drm/ttm/ttm_page_alloc.h > b/include/drm/ttm/ttm_page_alloc.h > index 706b962..d1f61bf 100644 > --- a/include/drm/ttm/ttm_page_alloc.h > +++ b/include/drm/ttm/ttm_page_alloc.h > @@ -62,7 +62,7 @@ extern void ttm_pool_unpopulate(struct ttm_tt *ttm); > extern int ttm_page_alloc_debugfs(struct seq_file *m, void *data); > > > -#ifdef CONFIG_SWIOTLB > +#if defined(CONFIG_SWIOTLB) || defined(CONFIG_INTEL_IOMMU) > /** > * Initialize pool allocator. > */ > @@ -94,6 +94,15 @@ static inline int ttm_dma_page_alloc_debugfs(struct > seq_file *m, void *data) > { > return 0; > } > +static inline int ttm_dma_populate(struct ttm_dma_tt *ttm_dma, > +struct device *dev) > +{ > + return -ENOMEM; > +} > +static inline void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, > + struct device *dev) > +{ > +} > #endif > > #endif > -- > 1.7.10.4 -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231=/4140/ostg.clktrk -- ___ Dri-devel mailing list Dri-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH 0/2] Make the vmwgfx driver reasonably DMA-API compliant
On Mon, Nov 04, 2013 at 05:57:37AM -0800, Thomas Hellstrom wrote: > These patches makes the vmwgfx driver use the DMA API to obtain valid > device addresses rather than blindly using physical addresses. > > The main motivation is to be able to use a virtual IOMMU in the future. Ooooh. Neat! Are there any RFC patches available? > > Other TTM drivers typically map pages one by one rather than using a > scatter-gather list, but since we can benefit from having a single dma > address region set up by the IOMMU, we use a scatter-gather list instead. > > Finally to be able to handle all the dma mapping modes, we locally extend the > scatter-gather list page iterator to handle also the direct physical- and > coherent cases. > > Finally, the TTM DMA page pool is enabled also when the Intel IOMMU is active -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231=/4140/ostg.clktrk -- ___ Dri-devel mailing list Dri-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: Use of pci_map_page in nouveau, radeon TTM.
On Tue, Oct 01, 2013 at 12:16:16PM +0200, Thomas Hellstrom wrote: Jerome, Konrad Forgive an ignorant question, but it appears like both Nouveau and Radeon may use pci_map_page() when populating TTMs on pages obtained using the ordinary (not DMA pool). These pages will, if I understand things correctly, not be pages allocated with DMA_ALLOC_COHERENT. Not always. That depends if the SWIOTLB buffer has been enabled. Which happens if you have Calgary IOMMU, AMD GART and if you run under Xen. From what I understand, at least for the corresponding dma_map_page() it's illegal for the CPU to access these pages without calling dma_sync_xx_for_cpu(). And before the device is allowed to access them again, you need to call dma_sync_xx_for_device(). Correct. So mapping for PCI really invalidates the TTM interleaved CPU / device access model. Unless you use the TTM DMA one which allocates them from the coherent pool - in which case they are already mapped. Granted the part of using DMA export/import API is not finished (so moving from TTM pool to a V4L for example) and it will blow up with the right mix. Or did I miss something here? That is it. But for most of the use cases the drivers have been able to skirt this restriction b/c the pci_map_page/pci_unmap_page setup a DMA mapping that is static (until the pci_unmap_page) and on x86 the memory is coherent. So the map is good irregardless of the PCI devices. Naturally if you have multitple IOMMUs per bridge this all falls apart :-( This all falls flat also with non-coherent memory and I believe that is what some of the PA-RISC folks are hitting their heads against. And probably also on ARM once they start using these chipsets. Thanks, Thomas ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: drm/radeon: ring test failed on PA-RISC Linux
On Sat, Sep 21, 2013 at 07:39:10AM +0400, Alex Ivanov wrote: 21.09.2013, в 1:27, Alex Deucher alexdeuc...@gmail.com написал(а): On Tue, Sep 17, 2013 at 3:33 PM, Alex Ivanov gnido...@p0n4ik.tk wrote: 17.09.2013, в 18:24, Alex Deucher alexdeuc...@gmail.com написал(а): On Tue, Sep 17, 2013 at 5:23 AM, Alex Ivanov gnido...@p0n4ik.tk wrote: Alex, 10.09.2013, в 16:37, Alex Deucher alexdeuc...@gmail.com написал(а): The dummy page isn't really going to help much. That page is just used as a safety placeholder for gart entries that aren't mapped on the GPU. TTM (drivers/gpu/drm/ttm) actually does the allocation of the backing pages for the gart. You may want to look there. Ah, sorry. Indeed. Though, my idea with: On Tue, Sep 10, 2013 at 5:20 AM, Alex Ivanov gnido...@p0n4ik.tk wrote: Thanks! I'll try. Meanwhile i've tried a switch from page_alloc() to dma_alloc_coherent() in radeon_dummy_page_*(), which didn't help :( doesn't make a sense at TTM part as well. After the driver is loaded, you can dump some info from debugfs: r100_rbbm_info r100_cp_ring_info r100_cp_csq_fifo Which will dump a bunch of registers and internal fifos so we can see that the chip actually processed. Alex Reading of r100_cp_ring_info leads to a KP: r100_debugfs_cp_ring_info(): count = (rdp + ring-ring_size - wdp) ring-ptr_mask; i = (rdp + j) ring-ptr_mask; for (j = 0; j = count; j++) { i = (rdp + j) ring-ptr_mask; -- Here at first iteration -- -- count = 262080, i = 0 -- seq_printf(m, r[%04d]=0x%08x\n, i, ring-ring[i]); } Reading of radeon_ring_gfx (which i've additionally tried to read) throws an MCE: radeon_debugfs_ring_info(): count = (ring-ring_size / 4) - ring-ring_free_dw; i = (ring-rptr + ring-ptr_mask + 1 - 32) ring-ptr_mask; for (j = 0; j = (count + 32); j++) { -- Here at first iteration -- -- i = 262112, j = 0 -- seq_printf(m, r[%5d]=0x%08x\n, i, ring-ring[i]); i = (i + 1) ring-ptr_mask; } I'm attaching debug outputs on kernel built with these loops commented. The register writes seems to be going through the register backbone correctly: [0x00B] 0x15E0=0x [0x00C] 0x15E4=0xCAFEDEAD [0x00D] 0x4274=0x000F [0x00E] 0x42C8=0x0007 [0x00F] 0x4018=0x001D [0x010] 0x170C=0x8000 [0x011] 0x3428=0x00020100 [0x012] 0x15E4=0xCAFEDEAD You can see the 0xCAFEDEAD written to the scratch register via MMIO from the ring_test(). The CP fifo however seems to be full of garbage. The CP is busy though, so it seems to be functional. I guess it's just fetching garbage rather than commands. If it is fetching garbage, that would imply the DMA (or bus addresses) that are programmed in the GART are bogus. If you dump them and try to figure out if bus adress - physical address - virtual address == virtual address - bus address that could help. And perhaps seeing what the virtual address has - and or poisoning it with known data? Or perhaps the the card has picked up an incorrect page table? Meaning the (bus) address given to it is not the correct one? Does doing a posted write when writing to the ring buffer help? Unfortunately, no. diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c index a890756..b4f04d2 100644 --- a/drivers/gpu/drm/radeon/radeon_ring.c +++ b/drivers/gpu/drm/radeon/radeon_ring.c @@ -324,12 +324,14 @@ static int radeon_debugfs_ring_init(struct radeon_device *rdev, struct radeon_ri */ void radeon_ring_write(struct radeon_ring *ring, uint32_t v) { + u32 tmp; #if DRM_DEBUG_CODE if (ring-count_dw = 0) { DRM_ERROR(radeon: writing more dwords to the ring than expected!\n); } #endif ring-ring[ring-wptr++] = v; + tmp = ring-ring[ring-wptr - 1]; ring-wptr = ring-ptr_mask; ring-count_dw--; ring-ring_free_dw--; ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: drm/radeon: ring test failed on PA-RISC Linux
On Tue, Sep 10, 2013 at 01:20:57PM +0400, Alex Ivanov wrote: Alex, 09.09.2013, в 21:43, Alex Deucher alexdeuc...@gmail.com написал(а): On Mon, Sep 9, 2013 at 12:44 PM, Alex Ivanov gnido...@p0n4ik.tk wrote: Folks, We (people at linux-parisc @ vger.kernel.org mail list) are trying to make native video options of the latest PA-RISC servers and workstations (these are ATIs, most of which are based on R100/R300/R420 chips) work correctly on this platform (big endian pa-risc). However, we hadn't much success. DRM fails every time with ring test failed for both AGP PCI. Maybe you would give us some suggestions that we could check? Topic started here: http://www.spinics.net/lists/linux-parisc/msg04908.html And continued there: http://www.spinics.net/lists/linux-parisc/msg04995.html http://www.spinics.net/lists/linux-parisc/msg05006.html Problems we've already resolved without any signs of progress: - Checked the successful microcode load parisc AGP GART code writes IOMMU entries in the wrong byte order and doesn't add the coherency information SBA code adds our PCI BAR setup doesn't really work very well together with the Radeon DRM address setup. DRM will generate addresses, which are even outside of the connected LBA Things planned for a check: The drivers/video/aty uses an endian config bit DRM doesn't use, but I haven't tested whether this makes a difference and how it is connected to the overall picture. I don't think that will any difference. radeon kms works fine on other big endian platforms such as powerpc. Good! I'll opt it out then. The Rage128 product revealed a weakness in some motherboard chipsets in that there is no mechanism to guarantee that data written by the CPU to memory is actually in a readable state before the Graphics Controller receives an update to its copy of the Write Pointer. In an effort to alleviate this problem, weve introduced a mechanism into the Graphics Controller that will delay the actual write to the Write Pointer for some programmable amount of time, in order to give the chipset time to flush its internal write buffers to memory. There are two register fields that control this mechanism: PRE_WRITE_TIMER and PRE_WRITE_LIMIT. In the radeon DRM codebase I didn't found anyone using/setting those registers. Maybe PA-RISC has some problem here?... I doubt it. If you are using AGP, I'd suggest disabling it and first try to get things working using the on chip gart rather than AGP. Load radeon with agpmode=-1. Already tried this without any luck. Anyway, a radeon driver fallbacks to the PCI mode in our case, so does it really matter? In addition, people with PCI cards experiencing the same issue... The on chip gart always uses cache snooped pci transactions and the driver assumes pci is cache coherent. On AGP/PCI chips, the on-chip gart mechanism stores the gart table in system ram. On PCIE asics, the gart table is stored in vram. The gart page table maps system pages to a contiguous aperture in the GPU's address space. The ring lives in gart memory. The GPU sees a contiguous buffer and the gart mechanism handles the access to the backing pages via the page table. I'd suggest verifying that the entries written to the gart page table are valid and then the information written to the ring buffer is valid before updating the ring's wptr in radeon_ring_unlock_commit(). Changing the wptr is what causes the CP to start fetching data from the ring. Thanks! I'll try. Meanwhile i've tried a switch from page_alloc() to dma_alloc_coherent() in radeon_dummy_page_*(), which didn't help :( Is this platform enabling the SWIOTLB layer? The reason I am asking is b/c if you do indeed enable it you end up using the TTM DMA pool which allocates pages using the dma_alloc_coherent - which means that all of the pages that come out of TTM are already 'DMA' mapped. And that means the radeon_gart_bind and all its friends use the DMA addresses that have been constructed by SWIOTLB IOMMU. Perhaps the PA-RISC IOMMU creates the DMA addresses differently? When the card gets programmed, you do end up using ttm_agp_bind right? I am wondering if something like this: https://lkml.org/lkml/2010/12/6/512 is needed to pass in the right DMA address? --- radeon_device.c.orig 2013-09-10 08:55:05.0 + +++ radeon_device.c 2013-09-10 09:12:17.0 + @@ -673,15 +673,13 @@ int radeon_dummy_page_init(struct radeon { if (rdev-dummy_page.page) return 0; - rdev-dummy_page.page = alloc_page(GFP_DMA32 | GFP_KERNEL | __GFP_ZERO); - if (rdev-dummy_page.page == NULL) + rdev-dummy_page.page = dma_alloc_coherent(rdev-pdev-dev, PAGE_SIZE, + rdev-dummy_page.addr, GFP_DMA32|GFP_KERNEL); + if (!rdev-dummy_page.page) return -ENOMEM; -
Re: [PATCH 2/3] drm/radeon: Implement radeon_pci_shutdown
On Sun, Sep 08, 2013 at 02:10:58PM +0200, Markus Trippelsdorf wrote: Currently radeon devices are not properbly shutdown during kexec. This properly cases a varity of issues, e.g. dpm initialization failures. Fix this by implementing a radeon_pci_shutdown function, that unloads the driver cleanly. Signed-off-by: Markus Trippelsdorf mar...@trippelsdorf.de --- drivers/gpu/drm/radeon/radeon_drv.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index cb4445f..d71edee 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -380,6 +380,15 @@ static const struct file_operations radeon_driver_kms_fops = { #endif }; + +static void +radeon_pci_shutdown(struct pci_dev *pdev) +{ + struct drm_device *dev = pci_get_drvdata(pdev); + + radeon_driver_unload_kms(dev); +} + static struct drm_driver kms_driver = { .driver_features = DRIVER_USE_AGP | @@ -453,6 +462,7 @@ static struct pci_driver radeon_kms_pci_driver = { .remove = radeon_pci_remove, .suspend = radeon_pci_suspend, .resume = radeon_pci_resume, + .shutdown = radeon_pci_shutdown, }; static int __init radeon_init(void) -- Markus ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] DRM: TTM: Fix memory leak issue in ttm_agp_tt_create().
On Wed, Sep 04, 2013 at 11:58:10AM +0530, Manjunath Goudar wrote: From: Jeyaraman R jeyaraman.rangas...@lge.com This patch adds kfree() in ttm_agp_tt_create() to avoide the memory leak, without this there is a chance of memory leak in ttm_tt_init() fail case. Signed-off-by: Jeyaraman R jeyaraman.rangas...@lge.com Signed-off-by: Manjunath Goudar manjunath.gou...@linaro.org Cc: David Airlie airl...@linux.ie Cc: Paul E. McKenney paul...@linux.vnet.ibm.com Cc: David Howells dhowe...@redhat.com Cc: Thomas Gleixner t...@linutronix.de Cc: Dave Jones da...@redhat.com Cc: Dave Airlie airl...@redhat.com Cc: dri-devel@lists.freedesktop.org Cc: linux-ker...@vger.kernel.org Reviewed-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- drivers/gpu/drm/ttm/ttm_agp_backend.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/ttm/ttm_agp_backend.c b/drivers/gpu/drm/ttm/ttm_agp_backend.c index 3302f99..764be36 100644 --- a/drivers/gpu/drm/ttm/ttm_agp_backend.c +++ b/drivers/gpu/drm/ttm/ttm_agp_backend.c @@ -126,6 +126,7 @@ struct ttm_tt *ttm_agp_tt_create(struct ttm_bo_device *bdev, agp_be-ttm.func = ttm_agp_func; if (ttm_tt_init(agp_be-ttm, bdev, size, page_flags, dummy_read_page)) { + kfree(agp_be); return NULL; } -- 1.8.1.2 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 3/6] drm/nouveau: hook up cache sync functions
On Wed, Aug 28, 2013 at 06:58:37PM +0200, Lucas Stach wrote: > Am Mittwoch, den 28.08.2013, 12:43 -0400 schrieb Konrad Rzeszutek Wilk: > > On Wed, Aug 28, 2013 at 02:00:47AM +0200, Lucas Stach wrote: > > > Signed-off-by: Lucas Stach > > > --- > > > drivers/gpu/drm/nouveau/nouveau_bo.c | 4 > > > drivers/gpu/drm/nouveau/nouveau_gem.c | 5 + > > > 2 files changed, 9 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c > > > b/drivers/gpu/drm/nouveau/nouveau_bo.c > > > index af20fba..f4a2eb9 100644 > > > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > > > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > > > @@ -411,6 +411,10 @@ nouveau_bo_validate(struct nouveau_bo *nvbo, bool > > > interruptible, > > > { > > > int ret; > > > > > > + if (nvbo->bo.ttm && nvbo->bo.ttm->caching_state == tt_cached) > > > > You don't want to do it also for tt_wc ? > > > No the point of using writecombined memory for BOs is to explicitly > avoid the need for this cache sync. An uncached MMIO read from the > device should already flush out all writecombining buffers and this read > is always happening when submitting a pushbuf. Could you include this explanation in the git commit description please? > > > > + ttm_dma_tt_cache_sync_for_device((struct ttm_dma_tt > > > *)nvbo->bo.ttm, > > > + _bdev(nvbo->bo.ttm->bdev)->dev->pdev->dev); > > > + > > > ret = ttm_bo_validate(>bo, >placement, > > > interruptible, no_wait_gpu); > > > if (ret) > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c > > > b/drivers/gpu/drm/nouveau/nouveau_gem.c > > > index 830cb7b..f632b92 100644 > > > --- a/drivers/gpu/drm/nouveau/nouveau_gem.c > > > +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c > > > @@ -901,6 +901,11 @@ nouveau_gem_ioctl_cpu_prep(struct drm_device *dev, > > > void *data, > > > ret = ttm_bo_wait(>bo, true, true, no_wait); > > > spin_unlock(>bo.bdev->fence_lock); > > > drm_gem_object_unreference_unlocked(gem); > > > + > > > + if (!ret && nvbo->bo.ttm && nvbo->bo.ttm->caching_state == tt_cached) > > > > Ditto? > cpu_prep is used to make the kernel aware of a following userspace read. > Writecombined mappings are essentially uncached from the read > perspective. > > > > > > + ttm_dma_tt_cache_sync_for_cpu((struct ttm_dma_tt *)nvbo->bo.ttm, > > > + >pdev->dev); > > > + > > > return ret; > > > } > > > > > > -- > > > 1.8.3.1 > > > > > > ___ > > > dri-devel mailing list > > > dri-devel at lists.freedesktop.org > > > http://lists.freedesktop.org/mailman/listinfo/dri-devel > >
[PATCH 3/6] drm/nouveau: hook up cache sync functions
On Wed, Aug 28, 2013 at 02:00:47AM +0200, Lucas Stach wrote: > Signed-off-by: Lucas Stach > --- > drivers/gpu/drm/nouveau/nouveau_bo.c | 4 > drivers/gpu/drm/nouveau/nouveau_gem.c | 5 + > 2 files changed, 9 insertions(+) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c > b/drivers/gpu/drm/nouveau/nouveau_bo.c > index af20fba..f4a2eb9 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > @@ -411,6 +411,10 @@ nouveau_bo_validate(struct nouveau_bo *nvbo, bool > interruptible, > { > int ret; > > + if (nvbo->bo.ttm && nvbo->bo.ttm->caching_state == tt_cached) You don't want to do it also for tt_wc ? > + ttm_dma_tt_cache_sync_for_device((struct ttm_dma_tt > *)nvbo->bo.ttm, > + _bdev(nvbo->bo.ttm->bdev)->dev->pdev->dev); > + > ret = ttm_bo_validate(>bo, >placement, > interruptible, no_wait_gpu); > if (ret) > diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c > b/drivers/gpu/drm/nouveau/nouveau_gem.c > index 830cb7b..f632b92 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_gem.c > +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c > @@ -901,6 +901,11 @@ nouveau_gem_ioctl_cpu_prep(struct drm_device *dev, void > *data, > ret = ttm_bo_wait(>bo, true, true, no_wait); > spin_unlock(>bo.bdev->fence_lock); > drm_gem_object_unreference_unlocked(gem); > + > + if (!ret && nvbo->bo.ttm && nvbo->bo.ttm->caching_state == tt_cached) Ditto? > + ttm_dma_tt_cache_sync_for_cpu((struct ttm_dma_tt *)nvbo->bo.ttm, > + >pdev->dev); > + > return ret; > } > > -- > 1.8.3.1 > > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 6/6] drm/nouveau: use MSI interrupts
On Wed, Aug 28, 2013 at 09:28:57AM +0200, Lucas Stach wrote: > Am Mittwoch, den 28.08.2013, 17:09 +1000 schrieb Ben Skeggs: > > On Wed, Aug 28, 2013 at 10:00 AM, Lucas Stach wrote: > > > MSIs were only problematic on some old, broken chipsets. But now that we > > > already see systems where PCI legacy interrupts are somewhat flaky, it's > > > really time to move to MSIs. > > > > > > Signed-off-by: Lucas Stach > > > --- > > > drivers/gpu/drm/nouveau/core/include/subdev/mc.h | 1 + > > > drivers/gpu/drm/nouveau/core/subdev/mc/base.c| 17 + > > > 2 files changed, 18 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/mc.h > > > b/drivers/gpu/drm/nouveau/core/include/subdev/mc.h > > > index 9d2cd20..ce6569f 100644 > > > --- a/drivers/gpu/drm/nouveau/core/include/subdev/mc.h > > > +++ b/drivers/gpu/drm/nouveau/core/include/subdev/mc.h > > > @@ -12,6 +12,7 @@ struct nouveau_mc_intr { > > > struct nouveau_mc { > > > struct nouveau_subdev base; > > > const struct nouveau_mc_intr *intr_map; > > > + bool use_msi; > > > }; > > > > > > static inline struct nouveau_mc * > > > diff --git a/drivers/gpu/drm/nouveau/core/subdev/mc/base.c > > > b/drivers/gpu/drm/nouveau/core/subdev/mc/base.c > > > index ec9cd6f..02b337e 100644 > > > --- a/drivers/gpu/drm/nouveau/core/subdev/mc/base.c > > > +++ b/drivers/gpu/drm/nouveau/core/subdev/mc/base.c > > > @@ -23,6 +23,7 @@ > > > */ > > > > > > #include > > > +#include > > > > > > static irqreturn_t > > > nouveau_mc_intr(int irq, void *arg) > > > @@ -43,6 +44,9 @@ nouveau_mc_intr(int irq, void *arg) > > > map++; > > > } > > > > > > + if (pmc->use_msi) > > > + nv_wr08(pmc->base.base.parent, 0x00088068, 0xff); > > Register not present everywhere. > > > > At the very least, the enabling of MSI should be disallowed on the > > earlier chipsets where it's not supported. Though, it's perhaps > > possible that the pci_enable_msi() call will fail in all of these > > cases anyway.. I'm not certain. > > > MSIs are required property for everything doing PCIe. So the only cases > where this should fail is plain PCI/AGP devices. I don't really have a > test system for those old cards set up. That is not true. You can boot a machine with pci=nomsi that has PCIe and it will work. Legacy interrupts still work on PCIe > > But I remember Ilia having some legacy things plugged in, so maybe he > could test this patch and see how it goes? > > > > + > > > if (intr) { > > > nv_error(pmc, "unknown intr 0x%08x\n", stat); > > > } > > > @@ -75,6 +79,8 @@ _nouveau_mc_dtor(struct nouveau_object *object) > > > struct nouveau_device *device = nv_device(object); > > > struct nouveau_mc *pmc = (void *)object; > > > free_irq(device->pdev->irq, pmc); > > > + if (pmc->use_msi) > > > + pci_disable_msi(device->pdev); > > > nouveau_subdev_destroy(>base); > > > } > > > > > > @@ -96,6 +102,17 @@ nouveau_mc_create_(struct nouveau_object *parent, > > > struct nouveau_object *engine, > > > > > > pmc->intr_map = intr_map; > > > > > > + pmc->use_msi = nouveau_boolopt(device->cfgopt, "NvMSI", true); > > > + if (pmc->use_msi) { > > > + ret = pci_enable_msi(device->pdev); > > > + if (ret) { > > > + pmc->use_msi = false; > > > + } else { > > > + nv_wr08(device, 0x00088068, 0xff); > > > + nv_info(pmc, "MSI interrupts enabled\n"); > > > + } > > > + } > > > + > > > ret = request_irq(device->pdev->irq, nouveau_mc_intr, > > > IRQF_SHARED, "nouveau", pmc); > > > if (ret < 0) > > > -- > > > 1.8.3.1 > > > > > > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 6/6] drm/nouveau: use MSI interrupts
On Wed, Aug 28, 2013 at 09:28:57AM +0200, Lucas Stach wrote: Am Mittwoch, den 28.08.2013, 17:09 +1000 schrieb Ben Skeggs: On Wed, Aug 28, 2013 at 10:00 AM, Lucas Stach d...@lynxeye.de wrote: MSIs were only problematic on some old, broken chipsets. But now that we already see systems where PCI legacy interrupts are somewhat flaky, it's really time to move to MSIs. Signed-off-by: Lucas Stach d...@lynxeye.de --- drivers/gpu/drm/nouveau/core/include/subdev/mc.h | 1 + drivers/gpu/drm/nouveau/core/subdev/mc/base.c| 17 + 2 files changed, 18 insertions(+) diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/mc.h b/drivers/gpu/drm/nouveau/core/include/subdev/mc.h index 9d2cd20..ce6569f 100644 --- a/drivers/gpu/drm/nouveau/core/include/subdev/mc.h +++ b/drivers/gpu/drm/nouveau/core/include/subdev/mc.h @@ -12,6 +12,7 @@ struct nouveau_mc_intr { struct nouveau_mc { struct nouveau_subdev base; const struct nouveau_mc_intr *intr_map; + bool use_msi; }; static inline struct nouveau_mc * diff --git a/drivers/gpu/drm/nouveau/core/subdev/mc/base.c b/drivers/gpu/drm/nouveau/core/subdev/mc/base.c index ec9cd6f..02b337e 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/mc/base.c +++ b/drivers/gpu/drm/nouveau/core/subdev/mc/base.c @@ -23,6 +23,7 @@ */ #include subdev/mc.h +#include core/option.h static irqreturn_t nouveau_mc_intr(int irq, void *arg) @@ -43,6 +44,9 @@ nouveau_mc_intr(int irq, void *arg) map++; } + if (pmc-use_msi) + nv_wr08(pmc-base.base.parent, 0x00088068, 0xff); Register not present everywhere. At the very least, the enabling of MSI should be disallowed on the earlier chipsets where it's not supported. Though, it's perhaps possible that the pci_enable_msi() call will fail in all of these cases anyway.. I'm not certain. MSIs are required property for everything doing PCIe. So the only cases where this should fail is plain PCI/AGP devices. I don't really have a test system for those old cards set up. That is not true. You can boot a machine with pci=nomsi that has PCIe and it will work. Legacy interrupts still work on PCIe But I remember Ilia having some legacy things plugged in, so maybe he could test this patch and see how it goes? + if (intr) { nv_error(pmc, unknown intr 0x%08x\n, stat); } @@ -75,6 +79,8 @@ _nouveau_mc_dtor(struct nouveau_object *object) struct nouveau_device *device = nv_device(object); struct nouveau_mc *pmc = (void *)object; free_irq(device-pdev-irq, pmc); + if (pmc-use_msi) + pci_disable_msi(device-pdev); nouveau_subdev_destroy(pmc-base); } @@ -96,6 +102,17 @@ nouveau_mc_create_(struct nouveau_object *parent, struct nouveau_object *engine, pmc-intr_map = intr_map; + pmc-use_msi = nouveau_boolopt(device-cfgopt, NvMSI, true); + if (pmc-use_msi) { + ret = pci_enable_msi(device-pdev); + if (ret) { + pmc-use_msi = false; + } else { + nv_wr08(device, 0x00088068, 0xff); + nv_info(pmc, MSI interrupts enabled\n); + } + } + ret = request_irq(device-pdev-irq, nouveau_mc_intr, IRQF_SHARED, nouveau, pmc); if (ret 0) -- 1.8.3.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 3/6] drm/nouveau: hook up cache sync functions
On Wed, Aug 28, 2013 at 02:00:47AM +0200, Lucas Stach wrote: Signed-off-by: Lucas Stach d...@lynxeye.de --- drivers/gpu/drm/nouveau/nouveau_bo.c | 4 drivers/gpu/drm/nouveau/nouveau_gem.c | 5 + 2 files changed, 9 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index af20fba..f4a2eb9 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -411,6 +411,10 @@ nouveau_bo_validate(struct nouveau_bo *nvbo, bool interruptible, { int ret; + if (nvbo-bo.ttm nvbo-bo.ttm-caching_state == tt_cached) You don't want to do it also for tt_wc ? + ttm_dma_tt_cache_sync_for_device((struct ttm_dma_tt *)nvbo-bo.ttm, + nouveau_bdev(nvbo-bo.ttm-bdev)-dev-pdev-dev); + ret = ttm_bo_validate(nvbo-bo, nvbo-placement, interruptible, no_wait_gpu); if (ret) diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index 830cb7b..f632b92 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -901,6 +901,11 @@ nouveau_gem_ioctl_cpu_prep(struct drm_device *dev, void *data, ret = ttm_bo_wait(nvbo-bo, true, true, no_wait); spin_unlock(nvbo-bo.bdev-fence_lock); drm_gem_object_unreference_unlocked(gem); + + if (!ret nvbo-bo.ttm nvbo-bo.ttm-caching_state == tt_cached) Ditto? + ttm_dma_tt_cache_sync_for_cpu((struct ttm_dma_tt *)nvbo-bo.ttm, + dev-pdev-dev); + return ret; } -- 1.8.3.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 3/6] drm/nouveau: hook up cache sync functions
On Wed, Aug 28, 2013 at 06:58:37PM +0200, Lucas Stach wrote: Am Mittwoch, den 28.08.2013, 12:43 -0400 schrieb Konrad Rzeszutek Wilk: On Wed, Aug 28, 2013 at 02:00:47AM +0200, Lucas Stach wrote: Signed-off-by: Lucas Stach d...@lynxeye.de --- drivers/gpu/drm/nouveau/nouveau_bo.c | 4 drivers/gpu/drm/nouveau/nouveau_gem.c | 5 + 2 files changed, 9 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index af20fba..f4a2eb9 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -411,6 +411,10 @@ nouveau_bo_validate(struct nouveau_bo *nvbo, bool interruptible, { int ret; + if (nvbo-bo.ttm nvbo-bo.ttm-caching_state == tt_cached) You don't want to do it also for tt_wc ? No the point of using writecombined memory for BOs is to explicitly avoid the need for this cache sync. An uncached MMIO read from the device should already flush out all writecombining buffers and this read is always happening when submitting a pushbuf. Could you include this explanation in the git commit description please? + ttm_dma_tt_cache_sync_for_device((struct ttm_dma_tt *)nvbo-bo.ttm, + nouveau_bdev(nvbo-bo.ttm-bdev)-dev-pdev-dev); + ret = ttm_bo_validate(nvbo-bo, nvbo-placement, interruptible, no_wait_gpu); if (ret) diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index 830cb7b..f632b92 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -901,6 +901,11 @@ nouveau_gem_ioctl_cpu_prep(struct drm_device *dev, void *data, ret = ttm_bo_wait(nvbo-bo, true, true, no_wait); spin_unlock(nvbo-bo.bdev-fence_lock); drm_gem_object_unreference_unlocked(gem); + + if (!ret nvbo-bo.ttm nvbo-bo.ttm-caching_state == tt_cached) Ditto? cpu_prep is used to make the kernel aware of a following userspace read. Writecombined mappings are essentially uncached from the read perspective. + ttm_dma_tt_cache_sync_for_cpu((struct ttm_dma_tt *)nvbo-bo.ttm, + dev-pdev-dev); + return ret; } -- 1.8.3.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 1/2] [RFC PATCH v6] dmabuf-sync: Add a buffer synchronization framework
> > > +EXPORT_SYMBOL(is_dmabuf_sync_supported); > > > > _GPL ? > > > > I would also prefix it with 'dmabuf_is_sync_supported' just to make > > all of the libraries call start with 'dmabuf' > > > > Seems better. Will change it to dmabuf_is_sync_supported, and use > EXPORT_SYMBOL_GPL. One thing thought - while I suggest that you use GPL variant I think you should check who the consumers are. As in, if nvidia wants to use it it might make their lawyers unhappy - and in turn means that their engineers won't be able to use these symbols. So - if there is a strong argument to not have it GPL - then please say so.
Re: [PATCH 1/2] [RFC PATCH v6] dmabuf-sync: Add a buffer synchronization framework
+EXPORT_SYMBOL(is_dmabuf_sync_supported); _GPL ? I would also prefix it with 'dmabuf_is_sync_supported' just to make all of the libraries call start with 'dmabuf' Seems better. Will change it to dmabuf_is_sync_supported, and use EXPORT_SYMBOL_GPL. One thing thought - while I suggest that you use GPL variant I think you should check who the consumers are. As in, if nvidia wants to use it it might make their lawyers unhappy - and in turn means that their engineers won't be able to use these symbols. So - if there is a strong argument to not have it GPL - then please say so. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 1/2] [RFC PATCH v6] dmabuf-sync: Add a buffer synchronization framework
On Tue, Aug 13, 2013 at 06:19:35PM +0900, Inki Dae wrote: > This patch adds a buffer synchronization framework based on DMA BUF[1] > and and based on ww-mutexes[2] for lock mechanism. > > The purpose of this framework is to provide not only buffer access control > to CPU and DMA but also easy-to-use interfaces for device drivers and > user application. This framework can be used for all dma devices using > system memory as dma buffer, especially for most ARM based SoCs. > > Changelog v6: > - Fix sync lock to multiple reads. > - Add select system call support. > . Wake up poll_wait when a dmabuf is unlocked. > - Remove unnecessary the use of mutex lock. > - Add private backend ops callbacks. > . This ops has one callback for device drivers to clean up their > sync object resource when the sync object is freed. For this, > device drivers should implement the free callback properly. > - Update document file. > > Changelog v5: > - Rmove a dependence on reservation_object: the reservation_object is used > to hook up to ttm and dma-buf for easy sharing of reservations across > devices. However, the dmabuf sync can be used for all dma devices; v4l2 > and drm based drivers, so doesn't need the reservation_object anymore. > With regared to this, it adds 'void *sync' to dma_buf structure. > - All patches are rebased on mainline, Linux v3.10. > > Changelog v4: > - Add user side interface for buffer synchronization mechanism and update > descriptions related to the user side interface. > > Changelog v3: > - remove cache operation relevant codes and update document file. > > Changelog v2: > - use atomic_add_unless to avoid potential bug. > - add a macro for checking valid access type. > - code clean. > > The mechanism of this framework has the following steps, > 1. Register dmabufs to a sync object - A task gets a new sync object and > can add one or more dmabufs that the task wants to access. > This registering should be performed when a device context or an event > context such as a page flip event is created or before CPU accesses a > shared > buffer. > > dma_buf_sync_get(a sync object, a dmabuf); > > 2. Lock a sync object - A task tries to lock all dmabufs added in its own > sync object. Basically, the lock mechanism uses ww-mutex[1] to avoid dead > lock issue and for race condition between CPU and CPU, CPU and DMA, and > DMA > and DMA. Taking a lock means that others cannot access all locked dmabufs > until the task that locked the corresponding dmabufs, unlocks all the > locked > dmabufs. > This locking should be performed before DMA or CPU accesses these dmabufs. > > dma_buf_sync_lock(a sync object); > > 3. Unlock a sync object - The task unlocks all dmabufs added in its own > sync > object. The unlock means that the DMA or CPU accesses to the dmabufs have > been completed so that others may access them. > This unlocking should be performed after DMA or CPU has completed accesses > to the dmabufs. > > dma_buf_sync_unlock(a sync object); > > 4. Unregister one or all dmabufs from a sync object - A task unregisters > the given dmabufs from the sync object. This means that the task dosen't > want to lock the dmabufs. > The unregistering should be performed after DMA or CPU has completed > accesses to the dmabufs or when dma_buf_sync_lock() is failed. > > dma_buf_sync_put(a sync object, a dmabuf); > dma_buf_sync_put_all(a sync object); > > The described steps may be summarized as: > get -> lock -> CPU or DMA access to a buffer/s -> unlock -> put > > This framework includes the following two features. > 1. read (shared) and write (exclusive) locks - A task is required to > declare > the access type when the task tries to register a dmabuf; > READ, WRITE, READ DMA, or WRITE DMA. > > The below is example codes, > struct dmabuf_sync *sync; > > sync = dmabuf_sync_init(...); > ... > > dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_R); > ... > > And the below can be used as access types: > DMA_BUF_ACCESS_R - CPU will access a buffer for read. > DMA_BUF_ACCESS_W - CPU will access a buffer for read or write. > DMA_BUF_ACCESS_DMA_R - DMA will access a buffer for read > DMA_BUF_ACCESS_DMA_W - DMA will access a buffer for read or > write. > > 2. Mandatory resource releasing - a task cannot hold a lock indefinitely. > A task may never try to unlock a buffer after taking a lock to the buffer. > In this case, a timer handler to the corresponding sync object is called > in five (default) seconds and then the timed-out buffer is unlocked by > work > queue handler to avoid lockups and to enforce resources of the buffer. > > The below is how to use interfaces for device driver: > 1. Allocate
Re: [PATCH 1/2] [RFC PATCH v6] dmabuf-sync: Add a buffer synchronization framework
On Tue, Aug 13, 2013 at 06:19:35PM +0900, Inki Dae wrote: This patch adds a buffer synchronization framework based on DMA BUF[1] and and based on ww-mutexes[2] for lock mechanism. The purpose of this framework is to provide not only buffer access control to CPU and DMA but also easy-to-use interfaces for device drivers and user application. This framework can be used for all dma devices using system memory as dma buffer, especially for most ARM based SoCs. Changelog v6: - Fix sync lock to multiple reads. - Add select system call support. . Wake up poll_wait when a dmabuf is unlocked. - Remove unnecessary the use of mutex lock. - Add private backend ops callbacks. . This ops has one callback for device drivers to clean up their sync object resource when the sync object is freed. For this, device drivers should implement the free callback properly. - Update document file. Changelog v5: - Rmove a dependence on reservation_object: the reservation_object is used to hook up to ttm and dma-buf for easy sharing of reservations across devices. However, the dmabuf sync can be used for all dma devices; v4l2 and drm based drivers, so doesn't need the reservation_object anymore. With regared to this, it adds 'void *sync' to dma_buf structure. - All patches are rebased on mainline, Linux v3.10. Changelog v4: - Add user side interface for buffer synchronization mechanism and update descriptions related to the user side interface. Changelog v3: - remove cache operation relevant codes and update document file. Changelog v2: - use atomic_add_unless to avoid potential bug. - add a macro for checking valid access type. - code clean. The mechanism of this framework has the following steps, 1. Register dmabufs to a sync object - A task gets a new sync object and can add one or more dmabufs that the task wants to access. This registering should be performed when a device context or an event context such as a page flip event is created or before CPU accesses a shared buffer. dma_buf_sync_get(a sync object, a dmabuf); 2. Lock a sync object - A task tries to lock all dmabufs added in its own sync object. Basically, the lock mechanism uses ww-mutex[1] to avoid dead lock issue and for race condition between CPU and CPU, CPU and DMA, and DMA and DMA. Taking a lock means that others cannot access all locked dmabufs until the task that locked the corresponding dmabufs, unlocks all the locked dmabufs. This locking should be performed before DMA or CPU accesses these dmabufs. dma_buf_sync_lock(a sync object); 3. Unlock a sync object - The task unlocks all dmabufs added in its own sync object. The unlock means that the DMA or CPU accesses to the dmabufs have been completed so that others may access them. This unlocking should be performed after DMA or CPU has completed accesses to the dmabufs. dma_buf_sync_unlock(a sync object); 4. Unregister one or all dmabufs from a sync object - A task unregisters the given dmabufs from the sync object. This means that the task dosen't want to lock the dmabufs. The unregistering should be performed after DMA or CPU has completed accesses to the dmabufs or when dma_buf_sync_lock() is failed. dma_buf_sync_put(a sync object, a dmabuf); dma_buf_sync_put_all(a sync object); The described steps may be summarized as: get - lock - CPU or DMA access to a buffer/s - unlock - put This framework includes the following two features. 1. read (shared) and write (exclusive) locks - A task is required to declare the access type when the task tries to register a dmabuf; READ, WRITE, READ DMA, or WRITE DMA. The below is example codes, struct dmabuf_sync *sync; sync = dmabuf_sync_init(...); ... dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_R); ... And the below can be used as access types: DMA_BUF_ACCESS_R - CPU will access a buffer for read. DMA_BUF_ACCESS_W - CPU will access a buffer for read or write. DMA_BUF_ACCESS_DMA_R - DMA will access a buffer for read DMA_BUF_ACCESS_DMA_W - DMA will access a buffer for read or write. 2. Mandatory resource releasing - a task cannot hold a lock indefinitely. A task may never try to unlock a buffer after taking a lock to the buffer. In this case, a timer handler to the corresponding sync object is called in five (default) seconds and then the timed-out buffer is unlocked by work queue handler to avoid lockups and to enforce resources of the buffer. The below is how to use interfaces for device driver: 1. Allocate and Initialize a sync object: static void xxx_dmabuf_sync_free(void *priv) {
[PATCH 1/2] drm/i915: unpin backing storage in dmabuf_unmap
On Wed, Aug 07, 2013 at 12:09:32PM +0200, Daniel Vetter wrote: > This fixes a WARN in i915_gem_free_object when the > obj->pages_pin_count isn't 0. > > v2: Add locking to unmap, noticed by Chris Wilson. Note that even > though we call unmap with our own dev->struct_mutex held that won't > result in an immediate deadlock since we never go through the dma_buf > interfaces for our own, reimported buffers. But it's still easy to > blow up and anger lockdep, but that's already the case with our ->map > implementation. Fixing this for real will involve per dma-buf ww mutex > locking by the callers. And lots of fun. So go with the duct-tape > approach for now. > > Cc: Chris Wilson > Reported-by: Maarten Lankhorst > Cc: Maarten Lankhorst > Tested-by: Armin K. (v1) > Signed-off-by: Daniel Vetter > --- > drivers/gpu/drm/i915/i915_gem_dmabuf.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c > b/drivers/gpu/drm/i915/i915_gem_dmabuf.c > index 63ee1a9..f7e1682 100644 > --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c > +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c > @@ -85,9 +85,17 @@ static void i915_gem_unmap_dma_buf(struct > dma_buf_attachment *attachment, > struct sg_table *sg, > enum dma_data_direction dir) > { > + struct drm_i915_gem_object *obj = attachment->dmabuf->priv; > + > + mutex_lock(>base.dev->struct_mutex); > + > dma_unmap_sg(attachment->dev, sg->sgl, sg->nents, dir); > sg_free_table(sg); > kfree(sg); > + > + i915_gem_object_unpin_pages(obj); I am curious - would it logic of first unpinning, and then doing the dma_unmap_sg make more sense? As in, in the map path we do: dma_map pin And in here you do the same: dma_unmap unpin But I would have thought that on a unroll you would do it in reverse order, so: unpin dma_unmap > + > + mutex_unlock(>base.dev->struct_mutex); > } > > static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf) > -- > 1.8.3.2 > > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 1/2] drm/i915: unpin backing storage in dmabuf_unmap
On Wed, Aug 07, 2013 at 12:09:32PM +0200, Daniel Vetter wrote: This fixes a WARN in i915_gem_free_object when the obj-pages_pin_count isn't 0. v2: Add locking to unmap, noticed by Chris Wilson. Note that even though we call unmap with our own dev-struct_mutex held that won't result in an immediate deadlock since we never go through the dma_buf interfaces for our own, reimported buffers. But it's still easy to blow up and anger lockdep, but that's already the case with our -map implementation. Fixing this for real will involve per dma-buf ww mutex locking by the callers. And lots of fun. So go with the duct-tape approach for now. Cc: Chris Wilson ch...@chris-wilson.co.uk Reported-by: Maarten Lankhorst maarten.lankho...@canonical.com Cc: Maarten Lankhorst maarten.lankho...@canonical.com Tested-by: Armin K. kre...@email.com (v1) Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 63ee1a9..f7e1682 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -85,9 +85,17 @@ static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment, struct sg_table *sg, enum dma_data_direction dir) { + struct drm_i915_gem_object *obj = attachment-dmabuf-priv; + + mutex_lock(obj-base.dev-struct_mutex); + dma_unmap_sg(attachment-dev, sg-sgl, sg-nents, dir); sg_free_table(sg); kfree(sg); + + i915_gem_object_unpin_pages(obj); I am curious - would it logic of first unpinning, and then doing the dma_unmap_sg make more sense? As in, in the map path we do: dma_map pin And in here you do the same: dma_unmap unpin But I would have thought that on a unroll you would do it in reverse order, so: unpin dma_unmap + + mutex_unlock(obj-base.dev-struct_mutex); } static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf) -- 1.8.3.2 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Regression introduced by 31f731a drm/radeon/dpm: fix calculations in si_calculate_leakage_for_v_and_t_formula
Hey Alex, Your recent patch "31f731a drm/radeon/dpm: fix calculations in si_calculate_leakage_for_v_and_t_formula" causes a build regression: drivers/built-in.o: In function `si_calculate_leakage_for_v_and_t_formula': /home/build/linux-linus/drivers/gpu/drm/radeon/si_dpm.c:1770: undefined reference to `__divdi3' Which I think is due to the misplace parentheses.
Regression introduced by 31f731a drm/radeon/dpm: fix calculations in si_calculate_leakage_for_v_and_t_formula
Hey Alex, Your recent patch 31f731a drm/radeon/dpm: fix calculations in si_calculate_leakage_for_v_and_t_formula causes a build regression: drivers/built-in.o: In function `si_calculate_leakage_for_v_and_t_formula': /home/build/linux-linus/drivers/gpu/drm/radeon/si_dpm.c:1770: undefined reference to `__divdi3' Which I think is due to the misplace parentheses. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Is: Regression introduced by 0108bc808107b97e101b15af9705729626be6447 - drm/nouveau: do not allow negative sizes for now (Was:Re: nouveau crash with 3.11-rc2)
On Fri, Jul 26, 2013 at 04:37:32PM -0400, Ilia Mirkin wrote: > On Fri, Jul 26, 2013 at 2:28 PM, konrad wilk > wrote: > > I just saw this on a box of mine (rc1 worked) I hadn't done yet a bisection. > > Any suggestions? > > > > ring 0 polarity 1 > > [6.023776] Already setup the GSI :22 > > ^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G[6.036680] nouveau [ > > DEVICE][:00:0d.0] BOOT0 : 0x04c000a2 > > [6.036740] nouveau [ DEVICE][:00:0d.0] Chipset: C61 (NV4C) > > [6.036792] nouveau [ DEVICE][:00:0d.0] Family : NV40 > > [6.038554] nouveau [ VBIOS][:00:0d.0] checking PRAMIN for > > image... > > [6.062295] ata1: SATA link down (SStatus 0 SControl 300) > > [6.07] nouveau [ VBIOS][:00:0d.0] ... appears to be valid > > [6.077829] nouveau [ VBIOS][:00:0d.0] using image from PRAMIN > > [6.078024] nouveau [ VBIOS][:00:0d.0] BIT signature found > > [6.078076] nouveau [ VBIOS][:00:0d.0] version 05.61.32.22.01 > > [6.078666] skge :01:08.0 eth2: addr 00:0a:5e:65:74:93 > > [6.086529] nouveau [ PFB][:00:0d.0] RAM type: unknown > > [6.086594] nouveau [ PFB][:00:0d.0] RAM size: 128 MiB > > [6.086648] nouveau [ PFB][:00:0d.0]ZCOMP: 0 tags > > [6.115583] nouveau [ PTHERM][:00:0d.0] FAN control: none / > > external > > [6.115645] nouveau [ PTHERM][:00:0d.0] fan management: disabled > > [6.115698] nouveau [ PTHERM][:00:0d.0] internal sensor: no > > [6.140702] [TTM] Zone kernel: Available graphics memory: 461396 kiB > > [6.140763] [TTM] Initializing pool allocator > > [6.140852] [TTM] Initializing DMA pool allocator > > [6.141034] Failed to add WC MTRR for > > [e000-efff]; performance may suffer. > > [6.141095] nouveau [ DRM] VRAM: 125 MiB > > [6.141189] nouveau [ DRM] GART: 512 MiB > > [6.141242] nouveau [ DRM] TMDS table version 1.1 > > [6.141293] nouveau [ DRM] DCB version 3.0 > > [6.141342] nouveau [ DRM] DCB outp 00: 01000310 0023 > > [6.141421] nouveau [ DRM] DCB outp 01: 00110204 97e5 > > [6.141471] nouveau [ DRM] DCB conn 00: > > [6.141839] nouveau [ DRM] Saving VGA fonts > > [6.180531] BUG: unable to handle kernel NULL pointer dereference at > > (null) > > [6.180657] IP: [] nouveau_bo_new+0x36/0x330 [nouveau] > > [6.180775] PGD 29449067 PUD 28aad067 PMD 0 > > [6.180907] Oops: [#1] SMP > > [6.181013] Modules linked in: nouveau(+) skge e1000 fbcon tileblit font > > bitblit ttm softcursor ata_generic sata_nv drm_kms_helper mxm_wmi video wmi > > libata scsi_mod mperf xen_blkfront xen_netfront fb_sys_fops sysimgblt > > sysfillrect syscopyarea xenfs xen_privcmd > > [6.181953] CPU: 0 PID: 428 Comm: kworker/0:1 Not tainted > > 3.11.0-rc2upstream-00185-g07bc9dc #1 > > [6.182016] Hardware name: BIOSTAR Group N61PB-M2S/N61PB-M2S, BIOS 6.00 > > PG 09/03/2009 > > [6.182084] Workqueue: events work_for_cpu_fn > > [6.182167] task: 880037e69000 ti: 88003791a000 task.ti: > > 88003791a000 > > [6.182228] RIP: e030:[] [] > > nouveau_bo_new+0x36/0x330 [nouveau] > > [6.182344] RSP: e02b:88003791ba88 EFLAGS: 00010287 > > [6.182396] RAX: RBX: 880028f7f000 RCX: > > 0004 > > [6.182457] RDX: 0100 RSI: 4000 RDI: > > 88002b576800 > > [6.182511] RBP: 88003791bb08 R08: R09: > > > > [6.182565] R10: 0004 R11: 0100 R12: > > 4000 > > [6.182619] R13: 88002b1e3000 R14: a01fe740 R15: > > > > [6.182679] FS: 7f3da25417a0() GS:88003de0() > > knlGS: > > [6.182748] CS: e033 DS: ES: CR0: 8005003b > > [6.182798] CR2: CR3: 27416000 CR4: > > 0660 > > [6.182851] Stack: > > [6.182896] 88003791bae8 811aa1f9 88002b576800 > > 880027aad240 > > [6.183108] 880080d0 8142a566 880028f7f000 > > 880028f7f000 > > [6.183295] 88002b576800 a01fe740 > > 880028f7f000 > > [6.183488] Call Trace: > > [6.183544] [] ? __kmalloc+0x259/0x2a0 > > [6.183603] [] ? > > drm_mode_crtc_set_gamma_size+0x26/0x60 > > [6.183680] [] nv04_crtc_create+0xdf/0x160 [nouveau] > > [6.183757] [] nv04_display_create+0x11a/0x400 > > [nouveau] > > [6.183813] [] ? __cancel_work_timer+0x7e/0x110 > > [6.183886] [] nouveau_display_create+0x598/0x5a0 > > [nouveau] > > [6.183981] [] nouveau_drm_load+0x25c/0x670 [nouveau] > > [6.184040] [] ? device_register+0x19/0x20 > > [6.184098] [] ? drm_get_minor+0x1fc/0x280 > > [6.187041] [] drm_get_pci_dev+0x178/0x2a0 > > [6.187096] [] ? pcibios_set_master+0x83/0xb0 > > [
Is: Regression introduced by 0108bc808107b97e101b15af9705729626be6447 - drm/nouveau: do not allow negative sizes for now (Was:Re: nouveau crash with 3.11-rc2)
On Fri, Jul 26, 2013 at 04:37:32PM -0400, Ilia Mirkin wrote: On Fri, Jul 26, 2013 at 2:28 PM, konrad wilk konrad.w...@oracle.com wrote: I just saw this on a box of mine (rc1 worked) I hadn't done yet a bisection. Any suggestions? ring 0 polarity 1 [6.023776] Already setup the GSI :22 ^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G^G[6.036680] nouveau [ DEVICE][:00:0d.0] BOOT0 : 0x04c000a2 [6.036740] nouveau [ DEVICE][:00:0d.0] Chipset: C61 (NV4C) [6.036792] nouveau [ DEVICE][:00:0d.0] Family : NV40 [6.038554] nouveau [ VBIOS][:00:0d.0] checking PRAMIN for image... [6.062295] ata1: SATA link down (SStatus 0 SControl 300) [6.07] nouveau [ VBIOS][:00:0d.0] ... appears to be valid [6.077829] nouveau [ VBIOS][:00:0d.0] using image from PRAMIN [6.078024] nouveau [ VBIOS][:00:0d.0] BIT signature found [6.078076] nouveau [ VBIOS][:00:0d.0] version 05.61.32.22.01 [6.078666] skge :01:08.0 eth2: addr 00:0a:5e:65:74:93 [6.086529] nouveau [ PFB][:00:0d.0] RAM type: unknown [6.086594] nouveau [ PFB][:00:0d.0] RAM size: 128 MiB [6.086648] nouveau [ PFB][:00:0d.0]ZCOMP: 0 tags [6.115583] nouveau [ PTHERM][:00:0d.0] FAN control: none / external [6.115645] nouveau [ PTHERM][:00:0d.0] fan management: disabled [6.115698] nouveau [ PTHERM][:00:0d.0] internal sensor: no [6.140702] [TTM] Zone kernel: Available graphics memory: 461396 kiB [6.140763] [TTM] Initializing pool allocator [6.140852] [TTM] Initializing DMA pool allocator [6.141034] Failed to add WC MTRR for [e000-efff]; performance may suffer. [6.141095] nouveau [ DRM] VRAM: 125 MiB [6.141189] nouveau [ DRM] GART: 512 MiB [6.141242] nouveau [ DRM] TMDS table version 1.1 [6.141293] nouveau [ DRM] DCB version 3.0 [6.141342] nouveau [ DRM] DCB outp 00: 01000310 0023 [6.141421] nouveau [ DRM] DCB outp 01: 00110204 97e5 [6.141471] nouveau [ DRM] DCB conn 00: [6.141839] nouveau [ DRM] Saving VGA fonts [6.180531] BUG: unable to handle kernel NULL pointer dereference at (null) [6.180657] IP: [a019b316] nouveau_bo_new+0x36/0x330 [nouveau] [6.180775] PGD 29449067 PUD 28aad067 PMD 0 [6.180907] Oops: [#1] SMP [6.181013] Modules linked in: nouveau(+) skge e1000 fbcon tileblit font bitblit ttm softcursor ata_generic sata_nv drm_kms_helper mxm_wmi video wmi libata scsi_mod mperf xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd [6.181953] CPU: 0 PID: 428 Comm: kworker/0:1 Not tainted 3.11.0-rc2upstream-00185-g07bc9dc #1 [6.182016] Hardware name: BIOSTAR Group N61PB-M2S/N61PB-M2S, BIOS 6.00 PG 09/03/2009 [6.182084] Workqueue: events work_for_cpu_fn [6.182167] task: 880037e69000 ti: 88003791a000 task.ti: 88003791a000 [6.182228] RIP: e030:[a019b316] [a019b316] nouveau_bo_new+0x36/0x330 [nouveau] [6.182344] RSP: e02b:88003791ba88 EFLAGS: 00010287 [6.182396] RAX: RBX: 880028f7f000 RCX: 0004 [6.182457] RDX: 0100 RSI: 4000 RDI: 88002b576800 [6.182511] RBP: 88003791bb08 R08: R09: [6.182565] R10: 0004 R11: 0100 R12: 4000 [6.182619] R13: 88002b1e3000 R14: a01fe740 R15: [6.182679] FS: 7f3da25417a0() GS:88003de0() knlGS: [6.182748] CS: e033 DS: ES: CR0: 8005003b [6.182798] CR2: CR3: 27416000 CR4: 0660 [6.182851] Stack: [6.182896] 88003791bae8 811aa1f9 88002b576800 880027aad240 [6.183108] 880080d0 8142a566 880028f7f000 880028f7f000 [6.183295] 88002b576800 a01fe740 880028f7f000 [6.183488] Call Trace: [6.183544] [811aa1f9] ? __kmalloc+0x259/0x2a0 [6.183603] [8142a566] ? drm_mode_crtc_set_gamma_size+0x26/0x60 [6.183680] [a01ac86f] nv04_crtc_create+0xdf/0x160 [nouveau] [6.183757] [a01b289a] nv04_display_create+0x11a/0x400 [nouveau] [6.183813] [810b817e] ? __cancel_work_timer+0x7e/0x110 [6.183886] [a01a79a8] nouveau_display_create+0x598/0x5a0 [nouveau] [6.183981] [a019633c] nouveau_drm_load+0x25c/0x670 [nouveau] [6.184040] [8143a479] ? device_register+0x19/0x20 [6.184098] [8142018c] ? drm_get_minor+0x1fc/0x280 [6.187041] [814222a8] drm_get_pci_dev+0x178/0x2a0 [6.187096]
FAILED: patch "[PATCH] drm/i915: make compact dma scatter lists creation work with" failed to apply to 3.10-stable tree
On Tue, Jul 23, 2013 at 09:54:57AM -0700, gregkh at linuxfoundation.org wrote: > > The patch below does not apply to the 3.10-stable tree. You can drop it. The patch is already in 3.10 and it does not need to be backported to earlier trees. There was a snaffu were I thought David would post this in v3.11 merge window - hence the stable at vger.kernel.org tag. But David came in to rescue me and at the last minute sent an git pull to Linus for v3.10. So all good. Thanks! > If someone wants it applied there, or to any other stable or longterm > tree, then please email the backport, including the original git commit > id to . > > thanks, > > greg k-h > > -- original commit in Linus's tree -- > > >From 1625e7e549c50fb57a1e1ab1cb0f5735c84c9029 Mon Sep 17 00:00:00 2001 > From: Konrad Rzeszutek Wilk > Date: Mon, 24 Jun 2013 11:47:48 -0400 > Subject: [PATCH] drm/i915: make compact dma scatter lists creation work with > SWIOTLB backend. > > Git commit 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 > ("drm/i915: create compact dma scatter lists for gem objects") makes > certain assumptions about the under laying DMA API that are not always > correct. > > On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup > I see: > > [drm:intel_pipe_set_base] *ERROR* pin & fence failed > [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 > > Bit of debugging traced it down to dma_map_sg failing (in > i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB). > > That unfortunately are sizes that the SWIOTLB is incapable of handling - > the maximum it can handle is a an entry of 512KB of virtual contiguous > memory for its bounce buffer. (See IO_TLB_SEGSIZE). > > Previous to the above mention git commit the SG entries were of 4KB, and > the code introduced by above git commit squashed the CPU contiguous PFNs > in one big virtual address provided to DMA API. > > This patch is a simple semi-revert - were we emulate the old behavior > if we detect that SWIOTLB is online. If it is not online then we continue > on with the new compact scatter gather mechanism. > > An alternative solution would be for the the '.get_pages' and the > i915_gem_gtt_prepare_object to retry with smaller max gap of the > amount of PFNs that can be combined together - but with this issue > discovered during rc7 that might be too risky. > > Reported-and-Tested-by: Konrad Rzeszutek Wilk > CC: Chris Wilson > CC: Imre Deak > CC: Daniel Vetter > CC: David Airlie > CC: > Signed-off-by: Konrad Rzeszutek Wilk > Cc: stable at vger.kernel.org > Signed-off-by: Daniel Vetter > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index a6178bac..e31eeb1d 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -1802,7 +1802,14 @@ i915_gem_object_get_pages_gtt(struct > drm_i915_gem_object *obj) > gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD; > gfp &= ~(__GFP_IO | __GFP_WAIT); > } > - > +#ifdef CONFIG_SWIOTLB > + if (swiotlb_nr_tbl()) { > + st->nents++; > + sg_set_page(sg, page, PAGE_SIZE, 0); > + sg = sg_next(sg); > + continue; > + } > +#endif > if (!i || page_to_pfn(page) != last_pfn + 1) { > if (i) > sg = sg_next(sg); > @@ -1813,8 +1820,10 @@ i915_gem_object_get_pages_gtt(struct > drm_i915_gem_object *obj) > } > last_pfn = page_to_pfn(page); > } > - > - sg_mark_end(sg); > +#ifdef CONFIG_SWIOTLB > + if (!swiotlb_nr_tbl()) > +#endif > + sg_mark_end(sg); > obj->pages = st; > > if (i915_gem_object_needs_bit17_swizzle(obj)) >
Re: FAILED: patch [PATCH] drm/i915: make compact dma scatter lists creation work with failed to apply to 3.10-stable tree
On Tue, Jul 23, 2013 at 09:54:57AM -0700, gre...@linuxfoundation.org wrote: The patch below does not apply to the 3.10-stable tree. You can drop it. The patch is already in 3.10 and it does not need to be backported to earlier trees. There was a snaffu were I thought David would post this in v3.11 merge window - hence the sta...@vger.kernel.org tag. But David came in to rescue me and at the last minute sent an git pull to Linus for v3.10. So all good. Thanks! If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to sta...@vger.kernel.org. thanks, greg k-h -- original commit in Linus's tree -- From 1625e7e549c50fb57a1e1ab1cb0f5735c84c9029 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk konrad.w...@oracle.com Date: Mon, 24 Jun 2013 11:47:48 -0400 Subject: [PATCH] drm/i915: make compact dma scatter lists creation work with SWIOTLB backend. Git commit 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 (drm/i915: create compact dma scatter lists for gem objects) makes certain assumptions about the under laying DMA API that are not always correct. On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup I see: [drm:intel_pipe_set_base] *ERROR* pin fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 Bit of debugging traced it down to dma_map_sg failing (in i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB). That unfortunately are sizes that the SWIOTLB is incapable of handling - the maximum it can handle is a an entry of 512KB of virtual contiguous memory for its bounce buffer. (See IO_TLB_SEGSIZE). Previous to the above mention git commit the SG entries were of 4KB, and the code introduced by above git commit squashed the CPU contiguous PFNs in one big virtual address provided to DMA API. This patch is a simple semi-revert - were we emulate the old behavior if we detect that SWIOTLB is online. If it is not online then we continue on with the new compact scatter gather mechanism. An alternative solution would be for the the '.get_pages' and the i915_gem_gtt_prepare_object to retry with smaller max gap of the amount of PFNs that can be combined together - but with this issue discovered during rc7 that might be too risky. Reported-and-Tested-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com CC: Chris Wilson ch...@chris-wilson.co.uk CC: Imre Deak imre.d...@intel.com CC: Daniel Vetter daniel.vet...@ffwll.ch CC: David Airlie airl...@linux.ie CC: dri-devel@lists.freedesktop.org Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: sta...@vger.kernel.org Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index a6178bac..e31eeb1d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1802,7 +1802,14 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD; gfp = ~(__GFP_IO | __GFP_WAIT); } - +#ifdef CONFIG_SWIOTLB + if (swiotlb_nr_tbl()) { + st-nents++; + sg_set_page(sg, page, PAGE_SIZE, 0); + sg = sg_next(sg); + continue; + } +#endif if (!i || page_to_pfn(page) != last_pfn + 1) { if (i) sg = sg_next(sg); @@ -1813,8 +1820,10 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) } last_pfn = page_to_pfn(page); } - - sg_mark_end(sg); +#ifdef CONFIG_SWIOTLB + if (!swiotlb_nr_tbl()) +#endif + sg_mark_end(sg); obj-pages = st; if (i915_gem_object_needs_bit17_swizzle(obj)) ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/prime: remove cargo-cult locking from map_sg helper
> So after a bit of irc chatting with Maarten this seems to be more > involved. The above check is to cache the dma mapping, but the > implementation is bogus in tons of ways: > - If direction changes we don't bother with unmaping and freeing the > mapping, but simply leak it. > - This will break if the dma mapping needs explicit syncing since the > helpers don't call sync_to_cpu/sync_to_device anywhere. Right, and I believe I signed up for that. > > So I think I'll decline to poke around more in this hornet nest and > leave it at the locking removal. .. and I get the hornet nest :-). Is there a IRC log of what you guys talked about so I don't omit certain pieces of code.
Re: [PATCH] drm/prime: remove cargo-cult locking from map_sg helper
So after a bit of irc chatting with Maarten this seems to be more involved. The above check is to cache the dma mapping, but the implementation is bogus in tons of ways: - If direction changes we don't bother with unmaping and freeing the mapping, but simply leak it. - This will break if the dma mapping needs explicit syncing since the helpers don't call sync_to_cpu/sync_to_device anywhere. Right, and I believe I signed up for that. So I think I'll decline to poke around more in this hornet nest and leave it at the locking removal. .. and I get the hornet nest :-). Is there a IRC log of what you guys talked about so I don't omit certain pieces of code. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
drm-next status (or: drm-openchrome will not be in 3.11)
James Simmons wrote: > >> Am Freitag, 28. Juni 2013, 13:31:50 schrieb Dave Airlie: >> > Okay drm-next is pretty big, possibly the biggest ever. >> > >> > Outstanding things I know about, and will merge, if they arrive >soon: >> > exynos -next >> > nouveau -next >> > >> > Big things I've merged: >> > new rcar driver >> > intel next >> > radeon next >> > tegra next >> > shmob next >> > core/mutexes >> > ttm -> reservation conversion >> > tilcdc patches acked by Rob >> > mtrr reworking >> > prime + gem patches from samsung >> > Laurent's documentation updates >> > various mgag200 patches >> > >> > Otherwise I'm sure I've missed some changes, please let me know of >> > anything you think has fallen down the cracks asap. >> > >> > Slow down people :-P >> > >> > Dave. >> >> IRC #dri-devel: >> >> airlied: drm-openchrome will not be part of Kernel 3.11 >because >> jsimmons has not responded? >> >> jobermayr: seems likely, I don't merge just because someone >posts >> patchrs >> >> >> Tasks to do: >> http://lists.freedesktop.org/archives/dri-devel/2013-June/039695.html >> http://lists.freedesktop.org/archives/dri-devel/2013-June/039796.html > >The VIA driver is pretty huge. Its going to take time to merge it. Plus > >I'm the new guy so I don't have the level of trust the other >maintainers >have. >___ >dri-devel mailing list >dri-devel at lists.freedesktop.org >http://lists.freedesktop.org/mailman/listinfo/dri-devel Could you repost it with some of the reviewers comments addressed please? -- Sent from my Android phone. Please excuse my brevity.
Re: drm-next status (or: drm-openchrome will not be in 3.11)
James Simmons jsimm...@infradead.org wrote: Am Freitag, 28. Juni 2013, 13:31:50 schrieb Dave Airlie: Okay drm-next is pretty big, possibly the biggest ever. Outstanding things I know about, and will merge, if they arrive soon: exynos -next nouveau -next Big things I've merged: new rcar driver intel next radeon next tegra next shmob next core/mutexes ttm - reservation conversion tilcdc patches acked by Rob mtrr reworking prime + gem patches from samsung Laurent's documentation updates various mgag200 patches Otherwise I'm sure I've missed some changes, please let me know of anything you think has fallen down the cracks asap. Slow down people :-P Dave. IRC #dri-devel: jobermayr_ airlied: drm-openchrome will not be part of Kernel 3.11 because jsimmons has not responded? airlied jobermayr: seems likely, I don't merge just because someone posts patchrs Tasks to do: http://lists.freedesktop.org/archives/dri-devel/2013-June/039695.html http://lists.freedesktop.org/archives/dri-devel/2013-June/039796.html The VIA driver is pretty huge. Its going to take time to merge it. Plus I'm the new guy so I don't have the level of trust the other maintainers have. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel Could you repost it with some of the reviewers comments addressed please? -- Sent from my Android phone. Please excuse my brevity. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.
> >>Dave. > > > > Hey Dave > > Of course I will investigate. > > > > The SWIOTLB is unfortunately used because it is a fallback (and I am the > > maintainer of it) and if a real IOMMU is activated it can be mitigated > > differently. When you say 'passed through' you mean in terms of an IOMMU > > in a guest? There are no IOMMU inside a guest when passing in a PCI device. > > I just don't understand why anyone would see swiotlb in this case, the > host would be using hw iommu, why would the guest then need to use a > bounce buffer? Hey Dave, Sorry for the late response. The guest has no concept of the HW IOMMU as it is not 'emulated' or there are no plumbing for it to interface with the host's IOMMU. It means that if it has more than 4GB it will automatically turn on SWIOTLB (b/c hey it might have 32-bit capable devices and it needs to bounce buffer the data to an area above 4GB). Normally the SWIOTLB bounce buffers won't be used unless: a) the pages are not contingous. This is not a case for HVM guests (as it _thinks_ its PFN are always contingous - albeit in reality in might not be, but that is the job of the host EPT/IOMMU to construct this fake view), but for Xen PV - which has a mapping of the PFN -> machine addresses - it _knows_ that the real machine address of a PFN. And as guests are created from random swaths of memory - some of the PFNs might be contingous but some might not. In other words for RAM regions: pfn_to_mfn(pfn + 1) != (pfn_to_mfn(pfn) + 1) mfn is the real physical address bitshifted (PAGE_SHIFT). For HVM guest: (pfn_to_mfn returns the pfn value, so the above formula is): pfn+1 == pfn+1 If this does not make any sense to you - that is OK :-) I can try to explain more but it might just put you to sleep - in which case just think: "Xen PV CPU physical addresses are not the same as the bus(DMA) addresses." - which means it is similar to Sparc platforms or other platforms where the IOMMU has no address CPU->PCI machinery. b) the pages are not page aligned. Less of an issue, but still can come up. c) the DMA mask of the PCI device is 32-bit (common with USB devices, not so often with graphic cards). But hey - there are quirks that sometimes make graphics card DMA up only to certain bitness. d). user provided 'swiotlb=force' and now everything is going through the bounce buffer. The nice solution is to have a virtualization aware version of IOMMU in the guest that will disable SWIOTLB (or use it only in fallback). The AMD folks were thinking about that for KVM, but nothing came out of that. The Citrix folks are looking at that for Xen, but nothing yet (thought I did see some RFC patches). > > > > > Let me start on a new thread on this when I have gotten my head wrapped > > around dma buf. Hadn't gotten to that yet. > > > > Thanks and sorry for getting to this so late in the cycle. New laptop and > > playing with it and that triggered me finding this. > > My main worry is this will regress things for people with swiotlb > enabled even if the gpu isn't using it, granted it won't be any slower > than before so probably not something I care about now if I know > you'll narrow down why all this is necessary later. I am not sure how it would? The patch makes the i915 construct the scatter gather list as it was in v3.9. So it _should_ not impact it negatively. I was trying to follow the spirit of doing a partial revert as close as possible so that the risk of regression would be nil. To summarize, I think (and please correct me if I am mistaken): - You or Daniel are thinking to take this patch for v3.10 or v3.11 (and if in v3.11 then tack on stable at vger.kernel.org). - You will tell defer all SWIOTLB related issues to me. In other words if you see something that is i915 and swiotlb, you will happily shout "Konrad! Tag!" and wash your hands. Hopefully you can also send me some of the past bugs that you suspect are SWIOTLB related. - You expect me to look at dma-buf and figure out how it can coexist with SWIOTLB. Sounds about right? > > Dave.
Re: [PATCH] drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.
Dave. Hey Dave Of course I will investigate. The SWIOTLB is unfortunately used because it is a fallback (and I am the maintainer of it) and if a real IOMMU is activated it can be mitigated differently. When you say 'passed through' you mean in terms of an IOMMU in a guest? There are no IOMMU inside a guest when passing in a PCI device. I just don't understand why anyone would see swiotlb in this case, the host would be using hw iommu, why would the guest then need to use a bounce buffer? Hey Dave, Sorry for the late response. The guest has no concept of the HW IOMMU as it is not 'emulated' or there are no plumbing for it to interface with the host's IOMMU. It means that if it has more than 4GB it will automatically turn on SWIOTLB (b/c hey it might have 32-bit capable devices and it needs to bounce buffer the data to an area above 4GB). Normally the SWIOTLB bounce buffers won't be used unless: a) the pages are not contingous. This is not a case for HVM guests (as it _thinks_ its PFN are always contingous - albeit in reality in might not be, but that is the job of the host EPT/IOMMU to construct this fake view), but for Xen PV - which has a mapping of the PFN - machine addresses - it _knows_ that the real machine address of a PFN. And as guests are created from random swaths of memory - some of the PFNs might be contingous but some might not. In other words for RAM regions: pfn_to_mfn(pfn + 1) != (pfn_to_mfn(pfn) + 1) mfn is the real physical address bitshifted (PAGE_SHIFT). For HVM guest: (pfn_to_mfn returns the pfn value, so the above formula is): pfn+1 == pfn+1 If this does not make any sense to you - that is OK :-) I can try to explain more but it might just put you to sleep - in which case just think: Xen PV CPU physical addresses are not the same as the bus(DMA) addresses. - which means it is similar to Sparc platforms or other platforms where the IOMMU has no address CPU-PCI machinery. b) the pages are not page aligned. Less of an issue, but still can come up. c) the DMA mask of the PCI device is 32-bit (common with USB devices, not so often with graphic cards). But hey - there are quirks that sometimes make graphics card DMA up only to certain bitness. d). user provided 'swiotlb=force' and now everything is going through the bounce buffer. The nice solution is to have a virtualization aware version of IOMMU in the guest that will disable SWIOTLB (or use it only in fallback). The AMD folks were thinking about that for KVM, but nothing came out of that. The Citrix folks are looking at that for Xen, but nothing yet (thought I did see some RFC patches). Let me start on a new thread on this when I have gotten my head wrapped around dma buf. Hadn't gotten to that yet. Thanks and sorry for getting to this so late in the cycle. New laptop and playing with it and that triggered me finding this. My main worry is this will regress things for people with swiotlb enabled even if the gpu isn't using it, granted it won't be any slower than before so probably not something I care about now if I know you'll narrow down why all this is necessary later. I am not sure how it would? The patch makes the i915 construct the scatter gather list as it was in v3.9. So it _should_ not impact it negatively. I was trying to follow the spirit of doing a partial revert as close as possible so that the risk of regression would be nil. To summarize, I think (and please correct me if I am mistaken): - You or Daniel are thinking to take this patch for v3.10 or v3.11 (and if in v3.11 then tack on sta...@vger.kernel.org). - You will tell defer all SWIOTLB related issues to me. In other words if you see something that is i915 and swiotlb, you will happily shout Konrad! Tag! and wash your hands. Hopefully you can also send me some of the past bugs that you suspect are SWIOTLB related. - You expect me to look at dma-buf and figure out how it can coexist with SWIOTLB. Sounds about right? Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
i915 mapping large (3MB) scatter list, hitting limits on certain IOMMUs that can only map contingous regions up to 2MB.
On Tue, Jun 25, 2013 at 11:03:01AM -0400, Jerome Glisse wrote: > On Fri, Jun 21, 2013 at 3:28 PM, Konrad Rzeszutek Wilk > wrote: > > Hey, > > > > I am using an ThinkPad X230 with an Intel HD 4000. With a stock Fedora 18 > > (3.9.6) I can get it to boot and work just fine with Xen. If I use v3.10-rc6 > > I found that i915 would halt with a > > > > [drm:intel_pipe_set_base] *ERROR* pin & fence failed > > [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = > > -28 > > > > after a bit of debugging (see patch below) I traced it down > > to the fact that the scatter list that is provided at the end has > > a huge (3MB) page. I am wondering if anybody knows what patch might > > have introduced it to grab such a large memory segment? > > > > The other thing I am wondering is if there are some fallbacks when the > > underlaying IOMMU can't deal with a request for contingous regions > > that are more than 2MB? > > There is no fallback afaik, but most gpu have their own mmu so they > don't really need contiguous iommu mapping, all they need is at the > very least being able to access all page of large object. Probably > something we should take a look at. They can see. It is just that the underlaying DMA was not capable of dealing with a large chunk of the scatter gather. > > Cheers, > Jerome > > > Thanks. > > > > From a681a4adb4738c32cb1acdf6f5161bf877816b01 Mon Sep 17 00:00:00 2001 > > From: Konrad Rzeszutek Wilk > > Date: Fri, 21 Jun 2013 11:17:55 -0400 > > Subject: [PATCH] dbug: print scatterlist. > > > > [ cut here ] > > WARNING: at drivers/gpu/drm/i915/i915_gem_gtt.c:418 > > i915_gem_gtt_prepare_object+0x180/0x200() > > 10 but got 0 > > Modules linked in: > > crc32_pclmul sdhci_pci crc32c_intel sdhci mmc_core ghash_clmulni_intel > > CPU: 0 PID: 216 Comm: plymouthd Not tainted 3.10.0-rc6+ #16 > > Hardware name: LENOVO 2325DV4/2325DV4, BIOS G2ET86WW (2.06 ) 11/13/2012 > > 0009 8801fa42d958 816e6d89 8801fa42d998 > > 8105d2b0 8801fa42d988 8801fb0f4d80 > > 81c172e0 8801fa76f000 000a 8801fa42d9f8 > > Call Trace: > > [] dump_stack+0x19/0x1b > > [] warn_slowpath_common+0x70/0xa0 > > [] warn_slowpath_fmt+0x46/0x50 > > [] i915_gem_gtt_prepare_object+0x180/0x200 > > [] i915_gem_object_pin+0x321/0x670 > > [] i915_gem_object_pin_to_display_plane+0x81/0x190 > > [] intel_pin_and_fence_fb_obj+0x85/0x1a0 > > [] intel_pipe_set_base+0x7c/0x220 > > [] intel_crtc_set_config+0x89e/0x990 > > [] drm_mode_set_config_internal+0x2e/0x60 > > [] drm_mode_setcrtc+0xfb/0x620 > > [] ? kmem_cache_alloc_trace+0x39/0x1f0 > > [] ? drm_vm_open_locked+0x57/0x90 > > [] drm_ioctl+0x549/0x680 > > [] ? drm_mode_setplane+0x3b0/0x3b0 > > [] do_vfs_ioctl+0x97/0x580 > > [] ? inode_has_perm.isra.32.constprop.62+0x2a/0x30 > > [] ? file_has_perm+0x97/0xb0 > > [] SyS_ioctl+0x91/0xb0 > > [] tracesys+0xdd/0xe2 > > ---[ end trace 7b6adc5450d9a9e1 ]--- > > i915 :00:02.0: i915_gem_gtt_prepare_object: Mapping 10 pages, mapped: 0 > > [0] virT:8801fd37c000 dma: 1fd37c000, size:4096 > > [1] virT:8801fd37b000 dma: 1fd37b000, size:4096 > > [2] virT:8801fd37a000 dma: 1fd37a000, size:4096 > > [3] virT:8801fd378000 dma: 1fd378000, size:4096 > > [4] virT:8801fd131000 dma: 1fd131000, size:4096 > > [5] virT:880200c36000 dma: 200c36000, size:4096 > > [6] virT:8801fd1a4000 dma: 1fd1a4000, size:69632 > > [7] virT:8801fd3bb000 dma: 1fd3bb000, size:4096 > > [8] virT:ffff8801fd3c0000 dma: 1fd3c, size:262144 > > [9] virT:8801f940 dma: 1f940, size:3866624 > > [drm] 3011: ret:-28 > > [drm] 3540: ret:-28 > > [drm] 3364: ret:-28 > > [drm:intel_pin_and_fence_fb_obj] *ERROR* > > i915_gem_object_pin_to_display_plane failed: -28 > > [drm:intel_pipe_set_base] *ERROR* pin & fence failed > > [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = > > -28 > > > > Signed-off-by: Konrad Rzeszutek Wilk > > --- > > drivers/gpu/drm/i915/i915_drv.c | 6 ++ > > drivers/gpu/drm/i915/i915_gem_gtt.c | 28 +--- > > 2 files changed, 31 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c > > b/drivers/gpu/drm/i915/i915_drv.c > > index 8411942..141c6fb 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.c > > +++ b/drivers/gpu/drm/i915/i
Re: i915 mapping large (3MB) scatter list, hitting limits on certain IOMMUs that can only map contingous regions up to 2MB.
On Tue, Jun 25, 2013 at 11:03:01AM -0400, Jerome Glisse wrote: On Fri, Jun 21, 2013 at 3:28 PM, Konrad Rzeszutek Wilk konrad.w...@oracle.com wrote: Hey, I am using an ThinkPad X230 with an Intel HD 4000. With a stock Fedora 18 (3.9.6) I can get it to boot and work just fine with Xen. If I use v3.10-rc6 I found that i915 would halt with a [drm:intel_pipe_set_base] *ERROR* pin fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 after a bit of debugging (see patch below) I traced it down to the fact that the scatter list that is provided at the end has a huge (3MB) page. I am wondering if anybody knows what patch might have introduced it to grab such a large memory segment? The other thing I am wondering is if there are some fallbacks when the underlaying IOMMU can't deal with a request for contingous regions that are more than 2MB? There is no fallback afaik, but most gpu have their own mmu so they don't really need contiguous iommu mapping, all they need is at the very least being able to access all page of large object. Probably something we should take a look at. They can see. It is just that the underlaying DMA was not capable of dealing with a large chunk of the scatter gather. Cheers, Jerome Thanks. From a681a4adb4738c32cb1acdf6f5161bf877816b01 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk konrad.w...@oracle.com Date: Fri, 21 Jun 2013 11:17:55 -0400 Subject: [PATCH] dbug: print scatterlist. [ cut here ] WARNING: at drivers/gpu/drm/i915/i915_gem_gtt.c:418 i915_gem_gtt_prepare_object+0x180/0x200() 10 but got 0 Modules linked in: crc32_pclmul sdhci_pci crc32c_intel sdhci mmc_core ghash_clmulni_intel CPU: 0 PID: 216 Comm: plymouthd Not tainted 3.10.0-rc6+ #16 Hardware name: LENOVO 2325DV4/2325DV4, BIOS G2ET86WW (2.06 ) 11/13/2012 0009 8801fa42d958 816e6d89 8801fa42d998 8105d2b0 8801fa42d988 8801fb0f4d80 81c172e0 8801fa76f000 000a 8801fa42d9f8 Call Trace: [816e6d89] dump_stack+0x19/0x1b [8105d2b0] warn_slowpath_common+0x70/0xa0 [8105d396] warn_slowpath_fmt+0x46/0x50 [8142a740] i915_gem_gtt_prepare_object+0x180/0x200 [81423581] i915_gem_object_pin+0x321/0x670 [81423951] i915_gem_object_pin_to_display_plane+0x81/0x190 [814381b5] intel_pin_and_fence_fb_obj+0x85/0x1a0 [8143999c] intel_pipe_set_base+0x7c/0x220 [814409be] intel_crtc_set_config+0x89e/0x990 [813ffdae] drm_mode_set_config_internal+0x2e/0x60 [814023ab] drm_mode_setcrtc+0xfb/0x620 [811850f9] ? kmem_cache_alloc_trace+0x39/0x1f0 [813f9767] ? drm_vm_open_locked+0x57/0x90 [813f2e39] drm_ioctl+0x549/0x680 [814022b0] ? drm_mode_setplane+0x3b0/0x3b0 [811aef77] do_vfs_ioctl+0x97/0x580 [81295dca] ? inode_has_perm.isra.32.constprop.62+0x2a/0x30 [81297397] ? file_has_perm+0x97/0xb0 [811af4f1] SyS_ioctl+0x91/0xb0 [816f63e7] tracesys+0xdd/0xe2 ---[ end trace 7b6adc5450d9a9e1 ]--- i915 :00:02.0: i915_gem_gtt_prepare_object: Mapping 10 pages, mapped: 0 [0] virT:8801fd37c000 dma: 1fd37c000, size:4096 [1] virT:8801fd37b000 dma: 1fd37b000, size:4096 [2] virT:8801fd37a000 dma: 1fd37a000, size:4096 [3] virT:8801fd378000 dma: 1fd378000, size:4096 [4] virT:8801fd131000 dma: 1fd131000, size:4096 [5] virT:880200c36000 dma: 200c36000, size:4096 [6] virT:8801fd1a4000 dma: 1fd1a4000, size:69632 [7] virT:8801fd3bb000 dma: 1fd3bb000, size:4096 [8] virT:8801fd3c dma: 1fd3c, size:262144 [9] virT:8801f940 dma: 1f940, size:3866624 [drm] 3011: ret:-28 [drm] 3540: ret:-28 [drm] 3364: ret:-28 [drm:intel_pin_and_fence_fb_obj] *ERROR* i915_gem_object_pin_to_display_plane failed: -28 [drm:intel_pipe_set_base] *ERROR* pin fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- drivers/gpu/drm/i915/i915_drv.c | 6 ++ drivers/gpu/drm/i915/i915_gem_gtt.c | 28 +--- 2 files changed, 31 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 8411942..141c6fb 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -133,6 +133,12 @@ module_param_named(coherent, use_coherent, int, 0600); MODULE_PARM_DESC(use_coherent, Use coherent DMA API calls (default: false)); +int i915_sgl __read_mostly = 0; +module_param_named(sgl, i915_sgl, int, 0600); +MODULE_PARM_DESC(sgl, +Print scatterlist SG's when DMA mapping them (default: false
[PATCH] drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.
Dave Airlie wrote: >On Tue, Jun 25, 2013 at 4:34 AM, Konrad Rzeszutek Wilk > wrote: >> On Mon, Jun 24, 2013 at 08:26:18PM +0200, Daniel Vetter wrote: >>> On Mon, Jun 24, 2013 at 7:32 PM, Konrad Rzeszutek Wilk >>> wrote: >>> > On Mon, Jun 24, 2013 at 07:09:12PM +0200, Daniel Vetter wrote: >>> >> On Mon, Jun 24, 2013 at 11:47:48AM -0400, Konrad Rzeszutek Wilk >wrote: >>> >> > Git commit 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 >>> >> > ("drm/i915: create compact dma scatter lists for gem objects") >makes >>> >> > certain assumptions about the under laying DMA API that are not >always >>> >> > correct. >>> >> > >>> >> > On a ThinkPad X230 with an Intel HD 4000 with Xen during the >bootup >>> >> > I see: >>> >> > >>> >> > [drm:intel_pipe_set_base] *ERROR* pin & fence failed >>> >> > [drm:intel_crtc_set_config] *ERROR* failed to set mode on >[CRTC:3], err = -28 >>> >> > >>> >> > Bit of debugging traced it down to dma_map_sg failing (in >>> >> > i915_gem_gtt_prepare_object) as some of the SG entries were >huge (3MB). >>> >> > >>> >> > That unfortunately are sizes that the SWIOTLB is incapable of >handling - >>> >> > the maximum it can handle is a an entry of 512KB of virtual >contiguous >>> >> > memory for its bounce buffer. (See IO_TLB_SEGSIZE). >>> >> > >>> >> > Previous to the above mention git commit the SG entries were of >4KB, and >>> >> > the code introduced by above git commit squashed the CPU >contiguous PFNs >>> >> > in one big virtual address provided to DMA API. >>> >> > >>> >> > This patch is a simple semi-revert - were we emulate the old >behavior >>> >> > if we detect that SWIOTLB is online. If it is not online then >we continue >>> >> > on with the new compact scatter gather mechanism. >>> >> > >>> >> > An alternative solution would be for the the '.get_pages' and >the >>> >> > i915_gem_gtt_prepare_object to retry with smaller max gap of >the >>> >> > amount of PFNs that can be combined together - but with this >issue >>> >> > discovered during rc7 that might be too risky. >>> >> > >>> >> > Reported-and-Tested-by: Konrad Rzeszutek Wilk > >>> >> > CC: Chris Wilson >>> >> > CC: Imre Deak >>> >> > CC: Daniel Vetter >>> >> > CC: David Airlie >>> >> > CC: >>> >> > Signed-off-by: Konrad Rzeszutek Wilk >>> >> >>> >> Two things: >>> > >>> > Hey Daniel, >>> > >>> >> >>> >> - SWIOTLB usage should seriously blow up all over the place in >drm/i915. >>> >> We really rely on the everywhere else true fact that the pages >and their >>> >> dma mapping point at the same backing storage. >>> > >>> > It works. As in, it seems to work for just a normal desktop user. >I don't >>> > see much of dma_sync_* sprinkled around the drm/i915 so I would >think that >>> > there are some issues would be hit as well - but at the first >glance >>> > when using it on a laptop it looks OK. >>> >>> Yeah, we have a pretty serious case of "roll our own coherency >stuff". >>> The biggest reason is that for a long time i915.ko didn't care one >bit >>> about iommus, and the thing we care about (flushing cpu caches for >>> dma) isn't supported on x86 since x86 every dma is coherent (well, >not >>> quite, but we don't have support for it). I think longer-term it >would >>> make sense to move the clfushing we're doing into the dma layer. >>> >>> >> - How is this solved elsewhere when constructing sg tables? Or >are we >>> >> really the only guys who try to construct such big sg entries? >I >>> >> expected somewhat that the dma mapping backed would fill in the >segment >>> >> limits accordingly, but I haven't found anything really on a >quick >>> >> search. >>> > >>> > The TTM layer (so radeon, nouveau) uses pci_alloc_coherent which >will >>> > construct the dma mapped pages. That allows it to construct >"SWI
[PATCH] drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.
On Mon, Jun 24, 2013 at 08:26:18PM +0200, Daniel Vetter wrote: > On Mon, Jun 24, 2013 at 7:32 PM, Konrad Rzeszutek Wilk > wrote: > > On Mon, Jun 24, 2013 at 07:09:12PM +0200, Daniel Vetter wrote: > >> On Mon, Jun 24, 2013 at 11:47:48AM -0400, Konrad Rzeszutek Wilk wrote: > >> > Git commit 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 > >> > ("drm/i915: create compact dma scatter lists for gem objects") makes > >> > certain assumptions about the under laying DMA API that are not always > >> > correct. > >> > > >> > On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup > >> > I see: > >> > > >> > [drm:intel_pipe_set_base] *ERROR* pin & fence failed > >> > [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err > >> > = -28 > >> > > >> > Bit of debugging traced it down to dma_map_sg failing (in > >> > i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB). > >> > > >> > That unfortunately are sizes that the SWIOTLB is incapable of handling - > >> > the maximum it can handle is a an entry of 512KB of virtual contiguous > >> > memory for its bounce buffer. (See IO_TLB_SEGSIZE). > >> > > >> > Previous to the above mention git commit the SG entries were of 4KB, and > >> > the code introduced by above git commit squashed the CPU contiguous PFNs > >> > in one big virtual address provided to DMA API. > >> > > >> > This patch is a simple semi-revert - were we emulate the old behavior > >> > if we detect that SWIOTLB is online. If it is not online then we continue > >> > on with the new compact scatter gather mechanism. > >> > > >> > An alternative solution would be for the the '.get_pages' and the > >> > i915_gem_gtt_prepare_object to retry with smaller max gap of the > >> > amount of PFNs that can be combined together - but with this issue > >> > discovered during rc7 that might be too risky. > >> > > >> > Reported-and-Tested-by: Konrad Rzeszutek Wilk > >> > CC: Chris Wilson > >> > CC: Imre Deak > >> > CC: Daniel Vetter > >> > CC: David Airlie > >> > CC: > >> > Signed-off-by: Konrad Rzeszutek Wilk > >> > >> Two things: > > > > Hey Daniel, > > > >> > >> - SWIOTLB usage should seriously blow up all over the place in drm/i915. > >> We really rely on the everywhere else true fact that the pages and their > >> dma mapping point at the same backing storage. > > > > It works. As in, it seems to work for just a normal desktop user. I don't > > see much of dma_sync_* sprinkled around the drm/i915 so I would think that > > there are some issues would be hit as well - but at the first glance > > when using it on a laptop it looks OK. > > Yeah, we have a pretty serious case of "roll our own coherency stuff". > The biggest reason is that for a long time i915.ko didn't care one bit > about iommus, and the thing we care about (flushing cpu caches for > dma) isn't supported on x86 since x86 every dma is coherent (well, not > quite, but we don't have support for it). I think longer-term it would > make sense to move the clfushing we're doing into the dma layer. > > >> - How is this solved elsewhere when constructing sg tables? Or are we > >> really the only guys who try to construct such big sg entries? I > >> expected somewhat that the dma mapping backed would fill in the segment > >> limits accordingly, but I haven't found anything really on a quick > >> search. > > > > The TTM layer (so radeon, nouveau) uses pci_alloc_coherent which will > > construct the dma mapped pages. That allows it to construct > > "SWIOTLB-approved" > > pages that won't need to go through dma_map/dma_unmap as they are > > already mapped and ready to go. > > > > Coming back to your question - I think that i915 is the one that I've > > encountered. > > That's a bit surprising. With dma_buf graphics people will use sg > tables much more (there's even a nice sg_alloc_table_from_pages helper > to construct them), and those sg tables tend to have large segments. I > guess we need some more generic solution here ... Yes. I don't grok the full picture yet so I am not sure how to help with this right now. Is there a roadmap or Wiki on how this was envisioned? > > For now I guess we can live with your CONFIG_SWIOTLB hack. > -Daniel OK,
[PATCH] drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.
On Mon, Jun 24, 2013 at 07:09:12PM +0200, Daniel Vetter wrote: > On Mon, Jun 24, 2013 at 11:47:48AM -0400, Konrad Rzeszutek Wilk wrote: > > Git commit 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 > > ("drm/i915: create compact dma scatter lists for gem objects") makes > > certain assumptions about the under laying DMA API that are not always > > correct. > > > > On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup > > I see: > > > > [drm:intel_pipe_set_base] *ERROR* pin & fence failed > > [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = > > -28 > > > > Bit of debugging traced it down to dma_map_sg failing (in > > i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB). > > > > That unfortunately are sizes that the SWIOTLB is incapable of handling - > > the maximum it can handle is a an entry of 512KB of virtual contiguous > > memory for its bounce buffer. (See IO_TLB_SEGSIZE). > > > > Previous to the above mention git commit the SG entries were of 4KB, and > > the code introduced by above git commit squashed the CPU contiguous PFNs > > in one big virtual address provided to DMA API. > > > > This patch is a simple semi-revert - were we emulate the old behavior > > if we detect that SWIOTLB is online. If it is not online then we continue > > on with the new compact scatter gather mechanism. > > > > An alternative solution would be for the the '.get_pages' and the > > i915_gem_gtt_prepare_object to retry with smaller max gap of the > > amount of PFNs that can be combined together - but with this issue > > discovered during rc7 that might be too risky. > > > > Reported-and-Tested-by: Konrad Rzeszutek Wilk > > CC: Chris Wilson > > CC: Imre Deak > > CC: Daniel Vetter > > CC: David Airlie > > CC: > > Signed-off-by: Konrad Rzeszutek Wilk > > Two things: Hey Daniel, > > - SWIOTLB usage should seriously blow up all over the place in drm/i915. > We really rely on the everywhere else true fact that the pages and their > dma mapping point at the same backing storage. It works. As in, it seems to work for just a normal desktop user. I don't see much of dma_sync_* sprinkled around the drm/i915 so I would think that there are some issues would be hit as well - but at the first glance when using it on a laptop it looks OK. > - How is this solved elsewhere when constructing sg tables? Or are we > really the only guys who try to construct such big sg entries? I > expected somewhat that the dma mapping backed would fill in the segment > limits accordingly, but I haven't found anything really on a quick > search. The TTM layer (so radeon, nouveau) uses pci_alloc_coherent which will construct the dma mapped pages. That allows it to construct "SWIOTLB-approved" pages that won't need to go through dma_map/dma_unmap as they are already mapped and ready to go. Coming back to your question - I think that i915 is the one that I've encountered. > > > Cheers, Daniel > > > --- > > drivers/gpu/drm/i915/i915_gem.c | 15 --- > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c > > b/drivers/gpu/drm/i915/i915_gem.c > > index 970ad17..7045f45 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -1801,7 +1801,14 @@ i915_gem_object_get_pages_gtt(struct > > drm_i915_gem_object *obj) > > gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD; > > gfp &= ~(__GFP_IO | __GFP_WAIT); > > } > > - > > +#ifdef CONFIG_SWIOTLB > > + if (swiotlb_nr_tbl()) { > > + st->nents++; > > + sg_set_page(sg, page, PAGE_SIZE, 0); > > + sg = sg_next(sg); > > + continue; > > + } > > +#endif > > if (!i || page_to_pfn(page) != last_pfn + 1) { > > if (i) > > sg = sg_next(sg); > > @@ -1812,8 +1819,10 @@ i915_gem_object_get_pages_gtt(struct > > drm_i915_gem_object *obj) > > } > > last_pfn = page_to_pfn(page); > > } > > - > > - sg_mark_end(sg); > > +#ifdef CONFIG_SWIOTLB > > + if (!swiotlb_nr_tbl()) > > +#endif > > + sg_mark_end(sg); > > obj->pages = st; > > > > if (i915_gem_object_needs_bit17_swizzle(obj)) > > -- > > 1.8.1.4 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
[PATCH] drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.
Git commit 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 ("drm/i915: create compact dma scatter lists for gem objects") makes certain assumptions about the under laying DMA API that are not always correct. On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup I see: [drm:intel_pipe_set_base] *ERROR* pin & fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 Bit of debugging traced it down to dma_map_sg failing (in i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB). That unfortunately are sizes that the SWIOTLB is incapable of handling - the maximum it can handle is a an entry of 512KB of virtual contiguous memory for its bounce buffer. (See IO_TLB_SEGSIZE). Previous to the above mention git commit the SG entries were of 4KB, and the code introduced by above git commit squashed the CPU contiguous PFNs in one big virtual address provided to DMA API. This patch is a simple semi-revert - were we emulate the old behavior if we detect that SWIOTLB is online. If it is not online then we continue on with the new compact scatter gather mechanism. An alternative solution would be for the the '.get_pages' and the i915_gem_gtt_prepare_object to retry with smaller max gap of the amount of PFNs that can be combined together - but with this issue discovered during rc7 that might be too risky. Reported-and-Tested-by: Konrad Rzeszutek Wilk CC: Chris Wilson CC: Imre Deak CC: Daniel Vetter CC: David Airlie CC: Signed-off-by: Konrad Rzeszutek Wilk --- drivers/gpu/drm/i915/i915_gem.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 970ad17..7045f45 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1801,7 +1801,14 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD; gfp &= ~(__GFP_IO | __GFP_WAIT); } - +#ifdef CONFIG_SWIOTLB + if (swiotlb_nr_tbl()) { + st->nents++; + sg_set_page(sg, page, PAGE_SIZE, 0); + sg = sg_next(sg); + continue; + } +#endif if (!i || page_to_pfn(page) != last_pfn + 1) { if (i) sg = sg_next(sg); @@ -1812,8 +1819,10 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) } last_pfn = page_to_pfn(page); } - - sg_mark_end(sg); +#ifdef CONFIG_SWIOTLB + if (!swiotlb_nr_tbl()) +#endif + sg_mark_end(sg); obj->pages = st; if (i915_gem_object_needs_bit17_swizzle(obj)) -- 1.8.1.4
[PATCH] Bootup regression of v3.10-rc6 + SWIOTLB + Intel 4000.
Hey Dave, Chris, Imre, Attached is a fix that makes v3.10-rc6 boot on Intel HD 4000 when SWIOTLB bounce buffer is in usage. The SWIOTLB can only handle up to 512KB swath of memory to create bounce buffers for and Imre's patch made it possible to provide more than to the DMA API which caused it to fail with dma_map_sg. Since this is rc7 time I did the less risky way of fixing it - by just doing what said code did before 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 ("drm/i915: create compact dma scatter lists for gem objects") was introduced by using a check to see if SWIOTLB is enabled. It is not the best fix but I figured the less risky. drivers/gpu/drm/i915/i915_gem.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) I think that a better approach (in v3.11?) would be to do some form of retry mechanism: (not compile tested, not run at all): diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b9d00dc..0f9079d 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1110,8 +1110,12 @@ struct drm_i915_gem_object_ops { * will therefore most likely be called when the object itself is * being released or under memory pressure (where we attempt to * reap pages for the shrinker). +* +* max is the maximum size an sg entry can be. Usually it is +* PAGE_SIZE but if the backend (IOMMU) can deal with larger +* then a larger value might be used as well. */ - int (*get_pages)(struct drm_i915_gem_object *); + int (*get_pages)(struct drm_i915_gem_object *, unsigned long max); void (*put_pages)(struct drm_i915_gem_object *); }; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 7045f45..a29e7db 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1738,7 +1738,7 @@ i915_gem_shrink_all(struct drm_i915_private *dev_priv) } static int -i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) +i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj, unsigned long max) { struct drm_i915_private *dev_priv = obj->base.dev->dev_private; int page_count, i; @@ -1809,7 +1809,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) continue; } #endif - if (!i || page_to_pfn(page) != last_pfn + 1) { + if ((!i || (page_to_pfn(page) != last_pfn + 1)) && (sg->length < max)) { if (i) sg = sg_next(sg); st->nents++; @@ -1847,7 +1847,7 @@ err_pages: * or as the object is itself released. */ int -i915_gem_object_get_pages(struct drm_i915_gem_object *obj) +i915_gem_object_get_pages(struct drm_i915_gem_object *obj, unsigned int max) { struct drm_i915_private *dev_priv = obj->base.dev->dev_private; const struct drm_i915_gem_object_ops *ops = obj->ops; @@ -1863,7 +1863,7 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) BUG_ON(obj->pages_pin_count); - ret = ops->get_pages(obj); + ret = ops->get_pages(obj, max); if (ret) return ret; @@ -2942,7 +2942,12 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, u32 size, fence_size, fence_alignment, unfenced_alignment; bool mappable, fenceable; int ret; + static unsigned int max_size = 4 * 1024 * 1024; /* 4MB */ +#ifdef CONFIG_SWIOTLB + if (swiotlb_nr_tbl()) + max_size = PAGE_SIZE; +#endif fence_size = i915_gem_get_gtt_size(dev, obj->base.size, obj->tiling_mode); @@ -2972,8 +2977,8 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, DRM_ERROR("Attempting to bind an object larger than the aperture\n"); return -E2BIG; } - - ret = i915_gem_object_get_pages(obj); + retry: + ret = i915_gem_object_get_pages(obj, max_size); if (ret) return ret; @@ -3015,6 +3020,10 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, if (ret) { i915_gem_object_unpin_pages(obj); drm_mm_put_block(node); + if (max_size > PAGE_SIZE) { + max_size >> 1; + goto retry; + } return ret; } diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index dc53a52..8101387 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -230,7 +230,8 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev, return dma_buf_export(obj, _dmabuf_ops, obj->base.size, flags); } -static int i915_gem_object_get_pages_dmabuf(struct drm_i915_gem_object *obj)
Regression introduced by 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 ("drm/i915: create compact dma scatter lists for gem objects") Was:Re: i915 mapping large (3MB) scatter list, hitting limits on certai
On Sat, Jun 22, 2013 at 03:22:59PM +0100, Chris Wilson wrote: > On Fri, Jun 21, 2013 at 10:03:43PM -0400, Konrad Rzeszutek Wilk wrote: > > On Fri, Jun 21, 2013 at 03:28:28PM -0400, Konrad Rzeszutek Wilk wrote: > > > Hey, > > > > CC-ing Imre, > > > > Imre, your patch 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 > > ("drm/i915: create compact dma scatter lists for gem objects") is the cause > > of the regression. > > > > If I revert your patch it boots fine without any trouble. > > > > I am not entirely sure why that is - as I added some debug code in > > lib/swiotlb.c to trigger when it can't find 3MB area (which > > is what I thought initially was the issue) - but none of the debug > > code seems to be hit. > > > > Any thoughts? > > You should be hitting drivers/iommu/intel-iommu.c for the dma > translation. It looks like the contiguous 3MiB segment will be special > as it is the first sg that __domain_mapping() will attempt to allocate a > superpage (2MiB) for. What goes wrong at point, I am not sure, but I > would suggest peppering intel-iommu.c with printk to track down the error. I figured it out. The issue was that I am backed by the SWIOTLB which can only allocate up to 128*4K chunks of contingous bounce buffer (see IO_TLB_SEGSIZE) - which means it can only setup up to 512kB buffers. While one of the SG entries tries to give it one past that size (3MB). The change Imre introduced assume that the CPU addresses (virtual) are the same as the bus addresses. That is correct in most platforms, but some (for example when booting a Xen PV guest with i915 as PCI passthrough) the virt_to_phys() values != bus address. Which means that the nice check of: if (!i || page_to_pfn(page) != last_pfn + 1) { if (i) sg = sg_next(sg); st->nents++; sg_set_page(sg, page, PAGE_SIZE, 0); } else { sg->length += PAGE_SIZE; } is too simplistic. What it ought to do is consult the DMA API whether the next PFN (page_to_pfn(page)) is really contingous in the DMA space. And also that it does got past the DMA mask for said device. Unfortunatly such calls do not exist. Those checks are all done when dma_map_page_* is done (which is where it failed for me). The best (for an rc7 stage) fix I came up with is to revert just a bit of the old behavior and still retain the sg compact code behavior. See following patch: diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 970ad17..9edd2eb 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1801,7 +1801,12 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD; gfp &= ~(__GFP_IO | __GFP_WAIT); } - + if (swiotlb_nr_tbl()) { + st->nents++; + sg_set_page(sg, page, PAGE_SIZE, 0); + sg = sg_next(sg); + continue; + } if (!i || page_to_pfn(page) != last_pfn + 1) { if (i) sg = sg_next(sg); @@ -1813,7 +1818,8 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) last_pfn = page_to_pfn(page); } - sg_mark_end(sg); + if (!swiotlb_nr_tbl()) + sg_mark_end(sg); obj->pages = st; if (i915_gem_object_needs_bit17_swizzle(obj))
Re: Regression introduced by 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 (drm/i915: create compact dma scatter lists for gem objects) Was:Re: i915 mapping large (3MB) scatter list, hitting limits on ce
On Sat, Jun 22, 2013 at 03:22:59PM +0100, Chris Wilson wrote: On Fri, Jun 21, 2013 at 10:03:43PM -0400, Konrad Rzeszutek Wilk wrote: On Fri, Jun 21, 2013 at 03:28:28PM -0400, Konrad Rzeszutek Wilk wrote: Hey, CC-ing Imre, Imre, your patch 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 (drm/i915: create compact dma scatter lists for gem objects) is the cause of the regression. If I revert your patch it boots fine without any trouble. I am not entirely sure why that is - as I added some debug code in lib/swiotlb.c to trigger when it can't find 3MB area (which is what I thought initially was the issue) - but none of the debug code seems to be hit. Any thoughts? You should be hitting drivers/iommu/intel-iommu.c for the dma translation. It looks like the contiguous 3MiB segment will be special as it is the first sg that __domain_mapping() will attempt to allocate a superpage (2MiB) for. What goes wrong at point, I am not sure, but I would suggest peppering intel-iommu.c with printk to track down the error. I figured it out. The issue was that I am backed by the SWIOTLB which can only allocate up to 128*4K chunks of contingous bounce buffer (see IO_TLB_SEGSIZE) - which means it can only setup up to 512kB buffers. While one of the SG entries tries to give it one past that size (3MB). The change Imre introduced assume that the CPU addresses (virtual) are the same as the bus addresses. That is correct in most platforms, but some (for example when booting a Xen PV guest with i915 as PCI passthrough) the virt_to_phys() values != bus address. Which means that the nice check of: if (!i || page_to_pfn(page) != last_pfn + 1) { if (i) sg = sg_next(sg); st-nents++; sg_set_page(sg, page, PAGE_SIZE, 0); } else { sg-length += PAGE_SIZE; } is too simplistic. What it ought to do is consult the DMA API whether the next PFN (page_to_pfn(page)) is really contingous in the DMA space. And also that it does got past the DMA mask for said device. Unfortunatly such calls do not exist. Those checks are all done when dma_map_page_* is done (which is where it failed for me). The best (for an rc7 stage) fix I came up with is to revert just a bit of the old behavior and still retain the sg compact code behavior. See following patch: diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 970ad17..9edd2eb 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1801,7 +1801,12 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD; gfp = ~(__GFP_IO | __GFP_WAIT); } - + if (swiotlb_nr_tbl()) { + st-nents++; + sg_set_page(sg, page, PAGE_SIZE, 0); + sg = sg_next(sg); + continue; + } if (!i || page_to_pfn(page) != last_pfn + 1) { if (i) sg = sg_next(sg); @@ -1813,7 +1818,8 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) last_pfn = page_to_pfn(page); } - sg_mark_end(sg); + if (!swiotlb_nr_tbl()) + sg_mark_end(sg); obj-pages = st; if (i915_gem_object_needs_bit17_swizzle(obj)) ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.
Git commit 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 (drm/i915: create compact dma scatter lists for gem objects) makes certain assumptions about the under laying DMA API that are not always correct. On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup I see: [drm:intel_pipe_set_base] *ERROR* pin fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 Bit of debugging traced it down to dma_map_sg failing (in i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB). That unfortunately are sizes that the SWIOTLB is incapable of handling - the maximum it can handle is a an entry of 512KB of virtual contiguous memory for its bounce buffer. (See IO_TLB_SEGSIZE). Previous to the above mention git commit the SG entries were of 4KB, and the code introduced by above git commit squashed the CPU contiguous PFNs in one big virtual address provided to DMA API. This patch is a simple semi-revert - were we emulate the old behavior if we detect that SWIOTLB is online. If it is not online then we continue on with the new compact scatter gather mechanism. An alternative solution would be for the the '.get_pages' and the i915_gem_gtt_prepare_object to retry with smaller max gap of the amount of PFNs that can be combined together - but with this issue discovered during rc7 that might be too risky. Reported-and-Tested-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com CC: Chris Wilson ch...@chris-wilson.co.uk CC: Imre Deak imre.d...@intel.com CC: Daniel Vetter daniel.vet...@ffwll.ch CC: David Airlie airl...@linux.ie CC: dri-devel@lists.freedesktop.org Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- drivers/gpu/drm/i915/i915_gem.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 970ad17..7045f45 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1801,7 +1801,14 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD; gfp = ~(__GFP_IO | __GFP_WAIT); } - +#ifdef CONFIG_SWIOTLB + if (swiotlb_nr_tbl()) { + st-nents++; + sg_set_page(sg, page, PAGE_SIZE, 0); + sg = sg_next(sg); + continue; + } +#endif if (!i || page_to_pfn(page) != last_pfn + 1) { if (i) sg = sg_next(sg); @@ -1812,8 +1819,10 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) } last_pfn = page_to_pfn(page); } - - sg_mark_end(sg); +#ifdef CONFIG_SWIOTLB + if (!swiotlb_nr_tbl()) +#endif + sg_mark_end(sg); obj-pages = st; if (i915_gem_object_needs_bit17_swizzle(obj)) -- 1.8.1.4 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] Bootup regression of v3.10-rc6 + SWIOTLB + Intel 4000.
Hey Dave, Chris, Imre, Attached is a fix that makes v3.10-rc6 boot on Intel HD 4000 when SWIOTLB bounce buffer is in usage. The SWIOTLB can only handle up to 512KB swath of memory to create bounce buffers for and Imre's patch made it possible to provide more than to the DMA API which caused it to fail with dma_map_sg. Since this is rc7 time I did the less risky way of fixing it - by just doing what said code did before 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 (drm/i915: create compact dma scatter lists for gem objects) was introduced by using a check to see if SWIOTLB is enabled. It is not the best fix but I figured the less risky. drivers/gpu/drm/i915/i915_gem.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) I think that a better approach (in v3.11?) would be to do some form of retry mechanism: (not compile tested, not run at all): diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b9d00dc..0f9079d 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1110,8 +1110,12 @@ struct drm_i915_gem_object_ops { * will therefore most likely be called when the object itself is * being released or under memory pressure (where we attempt to * reap pages for the shrinker). +* +* max is the maximum size an sg entry can be. Usually it is +* PAGE_SIZE but if the backend (IOMMU) can deal with larger +* then a larger value might be used as well. */ - int (*get_pages)(struct drm_i915_gem_object *); + int (*get_pages)(struct drm_i915_gem_object *, unsigned long max); void (*put_pages)(struct drm_i915_gem_object *); }; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 7045f45..a29e7db 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1738,7 +1738,7 @@ i915_gem_shrink_all(struct drm_i915_private *dev_priv) } static int -i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) +i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj, unsigned long max) { struct drm_i915_private *dev_priv = obj-base.dev-dev_private; int page_count, i; @@ -1809,7 +1809,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) continue; } #endif - if (!i || page_to_pfn(page) != last_pfn + 1) { + if ((!i || (page_to_pfn(page) != last_pfn + 1)) (sg-length max)) { if (i) sg = sg_next(sg); st-nents++; @@ -1847,7 +1847,7 @@ err_pages: * or as the object is itself released. */ int -i915_gem_object_get_pages(struct drm_i915_gem_object *obj) +i915_gem_object_get_pages(struct drm_i915_gem_object *obj, unsigned int max) { struct drm_i915_private *dev_priv = obj-base.dev-dev_private; const struct drm_i915_gem_object_ops *ops = obj-ops; @@ -1863,7 +1863,7 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) BUG_ON(obj-pages_pin_count); - ret = ops-get_pages(obj); + ret = ops-get_pages(obj, max); if (ret) return ret; @@ -2942,7 +2942,12 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, u32 size, fence_size, fence_alignment, unfenced_alignment; bool mappable, fenceable; int ret; + static unsigned int max_size = 4 * 1024 * 1024; /* 4MB */ +#ifdef CONFIG_SWIOTLB + if (swiotlb_nr_tbl()) + max_size = PAGE_SIZE; +#endif fence_size = i915_gem_get_gtt_size(dev, obj-base.size, obj-tiling_mode); @@ -2972,8 +2977,8 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, DRM_ERROR(Attempting to bind an object larger than the aperture\n); return -E2BIG; } - - ret = i915_gem_object_get_pages(obj); + retry: + ret = i915_gem_object_get_pages(obj, max_size); if (ret) return ret; @@ -3015,6 +3020,10 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, if (ret) { i915_gem_object_unpin_pages(obj); drm_mm_put_block(node); + if (max_size PAGE_SIZE) { + max_size 1; + goto retry; + } return ret; } diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index dc53a52..8101387 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -230,7 +230,8 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev, return dma_buf_export(obj, i915_dmabuf_ops, obj-base.size, flags); } -static int i915_gem_object_get_pages_dmabuf(struct drm_i915_gem_object *obj) +static int
Re: [PATCH] drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.
On Mon, Jun 24, 2013 at 07:09:12PM +0200, Daniel Vetter wrote: On Mon, Jun 24, 2013 at 11:47:48AM -0400, Konrad Rzeszutek Wilk wrote: Git commit 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 (drm/i915: create compact dma scatter lists for gem objects) makes certain assumptions about the under laying DMA API that are not always correct. On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup I see: [drm:intel_pipe_set_base] *ERROR* pin fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 Bit of debugging traced it down to dma_map_sg failing (in i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB). That unfortunately are sizes that the SWIOTLB is incapable of handling - the maximum it can handle is a an entry of 512KB of virtual contiguous memory for its bounce buffer. (See IO_TLB_SEGSIZE). Previous to the above mention git commit the SG entries were of 4KB, and the code introduced by above git commit squashed the CPU contiguous PFNs in one big virtual address provided to DMA API. This patch is a simple semi-revert - were we emulate the old behavior if we detect that SWIOTLB is online. If it is not online then we continue on with the new compact scatter gather mechanism. An alternative solution would be for the the '.get_pages' and the i915_gem_gtt_prepare_object to retry with smaller max gap of the amount of PFNs that can be combined together - but with this issue discovered during rc7 that might be too risky. Reported-and-Tested-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com CC: Chris Wilson ch...@chris-wilson.co.uk CC: Imre Deak imre.d...@intel.com CC: Daniel Vetter daniel.vet...@ffwll.ch CC: David Airlie airl...@linux.ie CC: dri-devel@lists.freedesktop.org Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Two things: Hey Daniel, - SWIOTLB usage should seriously blow up all over the place in drm/i915. We really rely on the everywhere else true fact that the pages and their dma mapping point at the same backing storage. It works. As in, it seems to work for just a normal desktop user. I don't see much of dma_sync_* sprinkled around the drm/i915 so I would think that there are some issues would be hit as well - but at the first glance when using it on a laptop it looks OK. - How is this solved elsewhere when constructing sg tables? Or are we really the only guys who try to construct such big sg entries? I expected somewhat that the dma mapping backed would fill in the segment limits accordingly, but I haven't found anything really on a quick search. The TTM layer (so radeon, nouveau) uses pci_alloc_coherent which will construct the dma mapped pages. That allows it to construct SWIOTLB-approved pages that won't need to go through dma_map/dma_unmap as they are already mapped and ready to go. Coming back to your question - I think that i915 is the one that I've encountered. Cheers, Daniel --- drivers/gpu/drm/i915/i915_gem.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 970ad17..7045f45 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1801,7 +1801,14 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD; gfp = ~(__GFP_IO | __GFP_WAIT); } - +#ifdef CONFIG_SWIOTLB + if (swiotlb_nr_tbl()) { + st-nents++; + sg_set_page(sg, page, PAGE_SIZE, 0); + sg = sg_next(sg); + continue; + } +#endif if (!i || page_to_pfn(page) != last_pfn + 1) { if (i) sg = sg_next(sg); @@ -1812,8 +1819,10 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) } last_pfn = page_to_pfn(page); } - - sg_mark_end(sg); +#ifdef CONFIG_SWIOTLB + if (!swiotlb_nr_tbl()) +#endif + sg_mark_end(sg); obj-pages = st; if (i915_gem_object_needs_bit17_swizzle(obj)) -- 1.8.1.4 -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.
On Mon, Jun 24, 2013 at 08:26:18PM +0200, Daniel Vetter wrote: On Mon, Jun 24, 2013 at 7:32 PM, Konrad Rzeszutek Wilk konrad.w...@oracle.com wrote: On Mon, Jun 24, 2013 at 07:09:12PM +0200, Daniel Vetter wrote: On Mon, Jun 24, 2013 at 11:47:48AM -0400, Konrad Rzeszutek Wilk wrote: Git commit 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 (drm/i915: create compact dma scatter lists for gem objects) makes certain assumptions about the under laying DMA API that are not always correct. On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup I see: [drm:intel_pipe_set_base] *ERROR* pin fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 Bit of debugging traced it down to dma_map_sg failing (in i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB). That unfortunately are sizes that the SWIOTLB is incapable of handling - the maximum it can handle is a an entry of 512KB of virtual contiguous memory for its bounce buffer. (See IO_TLB_SEGSIZE). Previous to the above mention git commit the SG entries were of 4KB, and the code introduced by above git commit squashed the CPU contiguous PFNs in one big virtual address provided to DMA API. This patch is a simple semi-revert - were we emulate the old behavior if we detect that SWIOTLB is online. If it is not online then we continue on with the new compact scatter gather mechanism. An alternative solution would be for the the '.get_pages' and the i915_gem_gtt_prepare_object to retry with smaller max gap of the amount of PFNs that can be combined together - but with this issue discovered during rc7 that might be too risky. Reported-and-Tested-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com CC: Chris Wilson ch...@chris-wilson.co.uk CC: Imre Deak imre.d...@intel.com CC: Daniel Vetter daniel.vet...@ffwll.ch CC: David Airlie airl...@linux.ie CC: dri-devel@lists.freedesktop.org Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Two things: Hey Daniel, - SWIOTLB usage should seriously blow up all over the place in drm/i915. We really rely on the everywhere else true fact that the pages and their dma mapping point at the same backing storage. It works. As in, it seems to work for just a normal desktop user. I don't see much of dma_sync_* sprinkled around the drm/i915 so I would think that there are some issues would be hit as well - but at the first glance when using it on a laptop it looks OK. Yeah, we have a pretty serious case of roll our own coherency stuff. The biggest reason is that for a long time i915.ko didn't care one bit about iommus, and the thing we care about (flushing cpu caches for dma) isn't supported on x86 since x86 every dma is coherent (well, not quite, but we don't have support for it). I think longer-term it would make sense to move the clfushing we're doing into the dma layer. - How is this solved elsewhere when constructing sg tables? Or are we really the only guys who try to construct such big sg entries? I expected somewhat that the dma mapping backed would fill in the segment limits accordingly, but I haven't found anything really on a quick search. The TTM layer (so radeon, nouveau) uses pci_alloc_coherent which will construct the dma mapped pages. That allows it to construct SWIOTLB-approved pages that won't need to go through dma_map/dma_unmap as they are already mapped and ready to go. Coming back to your question - I think that i915 is the one that I've encountered. That's a bit surprising. With dma_buf graphics people will use sg tables much more (there's even a nice sg_alloc_table_from_pages helper to construct them), and those sg tables tend to have large segments. I guess we need some more generic solution here ... Yes. I don't grok the full picture yet so I am not sure how to help with this right now. Is there a roadmap or Wiki on how this was envisioned? For now I guess we can live with your CONFIG_SWIOTLB hack. -Daniel OK, I read that as an Ack-ed-by. Should I send the patch to Dave Airlie in a GIT PULL or some other way to make it on the v3.10-rc7 train? Cheers, Daniel --- drivers/gpu/drm/i915/i915_gem.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 970ad17..7045f45 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1801,7 +1801,14 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD; gfp = ~(__GFP_IO | __GFP_WAIT); } - +#ifdef CONFIG_SWIOTLB + if (swiotlb_nr_tbl()) { + st-nents
Re: [PATCH] drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.
Dave Airlie airl...@gmail.com wrote: On Tue, Jun 25, 2013 at 4:34 AM, Konrad Rzeszutek Wilk konrad.w...@oracle.com wrote: On Mon, Jun 24, 2013 at 08:26:18PM +0200, Daniel Vetter wrote: On Mon, Jun 24, 2013 at 7:32 PM, Konrad Rzeszutek Wilk konrad.w...@oracle.com wrote: On Mon, Jun 24, 2013 at 07:09:12PM +0200, Daniel Vetter wrote: On Mon, Jun 24, 2013 at 11:47:48AM -0400, Konrad Rzeszutek Wilk wrote: Git commit 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 (drm/i915: create compact dma scatter lists for gem objects) makes certain assumptions about the under laying DMA API that are not always correct. On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup I see: [drm:intel_pipe_set_base] *ERROR* pin fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 Bit of debugging traced it down to dma_map_sg failing (in i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB). That unfortunately are sizes that the SWIOTLB is incapable of handling - the maximum it can handle is a an entry of 512KB of virtual contiguous memory for its bounce buffer. (See IO_TLB_SEGSIZE). Previous to the above mention git commit the SG entries were of 4KB, and the code introduced by above git commit squashed the CPU contiguous PFNs in one big virtual address provided to DMA API. This patch is a simple semi-revert - were we emulate the old behavior if we detect that SWIOTLB is online. If it is not online then we continue on with the new compact scatter gather mechanism. An alternative solution would be for the the '.get_pages' and the i915_gem_gtt_prepare_object to retry with smaller max gap of the amount of PFNs that can be combined together - but with this issue discovered during rc7 that might be too risky. Reported-and-Tested-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com CC: Chris Wilson ch...@chris-wilson.co.uk CC: Imre Deak imre.d...@intel.com CC: Daniel Vetter daniel.vet...@ffwll.ch CC: David Airlie airl...@linux.ie CC: dri-devel@lists.freedesktop.org Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Two things: Hey Daniel, - SWIOTLB usage should seriously blow up all over the place in drm/i915. We really rely on the everywhere else true fact that the pages and their dma mapping point at the same backing storage. It works. As in, it seems to work for just a normal desktop user. I don't see much of dma_sync_* sprinkled around the drm/i915 so I would think that there are some issues would be hit as well - but at the first glance when using it on a laptop it looks OK. Yeah, we have a pretty serious case of roll our own coherency stuff. The biggest reason is that for a long time i915.ko didn't care one bit about iommus, and the thing we care about (flushing cpu caches for dma) isn't supported on x86 since x86 every dma is coherent (well, not quite, but we don't have support for it). I think longer-term it would make sense to move the clfushing we're doing into the dma layer. - How is this solved elsewhere when constructing sg tables? Or are we really the only guys who try to construct such big sg entries? I expected somewhat that the dma mapping backed would fill in the segment limits accordingly, but I haven't found anything really on a quick search. The TTM layer (so radeon, nouveau) uses pci_alloc_coherent which will construct the dma mapped pages. That allows it to construct SWIOTLB-approved pages that won't need to go through dma_map/dma_unmap as they are already mapped and ready to go. Coming back to your question - I think that i915 is the one that I've encountered. That's a bit surprising. With dma_buf graphics people will use sg tables much more (there's even a nice sg_alloc_table_from_pages helper to construct them), and those sg tables tend to have large segments. I guess we need some more generic solution here ... Yes. I don't grok the full picture yet so I am not sure how to help with this right now. Is there a roadmap or Wiki on how this was envisioned? For now I guess we can live with your CONFIG_SWIOTLB hack. -Daniel OK, I read that as an Ack-ed-by. Should I send the patch to Dave Airlie in a GIT PULL or some other way to make it on the v3.10-rc7 train? I don't like this at all, I'll accept the patch on the condition you investigate further :-) If you are using swiotlb on i915 things should break, I know I've investigated problems before where swiotlb was being incorrectly used due to page masks or other issues. Shouldn't you be passing through using the real iommu? Dave. Hey Dave Of course I will investigate. The SWIOTLB is unfortunately used because it is a fallback (and I am the maintainer of it) and if a real IOMMU is activated it can be mitigated differently. When you say 'passed through' you mean in terms of an IOMMU
Regression introduced by 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 ("drm/i915: create compact dma scatter lists for gem objects") Was:Re: i915 mapping large (3MB) scatter list, hitting limits on certa
On Fri, Jun 21, 2013 at 03:28:28PM -0400, Konrad Rzeszutek Wilk wrote: > Hey, CC-ing Imre, Imre, your patch 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 ("drm/i915: create compact dma scatter lists for gem objects") is the cause of the regression. If I revert your patch it boots fine without any trouble. I am not entirely sure why that is - as I added some debug code in lib/swiotlb.c to trigger when it can't find 3MB area (which is what I thought initially was the issue) - but none of the debug code seems to be hit. Any thoughts? > > I am using an ThinkPad X230 with an Intel HD 4000. With a stock Fedora 18 > (3.9.6) I can get it to boot and work just fine with Xen. If I use v3.10-rc6 > I found that i915 would halt with a > > [drm:intel_pipe_set_base] *ERROR* pin & fence failed > [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 > > after a bit of debugging (see patch below) I traced it down > to the fact that the scatter list that is provided at the end has > a huge (3MB) page. I am wondering if anybody knows what patch might > have introduced it to grab such a large memory segment? > > The other thing I am wondering is if there are some fallbacks when the > underlaying IOMMU can't deal with a request for contingous regions > that are more than 2MB? > > Thanks. > > >From a681a4adb4738c32cb1acdf6f5161bf877816b01 Mon Sep 17 00:00:00 2001 > From: Konrad Rzeszutek Wilk > Date: Fri, 21 Jun 2013 11:17:55 -0400 > Subject: [PATCH] dbug: print scatterlist. > > [ cut here ] > WARNING: at drivers/gpu/drm/i915/i915_gem_gtt.c:418 > i915_gem_gtt_prepare_object+0x180/0x200() > 10 but got 0 > Modules linked in: > crc32_pclmul sdhci_pci crc32c_intel sdhci mmc_core ghash_clmulni_intel > CPU: 0 PID: 216 Comm: plymouthd Not tainted 3.10.0-rc6+ #16 > Hardware name: LENOVO 2325DV4/2325DV4, BIOS G2ET86WW (2.06 ) 11/13/2012 > 0009 8801fa42d958 816e6d89 8801fa42d998 > 8105d2b0 8801fa42d988 8801fb0f4d80 > 81c172e0 8801fa76f000 000a 8801fa42d9f8 > Call Trace: > [] dump_stack+0x19/0x1b > [] warn_slowpath_common+0x70/0xa0 > [] warn_slowpath_fmt+0x46/0x50 > [] i915_gem_gtt_prepare_object+0x180/0x200 > [] i915_gem_object_pin+0x321/0x670 > [] i915_gem_object_pin_to_display_plane+0x81/0x190 > [] intel_pin_and_fence_fb_obj+0x85/0x1a0 > [] intel_pipe_set_base+0x7c/0x220 > [] intel_crtc_set_config+0x89e/0x990 > [] drm_mode_set_config_internal+0x2e/0x60 > [] drm_mode_setcrtc+0xfb/0x620 > [] ? kmem_cache_alloc_trace+0x39/0x1f0 > [] ? drm_vm_open_locked+0x57/0x90 > [] drm_ioctl+0x549/0x680 > [] ? drm_mode_setplane+0x3b0/0x3b0 > [] do_vfs_ioctl+0x97/0x580 > [] ? inode_has_perm.isra.32.constprop.62+0x2a/0x30 > [] ? file_has_perm+0x97/0xb0 > [] SyS_ioctl+0x91/0xb0 > [] tracesys+0xdd/0xe2 > ---[ end trace 7b6adc5450d9a9e1 ]--- > i915 :00:02.0: i915_gem_gtt_prepare_object: Mapping 10 pages, mapped: 0 > [0] virT:8801fd37c000 dma: 1fd37c000, size:4096 > [1] virT:8801fd37b000 dma: 1fd37b000, size:4096 > [2] virT:8801fd37a000 dma: 1fd37a000, size:4096 > [3] virT:8801fd378000 dma: 1fd378000, size:4096 > [4] virT:8801fd131000 dma: 1fd131000, size:4096 > [5] virT:880200c36000 dma: 200c36000, size:4096 > [6] virT:8801fd1a4000 dma: 1fd1a4000, size:69632 > [7] virT:8801fd3bb000 dma: 1fd3bb000, size:4096 > [8] virT:8801fd3c dma: 1fd3c, size:262144 > [9] virT:8801f940 dma: 1f940, size:3866624 > [drm] 3011: ret:-28 > [drm] 3540: ret:-28 > [drm] 3364: ret:-28 > [drm:intel_pin_and_fence_fb_obj] *ERROR* i915_gem_object_pin_to_display_plane > failed: -28 > [drm:intel_pipe_set_base] *ERROR* pin & fence failed > [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 > > Signed-off-by: Konrad Rzeszutek Wilk > --- > drivers/gpu/drm/i915/i915_drv.c | 6 ++ > drivers/gpu/drm/i915/i915_gem_gtt.c | 28 +--- > 2 files changed, 31 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > index 8411942..141c6fb 100644 > --- a/drivers/gpu/drm/i915/i915_drv.c > +++ b/drivers/gpu/drm/i915/i915_drv.c > @@ -133,6 +133,12 @@ module_param_named(coherent, use_coherent, int, 0600); > MODULE_PARM_DESC(use_coherent, >"Use coherent DMA API calls (default: false)"); > > +int i915_sgl __read_mostly = 0; > +module_param_named(sgl, i915_sgl, int, 0600); > +MODULE_PARM_DESC(sgl, > + "Print scatterlist SG's when DMA mapping them (default: > false)"); > + > + > static struct d
i915 mapping large (3MB) scatter list, hitting limits on certain IOMMUs that can only map contingous regions up to 2MB.
Hey, I am using an ThinkPad X230 with an Intel HD 4000. With a stock Fedora 18 (3.9.6) I can get it to boot and work just fine with Xen. If I use v3.10-rc6 I found that i915 would halt with a [drm:intel_pipe_set_base] *ERROR* pin & fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 after a bit of debugging (see patch below) I traced it down to the fact that the scatter list that is provided at the end has a huge (3MB) page. I am wondering if anybody knows what patch might have introduced it to grab such a large memory segment? The other thing I am wondering is if there are some fallbacks when the underlaying IOMMU can't deal with a request for contingous regions that are more than 2MB? Thanks. >From a681a4adb4738c32cb1acdf6f5161bf877816b01 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk <konrad.w...@oracle.com> Date: Fri, 21 Jun 2013 11:17:55 -0400 Subject: [PATCH] dbug: print scatterlist. [ cut here ] WARNING: at drivers/gpu/drm/i915/i915_gem_gtt.c:418 i915_gem_gtt_prepare_object+0x180/0x200() 10 but got 0 Modules linked in: crc32_pclmul sdhci_pci crc32c_intel sdhci mmc_core ghash_clmulni_intel CPU: 0 PID: 216 Comm: plymouthd Not tainted 3.10.0-rc6+ #16 Hardware name: LENOVO 2325DV4/2325DV4, BIOS G2ET86WW (2.06 ) 11/13/2012 0009 8801fa42d958 816e6d89 8801fa42d998 8105d2b0 8801fa42d988 8801fb0f4d80 81c172e0 8801fa76f000 000a 8801fa42d9f8 Call Trace: [] dump_stack+0x19/0x1b [] warn_slowpath_common+0x70/0xa0 [] warn_slowpath_fmt+0x46/0x50 [] i915_gem_gtt_prepare_object+0x180/0x200 [] i915_gem_object_pin+0x321/0x670 [] i915_gem_object_pin_to_display_plane+0x81/0x190 [] intel_pin_and_fence_fb_obj+0x85/0x1a0 [] intel_pipe_set_base+0x7c/0x220 [] intel_crtc_set_config+0x89e/0x990 [] drm_mode_set_config_internal+0x2e/0x60 [] drm_mode_setcrtc+0xfb/0x620 [] ? kmem_cache_alloc_trace+0x39/0x1f0 [] ? drm_vm_open_locked+0x57/0x90 [] drm_ioctl+0x549/0x680 [] ? drm_mode_setplane+0x3b0/0x3b0 [] do_vfs_ioctl+0x97/0x580 [] ? inode_has_perm.isra.32.constprop.62+0x2a/0x30 [] ? file_has_perm+0x97/0xb0 [] SyS_ioctl+0x91/0xb0 [] tracesys+0xdd/0xe2 ---[ end trace 7b6adc5450d9a9e1 ]--- i915 :00:02.0: i915_gem_gtt_prepare_object: Mapping 10 pages, mapped: 0 [0] virT:8801fd37c000 dma: 1fd37c000, size:4096 [1] virT:8801fd37b000 dma: 1fd37b000, size:4096 [2] virT:8801fd37a000 dma: 1fd37a000, size:4096 [3] virT:8801fd378000 dma: 1fd378000, size:4096 [4] virT:8801fd131000 dma: 1fd131000, size:4096 [5] virT:880200c36000 dma: 200c36000, size:4096 [6] virT:8801fd1a4000 dma: 1fd1a4000, size:69632 [7] virT:8801fd3bb000 dma: 1fd3bb000, size:4096 [8] virT:8801fd3c dma: 1fd3c, size:262144 [9] virT:8801f940 dma: 1f940, size:3866624 [drm] 3011: ret:-28 [drm] 3540: ret:-28 [drm] 3364: ret:-28 [drm:intel_pin_and_fence_fb_obj] *ERROR* i915_gem_object_pin_to_display_plane failed: -28 [drm:intel_pipe_set_base] *ERROR* pin & fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 Signed-off-by: Konrad Rzeszutek Wilk --- drivers/gpu/drm/i915/i915_drv.c | 6 ++ drivers/gpu/drm/i915/i915_gem_gtt.c | 28 +--- 2 files changed, 31 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 8411942..141c6fb 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -133,6 +133,12 @@ module_param_named(coherent, use_coherent, int, 0600); MODULE_PARM_DESC(use_coherent, "Use coherent DMA API calls (default: false)"); +int i915_sgl __read_mostly = 0; +module_param_named(sgl, i915_sgl, int, 0600); +MODULE_PARM_DESC(sgl, +"Print scatterlist SG's when DMA mapping them (default: false)"); + + static struct drm_driver driver; extern int intel_agp_enabled; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index acb3b3f..292179c 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -28,6 +28,7 @@ #include "i915_trace.h" #include "intel_drv.h" +extern int use_coherent; typedef uint32_t gen6_gtt_pte_t; /* PPGTT stuff */ @@ -403,15 +404,36 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev) i915_gem_chipset_flush(dev); } - +extern int i915_sgl; int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj) { + int elem; if (obj->has_dma_mapping) return 0; - if (!dma_map_sg(>base.dev->pdev->dev, + elem = dma_map_sg(>base.dev->pdev->dev, obj->pages->sgl, obj->pages->nents, - PCI_DMA_BIDIRECTIONAL)) + PCI_DMA_BIDIRECTIONAL); + +
i915 mapping large (3MB) scatter list, hitting limits on certain IOMMUs that can only map contingous regions up to 2MB.
Hey, I am using an ThinkPad X230 with an Intel HD 4000. With a stock Fedora 18 (3.9.6) I can get it to boot and work just fine with Xen. If I use v3.10-rc6 I found that i915 would halt with a [drm:intel_pipe_set_base] *ERROR* pin fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 after a bit of debugging (see patch below) I traced it down to the fact that the scatter list that is provided at the end has a huge (3MB) page. I am wondering if anybody knows what patch might have introduced it to grab such a large memory segment? The other thing I am wondering is if there are some fallbacks when the underlaying IOMMU can't deal with a request for contingous regions that are more than 2MB? Thanks. From a681a4adb4738c32cb1acdf6f5161bf877816b01 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk konrad.w...@oracle.com Date: Fri, 21 Jun 2013 11:17:55 -0400 Subject: [PATCH] dbug: print scatterlist. [ cut here ] WARNING: at drivers/gpu/drm/i915/i915_gem_gtt.c:418 i915_gem_gtt_prepare_object+0x180/0x200() 10 but got 0 Modules linked in: crc32_pclmul sdhci_pci crc32c_intel sdhci mmc_core ghash_clmulni_intel CPU: 0 PID: 216 Comm: plymouthd Not tainted 3.10.0-rc6+ #16 Hardware name: LENOVO 2325DV4/2325DV4, BIOS G2ET86WW (2.06 ) 11/13/2012 0009 8801fa42d958 816e6d89 8801fa42d998 8105d2b0 8801fa42d988 8801fb0f4d80 81c172e0 8801fa76f000 000a 8801fa42d9f8 Call Trace: [816e6d89] dump_stack+0x19/0x1b [8105d2b0] warn_slowpath_common+0x70/0xa0 [8105d396] warn_slowpath_fmt+0x46/0x50 [8142a740] i915_gem_gtt_prepare_object+0x180/0x200 [81423581] i915_gem_object_pin+0x321/0x670 [81423951] i915_gem_object_pin_to_display_plane+0x81/0x190 [814381b5] intel_pin_and_fence_fb_obj+0x85/0x1a0 [8143999c] intel_pipe_set_base+0x7c/0x220 [814409be] intel_crtc_set_config+0x89e/0x990 [813ffdae] drm_mode_set_config_internal+0x2e/0x60 [814023ab] drm_mode_setcrtc+0xfb/0x620 [811850f9] ? kmem_cache_alloc_trace+0x39/0x1f0 [813f9767] ? drm_vm_open_locked+0x57/0x90 [813f2e39] drm_ioctl+0x549/0x680 [814022b0] ? drm_mode_setplane+0x3b0/0x3b0 [811aef77] do_vfs_ioctl+0x97/0x580 [81295dca] ? inode_has_perm.isra.32.constprop.62+0x2a/0x30 [81297397] ? file_has_perm+0x97/0xb0 [811af4f1] SyS_ioctl+0x91/0xb0 [816f63e7] tracesys+0xdd/0xe2 ---[ end trace 7b6adc5450d9a9e1 ]--- i915 :00:02.0: i915_gem_gtt_prepare_object: Mapping 10 pages, mapped: 0 [0] virT:8801fd37c000 dma: 1fd37c000, size:4096 [1] virT:8801fd37b000 dma: 1fd37b000, size:4096 [2] virT:8801fd37a000 dma: 1fd37a000, size:4096 [3] virT:8801fd378000 dma: 1fd378000, size:4096 [4] virT:8801fd131000 dma: 1fd131000, size:4096 [5] virT:880200c36000 dma: 200c36000, size:4096 [6] virT:8801fd1a4000 dma: 1fd1a4000, size:69632 [7] virT:8801fd3bb000 dma: 1fd3bb000, size:4096 [8] virT:8801fd3c dma: 1fd3c, size:262144 [9] virT:8801f940 dma: 1f940, size:3866624 [drm] 3011: ret:-28 [drm] 3540: ret:-28 [drm] 3364: ret:-28 [drm:intel_pin_and_fence_fb_obj] *ERROR* i915_gem_object_pin_to_display_plane failed: -28 [drm:intel_pipe_set_base] *ERROR* pin fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- drivers/gpu/drm/i915/i915_drv.c | 6 ++ drivers/gpu/drm/i915/i915_gem_gtt.c | 28 +--- 2 files changed, 31 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 8411942..141c6fb 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -133,6 +133,12 @@ module_param_named(coherent, use_coherent, int, 0600); MODULE_PARM_DESC(use_coherent, Use coherent DMA API calls (default: false)); +int i915_sgl __read_mostly = 0; +module_param_named(sgl, i915_sgl, int, 0600); +MODULE_PARM_DESC(sgl, +Print scatterlist SG's when DMA mapping them (default: false)); + + static struct drm_driver driver; extern int intel_agp_enabled; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index acb3b3f..292179c 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -28,6 +28,7 @@ #include i915_trace.h #include intel_drv.h +extern int use_coherent; typedef uint32_t gen6_gtt_pte_t; /* PPGTT stuff */ @@ -403,15 +404,36 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev) i915_gem_chipset_flush(dev); } - +extern int i915_sgl; int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj) { + int elem; if (obj-has_dma_mapping) return 0
Regression introduced by 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 (drm/i915: create compact dma scatter lists for gem objects) Was:Re: i915 mapping large (3MB) scatter list, hitting limits on certa
On Fri, Jun 21, 2013 at 03:28:28PM -0400, Konrad Rzeszutek Wilk wrote: Hey, CC-ing Imre, Imre, your patch 90797e6d1ec0dfde6ba62a48b9ee3803887d6ed4 (drm/i915: create compact dma scatter lists for gem objects) is the cause of the regression. If I revert your patch it boots fine without any trouble. I am not entirely sure why that is - as I added some debug code in lib/swiotlb.c to trigger when it can't find 3MB area (which is what I thought initially was the issue) - but none of the debug code seems to be hit. Any thoughts? I am using an ThinkPad X230 with an Intel HD 4000. With a stock Fedora 18 (3.9.6) I can get it to boot and work just fine with Xen. If I use v3.10-rc6 I found that i915 would halt with a [drm:intel_pipe_set_base] *ERROR* pin fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 after a bit of debugging (see patch below) I traced it down to the fact that the scatter list that is provided at the end has a huge (3MB) page. I am wondering if anybody knows what patch might have introduced it to grab such a large memory segment? The other thing I am wondering is if there are some fallbacks when the underlaying IOMMU can't deal with a request for contingous regions that are more than 2MB? Thanks. From a681a4adb4738c32cb1acdf6f5161bf877816b01 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk konrad.w...@oracle.com Date: Fri, 21 Jun 2013 11:17:55 -0400 Subject: [PATCH] dbug: print scatterlist. [ cut here ] WARNING: at drivers/gpu/drm/i915/i915_gem_gtt.c:418 i915_gem_gtt_prepare_object+0x180/0x200() 10 but got 0 Modules linked in: crc32_pclmul sdhci_pci crc32c_intel sdhci mmc_core ghash_clmulni_intel CPU: 0 PID: 216 Comm: plymouthd Not tainted 3.10.0-rc6+ #16 Hardware name: LENOVO 2325DV4/2325DV4, BIOS G2ET86WW (2.06 ) 11/13/2012 0009 8801fa42d958 816e6d89 8801fa42d998 8105d2b0 8801fa42d988 8801fb0f4d80 81c172e0 8801fa76f000 000a 8801fa42d9f8 Call Trace: [816e6d89] dump_stack+0x19/0x1b [8105d2b0] warn_slowpath_common+0x70/0xa0 [8105d396] warn_slowpath_fmt+0x46/0x50 [8142a740] i915_gem_gtt_prepare_object+0x180/0x200 [81423581] i915_gem_object_pin+0x321/0x670 [81423951] i915_gem_object_pin_to_display_plane+0x81/0x190 [814381b5] intel_pin_and_fence_fb_obj+0x85/0x1a0 [8143999c] intel_pipe_set_base+0x7c/0x220 [814409be] intel_crtc_set_config+0x89e/0x990 [813ffdae] drm_mode_set_config_internal+0x2e/0x60 [814023ab] drm_mode_setcrtc+0xfb/0x620 [811850f9] ? kmem_cache_alloc_trace+0x39/0x1f0 [813f9767] ? drm_vm_open_locked+0x57/0x90 [813f2e39] drm_ioctl+0x549/0x680 [814022b0] ? drm_mode_setplane+0x3b0/0x3b0 [811aef77] do_vfs_ioctl+0x97/0x580 [81295dca] ? inode_has_perm.isra.32.constprop.62+0x2a/0x30 [81297397] ? file_has_perm+0x97/0xb0 [811af4f1] SyS_ioctl+0x91/0xb0 [816f63e7] tracesys+0xdd/0xe2 ---[ end trace 7b6adc5450d9a9e1 ]--- i915 :00:02.0: i915_gem_gtt_prepare_object: Mapping 10 pages, mapped: 0 [0] virT:8801fd37c000 dma: 1fd37c000, size:4096 [1] virT:8801fd37b000 dma: 1fd37b000, size:4096 [2] virT:8801fd37a000 dma: 1fd37a000, size:4096 [3] virT:8801fd378000 dma: 1fd378000, size:4096 [4] virT:8801fd131000 dma: 1fd131000, size:4096 [5] virT:880200c36000 dma: 200c36000, size:4096 [6] virT:8801fd1a4000 dma: 1fd1a4000, size:69632 [7] virT:8801fd3bb000 dma: 1fd3bb000, size:4096 [8] virT:8801fd3c dma: 1fd3c, size:262144 [9] virT:8801f940 dma: 1f940, size:3866624 [drm] 3011: ret:-28 [drm] 3540: ret:-28 [drm] 3364: ret:-28 [drm:intel_pin_and_fence_fb_obj] *ERROR* i915_gem_object_pin_to_display_plane failed: -28 [drm:intel_pipe_set_base] *ERROR* pin fence failed [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- drivers/gpu/drm/i915/i915_drv.c | 6 ++ drivers/gpu/drm/i915/i915_gem_gtt.c | 28 +--- 2 files changed, 31 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 8411942..141c6fb 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -133,6 +133,12 @@ module_param_named(coherent, use_coherent, int, 0600); MODULE_PARM_DESC(use_coherent, Use coherent DMA API calls (default: false)); +int i915_sgl __read_mostly = 0; +module_param_named(sgl, i915_sgl, int, 0600); +MODULE_PARM_DESC(sgl, + Print scatterlist SG's when DMA mapping them (default: false)); + + static struct drm_driver driver; extern int intel_agp_enabled; diff --git a/drivers/gpu/drm/i915
[PATCH 0/3] fbdev no more!
On Sun, Jun 16, 2013 at 04:57:17PM +0200, Daniel Vetter wrote: > Hi all, > > So I've taken a look again at the locking mess in our fbdev support and cried. > Fixing up the console_lock mess around the fbdev notifier will be real work, > semanatically the fbdev layer does lots of stupid things (like the radeon > resume > issue I've just debugged) and the panic notifier is pretty much a lost cause. > > So I've decided to instead rip it all out. It seems to work \o/ When you say 'locking mess in our fbdev support' you mean the general core fbdev driver? Not neccessarily the i915 driver? I am asking b/c that would imply that the other fbdev drivers still hit the locking mess? Thanks! > > Of course a general purpose distro propably wants David's kmscon for any > fallbacks needs and a system compositor to ditch the VT subsystem - atm my > machine here runs with the dummy console so that VT switching between > different > X sessions still works ;-) > > Oh and: At least fedora's boot splash seems to be unhappy about the lack of an > fbdev (it doesn't seem to do anything), which breaks early disk encryption a > bit. The black screen itself shouldn't be a big issue at least for i915, since > with all the fastboot work we can just hang onto the current config and > framebuffer (one missing patch from Chris for the fb preservartion). So as > long > as the bios/grub put up something nice, it'll look ok. > > So just a small step here really, but imo into the right direction. Now, > please > bring on the flames! > > Aside: We can hide the #ifdef mess a bit better in drm/i915 I think, but I'd > like to wait for a bit of feedback first. And one more: This also removes the > console_lock completely from our critical path in suspend/resume! > > One thing I haven't wasted a single thought about is kgdb and panic notifier > support. But since the current code is pretty decently broken already (we have > _tons_ of mutex grabbing and waits in there) I don't think people care that > much > about it anyway. Using a sprite to smash the kgdb/panic output on top of > whatever's currently displaying might be an approach. > > Cheers, Daniel > > Daniel Vetter (3): > drm: Add separate Kconfig option for fbdev helpers > drm/i915: Kconfig option to disable the legacy fbdev support > drm/i915: rename intel_fb.c to intel_fbdev.c > > drivers/gpu/drm/Kconfig | 57 ++- > drivers/gpu/drm/Makefile | 3 +- > drivers/gpu/drm/ast/Kconfig | 1 + > drivers/gpu/drm/cirrus/Kconfig | 1 + > drivers/gpu/drm/exynos/Kconfig | 1 + > drivers/gpu/drm/gma500/Kconfig | 1 + > drivers/gpu/drm/i915/Kconfig | 56 +++ > drivers/gpu/drm/i915/Makefile| 3 +- > drivers/gpu/drm/i915/i915_debugfs.c | 4 +- > drivers/gpu/drm/i915/i915_dma.c | 8 +- > drivers/gpu/drm/i915/i915_drv.h | 2 + > drivers/gpu/drm/i915/intel_display.c | 12 +- > drivers/gpu/drm/i915/intel_drv.h | 39 - > drivers/gpu/drm/i915/intel_fb.c | 314 > --- > drivers/gpu/drm/i915/intel_fbdev.c | 314 > +++ > drivers/gpu/drm/mgag200/Kconfig | 1 + > drivers/gpu/drm/nouveau/Kconfig | 1 + > drivers/gpu/drm/omapdrm/Kconfig | 1 + > drivers/gpu/drm/qxl/Kconfig | 1 + > drivers/gpu/drm/shmobile/Kconfig | 1 + > drivers/gpu/drm/tilcdc/Kconfig | 1 + > drivers/gpu/drm/udl/Kconfig | 1 + > drivers/gpu/host1x/drm/Kconfig | 1 + > drivers/staging/imx-drm/Kconfig | 1 + > 24 files changed, 452 insertions(+), 373 deletions(-) > create mode 100644 drivers/gpu/drm/i915/Kconfig > delete mode 100644 drivers/gpu/drm/i915/intel_fb.c > create mode 100644 drivers/gpu/drm/i915/intel_fbdev.c > > -- > 1.7.11.7 > > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 0/3] fbdev no more!
On Sun, Jun 16, 2013 at 04:57:17PM +0200, Daniel Vetter wrote: Hi all, So I've taken a look again at the locking mess in our fbdev support and cried. Fixing up the console_lock mess around the fbdev notifier will be real work, semanatically the fbdev layer does lots of stupid things (like the radeon resume issue I've just debugged) and the panic notifier is pretty much a lost cause. So I've decided to instead rip it all out. It seems to work \o/ When you say 'locking mess in our fbdev support' you mean the general core fbdev driver? Not neccessarily the i915 driver? I am asking b/c that would imply that the other fbdev drivers still hit the locking mess? Thanks! Of course a general purpose distro propably wants David's kmscon for any fallbacks needs and a system compositor to ditch the VT subsystem - atm my machine here runs with the dummy console so that VT switching between different X sessions still works ;-) Oh and: At least fedora's boot splash seems to be unhappy about the lack of an fbdev (it doesn't seem to do anything), which breaks early disk encryption a bit. The black screen itself shouldn't be a big issue at least for i915, since with all the fastboot work we can just hang onto the current config and framebuffer (one missing patch from Chris for the fb preservartion). So as long as the bios/grub put up something nice, it'll look ok. So just a small step here really, but imo into the right direction. Now, please bring on the flames! Aside: We can hide the #ifdef mess a bit better in drm/i915 I think, but I'd like to wait for a bit of feedback first. And one more: This also removes the console_lock completely from our critical path in suspend/resume! One thing I haven't wasted a single thought about is kgdb and panic notifier support. But since the current code is pretty decently broken already (we have _tons_ of mutex grabbing and waits in there) I don't think people care that much about it anyway. Using a sprite to smash the kgdb/panic output on top of whatever's currently displaying might be an approach. Cheers, Daniel Daniel Vetter (3): drm: Add separate Kconfig option for fbdev helpers drm/i915: Kconfig option to disable the legacy fbdev support drm/i915: rename intel_fb.c to intel_fbdev.c drivers/gpu/drm/Kconfig | 57 ++- drivers/gpu/drm/Makefile | 3 +- drivers/gpu/drm/ast/Kconfig | 1 + drivers/gpu/drm/cirrus/Kconfig | 1 + drivers/gpu/drm/exynos/Kconfig | 1 + drivers/gpu/drm/gma500/Kconfig | 1 + drivers/gpu/drm/i915/Kconfig | 56 +++ drivers/gpu/drm/i915/Makefile| 3 +- drivers/gpu/drm/i915/i915_debugfs.c | 4 +- drivers/gpu/drm/i915/i915_dma.c | 8 +- drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/intel_display.c | 12 +- drivers/gpu/drm/i915/intel_drv.h | 39 - drivers/gpu/drm/i915/intel_fb.c | 314 --- drivers/gpu/drm/i915/intel_fbdev.c | 314 +++ drivers/gpu/drm/mgag200/Kconfig | 1 + drivers/gpu/drm/nouveau/Kconfig | 1 + drivers/gpu/drm/omapdrm/Kconfig | 1 + drivers/gpu/drm/qxl/Kconfig | 1 + drivers/gpu/drm/shmobile/Kconfig | 1 + drivers/gpu/drm/tilcdc/Kconfig | 1 + drivers/gpu/drm/udl/Kconfig | 1 + drivers/gpu/host1x/drm/Kconfig | 1 + drivers/staging/imx-drm/Kconfig | 1 + 24 files changed, 452 insertions(+), 373 deletions(-) create mode 100644 drivers/gpu/drm/i915/Kconfig delete mode 100644 drivers/gpu/drm/i915/intel_fb.c create mode 100644 drivers/gpu/drm/i915/intel_fbdev.c -- 1.7.11.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel