2014-09-29 5:45 GMT+09:00 Chuck Ebbert <[email protected]>: > On Mon, 29 Sep 2014 00:52:03 +0900 > Akinobu Mita <[email protected]> wrote: > >> If CONFIG_DMA_CMA is enabled, dma_generic_alloc_coherent() tries to >> allocate memory region by dma_alloc_from_contiguous() before trying to >> use alloc_pages(). >> >> This wastes CMA region by small DMA-coherent buffers which can be >> allocated by alloc_pages(). And it also causes performance degradation, >> as this is trying to drive _all_ dma mapping allocations through a >> _very_ small window, reported by Peter Hurley. >> >> This fixes it by trying to allocate by alloc_pages() first in >> dma_generic_alloc_coherent() as dma_alloc_from_contiguous should be >> called only for huge allocation. >> >> Signed-off-by: Akinobu Mita <[email protected]> >> Reported-by: Peter Hurley <[email protected]> >> Cc: Peter Hurley <[email protected]> >> Cc: Marek Szyprowski <[email protected]> >> Cc: Konrad Rzeszutek Wilk <[email protected]> >> Cc: David Woodhouse <[email protected]> >> Cc: Don Dutile <[email protected]> >> Cc: Thomas Gleixner <[email protected]> >> Cc: Ingo Molnar <[email protected]> >> Cc: "H. Peter Anvin" <[email protected]> >> Cc: Andi Kleen <[email protected]> >> Cc: Yinghai Lu <[email protected]> >> Cc: [email protected] >> Cc: [email protected] >> --- >> arch/x86/kernel/pci-dma.c | 12 ++++++------ >> 1 file changed, 6 insertions(+), 6 deletions(-) >> >> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c >> index a25e202..0402266 100644 >> --- a/arch/x86/kernel/pci-dma.c >> +++ b/arch/x86/kernel/pci-dma.c >> @@ -99,20 +99,20 @@ void *dma_generic_alloc_coherent(struct device *dev, >> size_t size, >> >> flag &= ~__GFP_ZERO; >> again: >> - page = NULL; >> + page = alloc_pages_node(dev_to_node(dev), flag | __GFP_NOWARN, >> + get_order(size)); > > Only try small allocs here, like when order < PAGE_ALLOC_COSTLY_ORDER ? > >> /* CMA can be used only in the context which permits sleeping */ >> - if (flag & __GFP_WAIT) { >> + if (!page && (flag & __GFP_WAIT)) { >> page = dma_alloc_from_contiguous(dev, count, get_order(size)); >> if (page && page_to_phys(page) + size > dma_mask) { >> dma_release_from_contiguous(dev, page, count); >> page = NULL; >> } >> } >> - /* fallback */ >> - if (!page) >> - page = alloc_pages_node(dev_to_node(dev), flag, >> get_order(size)); > > (I forgot to add this in my first reply). I think it should try for a > small alloc without CMA first, then try CMA, and then this final > fallback for larger allocs.
I'm concerned with the performance problem reported by Peter Hurley. This could be a solution, but I would like to hear Peter's opinion. For now, I prefer the solution by this patch because it gives less impact on CONFIG_DMA_CMA enabled. But it can be improved later on. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

