Re: [PATCH v2 0/8] Improve performance of VM translation on x86_64

2012-11-01 Thread Alexander Duyck
On 10/11/2012 03:58 PM, H. Peter Anvin wrote: On 10/12/2012 06:40 AM, Andi Kleen wrote: Patch series looks good to me. Thanks for doing this properly. Reviewed-by: Andi Kleen a...@linux.intel.com Agreed. Acked-by: H. Peter Anvin h...@zytor.com I will pick this up after the merge window

Re: [PATCH v3 0/7] Improve swiotlb performance by using physical addresses

2012-11-02 Thread Alexander Duyck
On 11/02/2012 09:21 AM, Konrad Rzeszutek Wilk wrote: On Mon, Oct 29, 2012 at 03:05:56PM -0400, Konrad Rzeszutek Wilk wrote: On Mon, Oct 29, 2012 at 11:18:09AM -0700, Alexander Duyck wrote: On Mon, Oct 15, 2012 at 10:19 AM, Alexander Duyck alexander.h.du...@intel.com wrote: While working

Re: [PATCH 5/5] ixgbe: add driver set_max_vfs support

2012-10-03 Thread Alexander Duyck
On 10/03/2012 10:51 AM, Yinghai Lu wrote: Need ixgbe guys to close the loop to use set_max_vfs instead kernel parameters. Signed-off-by: Yinghai Lu ying...@kernel.org Cc: Jeff Kirsher jeffrey.t.kirs...@intel.com Cc: Jesse Brandeburg jesse.brandeb...@intel.com Cc: Greg Rose

[RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-03 Thread Alexander Duyck
0.17% [k] swiotlb_dma_mapping_error --- Alexander Duyck (7): swiotlb: Do not export swiotlb_bounce since there are no external consumers swiotlb: Use physical addresses instead of virtual in swiotlb_tbl_sync_single swiotlb

[RFC PATCH 1/7] swiotlb: Instead of tracking the end of the swiotlb region just calculate it

2012-10-03 Thread Alexander Duyck
. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- lib/swiotlb.c | 25 - 1 files changed, 12 insertions(+), 13 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index f114bf6..5cc4d4e 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -57,11 +57,11

[RFC PATCH 2/7] swiotlb: Make io_tlb_start a physical address instead of a virtual address

2012-10-03 Thread Alexander Duyck
This change makes it so that io_tlb_start contains a physical address instead of a virtual address. The advantage to this is that we can avoid costly translations between virtual and physical addresses when comparing the io_tlb_start against DMA addresses. Signed-off-by: Alexander Duyck

[RFC PATCH 4/7] swiotlb: Return physical addresses when calling swiotlb_tbl_map_single

2012-10-03 Thread Alexander Duyck
for use. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- drivers/xen/swiotlb-xen.c | 22 +++--- include/linux/swiotlb.h | 11 +-- lib/swiotlb.c | 73 +++-- 3 files changed, 56 insertions(+), 50 deletions(-) diff

[RFC PATCH 5/7] swiotlb: Use physical addresses for swiotlb_tbl_unmap_single

2012-10-03 Thread Alexander Duyck
This change makes it so that the unmap functionality also uses physical addresses. This helps to further reduce the use of virt_to_phys and phys_to_virt functions. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- drivers/xen/swiotlb-xen.c |4 ++-- include/linux/swiotlb.h

[RFC PATCH 6/7] swiotlb: Use physical addresses instead of virtual in swiotlb_tbl_sync_single

2012-10-03 Thread Alexander Duyck
This change makes it so that the sync functionality also uses physical addresses. This helps to further reduce the use of virt_to_phys and phys_to_virt functions. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- drivers/xen/swiotlb-xen.c |3 +-- include/linux/swiotlb.h

[RFC PATCH 7/7] swiotlb: Do not export swiotlb_bounce since there are no external consumers

2012-10-03 Thread Alexander Duyck
of a virtual one. This is the last piece in essentially pushing all of the DMA address values to use physical addresses in swiotlb. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- include/linux/swiotlb.h |3 --- lib/swiotlb.c | 30 +- 2 files

[RFC PATCH 3/7] swiotlb: Make io_tlb_overflow_buffer a physical address

2012-10-03 Thread Alexander Duyck
that depended on that functionality be updated. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- lib/swiotlb.c | 61 - 1 files changed, 34 insertions(+), 27 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 02abb72

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-04 Thread Alexander Duyck
On 10/04/2012 05:55 AM, Konrad Rzeszutek Wilk wrote: On Wed, Oct 03, 2012 at 05:38:41PM -0700, Alexander Duyck wrote: While working on 10Gb/s routing performance I found a significant amount of time was being spent in the swiotlb DMA handler. Further digging found that a significant amount

Re: [RFC PATCH 1/7] swiotlb: Instead of tracking the end of the swiotlb region just calculate it

2012-10-04 Thread Alexander Duyck
On 10/04/2012 06:01 AM, Konrad Rzeszutek Wilk wrote: On Wed, Oct 03, 2012 at 05:38:47PM -0700, Alexander Duyck wrote: In the case of swiotlb we already have the start of the region and the number of slabs that give us the region size. Instead of having to call virt_to_phys on two pointers we

Re: [RFC PATCH 2/7] swiotlb: Make io_tlb_start a physical address instead of a virtual address

2012-10-04 Thread Alexander Duyck
On 10/04/2012 06:18 AM, Konrad Rzeszutek Wilk wrote: On Wed, Oct 03, 2012 at 05:38:53PM -0700, Alexander Duyck wrote: This change makes it so that io_tlb_start contains a physical address instead of a virtual address. The advantage to this is that we can avoid costly translations between

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-04 Thread Alexander Duyck
On 10/04/2012 06:33 AM, Konrad Rzeszutek Wilk wrote: On Wed, Oct 03, 2012 at 05:38:41PM -0700, Alexander Duyck wrote: While working on 10Gb/s routing performance I found a significant amount of time was being spent in the swiotlb DMA handler. Further digging found that a significant amount

Re: [RFC PATCH 2/7] swiotlb: Make io_tlb_start a physical address instead of a virtual address

2012-10-04 Thread Alexander Duyck
On 10/04/2012 10:19 AM, Konrad Rzeszutek Wilk wrote: @@ -450,7 +451,7 @@ void *swiotlb_tbl_map_single(struct device *hwdev, dma_addr_t tbl_dma_addr, io_tlb_list[i] = 0; for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1)

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-05 Thread Alexander Duyck
On 10/05/2012 09:55 AM, Andi Kleen wrote: Alexander Duyck alexander.h.du...@intel.com writes: While working on 10Gb/s routing performance I found a significant amount of time was being spent in the swiotlb DMA handler. Further digging found that a significant amount of this was due

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-05 Thread Alexander Duyck
On 10/05/2012 01:02 PM, Andi Kleen wrote: I was thinking the issue was all of the calls to relatively small functions occurring in quick succession. The way most of this code is setup it seems like it is one small function call in turn calling another, and then another, and I would imagine

[PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-05 Thread Alexander Duyck
to be confused with a bus address. --- Alexander Duyck (7): swiotlb: Do not export swiotlb_bounce since there are no external consumers swiotlb: Use physical addresses instead of virtual in swiotlb_tbl_sync_single swiotlb: Use physical addresses for swiotlb_tbl_unmap_single

[PATCH 1/7] swiotlb: Instead of tracking the end of the swiotlb region just calculate it

2012-10-05 Thread Alexander Duyck
. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- lib/swiotlb.c | 25 - 1 files changed, 12 insertions(+), 13 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index f114bf6..5cc4d4e 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -57,11 +57,11

[PATCH 2/7] swiotlb: Replace virtual io_tlb_start with physical io_tlb_addr

2012-10-05 Thread Alexander Duyck
to the physical one needed for testing an existing DMA address. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- lib/swiotlb.c | 67 + 1 files changed, 34 insertions(+), 33 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index

[PATCH 4/7] swiotlb: Return physical addresses when calling swiotlb_tbl_map_single

2012-10-05 Thread Alexander Duyck
buffer. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- drivers/xen/swiotlb-xen.c | 22 ++--- include/linux/swiotlb.h | 11 +- lib/swiotlb.c | 78 +++-- 3 files changed, 59 insertions(+), 52 deletions(-) diff

[PATCH 3/7] swiotlb: Make io_tlb_overflow_buffer a physical address

2012-10-05 Thread Alexander Duyck
that depended on that functionality be updated. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- lib/swiotlb.c | 61 - 1 files changed, 34 insertions(+), 27 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 3c45f10

[PATCH 6/7] swiotlb: Use physical addresses instead of virtual in swiotlb_tbl_sync_single

2012-10-05 Thread Alexander Duyck
to orig_addr, and dma_addr to tlb_addr. This way is should be clear that orig_addr is contained within io_orig_addr and tlb_addr is an address within the io_tlb_addr buffer. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- drivers/xen/swiotlb-xen.c |3 +-- include/linux/swiotlb.h

[PATCH 7/7] swiotlb: Do not export swiotlb_bounce since there are no external consumers

2012-10-05 Thread Alexander Duyck
is should be clear that orig_addr is contained within io_orig_addr and tlb_addr is an address within the io_tlb_addr buffer. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- include/linux/swiotlb.h |3 --- lib/swiotlb.c | 35 --- 2 files

[PATCH 5/7] swiotlb: Use physical addresses for swiotlb_tbl_unmap_single

2012-10-05 Thread Alexander Duyck
to orig_addr, and dma_addr to tlb_addr. This way is should be clear that orig_addr is contained within io_orig_addr and tlb_addr is an address within the io_tlb_addr buffer. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- drivers/xen/swiotlb-xen.c |4 ++-- include/linux/swiotlb.h

[PATCH v3 0/8] Improve performance of VM translation on x86_64

2012-11-05 Thread Alexander Duyck
expensive. However the default build for x86_64 increases the vmlinux size by 3.5K with this change applied. --- Alexander Duyck (8): x86/lguest: Use __pa_symbol instead of __pa on C visible symbols x86/acpi: Use __pa_symbol instead of __pa on C visible symbols x86/xen: Use __pa_symbol

[PATCH v3 2/8] x86: Make it so that __pa_symbol can only process kernel symbols on x86_64

2012-11-05 Thread Alexander Duyck
system this reduced the size for __pa_symbol from 5 instructions totalling 30 bytes to 3 instructions totalling 16 bytes. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/include/asm/page.h |3 ++- arch/x86/include/asm/page_32.h |1 + arch/x86/include

[PATCH v3 1/8] x86: Improve __phys_addr performance by making use of carry flags and inlining

2012-11-05 Thread Alexander Duyck
their type from UL to ULL. Finally I also applied the same logic changes to __virt_addr_valid since it used the same general code flow as __phys_addr and could achieve similar gains though these changes. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- v3: Added changes

[PATCH v3 3/8] x86: Drop 4 unnecessary calls to __pa_symbol

2012-11-05 Thread Alexander Duyck
to just change the two cases I found so that they are always just treated as x - y. As such I am casting the values to phys_addr_t and then doing simple subtraction so that the correct type and value is returned. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/kernel

[PATCH v3 4/8] x86: Use __pa_symbol instead of __pa on C visible symbols

2012-11-05 Thread Alexander Duyck
was able to reduce the overhead of kernel symbol to virtual memory translation by using a combination of __va(__pa_symbol()) instead of page_address(virt_to_page()). Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- v3: Added changes to init_64.c function mark_rodata_ro to avoid

[PATCH v3 5/8] x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

2012-11-05 Thread Alexander Duyck
however if we know that the instruction pointer is somewhere between _text and _etext we know that we are going to be translating an address form the kernel text space. Cc: Steven Rostedt rost...@goodmis.org Cc: Frederic Weisbecker fweis...@gmail.com Signed-off-by: Alexander Duyck alexander.h.du

[PATCH v3 6/8] x86/xen: Use __pa_symbol instead of __pa on C visible symbols

2012-11-05 Thread Alexander Duyck
. Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/xen/mmu.c | 21 +++-- 1 files changed, 11 insertions(+), 10 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index 4a05b39..a63e5f9 100644

[PATCH v3 7/8] x86/acpi: Use __pa_symbol instead of __pa on C visible symbols

2012-11-05 Thread Alexander Duyck
Brown len.br...@intel.com Cc: Pavel Machek pa...@ucw.cz Cc: Rafael J. Wysocki r...@sisk.pl Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/kernel/acpi/sleep.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86

[PATCH v3 8/8] x86/lguest: Use __pa_symbol instead of __pa on C visible symbols

2012-11-05 Thread Alexander Duyck
The function lguest_write_cr3 is using __pa to convert swapper_pg_dir and initial_page_table from virtual addresses to physical. The correct function to use for these values is __pa_symbol since they are C visible symbols. Cc: Rusty Russell ru...@rustcorp.com.au Signed-off-by: Alexander Duyck

Re: [PATCH v3 1/8] x86: Improve __phys_addr performance by making use of carry flags and inlining

2012-11-05 Thread Alexander Duyck
On 11/05/2012 12:24 PM, Kirill A. Shutemov wrote: On Mon, Nov 05, 2012 at 11:04:06AM -0800, Alexander Duyck wrote: This patch is meant to improve overall system performance when making use of the __phys_addr call. To do this I have implemented several changes. First if CONFIG_DEBUG_VIRTUAL

Re: [PATCH v2 1/7] swiotlb: Make io_tlb_end a physical address instead of a virtual one

2012-10-19 Thread Alexander Duyck
On 10/19/2012 07:18 AM, Konrad Rzeszutek Wilk wrote: On Thu, Oct 18, 2012 at 08:53:33AM -0700, Alexander Duyck wrote: end to be physical instead of virtual. I reviewed the code and realized that I wasn't saving anything by removing it since the overall code was larger as a result so I just

Re: [PATCH] x86: Improve 64 bit __phys_addr call performance

2012-10-24 Thread Alexander Duyck
On 10/24/2012 03:25 AM, Ingo Molnar wrote: * Alexander Duyck alexander.h.du...@intel.com wrote: This patch is meant to improve overall system performance when making use of the __phys_addr call on 64 bit x86 systems. To do this I have implemented several changes. First

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-08 Thread Alexander Duyck
On 10/06/2012 10:57 AM, Andi Kleen wrote: Inlining everything did speed things up a bit, but I still didn't reach the same speed I achieved using the patch set. However I did notice the resulting swiotlb code was considerably larger. Thanks. So your patch makes sense, but imho should pursue

[PATCH] x86: Improve 64 bit __phys_addr call performance

2012-10-09 Thread Alexander Duyck
from UL to ULL. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/include/asm/page_64_types.h | 16 ++-- arch/x86/kernel/x8664_ksyms_64.c |3 +++ arch/x86/mm/physaddr.c | 20 ++-- 3 files changed, 31 insertions(+), 8

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-09 Thread Alexander Duyck
On 10/08/2012 08:43 AM, Alexander Duyck wrote: On 10/06/2012 10:57 AM, Andi Kleen wrote: BTW __pa used to be a simple subtraction, the if () was just added to handle the few call sites for x86-64 that do __pa(text_symbol). Maybe we should just go back to the old __pa_symbol() for those cases

Re: [RFC PATCH 2/7] swiotlb: Make io_tlb_start a physical address instead of a virtual address

2012-10-09 Thread Alexander Duyck
On 10/09/2012 09:43 AM, Konrad Rzeszutek Wilk wrote: On Thu, Oct 04, 2012 at 01:22:58PM -0700, Alexander Duyck wrote: On 10/04/2012 10:19 AM, Konrad Rzeszutek Wilk wrote: @@ -450,7 +451,7 @@ void *swiotlb_tbl_map_single(struct device *hwdev, dma_addr_t tbl_dma_addr

[PATCH] x86: Make it so that __pa_symbol can only process kernel symbols on x86_64

2012-10-10 Thread Alexander Duyck
system this reduced the size for __pa_symbol from 5 instructions totalling 30 bytes to 3 instructions totalling 16 bytes. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/include/asm/page.h |3 ++- arch/x86/include/asm/page_32.h |1 + arch/x86/include

Re: [PATCH] x86: Improve 64 bit __phys_addr call performance

2012-10-10 Thread Alexander Duyck
On 10/10/2012 06:58 AM, Andi Kleen wrote: The second change was to streamline the code by making use of the carry flag on an add operation instead of performing a compare on a 64 bit value. The advantage to this is that it allows us to reduce the overall size of the call. On my Xeon E5

[PATCH v2 1/7] swiotlb: Make io_tlb_end a physical address instead of a virtual one

2012-10-11 Thread Alexander Duyck
to the physical one needed for testing an existing DMA address. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- lib/swiotlb.c | 24 +--- 1 files changed, 13 insertions(+), 11 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index f114bf6..19aac9f 100644 --- a/lib

[PATCH v2 2/7] swiotlb: Make io_tlb_start a physical address instead of a virtual one

2012-10-11 Thread Alexander Duyck
to the physical one needed for testing an existing DMA address. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- lib/swiotlb.c | 58 + 1 files changed, 29 insertions(+), 29 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index

[PATCH v2 5/7] swiotlb: Use physical addresses for swiotlb_tbl_unmap_single

2012-10-11 Thread Alexander Duyck
to orig_addr, and dma_addr to tlb_addr. This way is should be clear that orig_addr is contained within io_orig_addr and tlb_addr is an address within the io_tlb_addr buffer. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- drivers/xen/swiotlb-xen.c |4 ++-- include/linux/swiotlb.h

[PATCH v2 3/7] swiotlb: Make io_tlb_overflow_buffer a physical address

2012-10-11 Thread Alexander Duyck
that depended on that functionality be updated. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- lib/swiotlb.c | 61 - 1 files changed, 34 insertions(+), 27 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index c492b84

[PATCH v2 4/7] swiotlb: Return physical addresses when calling swiotlb_tbl_map_single

2012-10-11 Thread Alexander Duyck
buffer. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- drivers/xen/swiotlb-xen.c | 22 ++--- include/linux/swiotlb.h | 11 +- lib/swiotlb.c | 78 +++-- 3 files changed, 59 insertions(+), 52 deletions(-) diff

[PATCH v2 6/7] swiotlb: Use physical addresses instead of virtual in swiotlb_tbl_sync_single

2012-10-11 Thread Alexander Duyck
to orig_addr, and dma_addr to tlb_addr. This way is should be clear that orig_addr is contained within io_orig_addr and tlb_addr is an address within the io_tlb_addr buffer. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- drivers/xen/swiotlb-xen.c |3 +-- include/linux/swiotlb.h

[PATCH v2 7/7] swiotlb: Do not export swiotlb_bounce since there are no external consumers

2012-10-11 Thread Alexander Duyck
is should be clear that orig_addr is contained within io_orig_addr and tlb_addr is an address within the io_tlb_addr buffer. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- include/linux/swiotlb.h |3 --- lib/swiotlb.c | 35 --- 2 files

[PATCH v2 0/7] Improve swiotlb performance by using physical addresses

2012-10-11 Thread Alexander Duyck
to physical addresses. As such I have updated the patch so that it instead is converting io_tlb_end from a virtual address to a physical address. This actually helps to reduce the overhead for is_swiotlb_buffer and swiotlb_dma_supported by several instructions. --- Alexander Duyck (7

[PATCH v2 0/8] Improve performance of VM translation on x86_64

2012-10-11 Thread Alexander Duyck
in the 1% to 2% increase in overall performance. The remaining patches are various cleanups for a number of spots where __pa or virt_to_phys was being called and was not needed or __pa_symbol could have been used. --- Alexander Duyck (8): x86/lguest: Use __pa_symbol instead of __pa on C visible

[PATCH v2 2/8] x86: Make it so that __pa_symbol can only process kernel symbols on x86_64

2012-10-11 Thread Alexander Duyck
system this reduced the size for __pa_symbol from 5 instructions totalling 30 bytes to 3 instructions totalling 16 bytes. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/include/asm/page.h |3 ++- arch/x86/include/asm/page_32.h |1 + arch/x86/include

[PATCH v2 4/8] x86: Use __pa_symbol instead of __pa on C visible symbols

2012-10-11 Thread Alexander Duyck
When I made an attempt at separating __pa_symbol and __pa I found that there were a number of cases where __pa was used on an obvious symbol. I also caught one non-obvious case as _brk_start and _brk_end are based on the address of __brk_base which is a C visible symbol. Signed-off-by: Alexander

[PATCH v2 5/8] x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

2012-10-11 Thread Alexander Duyck
however if we know that the instruction pointer is somewhere between _text and _etext we know that we are going to be translating an address form the kernel text space. Cc: Steven Rostedt rost...@goodmis.org Cc: Frederic Weisbecker fweis...@gmail.com Signed-off-by: Alexander Duyck alexander.h.du

[PATCH v2 1/8] x86: Improve __phys_addr performance by making use of carry flags and inlining

2012-10-11 Thread Alexander Duyck
their type from UL to ULL. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/include/asm/page_64_types.h | 17 +++-- arch/x86/kernel/x8664_ksyms_64.c |3 +++ arch/x86/mm/physaddr.c | 20 ++-- 3 files changed, 32

[PATCH v2 7/8] x86/acpi: Use __pa_symbol instead of __pa on C visible symbols

2012-10-11 Thread Alexander Duyck
Brown len.br...@intel.com Cc: Pavel Machek pa...@ucw.cz Cc: Rafael J. Wysocki r...@sisk.pl Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/kernel/acpi/sleep.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86

[PATCH v2 6/8] x86/xen: Use __pa_symbol instead of __pa on C visible symbols

2012-10-11 Thread Alexander Duyck
. Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/xen/mmu.c | 19 ++- 1 files changed, 10 insertions(+), 9 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index fd28d86..c50a87e 100644

[PATCH v2 8/8] x86/lguest: Use __pa_symbol instead of __pa on C visible symbols

2012-10-11 Thread Alexander Duyck
The function lguest_write_cr3 is using __pa to convert swapper_pg_dir and initial_page_table from virtual addresses to physical. The correct function to use for these values is __pa_symbol since they are C visible symbols. Cc: Rusty Russell ru...@rustcorp.com.au Signed-off-by: Alexander Duyck

[PATCH v2 3/8] x86: Drop 4 unnecessary calls to __pa_symbol

2012-10-11 Thread Alexander Duyck
to just change the two cases I found so that they are always just treated as x - y. As such I am casting the values to phys_addr_t and then doing simple subtraction so that the correct type and value is returned. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/kernel

Re: [PATCH v2 0/8] Improve performance of VM translation on x86_64

2012-10-12 Thread Alexander Duyck
On 10/12/2012 08:15 AM, Andi Kleen wrote: Could you also add a blurb in the Documentation/ appropriate file for device driver writes mentioning the usage of __pa_symbol is preferred? Device driver writer's shouldn't use any of this anyways, they should always use the PCI DMA APIs and never

Re: [PATCH v2 0/7] Improve swiotlb performance by using physical addresses

2012-10-12 Thread Alexander Duyck
On 10/11/2012 01:34 PM, Alexander Duyck wrote: While working on 10Gb/s routing performance I found a significant amount of time was being spent in the swiotlb DMA handler. Further digging found that a significant amount of this was due to virtual to physical address translation and calling

Re: [PATCH v3 0/7] Improve swiotlb performance by using physical addresses

2012-10-29 Thread Alexander Duyck
On Mon, Oct 15, 2012 at 10:19 AM, Alexander Duyck alexander.h.du...@intel.com wrote: While working on 10Gb/s routing performance I found a significant amount of time was being spent in the swiotlb DMA handler. Further digging found that a significant amount of this was due to virtual

Re: [PATCH v2 1/7] swiotlb: Make io_tlb_end a physical address instead of a virtual one

2012-10-15 Thread Alexander Duyck
On 10/13/2012 05:52 AM, Hillf Danton wrote: Hi Alexander, On Fri, Oct 12, 2012 at 4:34 AM, Alexander Duyck alexander.h.du...@intel.com wrote: This change replaces all references to the virtual address for io_tlb_end with references to the physical address io_tlb_end. The main advantage

[PATCH v3 0/7] Improve swiotlb performance by using physical addresses

2012-10-15 Thread Alexander Duyck
realized I was causing some namespace pollution since a static char * was being replaced with phys_addr_t when it should have been static phys_addr_t. As such I have updated the first 3 patches to correctly replace static pointers with static physical addresses. --- Alexander Duyck (7

[PATCH v3 1/7] swiotlb: Make io_tlb_end a physical address instead of a virtual one

2012-10-15 Thread Alexander Duyck
to the physical one needed for testing an existing DMA address. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- lib/swiotlb.c | 24 +--- 1 files changed, 13 insertions(+), 11 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index f114bf6..c0cbfa1 100644 --- a/lib

[PATCH v3 2/7] swiotlb: Make io_tlb_start a physical address instead of a virtual one

2012-10-15 Thread Alexander Duyck
to the physical one needed for testing an existing DMA address. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- lib/swiotlb.c | 58 + 1 files changed, 29 insertions(+), 29 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index

[PATCH v3 3/7] swiotlb: Make io_tlb_overflow_buffer a physical address

2012-10-15 Thread Alexander Duyck
that depended on that functionality be updated. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- lib/swiotlb.c | 61 - 1 files changed, 34 insertions(+), 27 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 8c4791f

[PATCH v3 5/7] swiotlb: Use physical addresses for swiotlb_tbl_unmap_single

2012-10-15 Thread Alexander Duyck
to orig_addr, and dma_addr to tlb_addr. This way is should be clear that orig_addr is contained within io_orig_addr and tlb_addr is an address within the io_tlb_addr buffer. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- drivers/xen/swiotlb-xen.c |4 ++-- include/linux/swiotlb.h

[PATCH v3 6/7] swiotlb: Use physical addresses instead of virtual in swiotlb_tbl_sync_single

2012-10-15 Thread Alexander Duyck
to orig_addr, and dma_addr to tlb_addr. This way is should be clear that orig_addr is contained within io_orig_addr and tlb_addr is an address within the io_tlb_addr buffer. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- drivers/xen/swiotlb-xen.c |3 +-- include/linux/swiotlb.h

[PATCH v3 7/7] swiotlb: Do not export swiotlb_bounce since there are no external consumers

2012-10-15 Thread Alexander Duyck
is should be clear that orig_addr is contained within io_orig_addr and tlb_addr is an address within the io_tlb_addr buffer. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- include/linux/swiotlb.h |3 --- lib/swiotlb.c | 35 --- 2 files

[PATCH v3 4/7] swiotlb: Return physical addresses when calling swiotlb_tbl_map_single

2012-10-15 Thread Alexander Duyck
buffer. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- drivers/xen/swiotlb-xen.c | 22 ++--- include/linux/swiotlb.h | 11 +- lib/swiotlb.c | 78 +++-- 3 files changed, 59 insertions(+), 52 deletions(-) diff

Re: [PATCH] e1000 driver RX race condition fixed

2012-10-15 Thread Alexander Duyck
On 10/14/2012 10:19 AM, Dmitry Fleytman wrote: There is a race condition in e1000 driver. It enables HW receive before RX rings initalization. In case of specific timing this may lead to host memory corruption due to DMA write to arbitrary memory location. Following patch fixes this issue by

Re: [PATCH] e1000 driver RX race condition fixed

2012-10-15 Thread Alexander Duyck
it there. Thanks, Dmitry. On Mon, Oct 15, 2012 at 8:53 PM, Alexander Duyck alexander.h.du...@intel.com wrote: On 10/14/2012 10:19 AM, Dmitry Fleytman wrote: There is a race condition in e1000 driver. It enables HW receive before RX rings initalization. In case of specific timing this may lead to host

Re: [PATCH] e1000 driver RX race condition fixed

2012-10-15 Thread Alexander Duyck
, Oct 15, 2012 at 10:03 PM, Alexander Duyck alexander.h.du...@intel.com wrote: Hello Dmitry, My concern is that on many of our parts the behavior is to initialize both the head and tail to 0, enable Rx for either the ring or device depending on the queue configuration, and then allocate buffers

Re: [PATCH v2 1/7] swiotlb: Make io_tlb_end a physical address instead of a virtual one

2012-10-18 Thread Alexander Duyck
On 10/18/2012 05:41 AM, Konrad Rzeszutek Wilk wrote: On Mon, Oct 15, 2012 at 08:43:28AM -0700, Alexander Duyck wrote: On 10/13/2012 05:52 AM, Hillf Danton wrote: Hi Alexander, On Fri, Oct 12, 2012 at 4:34 AM, Alexander Duyck alexander.h.du...@intel.com wrote: This change replaces all

[PATCH 0/2] Address issues in dma-debug API

2013-03-18 Thread Alexander Duyck
that for the sub-maintainers to decide. --- Alexander Duyck (2): dma-debug: Fix locking bug in check_unmap dma-debug: Update DMA debug API to better handle multiple mappings of a buffer lib/dma-debug.c | 42 -- 1 files changed, 28 insertions

[PATCH 2/2] dma-debug: Update DMA debug API to better handle multiple mappings of a buffer

2013-03-18 Thread Alexander Duyck
of multiple false errors mer multi-mapped buffer. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- lib/dma-debug.c | 24 +++- 1 files changed, 19 insertions(+), 5 deletions(-) diff --git a/lib/dma-debug.c b/lib/dma-debug.c index 724bd4d..aa465d9 100644 --- a/lib/dma

[PATCH 1/2] dma-debug: Fix locking bug in check_unmap

2013-03-18 Thread Alexander Duyck
making the call to dma_mapping_error. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- lib/dma-debug.c | 18 +- 1 files changed, 9 insertions(+), 9 deletions(-) diff --git a/lib/dma-debug.c b/lib/dma-debug.c index 5e396ac..724bd4d 100644 --- a/lib/dma-debug.c +++ b

Re: [PATCH v3 1/4] net: Add support for hardware-offloaded encapsulation

2012-12-07 Thread Alexander Duyck
On 12/07/2012 02:07 AM, Ben Hutchings wrote: On Thu, 2012-12-06 at 17:56 -0800, Joseph Gasparakis wrote: This patch adds support in the kernel for offloading in the NIC Tx and Rx checksumming for encapsulated packets (such as VXLAN and IP GRE). [...] --- a/include/linux/netdevice.h +++

Re: [PATCH v4 1/5] net: Add support for hardware-offloaded encapsulation

2012-12-10 Thread Alexander Duyck
On 12/10/2012 02:04 AM, saeed bishara wrote: +static inline struct iphdr *inner_ip_hdr(const struct sk_buff *skb) +{ + return (struct iphdr *)skb_inner_network_header(skb); +} Hi, I'm a little bit bothered because of those inner_ functions, what about the following approach: 1. the

Re: [PATCH v4 1/5] net: Add support for hardware-offloaded encapsulation

2012-12-11 Thread Alexander Duyck
; shemmin...@vyatta.com; chr...@sous-sol.org; go...@redhat.com; net...@vger.kernel.org; linux-kernel@vger.kernel.org; Dmitry Kravkov; bhutchi...@solarflare.com; Peter P Waskiewicz Jr; Alexander Duyck Subject: Re: [PATCH v4 1/5] net: Add support for hardware-offloaded encapsulation +static inline

Re: [PATCH v3 1/8] x86: Improve __phys_addr performance by making use of carry flags and inlining

2012-11-16 Thread Alexander Duyck
On 11/05/2012 02:08 PM, Kirill A. Shutemov wrote: On Mon, Nov 05, 2012 at 01:56:28PM -0800, Alexander Duyck wrote: On 11/05/2012 12:24 PM, Kirill A. Shutemov wrote: On Mon, Nov 05, 2012 at 11:04:06AM -0800, Alexander Duyck wrote: This patch is meant to improve overall system performance when

[PATCH v4] x86/xen: Use __pa_symbol instead of __pa on C visible symbols

2012-11-16 Thread Alexander Duyck
. Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- v4: I have spun this patch off as a separate patch for v4 due to the fact that this patch doesn't apply cleanly to Linus's tree. As such I am submitting it based off

[PATCH v4 0/8] Improve performance of VM translation on x86_64

2012-11-16 Thread Alexander Duyck
to avoid virt_to_page calls. v4: Spun x86/xen changes off as a separate patch. Added new patch to push address translation into page_64.h. Minor change to __phys_addr_symbol to avoid unnecessary second check. --- Alexander Duyck (8): x86: Move some contents of page_64_types.h

[PATCH v4 1/8] x86: Move some contents of page_64_types.h into pgtable_64.h and page_64.h

2012-11-16 Thread Alexander Duyck
initialization were already located. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/include/asm/page_64.h | 19 +++ arch/x86/include/asm/page_64_types.h | 22 -- arch/x86/include/asm/pgtable_64.h|5 + 3 files changed, 24

[PATCH v4 2/8] x86: Improve __phys_addr performance by making use of carry flags and inlining

2012-11-16 Thread Alexander Duyck
with this patch applied is slightly faster than the non-debug version without the patch. Finally I also applied the same logic changes to __virt_addr_valid since it used the same general code flow as __phys_addr and could achieve similar gains though these changes. Signed-off-by: Alexander Duyck

[PATCH v4 3/8] x86: Make it so that __pa_symbol can only process kernel symbols on x86_64

2012-11-16 Thread Alexander Duyck
system this reduced the size for __pa_symbol from 5 instructions totalling 30 bytes to 3 instructions totalling 16 bytes. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- v4: Dropped yx check in debug version of __phys_addr_symbol since we already checked for y = KERNEL_IMAGE_SIZE

[PATCH v4 4/8] x86: Drop 4 unnecessary calls to __pa_symbol

2012-11-16 Thread Alexander Duyck
to just change the two cases I found so that they are always just treated as x - y. As such I am casting the values to phys_addr_t and then doing simple subtraction so that the correct type and value is returned. Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/kernel/head32

[PATCH v4 5/8] x86: Use __pa_symbol instead of __pa on C visible symbols

2012-11-16 Thread Alexander Duyck
was able to reduce the overhead of kernel symbol to virtual memory translation by using a combination of __va(__pa_symbol()) instead of page_address(virt_to_page()). Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- v3: Added changes to init_64.c function mark_rodata_ro to avoid

[PATCH v4 6/8] x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

2012-11-16 Thread Alexander Duyck
however if we know that the instruction pointer is somewhere between _text and _etext we know that we are going to be translating an address form the kernel text space. Cc: Steven Rostedt rost...@goodmis.org Cc: Frederic Weisbecker fweis...@gmail.com Signed-off-by: Alexander Duyck alexander.h.du

[PATCH v4 7/8] x86/acpi: Use __pa_symbol instead of __pa on C visible symbols

2012-11-16 Thread Alexander Duyck
Brown len.br...@intel.com Cc: Pavel Machek pa...@ucw.cz Cc: Rafael J. Wysocki r...@sisk.pl Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/kernel/acpi/sleep.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel

[PATCH v4 8/8] x86/lguest: Use __pa_symbol instead of __pa on C visible symbols

2012-11-16 Thread Alexander Duyck
The function lguest_write_cr3 is using __pa to convert swapper_pg_dir and initial_page_table from virtual addresses to physical. The correct function to use for these values is __pa_symbol since they are C visible symbols. Cc: Rusty Russell ru...@rustcorp.com.au Signed-off-by: Alexander Duyck

Re: [PATCH v4 6/8] x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

2012-11-16 Thread Alexander Duyck
On 11/16/2012 03:06 PM, H. Peter Anvin wrote: On 11/16/2012 02:45 PM, Steven Rostedt wrote: #define __pa(x)__phys_addr((unsigned long)(x)) #define __pa_symbol(x)__pa(__phys_reloc_hide((unsigned long)(x))) I'm confused. __pa_symbol() just calls __pa() with some macro magic to its

[PATCH] x86: Fix warning about cast from pointer to integer of different size

2012-11-19 Thread Alexander Duyck
. Peter Anvin h...@linux.intel.com Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- arch/x86/kernel/head32.c |2 +- arch/x86/kernel/head64.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c index f15db0c

[RESEND][PATCH] x86: Fix warning about cast from pointer to integer of different size

2012-11-19 Thread Alexander Duyck
. Peter Anvin h...@linux.intel.com Signed-off-by: Alexander Duyck alexander.h.du...@intel.com --- Resending patch as I realized I forgot to add --auto to stgit command line and as such the Cc was ignored. Sorry for the extra noise on the list. arch/x86/kernel/head32.c |2 +- arch/x86/kernel

Re: [PATCH v4 8/9] pci: Tune secondary bus reset timing

2013-08-06 Thread Alexander Duyck
On 08/05/2013 12:37 PM, Alex Williamson wrote: The PCI spec indicates that with stable power, reset needs to be asserted for a minimum of 1ms (Trst). Seems like we should be able to assume power is stable for a runtime secondary bus reset. The current code has always used 100ms with no

Re: [PATCH v4 8/9] pci: Tune secondary bus reset timing

2013-08-07 Thread Alexander Duyck
On 08/06/2013 07:56 PM, Alex Williamson wrote: On Tue, 2013-08-06 at 16:27 -0700, Alexander Duyck wrote: On 08/05/2013 12:37 PM, Alex Williamson wrote: The PCI spec indicates that with stable power, reset needs to be asserted for a minimum of 1ms (Trst). Seems like we should be able

Re: [PATCH] dma-debug: enhance dma_debug_device_change() to check for mapping errors

2013-11-12 Thread Alexander Duyck
I think this might be overdoing the error checking by a bit. I would much rather have the DMA leaked error be visible than have it buried under messages about the failure to check for DMA errors. In my mind the DMA buffer leak is much more serious than the failure to check for mapping errors.

  1   2   3   4   5   6   7   8   9   10   >