Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-09 Thread Alexander Duyck
On 10/08/2012 08:43 AM, Alexander Duyck wrote: > On 10/06/2012 10:57 AM, Andi Kleen wrote: >> BTW __pa used to be a simple subtraction, the if () was just added to >> handle the few call sites for x86-64 that do __pa(_symbol). >> Maybe we should just go back to the old __pa_symbol() for those

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-09 Thread Alexander Duyck
On 10/08/2012 08:43 AM, Alexander Duyck wrote: On 10/06/2012 10:57 AM, Andi Kleen wrote: BTW __pa used to be a simple subtraction, the if () was just added to handle the few call sites for x86-64 that do __pa(text_symbol). Maybe we should just go back to the old __pa_symbol() for those cases,

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-08 Thread Alexander Duyck
On 10/06/2012 10:57 AM, Andi Kleen wrote: >> Inlining everything did speed things up a bit, but I still didn't reach >> the same speed I achieved using the patch set. However I did notice the >> resulting swiotlb code was considerably larger. > Thanks. So your patch makes sense, but imho should

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-08 Thread Alexander Duyck
On 10/06/2012 10:57 AM, Andi Kleen wrote: Inlining everything did speed things up a bit, but I still didn't reach the same speed I achieved using the patch set. However I did notice the resulting swiotlb code was considerably larger. Thanks. So your patch makes sense, but imho should pursue

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-06 Thread H. Peter Anvin
On 10/06/2012 10:57 AM, Andi Kleen wrote: > > Maybe it's just me, but that's somehow sad for one if() and a su > btraction > > BTW __pa used to be a simple subtraction, the if () was just added to > handle the few call sites for x86-64 that do __pa(_symbol). > Maybe we should just go back to the

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-06 Thread Andi Kleen
> Inlining everything did speed things up a bit, but I still didn't reach > the same speed I achieved using the patch set. However I did notice the > resulting swiotlb code was considerably larger. Thanks. So your patch makes sense, but imho should pursue the inlining in parallel for other call

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-06 Thread Andi Kleen
Inlining everything did speed things up a bit, but I still didn't reach the same speed I achieved using the patch set. However I did notice the resulting swiotlb code was considerably larger. Thanks. So your patch makes sense, but imho should pursue the inlining in parallel for other call

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-06 Thread H. Peter Anvin
On 10/06/2012 10:57 AM, Andi Kleen wrote: Maybe it's just me, but that's somehow sad for one if() and a su btraction BTW __pa used to be a simple subtraction, the if () was just added to handle the few call sites for x86-64 that do __pa(text_symbol). Maybe we should just go back to the

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-05 Thread Alexander Duyck
On 10/05/2012 01:02 PM, Andi Kleen wrote: >> I was thinking the issue was all of the calls to relatively small >> functions occurring in quick succession. The way most of this code is >> setup it seems like it is one small function call in turn calling >> another, and then another, and I would

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-05 Thread Andi Kleen
> I was thinking the issue was all of the calls to relatively small > functions occurring in quick succession. The way most of this code is > setup it seems like it is one small function call in turn calling > another, and then another, and I would imagine the code fragmentation > can have a

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-05 Thread Alexander Duyck
On 10/05/2012 09:55 AM, Andi Kleen wrote: > Alexander Duyck writes: > >> While working on 10Gb/s routing performance I found a significant amount of >> time was being spent in the swiotlb DMA handler. Further digging found that >> a >> significant amount of this was due to the fact that virtual

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-05 Thread Andi Kleen
Alexander Duyck writes: > While working on 10Gb/s routing performance I found a significant amount of > time was being spent in the swiotlb DMA handler. Further digging found that a > significant amount of this was due to the fact that virtual to physical > address translation and calling the

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-05 Thread Andi Kleen
Alexander Duyck alexander.h.du...@intel.com writes: While working on 10Gb/s routing performance I found a significant amount of time was being spent in the swiotlb DMA handler. Further digging found that a significant amount of this was due to the fact that virtual to physical address

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-05 Thread Alexander Duyck
On 10/05/2012 09:55 AM, Andi Kleen wrote: Alexander Duyck alexander.h.du...@intel.com writes: While working on 10Gb/s routing performance I found a significant amount of time was being spent in the swiotlb DMA handler. Further digging found that a significant amount of this was due to the

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-05 Thread Andi Kleen
I was thinking the issue was all of the calls to relatively small functions occurring in quick succession. The way most of this code is setup it seems like it is one small function call in turn calling another, and then another, and I would imagine the code fragmentation can have a

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-05 Thread Alexander Duyck
On 10/05/2012 01:02 PM, Andi Kleen wrote: I was thinking the issue was all of the calls to relatively small functions occurring in quick succession. The way most of this code is setup it seems like it is one small function call in turn calling another, and then another, and I would imagine

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-04 Thread Alexander Duyck
On 10/04/2012 06:33 AM, Konrad Rzeszutek Wilk wrote: > On Wed, Oct 03, 2012 at 05:38:41PM -0700, Alexander Duyck wrote: >> While working on 10Gb/s routing performance I found a significant amount of >> time was being spent in the swiotlb DMA handler. Further digging found that >> a >>

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-04 Thread Alexander Duyck
On 10/04/2012 05:55 AM, Konrad Rzeszutek Wilk wrote: > On Wed, Oct 03, 2012 at 05:38:41PM -0700, Alexander Duyck wrote: >> While working on 10Gb/s routing performance I found a significant amount of >> time was being spent in the swiotlb DMA handler. Further digging found that >> a >>

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-04 Thread Konrad Rzeszutek Wilk
On Wed, Oct 03, 2012 at 05:38:41PM -0700, Alexander Duyck wrote: > While working on 10Gb/s routing performance I found a significant amount of > time was being spent in the swiotlb DMA handler. Further digging found that a > significant amount of this was due to the fact that virtual to physical

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-04 Thread Konrad Rzeszutek Wilk
On Wed, Oct 03, 2012 at 05:38:41PM -0700, Alexander Duyck wrote: > While working on 10Gb/s routing performance I found a significant amount of > time was being spent in the swiotlb DMA handler. Further digging found that a > significant amount of this was due to the fact that virtual to physical

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-04 Thread Konrad Rzeszutek Wilk
On Wed, Oct 03, 2012 at 05:38:41PM -0700, Alexander Duyck wrote: While working on 10Gb/s routing performance I found a significant amount of time was being spent in the swiotlb DMA handler. Further digging found that a significant amount of this was due to the fact that virtual to physical

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-04 Thread Konrad Rzeszutek Wilk
On Wed, Oct 03, 2012 at 05:38:41PM -0700, Alexander Duyck wrote: While working on 10Gb/s routing performance I found a significant amount of time was being spent in the swiotlb DMA handler. Further digging found that a significant amount of this was due to the fact that virtual to physical

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-04 Thread Alexander Duyck
On 10/04/2012 05:55 AM, Konrad Rzeszutek Wilk wrote: On Wed, Oct 03, 2012 at 05:38:41PM -0700, Alexander Duyck wrote: While working on 10Gb/s routing performance I found a significant amount of time was being spent in the swiotlb DMA handler. Further digging found that a significant amount

Re: [RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-04 Thread Alexander Duyck
On 10/04/2012 06:33 AM, Konrad Rzeszutek Wilk wrote: On Wed, Oct 03, 2012 at 05:38:41PM -0700, Alexander Duyck wrote: While working on 10Gb/s routing performance I found a significant amount of time was being spent in the swiotlb DMA handler. Further digging found that a significant amount

[RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-03 Thread Alexander Duyck
While working on 10Gb/s routing performance I found a significant amount of time was being spent in the swiotlb DMA handler. Further digging found that a significant amount of this was due to the fact that virtual to physical address translation and calling the function that did it. It accounted

[RFC PATCH 0/7] Improve swiotlb performance by using physical addresses

2012-10-03 Thread Alexander Duyck
While working on 10Gb/s routing performance I found a significant amount of time was being spent in the swiotlb DMA handler. Further digging found that a significant amount of this was due to the fact that virtual to physical address translation and calling the function that did it. It accounted