On Thu, 2015-03-26 at 16:00 -0700, David Miller wrote:
> From: casca...@linux.vnet.ibm.com
> Date: Wed, 25 Mar 2015 21:43:42 -0300
>
> > On Mon, Mar 23, 2015 at 10:15:08PM -0400, David Miller wrote:
> >> From: Benjamin Herrenschmidt
> >> Date: Tue, 24 Mar 2015 13:08:10 +1100
> >>
> >> > For the
From: casca...@linux.vnet.ibm.com
Date: Wed, 25 Mar 2015 21:43:42 -0300
> On Mon, Mar 23, 2015 at 10:15:08PM -0400, David Miller wrote:
>> From: Benjamin Herrenschmidt
>> Date: Tue, 24 Mar 2015 13:08:10 +1100
>>
>> > For the large pool, we don't keep a hint so we don't know it's
>> > wrapped, in
On (03/25/15 21:43), casca...@linux.vnet.ibm.com wrote:
> However, when using large TCP send/recv (I used uperf with 64KB
> writes/reads), I noticed that on the transmit side, largealloc is not
> used, but on the receive side, cxgb4 almost only uses largealloc, while
> qlge seems to have a 1/1 usag
On Wed, 2015-03-25 at 21:43 -0300, casca...@linux.vnet.ibm.com wrote:
> On Mon, Mar 23, 2015 at 10:15:08PM -0400, David Miller wrote:
> > From: Benjamin Herrenschmidt
> > Date: Tue, 24 Mar 2015 13:08:10 +1100
> >
> > > For the large pool, we don't keep a hint so we don't know it's
> > > wrapped,
On Mon, Mar 23, 2015 at 10:15:08PM -0400, David Miller wrote:
> From: Benjamin Herrenschmidt
> Date: Tue, 24 Mar 2015 13:08:10 +1100
>
> > For the large pool, we don't keep a hint so we don't know it's
> > wrapped, in fact we purposefully don't use a hint to limit
> > fragmentation on it, but the
From: Benjamin Herrenschmidt
Date: Tue, 24 Mar 2015 13:08:10 +1100
> For the large pool, we don't keep a hint so we don't know it's
> wrapped, in fact we purposefully don't use a hint to limit
> fragmentation on it, but then, it should be used rarely enough that
> flushing always is, I suspect, a
On Mon, 2015-03-23 at 21:44 -0400, David Miller wrote:
> From: Benjamin Herrenschmidt
> Date: Tue, 24 Mar 2015 09:21:05 +1100
>
> > Dave, what's your feeling there ? Does anybody around still have
> > some HW that we can test with ?
>
> I don't see what the actual problem is.
>
> Even if you us
benh> It might be sufficient to add a flush counter and compare it between runs
benh> if actual wall-clock benchmarks are too hard to do (especially if you
benh> don't have things like very fast network cards at hand).
benh>
benh> Number of flush / number of packets might be a sufficient metric, it
From: Benjamin Herrenschmidt
Date: Tue, 24 Mar 2015 09:21:05 +1100
> Dave, what's your feeling there ? Does anybody around still have
> some HW that we can test with ?
I don't see what the actual problem is.
Even if you use multiple pools, which we should for scalability on
sun4u too, just do t
On (03/24/15 11:47), Benjamin Herrenschmidt wrote:
>
> Yes, pass a function pointer argument that can be NULL or just make it a
> member of the iommu_allocator struct (or whatever you call it) passed to
> the init function and that can be NULL. My point is we don't need a
> separate "ops" structur
On Mon, 2015-03-23 at 19:19 -0400, Sowmini Varadhan wrote:
> What I've tried to do is to have a bool large_pool arg passed
> to iommu_tbl_pool_init. In my observation (instrumented for scsi, ixgbe),
> we never allocate more than 4 pages at a time, so I pass in
> large_pool == false for all the s
On Mon, 2015-03-23 at 19:08 -0400, Sowmini Varadhan wrote:
> > Sowmini, I see various options for the second choice. We could stick to
> > 1 pool, and basically do as before, ie, if we fail on the first pass of
> > alloc, it means we wrap around and do a flush, I don't think that will
> > cause a
On Mar 23, 2015 7:13 PM, "Sowmini Varadhan"
wrote:
>
> On (03/24/15 09:21), Benjamin Herrenschmidt wrote:
> >
> > So we have two choices here that I can see:
> >
> > - Keep that old platform use the old/simpler allocator
>
> Problem with that approach is that the base "struct iommu" structure
> f
On (03/24/15 09:36), Benjamin Herrenschmidt wrote:
>
> - One pool only
>
> - Whenever the allocation is before the previous hint, do a flush, that
> should only happen if a wrap around occurred or in some cases if the
> device DMA mask forced it. I think we always update the hint whenever we
>
On (03/24/15 09:21), Benjamin Herrenschmidt wrote:
>
> So we have two choices here that I can see:
>
> - Keep that old platform use the old/simpler allocator
Problem with that approach is that the base "struct iommu" structure
for sparc gets a split personality: the older one is used with
the o
On Mon, 2015-03-23 at 15:05 -0400, David Miller wrote:
> From: Sowmini Varadhan
> Date: Mon, 23 Mar 2015 12:54:06 -0400
>
> > If it was only an optimization (i.e., removing it would not break
> > any functionality), and if this was done for older hardware,
> > and *if* we believe that the directi
On Mon, 2015-03-23 at 12:54 -0400, Sowmini Varadhan wrote:
> If it was only an optimization (i.e., removing it would not break
> any functionality), and if this was done for older hardware,
> and *if* we believe that the direction of most architectures is to
> follow the sun4v/HV model, then, giv
On Mon, 2015-03-23 at 15:05 -0400, David Miller wrote:
> From: Sowmini Varadhan
> Date: Mon, 23 Mar 2015 12:54:06 -0400
>
> > If it was only an optimization (i.e., removing it would not break
> > any functionality), and if this was done for older hardware,
> > and *if* we believe that the directi
On (03/23/15 15:05), David Miller wrote:
>
> Why add performance regressions to old machines who already are
> suffering too much from all the bloat we are constantly adding to the
> kernel?
I have no personal opinion on this- it's a matter of choosing
whether we want to have some extra baggage
From: Sowmini Varadhan
Date: Mon, 23 Mar 2015 12:54:06 -0400
> If it was only an optimization (i.e., removing it would not break
> any functionality), and if this was done for older hardware,
> and *if* we believe that the direction of most architectures is to
> follow the sun4v/HV model, then,
On Monday 23 March 2015, Benjamin Herrenschmidt wrote:
> On Mon, 2015-03-23 at 07:04 +0100, Arnd Bergmann wrote:
> >
> > My guess is that the ARM code so far has been concerned mainly with
> > getting things to work in the first place, but scalability problems
> > will only be seen when there are
On (03/23/15 12:29), David Miller wrote:
>
> In order to elide the IOMMU flush as much as possible, I implemnented
> a scheme for sun4u wherein we always allocated from low IOMMU
> addresses to high IOMMU addresses.
>
> In this regime, we only need to flush the IOMMU when we rolled over
> back to
From: Sowmini Varadhan
Date: Sun, 22 Mar 2015 15:27:26 -0400
> That leaves only the odd iommu_flushall() hook, I'm trying
> to find the history behind that (needed for sun4u platforms,
> afaik, and not sure if there are other ways to achieve this).
In order to elide the IOMMU flush as much as po
On Mon, 2015-03-23 at 07:04 +0100, Arnd Bergmann wrote:
>
> My guess is that the ARM code so far has been concerned mainly with
> getting things to work in the first place, but scalability problems
> will only be seen when there are faster CPU cores become available.
In any case, I think this is
On Sunday 22 March 2015, Benjamin Herrenschmidt wrote:
> On Sun, 2015-03-22 at 18:07 -0400, Sowmini Varadhan wrote:
> > On (03/23/15 09:02), Benjamin Herrenschmidt wrote:
> > > > How does this relate to the ARM implementation? There is currently
> > > > an effort going on to make that one shared wi
On Sun, 2015-03-22 at 18:07 -0400, Sowmini Varadhan wrote:
> On (03/23/15 09:02), Benjamin Herrenschmidt wrote:
> > > How does this relate to the ARM implementation? There is currently
> > > an effort going on to make that one shared with ARM64 and possibly
> > > x86. Has anyone looked at both the
On (03/23/15 09:02), Benjamin Herrenschmidt wrote:
> > How does this relate to the ARM implementation? There is currently
> > an effort going on to make that one shared with ARM64 and possibly
> > x86. Has anyone looked at both the PowerPC and ARM ways of doing the
> > allocation to see if we could
On Sun, 2015-03-22 at 20:36 +0100, Arnd Bergmann wrote:
> How does this relate to the ARM implementation? There is currently
> an effort going on to make that one shared with ARM64 and possibly
> x86. Has anyone looked at both the PowerPC and ARM ways of doing the
> allocation to see if we could p
On Thursday 19 March 2015, David Miller wrote:
> PowerPC folks, we're trying to kill the locking contention in our
> IOMMU allocators and noticed that you guys have a nice solution to
> this in your IOMMU code.
>
> Sowmini put together a patch series that tries to extract out the
> generic parts o
Turned out that I was able to iterate over it, and remove
both the ->cookie_to_index and the ->demap indirection from
iommu_tbl_ops.
That leaves only the odd iommu_flushall() hook, I'm trying
to find the history behind that (needed for sun4u platforms,
afaik, and not sure if there are other ways t
On 03/19/2015 02:01 PM, Benjamin Herrenschmidt wrote:
Ben> One thing I noticed is the asymetry in your code between the alloc
Ben> and the free path. The alloc path is similar to us in that the lock
Ben> covers the allocation and that's about it, there's no actual mapping to
Ben> the HW done, it'
On 03/19/2015 02:01 PM, Benjamin Herrenschmidt wrote:
On Wed, 2015-03-18 at 22:25 -0400, David Miller wrote:
PowerPC folks, we're trying to kill the locking contention in our
IOMMU allocators and noticed that you guys have a nice solution to
this in your IOMMU code.
.../...
Adding Alexei to
On Wed, 2015-03-18 at 22:25 -0400, David Miller wrote:
> PowerPC folks, we're trying to kill the locking contention in our
> IOMMU allocators and noticed that you guys have a nice solution to
> this in your IOMMU code.
.../...
Adding Alexei too who is currently doing some changes to our iommu
co
From: Benjamin Herrenschmidt
Date: Thu, 19 Mar 2015 13:46:15 +1100
> Sounds like a good idea ! CC'ing Anton who wrote the pool stuff. I'll
> try to find somebody to work on that here & will let you know asap.
Thanks a lot Ben.
___
Linuxppc-dev mailing
On Wed, 2015-03-18 at 22:25 -0400, David Miller wrote:
> PowerPC folks, we're trying to kill the locking contention in our
> IOMMU allocators and noticed that you guys have a nice solution to
> this in your IOMMU code.
>
> Sowmini put together a patch series that tries to extract out the
> generic
35 matches
Mail list logo