Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andi Kleen
> 
> > Perhaps we just need an ioctl where an X server can switch this.
> 
> Switch what?  Turn on or off transparent translation?

Turn on/off bypass for its device.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Jesse Barnes
On Tuesday, June 26, 2007 10:31:57 Andi Kleen wrote:
> > >>(and I think it mostly already doesn't even without that)
> > >
> > >It uses /sys/bus/pci/* which is not any better as seen from the
> > > IOMMU.
> > >
> > >Any interface will need to be explicit because user space needs to
> > > know which
> > >DMA addresses to put into the hardware. It's not enough to just
> > >transparently
> > >translate the mappings.
> >
> > that's what DRM is used for nowadays...
>
> But DRM does support much less hardware than the X server?

Yeah, the number of DRM drivers is relatively small compared to X or 
fbdev, but for simple DMA they're fairly easy to write.

> Perhaps we just need an ioctl where an X server can switch this.

Switch what?  Turn on or off transparent translation?

Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andi Kleen
> >>(and I think it mostly already doesn't even without that)
> >
> >It uses /sys/bus/pci/* which is not any better as seen from the IOMMU.
> >
> >Any interface will need to be explicit because user space needs to know 
> >which
> >DMA addresses to put into the hardware. It's not enough to just 
> >transparently
> >translate the mappings.
> 
> that's what DRM is used for nowadays...

But DRM does support much less hardware than the X server?

Perhaps we just need an ioctl where an X server can switch this.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Arjan van de Ven

Andi Kleen wrote:

On Tue, Jun 26, 2007 at 08:15:05AM -0700, Arjan van de Ven wrote:

Also the user interface for X server case needs more work.

actually with the mode setting of X moving into the kernel... X won't 
use /dev/mem anymore at all


We'll see if that happens. It has been talked about forever,
but results are sparse. 


jbarnes posted the code a few weeks ago.




(and I think it mostly already doesn't even without that)


It uses /sys/bus/pci/* which is not any better as seen from the IOMMU.

Any interface will need to be explicit because user space needs to know which
DMA addresses to put into the hardware. It's not enough to just transparently
translate the mappings.


that's what DRM is used for nowadays...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Muli Ben-Yehuda
On Tue, Jun 26, 2007 at 08:48:04AM -0700, Keshavamurthy, Anil S wrote:

> Our initial benchmark results showed we had around 3% extra CPU
> utilization overhead when compared to native(i.e without IOMMU).
> Again, our benchmark was on small SMP machine and we used iperf and
> a 1G ethernet cards.

Please try netperf and a bigger machine for a meaningful comparison :-)
I assume this is with e1000?

> Going forward we will do more benchmark tests and will share the
> results.

Looking forward to it.

Cheers,
Muli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Keshavamurthy, Anil S
On Tue, Jun 26, 2007 at 11:11:25AM -0400, Muli Ben-Yehuda wrote:
> On Tue, Jun 26, 2007 at 08:03:59AM -0700, Arjan van de Ven wrote:
> > Muli Ben-Yehuda wrote:
> > >How much? we have numbers (to be presented at OLS later this week)
> > >that show that on bare-metal an IOMMU can cost as much as 15%-30% more
> > >CPU utilization for an IO intensive workload (netperf). It will be
> > >interesting to see comparable numbers for VT-d.
> > 
> > for VT-d it is a LOT less. I'll let anil give you his data :)
> 
> Looking forward to it. Note that this is on a large SMP machine with
> Gigabit ethernet, with netperf TCP stream. Comparing numbers for other
> benchmarks on other machines is ... less than useful, but the numbers
> themeselves are interesting.
Our initial benchmark results showed we had around 3% extra CPU 
utilization overhead when compared to native(i.e without IOMMU).
Again, our benchmark was on small SMP machine and we used
iperf and a 1G ethernet cards.

Going forward we will do more benchmark tests and will share the
results.

-Anil
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andi Kleen
On Tue, Jun 26, 2007 at 11:09:40AM -0400, Muli Ben-Yehuda wrote:
> On Tue, Jun 26, 2007 at 05:56:49PM +0200, Andi Kleen wrote:
> 
> > > > - The IOMMU can merge sg lists into a single virtual block. This could
> > > > potentially speed up SG IO when the device is slow walking SG
> > > > lists.  [I long ago benchmarked 5% on some block benchmark with
> > > > an old MPT Fusion; but it probably depends a lot on the HBA]
> > > 
> > > But most devices are SG-capable.
> > 
> > Your point being?
> 
> That the fact that an IOMMU can do SG for non-SG-capble cards is not
> interesting from a "reason for inclusion" POV.

You misunderstood me; my point was that some SG capable devices
can go faster if they get shorter SG lists.

But yes for non SG capable devices it is also interesting. I expect
it will obsolete most users of that ugly external patch to allocate large
memory areas for IOs. That's a point I didn't mention earlier.

> > Also the user interface for X server case needs more work.
> 
> Is anyone working on it?

It's somewhere on the todo list.


-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andi Kleen
On Tue, Jun 26, 2007 at 08:15:05AM -0700, Arjan van de Ven wrote:
> >
> >Also the user interface for X server case needs more work.
> >
> 
> actually with the mode setting of X moving into the kernel... X won't 
> use /dev/mem anymore at all

We'll see if that happens. It has been talked about forever,
but results are sparse. 

> (and I think it mostly already doesn't even without that)

It uses /sys/bus/pci/* which is not any better as seen from the IOMMU.

Any interface will need to be explicit because user space needs to know which
DMA addresses to put into the hardware. It's not enough to just transparently
translate the mappings.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Arjan van de Ven


Also the user interface for X server case needs more work.



actually with the mode setting of X moving into the kernel... X won't 
use /dev/mem anymore at all

(and I think it mostly already doesn't even without that)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Muli Ben-Yehuda
On Tue, Jun 26, 2007 at 08:03:59AM -0700, Arjan van de Ven wrote:
> Muli Ben-Yehuda wrote:
> >How much? we have numbers (to be presented at OLS later this week)
> >that show that on bare-metal an IOMMU can cost as much as 15%-30% more
> >CPU utilization for an IO intensive workload (netperf). It will be
> >interesting to see comparable numbers for VT-d.
> 
> for VT-d it is a LOT less. I'll let anil give you his data :)

Looking forward to it. Note that this is on a large SMP machine with
Gigabit ethernet, with netperf TCP stream. Comparing numbers for other
benchmarks on other machines is ... less than useful, but the numbers
themeselves are interesting.

Cheers,
Muli

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Muli Ben-Yehuda
On Tue, Jun 26, 2007 at 05:56:49PM +0200, Andi Kleen wrote:

> > > - The IOMMU can merge sg lists into a single virtual block. This could
> > > potentially speed up SG IO when the device is slow walking SG
> > > lists.  [I long ago benchmarked 5% on some block benchmark with
> > > an old MPT Fusion; but it probably depends a lot on the HBA]
> > 
> > But most devices are SG-capable.
> 
> Your point being?

That the fact that an IOMMU can do SG for non-SG-capble cards is not
interesting from a "reason for inclusion" POV.

> > How much? we have numbers (to be presented at OLS later this week)
> > that show that on bare-metal an IOMMU can cost as much as 15%-30%
> > more CPU utilization for an IO intensive workload (netperf). It
> > will be interesting to see comparable numbers for VT-d.
> 
> That is something that needs more work.

Yup. I'm working on it (mostly in the context of Calgary) but also
looking at improvements to the DMA-API interface and usage.

> We should probably have a switch to use the IOMMU only for specific
> devices (e.g. for the KVM case) r only when remapping is
> needed.

Calgary already does this internally (via calgary=disable=)
but that's pretty ugly. It would be better to do it in a generic
fashion when deciding which dma_ops to call (i.e., a dma_ops per bus
or device).

> Also the user interface for X server case needs more work.

Is anyone working on it?

Cheers,
Muli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Arjan van de Ven

Muli Ben-Yehuda wrote:

How much? we have numbers (to be presented at OLS later this week)
that show that on bare-metal an IOMMU can cost as much as 15%-30% more
CPU utilization for an IO intensive workload (netperf). It will be
interesting to see comparable numbers for VT-d.


for VT-d it is a LOT less. I'll let anil give you his data :)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andi Kleen
Muli Ben-Yehuda <[EMAIL PROTECTED]> writes:

> On Tue, Jun 26, 2007 at 09:12:45AM +0200, Andi Kleen wrote:
> 
> > There are some potential performance benefits too:
> > - When you have a device that cannot address the complete address range
> > an IOMMU can remap its memory instead of bounce buffering. Remapping
> > is likely cheaper than copying. 
> 
> But those devices aren't likely to be found on modern systems.

Not true. I don't see anybody designing DAC capable USB or firewire
or sound or TV cards. And there are plenty of non AHCI SATA interfaces too
(often the BIOS defaults are this way because XP doesn't deal
well with AHCI). And video cards generally don't support it 
(although they don't like IOMMUs either). Just these devices all might 
not be performance relevant (except for the video cards) 

> > - The IOMMU can merge sg lists into a single virtual block. This could
> > potentially speed up SG IO when the device is slow walking SG lists.
> > [I long ago benchmarked 5% on some block benchmark with an old
> > MPT Fusion; but it probably depends a lot on the HBA]
> 
> But most devices are SG-capable.

Your point being? It depends on if the SG hardware is slow
enough that it makes a difference. I found one case where that
was true, but it's unknown how common that is.

Only benchmarks can tell.

Also my results were on a pretty slow IOMMU implementation
so with a fast one it might be different too.

> How much? we have numbers (to be presented at OLS later this week)
> that show that on bare-metal an IOMMU can cost as much as 15%-30% more
> CPU utilization for an IO intensive workload (netperf). It will be
> interesting to see comparable numbers for VT-d.

That is something that needs more work.

We should probably have a switch to use the IOMMU only for specific
devices (e.g. for the KVM case) r only when remapping is needed. Only
boot options for this is probably not good enough. But that is something
that can be worked on once everything is in tree.

Also the user interface for X server case needs more work.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Muli Ben-Yehuda
On Tue, Jun 26, 2007 at 09:12:45AM +0200, Andi Kleen wrote:

> There are some potential performance benefits too:
> - When you have a device that cannot address the complete address range
> an IOMMU can remap its memory instead of bounce buffering. Remapping
> is likely cheaper than copying. 

But those devices aren't likely to be found on modern systems.

> - The IOMMU can merge sg lists into a single virtual block. This could
> potentially speed up SG IO when the device is slow walking SG lists.
> [I long ago benchmarked 5% on some block benchmark with an old
> MPT Fusion; but it probably depends a lot on the HBA]

But most devices are SG-capable.

> And you get better driver debugging because unexpected memory
> accesses from the devices will cause an trapable event.

That and direct-access for KVM the big ones, IMHO, and definitely
justify merging.

> > Does it slow anything down?
> 
> It adds more overhead to each IO so yes.

How much? we have numbers (to be presented at OLS later this week)
that show that on bare-metal an IOMMU can cost as much as 15%-30% more
CPU utilization for an IO intensive workload (netperf). It will be
interesting to see comparable numbers for VT-d.

Cheers,
Muli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andi Kleen
On Tuesday 26 June 2007 08:45:50 Andrew Morton wrote:
> On Tue, 19 Jun 2007 14:37:01 -0700 "Keshavamurthy, Anil S" <[EMAIL 
> PROTECTED]> wrote:
> 
> > This patch supports the upcomming Intel IOMMU hardware
> > a.k.a. Intel(R) Virtualization Technology for Directed I/O 
> > Architecture
> 
> So...  what's all this code for?
> 
> I assume that the intent here is to speed things up under Xen, etc? 

Yes in some cases, but not this code. That would be the Xen version
of this code that could potentially assign whole devices to guests.
I expect this to be only useful in some special cases though because
most hardware is not virtualizable and you typically want an own
instance for each guest.

Ok at some point KVM might implement this too; i likely would
use this code for this.

> Do we 
> have any benchmark results to help us to decide whether a merge would be
> justified?

The main advantage for doing it in the normal kernel is not performance, but 
more safety. Broken devices won't be able to corrupt memory by doing
random DMA.

Unfortunately that doesn't work for graphics yet, for that need
user space interfaces for the X server are needed.

There are some potential performance benefits too:
- When you have a device that cannot address the complete address range
an IOMMU can remap its memory instead of bounce buffering. Remapping
is likely cheaper than copying. 
- The IOMMU can merge sg lists into a single virtual block. This could
potentially speed up SG IO when the device is slow walking SG lists.
[I long ago benchmarked 5% on some block benchmark with an old
MPT Fusion; but it probably depends a lot on the HBA]

And you get better driver debugging because unexpected memory accesses
from the devices will cause an trapable event.

> 
> Does it slow anything down?

It adds more overhead to each IO so yes.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andrew Morton
On Tue, 19 Jun 2007 14:37:01 -0700 "Keshavamurthy, Anil S" <[EMAIL PROTECTED]> 
wrote:

>   This patch supports the upcomming Intel IOMMU hardware
> a.k.a. Intel(R) Virtualization Technology for Directed I/O 
> Architecture

So...  what's all this code for?

I assume that the intent here is to speed things up under Xen, etc?  Do we
have any benchmark results to help us to decide whether a merge would be
justified?

Does it slow anything down?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andrew Morton
On Tue, 19 Jun 2007 14:37:01 -0700 Keshavamurthy, Anil S [EMAIL PROTECTED] 
wrote:

   This patch supports the upcomming Intel IOMMU hardware
 a.k.a. Intel(R) Virtualization Technology for Directed I/O 
 Architecture

So...  what's all this code for?

I assume that the intent here is to speed things up under Xen, etc?  Do we
have any benchmark results to help us to decide whether a merge would be
justified?

Does it slow anything down?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andi Kleen
On Tuesday 26 June 2007 08:45:50 Andrew Morton wrote:
 On Tue, 19 Jun 2007 14:37:01 -0700 Keshavamurthy, Anil S [EMAIL 
 PROTECTED] wrote:
 
  This patch supports the upcomming Intel IOMMU hardware
  a.k.a. Intel(R) Virtualization Technology for Directed I/O 
  Architecture
 
 So...  what's all this code for?
 
 I assume that the intent here is to speed things up under Xen, etc? 

Yes in some cases, but not this code. That would be the Xen version
of this code that could potentially assign whole devices to guests.
I expect this to be only useful in some special cases though because
most hardware is not virtualizable and you typically want an own
instance for each guest.

Ok at some point KVM might implement this too; i likely would
use this code for this.

 Do we 
 have any benchmark results to help us to decide whether a merge would be
 justified?

The main advantage for doing it in the normal kernel is not performance, but 
more safety. Broken devices won't be able to corrupt memory by doing
random DMA.

Unfortunately that doesn't work for graphics yet, for that need
user space interfaces for the X server are needed.

There are some potential performance benefits too:
- When you have a device that cannot address the complete address range
an IOMMU can remap its memory instead of bounce buffering. Remapping
is likely cheaper than copying. 
- The IOMMU can merge sg lists into a single virtual block. This could
potentially speed up SG IO when the device is slow walking SG lists.
[I long ago benchmarked 5% on some block benchmark with an old
MPT Fusion; but it probably depends a lot on the HBA]

And you get better driver debugging because unexpected memory accesses
from the devices will cause an trapable event.

 
 Does it slow anything down?

It adds more overhead to each IO so yes.

-Andi

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Muli Ben-Yehuda
On Tue, Jun 26, 2007 at 09:12:45AM +0200, Andi Kleen wrote:

 There are some potential performance benefits too:
 - When you have a device that cannot address the complete address range
 an IOMMU can remap its memory instead of bounce buffering. Remapping
 is likely cheaper than copying. 

But those devices aren't likely to be found on modern systems.

 - The IOMMU can merge sg lists into a single virtual block. This could
 potentially speed up SG IO when the device is slow walking SG lists.
 [I long ago benchmarked 5% on some block benchmark with an old
 MPT Fusion; but it probably depends a lot on the HBA]

But most devices are SG-capable.

 And you get better driver debugging because unexpected memory
 accesses from the devices will cause an trapable event.

That and direct-access for KVM the big ones, IMHO, and definitely
justify merging.

  Does it slow anything down?
 
 It adds more overhead to each IO so yes.

How much? we have numbers (to be presented at OLS later this week)
that show that on bare-metal an IOMMU can cost as much as 15%-30% more
CPU utilization for an IO intensive workload (netperf). It will be
interesting to see comparable numbers for VT-d.

Cheers,
Muli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andi Kleen
Muli Ben-Yehuda [EMAIL PROTECTED] writes:

 On Tue, Jun 26, 2007 at 09:12:45AM +0200, Andi Kleen wrote:
 
  There are some potential performance benefits too:
  - When you have a device that cannot address the complete address range
  an IOMMU can remap its memory instead of bounce buffering. Remapping
  is likely cheaper than copying. 
 
 But those devices aren't likely to be found on modern systems.

Not true. I don't see anybody designing DAC capable USB or firewire
or sound or TV cards. And there are plenty of non AHCI SATA interfaces too
(often the BIOS defaults are this way because XP doesn't deal
well with AHCI). And video cards generally don't support it 
(although they don't like IOMMUs either). Just these devices all might 
not be performance relevant (except for the video cards) 

  - The IOMMU can merge sg lists into a single virtual block. This could
  potentially speed up SG IO when the device is slow walking SG lists.
  [I long ago benchmarked 5% on some block benchmark with an old
  MPT Fusion; but it probably depends a lot on the HBA]
 
 But most devices are SG-capable.

Your point being? It depends on if the SG hardware is slow
enough that it makes a difference. I found one case where that
was true, but it's unknown how common that is.

Only benchmarks can tell.

Also my results were on a pretty slow IOMMU implementation
so with a fast one it might be different too.

 How much? we have numbers (to be presented at OLS later this week)
 that show that on bare-metal an IOMMU can cost as much as 15%-30% more
 CPU utilization for an IO intensive workload (netperf). It will be
 interesting to see comparable numbers for VT-d.

That is something that needs more work.

We should probably have a switch to use the IOMMU only for specific
devices (e.g. for the KVM case) r only when remapping is needed. Only
boot options for this is probably not good enough. But that is something
that can be worked on once everything is in tree.

Also the user interface for X server case needs more work.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Arjan van de Ven

Muli Ben-Yehuda wrote:

How much? we have numbers (to be presented at OLS later this week)
that show that on bare-metal an IOMMU can cost as much as 15%-30% more
CPU utilization for an IO intensive workload (netperf). It will be
interesting to see comparable numbers for VT-d.


for VT-d it is a LOT less. I'll let anil give you his data :)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Muli Ben-Yehuda
On Tue, Jun 26, 2007 at 05:56:49PM +0200, Andi Kleen wrote:

   - The IOMMU can merge sg lists into a single virtual block. This could
   potentially speed up SG IO when the device is slow walking SG
   lists.  [I long ago benchmarked 5% on some block benchmark with
   an old MPT Fusion; but it probably depends a lot on the HBA]
  
  But most devices are SG-capable.
 
 Your point being?

That the fact that an IOMMU can do SG for non-SG-capble cards is not
interesting from a reason for inclusion POV.

  How much? we have numbers (to be presented at OLS later this week)
  that show that on bare-metal an IOMMU can cost as much as 15%-30%
  more CPU utilization for an IO intensive workload (netperf). It
  will be interesting to see comparable numbers for VT-d.
 
 That is something that needs more work.

Yup. I'm working on it (mostly in the context of Calgary) but also
looking at improvements to the DMA-API interface and usage.

 We should probably have a switch to use the IOMMU only for specific
 devices (e.g. for the KVM case) r only when remapping is
 needed.

Calgary already does this internally (via calgary=disable=BUSNUM)
but that's pretty ugly. It would be better to do it in a generic
fashion when deciding which dma_ops to call (i.e., a dma_ops per bus
or device).

 Also the user interface for X server case needs more work.

Is anyone working on it?

Cheers,
Muli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Muli Ben-Yehuda
On Tue, Jun 26, 2007 at 08:03:59AM -0700, Arjan van de Ven wrote:
 Muli Ben-Yehuda wrote:
 How much? we have numbers (to be presented at OLS later this week)
 that show that on bare-metal an IOMMU can cost as much as 15%-30% more
 CPU utilization for an IO intensive workload (netperf). It will be
 interesting to see comparable numbers for VT-d.
 
 for VT-d it is a LOT less. I'll let anil give you his data :)

Looking forward to it. Note that this is on a large SMP machine with
Gigabit ethernet, with netperf TCP stream. Comparing numbers for other
benchmarks on other machines is ... less than useful, but the numbers
themeselves are interesting.

Cheers,
Muli

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Arjan van de Ven


Also the user interface for X server case needs more work.



actually with the mode setting of X moving into the kernel... X won't 
use /dev/mem anymore at all

(and I think it mostly already doesn't even without that)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andi Kleen
On Tue, Jun 26, 2007 at 08:15:05AM -0700, Arjan van de Ven wrote:
 
 Also the user interface for X server case needs more work.
 
 
 actually with the mode setting of X moving into the kernel... X won't 
 use /dev/mem anymore at all

We'll see if that happens. It has been talked about forever,
but results are sparse. 

 (and I think it mostly already doesn't even without that)

It uses /sys/bus/pci/* which is not any better as seen from the IOMMU.

Any interface will need to be explicit because user space needs to know which
DMA addresses to put into the hardware. It's not enough to just transparently
translate the mappings.

-Andi

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andi Kleen
On Tue, Jun 26, 2007 at 11:09:40AM -0400, Muli Ben-Yehuda wrote:
 On Tue, Jun 26, 2007 at 05:56:49PM +0200, Andi Kleen wrote:
 
- The IOMMU can merge sg lists into a single virtual block. This could
potentially speed up SG IO when the device is slow walking SG
lists.  [I long ago benchmarked 5% on some block benchmark with
an old MPT Fusion; but it probably depends a lot on the HBA]
   
   But most devices are SG-capable.
  
  Your point being?
 
 That the fact that an IOMMU can do SG for non-SG-capble cards is not
 interesting from a reason for inclusion POV.

You misunderstood me; my point was that some SG capable devices
can go faster if they get shorter SG lists.

But yes for non SG capable devices it is also interesting. I expect
it will obsolete most users of that ugly external patch to allocate large
memory areas for IOs. That's a point I didn't mention earlier.

  Also the user interface for X server case needs more work.
 
 Is anyone working on it?

It's somewhere on the todo list.


-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Keshavamurthy, Anil S
On Tue, Jun 26, 2007 at 11:11:25AM -0400, Muli Ben-Yehuda wrote:
 On Tue, Jun 26, 2007 at 08:03:59AM -0700, Arjan van de Ven wrote:
  Muli Ben-Yehuda wrote:
  How much? we have numbers (to be presented at OLS later this week)
  that show that on bare-metal an IOMMU can cost as much as 15%-30% more
  CPU utilization for an IO intensive workload (netperf). It will be
  interesting to see comparable numbers for VT-d.
  
  for VT-d it is a LOT less. I'll let anil give you his data :)
 
 Looking forward to it. Note that this is on a large SMP machine with
 Gigabit ethernet, with netperf TCP stream. Comparing numbers for other
 benchmarks on other machines is ... less than useful, but the numbers
 themeselves are interesting.
Our initial benchmark results showed we had around 3% extra CPU 
utilization overhead when compared to native(i.e without IOMMU).
Again, our benchmark was on small SMP machine and we used
iperf and a 1G ethernet cards.

Going forward we will do more benchmark tests and will share the
results.

-Anil
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Muli Ben-Yehuda
On Tue, Jun 26, 2007 at 08:48:04AM -0700, Keshavamurthy, Anil S wrote:

 Our initial benchmark results showed we had around 3% extra CPU
 utilization overhead when compared to native(i.e without IOMMU).
 Again, our benchmark was on small SMP machine and we used iperf and
 a 1G ethernet cards.

Please try netperf and a bigger machine for a meaningful comparison :-)
I assume this is with e1000?

 Going forward we will do more benchmark tests and will share the
 results.

Looking forward to it.

Cheers,
Muli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Arjan van de Ven

Andi Kleen wrote:

On Tue, Jun 26, 2007 at 08:15:05AM -0700, Arjan van de Ven wrote:

Also the user interface for X server case needs more work.

actually with the mode setting of X moving into the kernel... X won't 
use /dev/mem anymore at all


We'll see if that happens. It has been talked about forever,
but results are sparse. 


jbarnes posted the code a few weeks ago.




(and I think it mostly already doesn't even without that)


It uses /sys/bus/pci/* which is not any better as seen from the IOMMU.

Any interface will need to be explicit because user space needs to know which
DMA addresses to put into the hardware. It's not enough to just transparently
translate the mappings.


that's what DRM is used for nowadays...
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andi Kleen
 (and I think it mostly already doesn't even without that)
 
 It uses /sys/bus/pci/* which is not any better as seen from the IOMMU.
 
 Any interface will need to be explicit because user space needs to know 
 which
 DMA addresses to put into the hardware. It's not enough to just 
 transparently
 translate the mappings.
 
 that's what DRM is used for nowadays...

But DRM does support much less hardware than the X server?

Perhaps we just need an ioctl where an X server can switch this.

-Andi

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Jesse Barnes
On Tuesday, June 26, 2007 10:31:57 Andi Kleen wrote:
  (and I think it mostly already doesn't even without that)
  
  It uses /sys/bus/pci/* which is not any better as seen from the
   IOMMU.
  
  Any interface will need to be explicit because user space needs to
   know which
  DMA addresses to put into the hardware. It's not enough to just
  transparently
  translate the mappings.
 
  that's what DRM is used for nowadays...

 But DRM does support much less hardware than the X server?

Yeah, the number of DRM drivers is relatively small compared to X or 
fbdev, but for simple DMA they're fairly easy to write.

 Perhaps we just need an ioctl where an X server can switch this.

Switch what?  Turn on or off transparent translation?

Jesse
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-26 Thread Andi Kleen
 
  Perhaps we just need an ioctl where an X server can switch this.
 
 Switch what?  Turn on or off transparent translation?

Turn on/off bypass for its device.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-19 Thread Keshavamurthy, Anil S
Hi All,
This patch supports the upcomming Intel IOMMU hardware
a.k.a. Intel(R) Virtualization Technology for Directed I/O 
Architecture and the hardware spec for the same can be found here
http://www.intel.com/technology/virtualization/index.htm

This version of the patches incorporates several 
feedback obtained from previous postings.

Some of the major changes are
1) Removed resource pool (a.k.a. pre-allocate pool) patch
2) For memory allocation in the DMA map api calls we
   now use kmem_cache_alloc() and get_zeroedpage() functions
   to allocate memory for internal data structures and for 
   page table setup memory.
3) The allocation of memory in the DMA map api calls is 
   very critical and to avoid failures during memory allocation
   in the DMA map api calls we evaluated several technique
   a) mempool - We found that mempool is pretty much useless
  if we try to allocate memory with GFP_ATOMIC which is
 our case. Also we found that it is difficult to judge
 how much to reserver during the creation of mempool.
   b) PF_MEMALLOC - When a task flags (current->flags) are
 set with PF_MEMALLOC then watermark checks are avoided
 during the memory allocation.
  We choose to use the latter (option b) and make this as
  a separate patch which can be debated further. Please
  see patch 6/10.

Other minor changes are mostly coding style fixes and 
making sure that checkpatch.pl passes the patches.

Please include this set of patches for next MM release.

Thanks and regards,
-Anil S Keshavamurthy
E-mail: [EMAIL PROTECTED]

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Intel IOMMU 00/10] Intel IOMMU support, take #2

2007-06-19 Thread Keshavamurthy, Anil S
Hi All,
This patch supports the upcomming Intel IOMMU hardware
a.k.a. Intel(R) Virtualization Technology for Directed I/O 
Architecture and the hardware spec for the same can be found here
http://www.intel.com/technology/virtualization/index.htm

This version of the patches incorporates several 
feedback obtained from previous postings.

Some of the major changes are
1) Removed resource pool (a.k.a. pre-allocate pool) patch
2) For memory allocation in the DMA map api calls we
   now use kmem_cache_alloc() and get_zeroedpage() functions
   to allocate memory for internal data structures and for 
   page table setup memory.
3) The allocation of memory in the DMA map api calls is 
   very critical and to avoid failures during memory allocation
   in the DMA map api calls we evaluated several technique
   a) mempool - We found that mempool is pretty much useless
  if we try to allocate memory with GFP_ATOMIC which is
 our case. Also we found that it is difficult to judge
 how much to reserver during the creation of mempool.
   b) PF_MEMALLOC - When a task flags (current-flags) are
 set with PF_MEMALLOC then watermark checks are avoided
 during the memory allocation.
  We choose to use the latter (option b) and make this as
  a separate patch which can be debated further. Please
  see patch 6/10.

Other minor changes are mostly coding style fixes and 
making sure that checkpatch.pl passes the patches.

Please include this set of patches for next MM release.

Thanks and regards,
-Anil S Keshavamurthy
E-mail: [EMAIL PROTECTED]

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/