Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-29 Thread Benjamin Herrenschmidt

> We have an X driver that does minimal performance costing operations.  
> As we should and will have for our other drivers.

Ok, so you use your own DDX and prevent X vgacrapware to kick in ? Makes
sense.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-27 Thread Zachary Amsden

Benjamin Herrenschmidt wrote:

On Wed, 2007-08-22 at 16:25 +1000, Rusty Russell wrote:
  

On Wed, 2007-08-22 at 08:34 +0300, Avi Kivity wrote:


Zachary Amsden wrote:
  

This patch provides hypercalls for the i386 port I/O instructions,
which vastly helps guests which use native-style drivers.  For certain
VMI workloads, this provides a performance boost of up to 30%.  We
expect KVM and lguest to be able to achieve similar gains on I/O
intensive workloads.

Won't these workloads be better off using paravirtualized drivers? 
i.e., do the native drivers with paravirt I/O instructions get anywhere

near the performance of paravirt drivers?
  

This patch also means I can kill off the emulation code in
drivers/lguest/core.c, which is a real relief.



Hrm... how do you deal with X doing IOs ?

Ben.
  


We have an X driver that does minimal performance costing operations.  
As we should and will have for our other drivers.


Zach

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-27 Thread Benjamin Herrenschmidt
On Wed, 2007-08-22 at 16:25 +1000, Rusty Russell wrote:
> On Wed, 2007-08-22 at 08:34 +0300, Avi Kivity wrote:
> > Zachary Amsden wrote:
> > > This patch provides hypercalls for the i386 port I/O instructions,
> > > which vastly helps guests which use native-style drivers.  For certain
> > > VMI workloads, this provides a performance boost of up to 30%.  We
> > > expect KVM and lguest to be able to achieve similar gains on I/O
> > > intensive workloads.
> > 
> > Won't these workloads be better off using paravirtualized drivers? 
> > i.e., do the native drivers with paravirt I/O instructions get anywhere
> > near the performance of paravirt drivers?
> 
> This patch also means I can kill off the emulation code in
> drivers/lguest/core.c, which is a real relief.

Hrm... how do you deal with X doing IOs ?

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-27 Thread Benjamin Herrenschmidt
On Tue, 2007-08-21 at 22:23 -0700, Zachary Amsden wrote:
> In general, I/O in a virtual guest is subject to performance problems.  
> The I/O can not be completed physically, but must be virtualized.  This 
> means trapping and decoding port I/O instructions from the guest OS.  
> Not only is the trap for a #GP heavyweight, both in the processor and 
> the hypervisor (which usually has a complex #GP path), but this forces 
> the hypervisor to decode the individual instruction which has faulted.  
> Worse, even with hardware assist such as VT, the exit reason alone is 
> not sufficient to determine the true nature of the faulting instruction, 
> requiring a complex and costly instruction decode and simulation.

 .../...

How about userland ? Things like X do IO's typically... You still need
to trap/emulate for these no ?

Cheers,
Ben.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-24 Thread Pavel Machek
Hi!

> >>In general, I/O in a virtual guest is subject to 
> >>performance problems. The I/O can not be completed 
> >>physically, but must be virtualized.  This
> >>means trapping and decoding port I/O instructions from 
> >>the guest OS. Not only is the trap for a #GP 
> >>heavyweight, both in the processor and
> >>the hypervisor (which usually has a complex #GP path), 
> >>but this forces
> >>the hypervisor to decode the individual instruction 
> >>which has faulted. Worse, even with hardware assist 
> >>such as VT, the exit reason alone is
> >>not sufficient to determine the true nature of the 
> >>faulting instruction,
> >>requiring a complex and costly instruction decode and 
> >>simulation.
> >>
> >>This patch provides hypercalls for the i386 port I/O 
> >>instructions, which
> >>vastly helps guests which use native-style drivers.  
> >>For certain VMI
> >>workloads, this provides a performance boost of up to 
> >>30%.  We expect
> >>KVM and lguest to be able to achieve similar gains on 
> >>I/O intensive
> >>workloads.
> >>
> >>
> >
> >What about cost on hardware?
> >  
> 
> On modern hardware, port I/O is about the most expensive 
> thing you can do.  The extra function call cost is 
> totally masked by the stall.  We have measured with port 
> I/O converted like this on real hardware, and have seen 
> zero measurable impact on macro-benchmarks.  
> Micro-benchmarks that generate massively repeated port 
> I/O might show some effect on ancient hardware, but I 
> can't even imagine a workload which does such a thing, 
> other than a polling port I/O loop perhaps - which would 
> not be performance critical in any case I can reasonably 
> imagine.

SCSI controller in ISA slot? IDE without DMA enabled?

Yes, those are performance-critical. The second case seems common with
compactflash cards.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Rusty Russell
On Wed, 2007-08-22 at 22:25 +0100, Alan Cox wrote:
> > I still think it's preferable to change some drivers than everybody.
> > 
> > AFAIK BusLogic as real hardware is pretty much dead anyways,
> > so you're probably the only primary user of it anyways.
> > Go wild on it!
> 
> I don't believe anyone is materially maintaining the buslogic driver and
> in time its going to break completely.
> 
> > Well that might be. I just think it would be a mistake
> > to design paravirt_ops based on someone's short term release engineering
> > considerations.
> 
> Agreed, especially as an interface where each in or out traps into the
> hypervisor is broken even for the model of virtualising hardware. 

I'd really like lguest guests not to do ins and outs, but that's likely
to be more invasive a change than this.  We do it to find the PCI bus
IIRC, and a couple of other early probe bits.

It's just unfortunate that it's the one place lguest has to emulate
because of lack of paravirt_ops coverage.

Rusty.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Andi Kleen
On Wed, Aug 22, 2007 at 05:38:31PM -0700, Jeremy Fitzhardinge wrote:
> Andi Kleen wrote:
> > On Wed, Aug 22, 2007 at 04:14:41PM -0700, Jeremy Fitzhardinge wrote:
> >   
> >> (which would also have VT, since
> >> all new processors do).
> >> 
> >
> > Not true unfortunately. The Intel low end parts like Celerons (which 
> > are actually shipped in very large numbers) don't. Also Intel
> > is still shipping some CPUs that don't support it at all, like
> > the ULV Centrinos which are based on an older core.
> >   
> 
> Likely to be missing VT-d too, right?

VT-d is chipset functionality. So it depends on the chipset.

At least initially the non Intel chipsets and lowend chips are unlikely
to get IOMMUs I guess.

There might be some exceptions. e.g. the GPU vendors seem
to want to to their own IOMMUs, so perhaps graphic devices
might have them anyways.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Jeremy Fitzhardinge
Andi Kleen wrote:
> On Wed, Aug 22, 2007 at 04:14:41PM -0700, Jeremy Fitzhardinge wrote:
>   
>> (which would also have VT, since
>> all new processors do).
>> 
>
> Not true unfortunately. The Intel low end parts like Celerons (which 
> are actually shipped in very large numbers) don't. Also Intel
> is still shipping some CPUs that don't support it at all, like
> the ULV Centrinos which are based on an older core.
>   

Likely to be missing VT-d too, right?

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Andi Kleen
On Wed, Aug 22, 2007 at 04:14:41PM -0700, Jeremy Fitzhardinge wrote:
> (which would also have VT, since
> all new processors do).

Not true unfortunately. The Intel low end parts like Celerons (which 
are actually shipped in very large numbers) don't. Also Intel
is still shipping some CPUs that don't support it at all, like
the ULV Centrinos which are based on an older core.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Jeremy Fitzhardinge
James Courtier-Dutton wrote:
> Ok, so I need to get a new CPU like the Intel Core Duo that has VT
> features? I have an old Pentium 4 at the moment, without any VT features.
>   

No, VT-d (as opposed to VT) is a chipset feature which allows the
hypervisor to control who's allowed to DMA where.  So you'd need a very
new machine with a VT-d capable chipset (which would also have VT, since
all new processors do).

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread James Courtier-Dutton
Chris Wright wrote:
> * James Courtier-Dutton ([EMAIL PROTECTED]) wrote:
>> If one could directly expose a device to the guest, this feature could
>> be extremely useful for me.
>> Is it possible? How would it manage to handle the DMA bus mastering?
> 
> Yes it's possible (Xen supports pci pass through).  Without an IOMMU
> (like Intel VT-d or AMD IOMMU) it's not DMA safe.
> 
> thanks,
> -chris

Ok, so I need to get a new CPU like the Intel Core Duo that has VT
features? I have an old Pentium 4 at the moment, without any VT features.

Kind Regards

James


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Chris Wright
* James Courtier-Dutton ([EMAIL PROTECTED]) wrote:
> Ok, so I need to get a new CPU like the Intel Core Duo that has VT
> features? I have an old Pentium 4 at the moment, without any VT features.

Depends on your goals.  You can certainly give a paravirt Xen guest[1]
physical hardware without any VT extentions.  But that guest will be
able to DMA anywhere in memory without VT-d, so if it's an untrusted
guest you'd be taking a huge risk.

thanks,
-chris

[1] Note: this is with the xenbits.xensource.com kernel, not with a
kernel you'll get from kernel.org ATM.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread James Courtier-Dutton
Jeremy Fitzhardinge wrote:
> Zachary Amsden wrote:
>> This patch provides hypercalls for the i386 port I/O instructions,
>> which vastly helps guests which use native-style drivers.  For certain
>> VMI workloads, this provides a performance boost of up to 30%.  We
>> expect KVM and lguest to be able to achieve similar gains on I/O
>> intensive workloads.
> 
> Two comments:
> 
> - I should dust off my "break up paravirt_ops" patch, and this would fit
> nicely into it (I think we already discussed this)
> 
> - What happens if you *don't* want to pv some of the io instructions? 
> What if you have a device which is directly exposed to the guest?


If one could directly expose a device to the guest, this feature could
be extremely useful for me.
Is it possible? How would it manage to handle the DMA bus mastering?

James
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Chris Wright
* James Courtier-Dutton ([EMAIL PROTECTED]) wrote:
> If one could directly expose a device to the guest, this feature could
> be extremely useful for me.
> Is it possible? How would it manage to handle the DMA bus mastering?

Yes it's possible (Xen supports pci pass through).  Without an IOMMU
(like Intel VT-d or AMD IOMMU) it's not DMA safe.

thanks,
-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Jeremy Fitzhardinge
Zachary Amsden wrote:
> This patch provides hypercalls for the i386 port I/O instructions,
> which vastly helps guests which use native-style drivers.  For certain
> VMI workloads, this provides a performance boost of up to 30%.  We
> expect KVM and lguest to be able to achieve similar gains on I/O
> intensive workloads.

Two comments:

- I should dust off my "break up paravirt_ops" patch, and this would fit
nicely into it (I think we already discussed this)

- What happens if you *don't* want to pv some of the io instructions? 
What if you have a device which is directly exposed to the guest?

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Zachary Amsden

Andi Kleen wrote:


We might benefit from it, but would the 
BusLogic driver?  It sets a nasty precedent for maintenance as different 
hypervisors and emulators hack up different drivers for their own 
performance.



I still think it's preferable to change some drivers than everybody.

AFAIK BusLogic as real hardware is pretty much dead anyways,
so you're probably the only primary user of it anyways.
Go wild on it!
  


It is looking juicy.  Maybe another day.

Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Zachary Amsden

Alan Cox wrote:

I still think it's preferable to change some drivers than everybody.

AFAIK BusLogic as real hardware is pretty much dead anyways,
so you're probably the only primary user of it anyways.
Go wild on it!



I don't believe anyone is materially maintaining the buslogic driver and
in time its going to break completely.
  


I think I was actually the last person to touch it ;)

  

Well that might be. I just think it would be a mistake
to design paravirt_ops based on someone's short term release engineering
considerations.



Agreed, especially as an interface where each in or out traps into the
hypervisor is broken even for the model of virtualising hardware. 
  


Well, it's not necessarily broken, it's just a different model.  At some 
point the cost of maintaining a whole suite of virtual drivers becomes 
greater than leveraging a bunch of legacy drivers.  If you can eliminate 
most of the performance cost of that by changing something at a layer 
below (port I/O), it is a win even if it is not a perfect solution.


But I think I've lost the argument anyways; it doesn't seem to be for 
the greater good of Linux, and there are alternatives we can take.  
Unfortunately for me, they require a lot more work.


Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Alan Cox
> I still think it's preferable to change some drivers than everybody.
> 
> AFAIK BusLogic as real hardware is pretty much dead anyways,
> so you're probably the only primary user of it anyways.
> Go wild on it!

I don't believe anyone is materially maintaining the buslogic driver and
in time its going to break completely.

> Well that might be. I just think it would be a mistake
> to design paravirt_ops based on someone's short term release engineering
> considerations.

Agreed, especially as an interface where each in or out traps into the
hypervisor is broken even for the model of virtualising hardware. 

> I thought we had that already?  But can't find it now :/

pci_iomap() and friends. 

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Andi Kleen
> No, you can't ignore it.  The page protections won't change between the 
> GP and the decoder execution, but the instruction can, causing you to 
> decode into the next page where the processor would not have.  !P 
> becomes obvious, but failure to respect NX or U/S is an exploitable 
> bug.  Put a 1 byte instruction at the end of a page crossing into a NX 
> (or supervisor page).  Remotely, change keep switching between the 
> instruction and a segment override.

Then you do this check only if the instruction crosses a page.
That is very cheap to test.

> Result: user executes instruction on supervisor code page, learning data 
> as a result of this; code on NX page gets executed.

Most systems probably have gaps between user and supervisor
(like Linux), but ok.

> We already have drivers for all of our hardware in Linux.  Most of the 
> hardware we emulate is physical hardware, and there are no virtual 
> drivers for it.  Should we take the BusLogic driver and "paravirtualize" 
> it by adding VMI hypercalls? 

You're proposing instead to paravirtualize all drivers, even if 99.99%
of those will never ever have a driver model.

> We might benefit from it, but would the 
> BusLogic driver?  It sets a nasty precedent for maintenance as different 
> hypervisors and emulators hack up different drivers for their own 
> performance.

I still think it's preferable to change some drivers than everybody.

AFAIK BusLogic as real hardware is pretty much dead anyways,
so you're probably the only primary user of it anyways.
Go wild on it!

If you worry about it do your own drivers like the other hypervisors.
I still suspect you could go faster if you use a paravirtualy
optimized driver, but and I'm not going to speculate on the 
reasons why you don't want to do that.


> Our SCSI and IDE emulation and thus the drivers used by Linux are pretty 
> much fixed in stone; we are not going to go about changing a tricky 
> hardware interface to a virtual one, it is simply too risky for 

You wouldn't need to change it; just add a very simple new one
(e.g. the lguest interface is nearly trivial) 

> There is great advantage in talking to our existing device layer faster, 
> and this is something that is valuable today.

Well that might be. I just think it would be a mistake
to design paravirt_ops based on someone's short term release engineering
considerations.

> 
> >Really LinuxHAL^wparavirt ops is already so complicated that
> >any new hooks need an extremly good justification and that is
> >just not here for this.
> >
> >We can add it if you find an equivalent number of hooks
> >to eliminate.
> >  
> 
> Interesting trade.  What if I sanitized the whole I/O messy macros into 
> something fun and friendly:

That would be a cool project anyways. e.g. just moving 
the NUMAQ support separately would clean it up. But probably not enough
on its own, sorry.

But you'll need separate interfaces anyways if you want to
go down the BusLogic change path. That could well be coupled
with a cleanup.

> We might even be able to get rid of the umpteen different 
> places where drivers wrap iospace access with their own byte / word / 
> long functions so they can switch between port I/O and memory mapped I/O 
> by moving it all into common infrastructure.

I thought we had that already?  But can't find it now :/

> 
> We could make similar (unwelcome?) advances on the pte functions if it 
> were not for the regrettable disconnect between pte_high / pte_low and 
> the rest.  Perhaps if it was hidden in macros?

You want to do what exactly? 

If you mean PAE and non PAE In the same binary: that would likely
need abstracted page tables first.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Zachary Amsden

Andi Kleen wrote:


How is that measured? In a loop? In the same pipeline state?

It seems a little dubious to me.
  


I did the experiments in a controlled environment, with interrupts 
disabled and care to get the pipeline in the same state.  It was a 
perfectly repeatable experiment.  I don't have exact cycle time anymore, 
but they were the tightest measurements I've even seen on cycle counts 
because of the unique nature of serializing the processor for the fault 
/ privilege transition.  I tested a variety of different conditions, 
including different types of #GP (yes, the cost does vary), #NP, #PF, 
sysenter, int $0xxx.  Sysenter was the fastest, by far.  Int was about 
5x the cost.  #GP and friends were all about similar costs.  #PF was the 
most expensive.



  
to verify protection in the page tables mapping the page allows 
execution (P, !NX, and U/S check).  This is a lot more expensive than a 
   

When the page is not executable or not present you get #PF not #GP. 
So the hardware already checks that.


The only case where you would need to check yourself is if you emulate
NX on non NX capable hardware, but I can't see you doing that.
 
  
No, it doesn't.  Between the #GP and decode, you have an SMP race where 
another processor can rewrite the instruction.



That can be ignored imho. If the page goes away you'll notice
when you handle the page fault on read. If it becomes NX then the execution
just happened to be logically a little earlier.

  


No, you can't ignore it.  The page protections won't change between the 
GP and the decoder execution, but the instruction can, causing you to 
decode into the next page where the processor would not have.  !P 
becomes obvious, but failure to respect NX or U/S is an exploitable 
bug.  Put a 1 byte instruction at the end of a page crossing into a NX 
(or supervisor page).  Remotely, change keep switching between the 
instruction and a segment override.


Result: user executes instruction on supervisor code page, learning data 
as a result of this; code on NX page gets executed.



Or easier to just write a backend for the lguest virtio drivers,
that will be likely faster in the end anyways than this gross
hack.
  


We already have drivers for all of our hardware in Linux.  Most of the 
hardware we emulate is physical hardware, and there are no virtual 
drivers for it.  Should we take the BusLogic driver and "paravirtualize" 
it by adding VMI hypercalls?  We might benefit from it, but would the 
BusLogic driver?  It sets a nasty precedent for maintenance as different 
hypervisors and emulators hack up different drivers for their own 
performance.


Our SCSI and IDE emulation and thus the drivers used by Linux are pretty 
much fixed in stone; we are not going to go about changing a tricky 
hardware interface to a virtual one, it is simply too risky for 
something as critical as storage.  We might be able to move our network 
driver over to virtio, but that is not a short-term prospect either.


There is great advantage in talking to our existing device layer faster, 
and this is something that is valuable today.



Really LinuxHAL^wparavirt ops is already so complicated that
any new hooks need an extremly good justification and that is
just not here for this.

We can add it if you find an equivalent number of hooks
to eliminate.
  


Interesting trade.  What if I sanitized the whole I/O messy macros into 
something fun and friendly:


native_port_in(int port, iosize_t opsize, int delay)
native_port_out(int port, iosize_t opsize, u32 output, int delay)
native_port_string_in(int port, void *ptr, iosize_t opsize, unsigned 
count, int delay)
native_port_string_out(int port, void *ptr, iosize_t opsize, unsigned 
count, int delay)


Then we can be rid of all the macro goo in io.h, which frightens my 
mother.  We might even be able to get rid of the umpteen different 
places where drivers wrap iospace access with their own byte / word / 
long functions so they can switch between port I/O and memory mapped I/O 
by moving it all into common infrastructure.


We could make similar (unwelcome?) advances on the pte functions if it 
were not for the regrettable disconnect between pte_high / pte_low and 
the rest.  Perhaps if it was hidden in macros?


Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Andi Kleen
On Wed, Aug 22, 2007 at 10:07:47AM -0700, Zachary Amsden wrote:
> >Also I fail to see the fundamental speed difference between
> >
> >mov index,register
> >int 0x...
> >...
> >switch (register) 
> >case : do emulation
> >  
> 
> Int (on p4 == ~680 cycles).
> 
> >versus
> >
> >out ...
> >#gp
> >-> switch (*eip) {
> >case 0xee:  /* etc. */ 
> > do emulation
> >  
> 
> GP = ~2000 cycles.

How is that measured? In a loop? In the same pipeline state?

It seems a little dubious to me.

> 
> >>to verify protection in the page tables mapping the page allows 
> >>execution (P, !NX, and U/S check).  This is a lot more expensive than a 
> >>
> >
> >When the page is not executable or not present you get #PF not #GP. 
> >So the hardware already checks that.
> >
> >The only case where you would need to check yourself is if you emulate
> >NX on non NX capable hardware, but I can't see you doing that.
> >  
> 
> No, it doesn't.  Between the #GP and decode, you have an SMP race where 
> another processor can rewrite the instruction.

That can be ignored imho. If the page goes away you'll notice
when you handle the page fault on read. If it becomes NX then the execution
just happened to be logically a little earlier.

My other objection to this scheme is that you'll change a zillion
drivers you'll never emulate which seems just stupid. You could
just change the small handful that you emulate to use hypercalls.

Or easier to just write a backend for the lguest virtio drivers,
that will be likely faster in the end anyways than this gross
hack.

Really LinuxHAL^wparavirt ops is already so complicated that
any new hooks need an extremly good justification and that is
just not here for this.

We can add it if you find an equivalent number of hooks
to eliminate.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Alan Cox
> out is usually a single byte. Shouldn't be very expensive
> to decode. In fact it should be roughly equivalent to your
> hypercall multiplex.

Why is a performance critical path on a paravirt kernel even using I/O
instructions and not paravirtual device drivers ?

It clearly makes sense to virtualise I/O operations if you are doing that
(so you can do posting, triggers and predicted reply handling guest side
to keep the trap rate sane) but I don't see why this situation occurs in
the first place for paravirt.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Zachary Amsden

Andi Kleen wrote:

On Wed, Aug 22, 2007 at 09:48:25AM -0700, Zachary Amsden wrote:
  

Andi Kleen wrote:


On Tue, Aug 21, 2007 at 10:23:14PM -0700, Zachary Amsden wrote:
 
  
In general, I/O in a virtual guest is subject to performance problems.  
The I/O can not be completed physically, but must be virtualized.  This 
means trapping and decoding port I/O instructions from the guest OS.  
Not only is the trap for a #GP heavyweight, both in the processor and 
the hypervisor (which usually has a complex #GP path), but this forces 
the hypervisor to decode the individual instruction which has faulted.  
   


Is that really that expensive? Hard to imagine.
 
  
You have an expensive (16x cost of hypercall on some processors) 



Where is the difference comming from? Are you using SYSENTER
for the hypercall?  I can't really see you using SYSENTER,
because how would you do system calls then? I bet system calls
are more frequent than in/out, so if you have decide between the
two using them for syscalls is likely faster.
  


We use sysenter for hypercalls and also for system calls.  :)


Also I fail to see the fundamental speed difference between

mov index,register
int 0x...
...
switch (register) 
case : do emulation
  


Int (on p4 == ~680 cycles).


versus

out ...
#gp
-> switch (*eip) {
case 0xee:  /* etc. */ 
	do emulation
  


GP = ~2000 cycles.

to verify protection in the page tables mapping the page allows 
execution (P, !NX, and U/S check).  This is a lot more expensive than a 



When the page is not executable or not present you get #PF not #GP. 
So the hardware already checks that.


The only case where you would need to check yourself is if you emulate
NX on non NX capable hardware, but I can't see you doing that.
  


No, it doesn't.  Between the #GP and decode, you have an SMP race where 
another processor can rewrite the instruction.


Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Andi Kleen
On Wed, Aug 22, 2007 at 09:48:25AM -0700, Zachary Amsden wrote:
> Andi Kleen wrote:
> >On Tue, Aug 21, 2007 at 10:23:14PM -0700, Zachary Amsden wrote:
> >  
> >>In general, I/O in a virtual guest is subject to performance problems.  
> >>The I/O can not be completed physically, but must be virtualized.  This 
> >>means trapping and decoding port I/O instructions from the guest OS.  
> >>Not only is the trap for a #GP heavyweight, both in the processor and 
> >>the hypervisor (which usually has a complex #GP path), but this forces 
> >>the hypervisor to decode the individual instruction which has faulted.  
> >>
> >
> >Is that really that expensive? Hard to imagine.
> >  
> 
> You have an expensive (16x cost of hypercall on some processors) 

Where is the difference comming from? Are you using SYSENTER
for the hypercall?  I can't really see you using SYSENTER,
because how would you do system calls then? I bet system calls
are more frequent than in/out, so if you have decide between the
two using them for syscalls is likely faster.

For an int XYZ gate i wouldn't expect that much difference to
a #GP fault.

Also I fail to see the fundamental speed difference between

mov index,register
int 0x...
...
switch (register) 
case : do emulation

versus

out ...
#gp
-> switch (*eip) {
case 0xee:  /* etc. */ 
do emulation

> privilege transition, you have to decode the instruction, then you have 

out is usually a single byte. Shouldn't be very expensive
to decode. In fact it should be roughly equivalent to your
hypercall multiplex.

> to verify protection in the page tables mapping the page allows 
> execution (P, !NX, and U/S check).  This is a lot more expensive than a 

When the page is not executable or not present you get #PF not #GP. 
So the hardware already checks that.

The only case where you would need to check yourself is if you emulate
NX on non NX capable hardware, but I can't see you doing that.

> There are 24 different possible I/O operations; sometimes with a port 
> encoded in the instruction, sometimes with input in the DX register, 
> sometimes with a rep prefix, and for 3 different operand sizes.

Most of this is a single byte which is the same as the hypercall
demux. Essentially a table lookup if you use the obvious switch()

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Zachary Amsden

Andi Kleen wrote:

On Tue, Aug 21, 2007 at 10:23:14PM -0700, Zachary Amsden wrote:
  
In general, I/O in a virtual guest is subject to performance problems.  
The I/O can not be completed physically, but must be virtualized.  This 
means trapping and decoding port I/O instructions from the guest OS.  
Not only is the trap for a #GP heavyweight, both in the processor and 
the hypervisor (which usually has a complex #GP path), but this forces 
the hypervisor to decode the individual instruction which has faulted.  



Is that really that expensive? Hard to imagine.
  


You have an expensive (16x cost of hypercall on some processors) 
privilege transition, you have to decode the instruction, then you have 
to verify protection in the page tables mapping the page allows 
execution (P, !NX, and U/S check).  This is a lot more expensive than a 
hypercall.



e.g. you could always have a fast check for inb/outb at the beginning
of the #GP handler. And is your initial #GP entry really more expensive
than a hypercall? 
  


The number of reasons for #GP is enormous, and there are too many paths 
to optimize with fast checks.  We do have a fast check for inb/outb; 
it's just not fast enough.


On P4, hypercall entry is 120 cycles.  #GP is about 2000.  Modern 
processors are better, but a hypercall is always faster than a fault.  
Many times, the hypercall can be handled and ready to return before a 
#GP would even complete.


On workloads that sit there and hammer network cards, these costs become 
significant, and latency sensitive network benchmarks suffer.


Worse, even with hardware assist such as VT, the exit reason alone is 
not sufficient to determine the true nature of the faulting instruction, 
requiring a complex and costly instruction decode and simulation.



It's unclear to me why that should be that costly.

Worst case it's a switch()
  


There are 24 different possible I/O operations; sometimes with a port 
encoded in the instruction, sometimes with input in the DX register, 
sometimes with a rep prefix, and for 3 different operand sizes.


Combine that with the MMU checks required, and it's complex and branchy 
enough to justify short-circuiting the whole thing with a simple hypercall.


Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Zachary Amsden

Avi Kivity wrote:


Since this is only for newer kernels, won't updating the driver to use 
a hypercall be more efficient?  Or is this for existing out-of-tree 
drivers?


Actually, it is for in-tree drivers that we emulate but don't want to 
pollute, and one out of tree driver (that will hopefully be in tree 
soon!) that has no way to determine if making hypercalls is acceptable.


Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Jeff Garzik

Avi Kivity wrote:
And even then, all the performance 
sensitive stuff uses mmio, no?



Depends on the hardware.

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Avi Kivity

Andi Kleen wrote:
Ah.  But that's mostly modules, so real in-core changes should be very 



Yes that's the big difference. Near all paravirt ops are concentrated
on the core kernel, but this one affects lots of people.

And why "but"? -- modules are as important as the core kernel. They're
not second citizens.
  


It's not being second class; simply few modules are loaded at runtime, 
so most of the code impact is on disk.  The in-code impact is small.  If 
paravirt i/o insns are worthwhile, I don't think code size is an issue.


--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Andi Kleen
> Ah.  But that's mostly modules, so real in-core changes should be very 

Yes that's the big difference. Near all paravirt ops are concentrated
on the core kernel, but this one affects lots of people.

And why "but"? -- modules are as important as the core kernel. They're
not second citizens.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Avi Kivity

Andi Kleen wrote:

On Wed, Aug 22, 2007 at 01:23:43PM +0300, Avi Kivity wrote:
  

Andi Kleen wrote:


I don't see why it's intrusive -- they all use the APIs, right?
   


Yes, but it still changes them. It might have a larger impact
on code size for example. 
 
  
Only if CONFIG_PARAVIRT is defined.  



Which eventually distribution kernels will do.

  
And even then, all the performance 
sensitive stuff uses mmio, no?



Not worried about performance, but just impact on code size etc.
  


Ah.  But that's mostly modules, so real in-core changes should be very 
small (say 10 bytes per call site X 10 callsites per driver X 10 
drivers... even if off by an order of magnitude it's still tiny)



--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Andi Kleen
On Wed, Aug 22, 2007 at 01:23:43PM +0300, Avi Kivity wrote:
> Andi Kleen wrote:
> >>I don't see why it's intrusive -- they all use the APIs, right?
> >>
> >
> >Yes, but it still changes them. It might have a larger impact
> >on code size for example. 
> >  
> 
> Only if CONFIG_PARAVIRT is defined.  

Which eventually distribution kernels will do.

> And even then, all the performance 
> sensitive stuff uses mmio, no?

Not worried about performance, but just impact on code size etc.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Avi Kivity

Andi Kleen wrote:

I don't see why it's intrusive -- they all use the APIs, right?



Yes, but it still changes them. It might have a larger impact
on code size for example. 
  


Only if CONFIG_PARAVIRT is defined.  And even then, all the performance 
sensitive stuff uses mmio, no?



--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Andi Kleen
> I don't see why it's intrusive -- they all use the APIs, right?

Yes, but it still changes them. It might have a larger impact
on code size for example. 

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Avi Kivity

Andi Kleen wrote:

This patch also means I can kill off the emulation code in
drivers/lguest/core.c, which is a real relief.



But would it be faster? If not or only insignificant amount I think I would 
prefer you keep it. Hooking IO is quite intrusive because it's done

by so many drivers.
  


I don't see why it's intrusive -- they all use the APIs, right?


--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Andi Kleen
> This patch also means I can kill off the emulation code in
> drivers/lguest/core.c, which is a real relief.

But would it be faster? If not or only insignificant amount I think I would 
prefer you keep it. Hooking IO is quite intrusive because it's done
by so many drivers.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Andi Kleen
On Tue, Aug 21, 2007 at 10:23:14PM -0700, Zachary Amsden wrote:
> In general, I/O in a virtual guest is subject to performance problems.  
> The I/O can not be completed physically, but must be virtualized.  This 
> means trapping and decoding port I/O instructions from the guest OS.  
> Not only is the trap for a #GP heavyweight, both in the processor and 
> the hypervisor (which usually has a complex #GP path), but this forces 
> the hypervisor to decode the individual instruction which has faulted.  

Is that really that expensive? Hard to imagine.

e.g. you could always have a fast check for inb/outb at the beginning
of the #GP handler. And is your initial #GP entry really more expensive
than a hypercall? 

> Worse, even with hardware assist such as VT, the exit reason alone is 
> not sufficient to determine the true nature of the faulting instruction, 
> requiring a complex and costly instruction decode and simulation.

It's unclear to me why that should be that costly.

Worst case it's a switch()

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-22 Thread Avi Kivity

Zachary Amsden wrote:

Avi Kivity wrote:

Zachary Amsden wrote:
 

In general, I/O in a virtual guest is subject to performance
problems.  The I/O can not be completed physically, but must be
virtualized.  This means trapping and decoding port I/O instructions
from the guest OS.  Not only is the trap for a #GP heavyweight, both
in the processor and the hypervisor (which usually has a complex #GP
path), but this forces the hypervisor to decode the individual
instruction which has faulted.  Worse, even with hardware assist such
as VT, the exit reason alone is not sufficient to determine the true
nature of the faulting instruction, requiring a complex and costly
instruction decode and simulation.

This patch provides hypercalls for the i386 port I/O instructions,
which vastly helps guests which use native-style drivers.  For certain
VMI workloads, this provides a performance boost of up to 30%.  We
expect KVM and lguest to be able to achieve similar gains on I/O
intensive workloads.





Won't these workloads be better off using paravirtualized drivers? 
i.e., do the native drivers with paravirt I/O instructions get anywhere

near the performance of paravirt drivers?
  


Yes, in general, this is true (better off with paravirt drivers).  
However, we have "paravirt" drivers which run in both 
fully-paravirtualized and fully traditionally virtualized 
environments.  As a result, they use native port I/O operations to 
interact with virtual hardware.


Suffering from terminology overdose here: "fully traditionally 
virtualized, fully-paravirtuallized, para-fullyvirtualized".


Since this is only for newer kernels, won't updating the driver to use a 
hypercall be more efficient?  Or is this for existing out-of-tree drivers?




Since not all hypervisors have paravirtualized driver infrastructures 
and guest O/S support yet, these hypercalls can be advantages to a 
wide range of scenarios.  Using I/O hypercalls as such gives exactly 
the same performance as paravirt drivers for us, by eliminating the 
costly decode path, and the simplicity of using the same driver code 
makes this a huge win in code complexity.


Ah, seems the answer to the last question is yes.


--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-21 Thread Rusty Russell
On Wed, 2007-08-22 at 08:34 +0300, Avi Kivity wrote:
> Zachary Amsden wrote:
> > This patch provides hypercalls for the i386 port I/O instructions,
> > which vastly helps guests which use native-style drivers.  For certain
> > VMI workloads, this provides a performance boost of up to 30%.  We
> > expect KVM and lguest to be able to achieve similar gains on I/O
> > intensive workloads.
> 
> Won't these workloads be better off using paravirtualized drivers? 
> i.e., do the native drivers with paravirt I/O instructions get anywhere
> near the performance of paravirt drivers?

This patch also means I can kill off the emulation code in
drivers/lguest/core.c, which is a real relief.

Cheers,
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-21 Thread Zachary Amsden

H. Peter Anvin wrote:

Zachary Amsden wrote:
  
In general, I/O in a virtual guest is subject to performance problems. 
The I/O can not be completed physically, but must be virtualized.  This
means trapping and decoding port I/O instructions from the guest OS. 
Not only is the trap for a #GP heavyweight, both in the processor and

the hypervisor (which usually has a complex #GP path), but this forces
the hypervisor to decode the individual instruction which has faulted. 
Worse, even with hardware assist such as VT, the exit reason alone is

not sufficient to determine the true nature of the faulting instruction,
requiring a complex and costly instruction decode and simulation.

This patch provides hypercalls for the i386 port I/O instructions, which
vastly helps guests which use native-style drivers.  For certain VMI
workloads, this provides a performance boost of up to 30%.  We expect
KVM and lguest to be able to achieve similar gains on I/O intensive
workloads.




What about cost on hardware?
  


On modern hardware, port I/O is about the most expensive thing you can 
do.  The extra function call cost is totally masked by the stall.  We 
have measured with port I/O converted like this on real hardware, and 
have seen zero measurable impact on macro-benchmarks.  Micro-benchmarks 
that generate massively repeated port I/O might show some effect on 
ancient hardware, but I can't even imagine a workload which does such a 
thing, other than a polling port I/O loop perhaps - which would not be 
performance critical in any case I can reasonably imagine.


Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-21 Thread H. Peter Anvin
Zachary Amsden wrote:
> In general, I/O in a virtual guest is subject to performance problems. 
> The I/O can not be completed physically, but must be virtualized.  This
> means trapping and decoding port I/O instructions from the guest OS. 
> Not only is the trap for a #GP heavyweight, both in the processor and
> the hypervisor (which usually has a complex #GP path), but this forces
> the hypervisor to decode the individual instruction which has faulted. 
> Worse, even with hardware assist such as VT, the exit reason alone is
> not sufficient to determine the true nature of the faulting instruction,
> requiring a complex and costly instruction decode and simulation.
> 
> This patch provides hypercalls for the i386 port I/O instructions, which
> vastly helps guests which use native-style drivers.  For certain VMI
> workloads, this provides a performance boost of up to 30%.  We expect
> KVM and lguest to be able to achieve similar gains on I/O intensive
> workloads.
> 

What about cost on hardware?

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-21 Thread Zachary Amsden

Avi Kivity wrote:

Zachary Amsden wrote:
  

In general, I/O in a virtual guest is subject to performance
problems.  The I/O can not be completed physically, but must be
virtualized.  This means trapping and decoding port I/O instructions
from the guest OS.  Not only is the trap for a #GP heavyweight, both
in the processor and the hypervisor (which usually has a complex #GP
path), but this forces the hypervisor to decode the individual
instruction which has faulted.  Worse, even with hardware assist such
as VT, the exit reason alone is not sufficient to determine the true
nature of the faulting instruction, requiring a complex and costly
instruction decode and simulation.

This patch provides hypercalls for the i386 port I/O instructions,
which vastly helps guests which use native-style drivers.  For certain
VMI workloads, this provides a performance boost of up to 30%.  We
expect KVM and lguest to be able to achieve similar gains on I/O
intensive workloads.





Won't these workloads be better off using paravirtualized drivers? 
i.e., do the native drivers with paravirt I/O instructions get anywhere

near the performance of paravirt drivers?
  


Yes, in general, this is true (better off with paravirt drivers).  
However, we have "paravirt" drivers which run in both 
fully-paravirtualized and fully traditionally virtualized environments.  
As a result, they use native port I/O operations to interact with 
virtual hardware.


Since not all hypervisors have paravirtualized driver infrastructures 
and guest O/S support yet, these hypercalls can be advantages to a wide 
range of scenarios.  Using I/O hypercalls as such gives exactly the same 
performance as paravirt drivers for us, by eliminating the costly decode 
path, and the simplicity of using the same driver code makes this a huge 
win in code complexity.


Zach

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add I/O hypercalls for i386 paravirt

2007-08-21 Thread Avi Kivity
Zachary Amsden wrote:
> In general, I/O in a virtual guest is subject to performance
> problems.  The I/O can not be completed physically, but must be
> virtualized.  This means trapping and decoding port I/O instructions
> from the guest OS.  Not only is the trap for a #GP heavyweight, both
> in the processor and the hypervisor (which usually has a complex #GP
> path), but this forces the hypervisor to decode the individual
> instruction which has faulted.  Worse, even with hardware assist such
> as VT, the exit reason alone is not sufficient to determine the true
> nature of the faulting instruction, requiring a complex and costly
> instruction decode and simulation.
>
> This patch provides hypercalls for the i386 port I/O instructions,
> which vastly helps guests which use native-style drivers.  For certain
> VMI workloads, this provides a performance boost of up to 30%.  We
> expect KVM and lguest to be able to achieve similar gains on I/O
> intensive workloads.
>


Won't these workloads be better off using paravirtualized drivers? 
i.e., do the native drivers with paravirt I/O instructions get anywhere
near the performance of paravirt drivers?


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/