Re: [Qemu-devel] Ideas wiki for GSoC 2010
On 03/10/2010 11:30 PM, Luiz Capitulino wrote: Hi there, Our wiki page for the Summer of Code 2010 is doing quite well: http://wiki.qemu.org/Google_Summer_of_Code_2010 I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On 03/15/2010 02:38 PM, Joerg Roedel wrote: On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote: On 03/10/2010 11:30 PM, Luiz Capitulino wrote: Hi there, Our wiki page for the Summer of Code 2010 is doing quite well: http://wiki.qemu.org/Google_Summer_of_Code_2010 I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Good idea. If there is interest I could help to mentor this project. Thanks. I volunteered Anthony, but he may be a little overcommitted. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On 03/15/2010 03:03 PM, Joerg Roedel wrote: I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Our experiments show that nested device assignment is pretty much required for I/O performance in nested scenarios. Really? I did a small test with virtio-blk in a nested guest (disk read with dd, so not a real benchmark) and got a reasonable read-performance of around 25MB/s from the disk in the l2-guest. Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit. I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we can do for other guests. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On 03/15/2010 07:42 AM, Avi Kivity wrote: On 03/15/2010 02:38 PM, Joerg Roedel wrote: On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote: On 03/10/2010 11:30 PM, Luiz Capitulino wrote: Hi there, Our wiki page for the Summer of Code 2010 is doing quite well: http://wiki.qemu.org/Google_Summer_of_Code_2010 I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Good idea. If there is interest I could help to mentor this project. Thanks. I volunteered Anthony, but he may be a little overcommitted. Joerg, feel free to put your name against too. Regards, Anthony Liguori
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On 03/15/2010 08:11 AM, Avi Kivity wrote: On 03/15/2010 03:03 PM, Joerg Roedel wrote: I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Our experiments show that nested device assignment is pretty much required for I/O performance in nested scenarios. Really? I did a small test with virtio-blk in a nested guest (disk read with dd, so not a real benchmark) and got a reasonable read-performance of around 25MB/s from the disk in the l2-guest. Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit. I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we can do for other guests. VMREAD/VMWRITEs are generally optimized by hypervisors as they tend to be costly. KVM is a bit unusual in terms of how many times the instructions are executed per exit. Regards, Anthony Liguori
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On 03/15/2010 08:24 AM, Joerg Roedel wrote: On Mon, Mar 15, 2010 at 03:11:42PM +0200, Avi Kivity wrote: On 03/15/2010 03:03 PM, Joerg Roedel wrote: I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Our experiments show that nested device assignment is pretty much required for I/O performance in nested scenarios. Really? I did a small test with virtio-blk in a nested guest (disk read with dd, so not a real benchmark) and got a reasonable read-performance of around 25MB/s from the disk in the l2-guest. Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit. I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we can do for other guests. Does it matter for the ept-on-ept case? The initial patchset of nested-vmx implemented it and they reported a performance drop of around 12% between levels which is reasonable. So I expected the loss of io-performance for l2 also reasonable in this case. My small measurement was also done using npt-on-npt. But that was something like kernbench IIRC which is actually exit light once ept is enabled. Network IO is typically exit heavy and becomes something more of a pathological work load (both for nested ept and nested npt). Regards, Anthony Liguori Joerg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On 03/15/2010 03:23 PM, Anthony Liguori wrote: On 03/15/2010 08:11 AM, Avi Kivity wrote: On 03/15/2010 03:03 PM, Joerg Roedel wrote: I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Our experiments show that nested device assignment is pretty much required for I/O performance in nested scenarios. Really? I did a small test with virtio-blk in a nested guest (disk read with dd, so not a real benchmark) and got a reasonable read-performance of around 25MB/s from the disk in the l2-guest. Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit. I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we can do for other guests. VMREAD/VMWRITEs are generally optimized by hypervisors as they tend to be costly. KVM is a bit unusual in terms of how many times the instructions are executed per exit. Do you know offhand of any unnecessary read/writes? There's update_cr8_intercept(), but on normal exits, I don't see what else we can remove. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote: On 03/10/2010 11:30 PM, Luiz Capitulino wrote: Hi there, Our wiki page for the Summer of Code 2010 is doing quite well: http://wiki.qemu.org/Google_Summer_of_Code_2010 I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Good idea. If there is interest I could help to mentor this project. Joerg
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote: On 03/10/2010 11:30 PM, Luiz Capitulino wrote: Hi there, Our wiki page for the Summer of Code 2010 is doing quite well: http://wiki.qemu.org/Google_Summer_of_Code_2010 I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Our experiments show that nested device assignment is pretty much required for I/O performance in nested scenarios. Cheers, Muli
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On Mon, Mar 15, 2010 at 05:53:13AM -0700, Muli Ben-Yehuda wrote: On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote: On 03/10/2010 11:30 PM, Luiz Capitulino wrote: Hi there, Our wiki page for the Summer of Code 2010 is doing quite well: http://wiki.qemu.org/Google_Summer_of_Code_2010 I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Our experiments show that nested device assignment is pretty much required for I/O performance in nested scenarios. Really? I did a small test with virtio-blk in a nested guest (disk read with dd, so not a real benchmark) and got a reasonable read-performance of around 25MB/s from the disk in the l2-guest. Joerg
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On Mon, Mar 15, 2010 at 03:11:42PM +0200, Avi Kivity wrote: On 03/15/2010 03:03 PM, Joerg Roedel wrote: I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Our experiments show that nested device assignment is pretty much required for I/O performance in nested scenarios. Really? I did a small test with virtio-blk in a nested guest (disk read with dd, so not a real benchmark) and got a reasonable read-performance of around 25MB/s from the disk in the l2-guest. Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit. I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we can do for other guests. Does it matter for the ept-on-ept case? The initial patchset of nested-vmx implemented it and they reported a performance drop of around 12% between levels which is reasonable. So I expected the loss of io-performance for l2 also reasonable in this case. My small measurement was also done using npt-on-npt. Joerg
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On Mon, Mar 15, 2010 at 02:03:11PM +0100, Joerg Roedel wrote: On Mon, Mar 15, 2010 at 05:53:13AM -0700, Muli Ben-Yehuda wrote: On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote: On 03/10/2010 11:30 PM, Luiz Capitulino wrote: Hi there, Our wiki page for the Summer of Code 2010 is doing quite well: http://wiki.qemu.org/Google_Summer_of_Code_2010 I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Our experiments show that nested device assignment is pretty much required for I/O performance in nested scenarios. Really? I did a small test with virtio-blk in a nested guest (disk read with dd, so not a real benchmark) and got a reasonable read-performance of around 25MB/s from the disk in the l2-guest. Netperf running in L1 with direct access: ~950 Mbps throughput with 25% CPU utilization. Netperf running in L2 with virtio between L2 and L1 and direct assignment between L1 and L0: roughly the same throughput, but over 90% CPU utilization! Now extrapolate to 10GbE. Cheers, Muli
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On Mon, Mar 15, 2010 at 08:14:29AM -0500, Anthony Liguori wrote: On 03/15/2010 07:42 AM, Avi Kivity wrote: On 03/15/2010 02:38 PM, Joerg Roedel wrote: On Mon, Mar 15, 2010 at 02:25:41PM +0200, Avi Kivity wrote: On 03/10/2010 11:30 PM, Luiz Capitulino wrote: Hi there, Our wiki page for the Summer of Code 2010 is doing quite well: http://wiki.qemu.org/Google_Summer_of_Code_2010 I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Good idea. If there is interest I could help to mentor this project. Thanks. I volunteered Anthony, but he may be a little overcommitted. Joerg, feel free to put your name against too. [x] Done. Thanks, Joerg
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On 03/15/2010 10:06 AM, Avi Kivity wrote: On 03/15/2010 03:23 PM, Anthony Liguori wrote: On 03/15/2010 08:11 AM, Avi Kivity wrote: On 03/15/2010 03:03 PM, Joerg Roedel wrote: I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Our experiments show that nested device assignment is pretty much required for I/O performance in nested scenarios. Really? I did a small test with virtio-blk in a nested guest (disk read with dd, so not a real benchmark) and got a reasonable read-performance of around 25MB/s from the disk in the l2-guest. Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit. I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we can do for other guests. VMREAD/VMWRITEs are generally optimized by hypervisors as they tend to be costly. KVM is a bit unusual in terms of how many times the instructions are executed per exit. Do you know offhand of any unnecessary read/writes? There's update_cr8_intercept(), but on normal exits, I don't see what else we can remove. Yeah, there are a number of examples. vmcs_clear_bits() and vmcs_set_bits() read a field of the VMCS and then immediately writes it. This is unnecessary as the same information could be kept in a shadow variable. In vmx_fpu_activate, we call vmcs_clear_bits() followed immediately by vmcs_set_bits(). which means we're reading GUEST_CR0 twice and writing it twice. vmx_get_rflags() reads from the VMCS and we frequently call get_rflags() followed by a set_rflags() to update a bit. We also don't cache the value between calls and there's a few spots in the code that make multiple calls. Regards, Anthony Liguori
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On 03/16/2010 03:21 AM, Anthony Liguori wrote: On 03/15/2010 10:06 AM, Avi Kivity wrote: On 03/15/2010 03:23 PM, Anthony Liguori wrote: On 03/15/2010 08:11 AM, Avi Kivity wrote: On 03/15/2010 03:03 PM, Joerg Roedel wrote: I will add another project - iommu emulation. Could be very useful for doing device assignment to nested guests, which could make testing a lot easier. Our experiments show that nested device assignment is pretty much required for I/O performance in nested scenarios. Really? I did a small test with virtio-blk in a nested guest (disk read with dd, so not a real benchmark) and got a reasonable read-performance of around 25MB/s from the disk in the l2-guest. Your guest wasn't doing a zillion VMREADs and VMWRITEs every exit. I plan to reduce VMREAD/VMWRITE overhead for kvm, but not much we can do for other guests. VMREAD/VMWRITEs are generally optimized by hypervisors as they tend to be costly. KVM is a bit unusual in terms of how many times the instructions are executed per exit. Do you know offhand of any unnecessary read/writes? There's update_cr8_intercept(), but on normal exits, I don't see what else we can remove. Yeah, there are a number of examples. vmcs_clear_bits() and vmcs_set_bits() read a field of the VMCS and then immediately writes it. This is unnecessary as the same information could be kept in a shadow variable. In vmx_fpu_activate, we call vmcs_clear_bits() followed immediately by vmcs_set_bits(). which means we're reading GUEST_CR0 twice and writing it twice. This should be much better these days (2.6.34-rc1) as vmx_fpu_activate() is called at most once per heavyweight exit (and I have evil plans to reduce it even further). Still, that code should be optimized. vmx_get_rflags() reads from the VMCS and we frequently call get_rflags() followed by a set_rflags() to update a bit. We also don't cache the value between calls and there's a few spots in the code that make multiple calls. We definitely should cache that (and segment access from the emulator as well). But I'd have thought this to be relatively infrequent. At least with Linux, using x2apic and virtio allows you to eliminate most emulator access, if you have npt or ept. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.
Re: [Qemu-devel] Ideas wiki for GSoC 2010
Hi Luiz. Around the time when I introduced the new Asynchronous monitor command API we had talked about converting all commands to use this new API so that we can cut down on duplicate code paths and confusing code. I would like to propose this as a GSoC project idea. Do you think it should stand as its own project or should we merge it into your Convert Monitor commands to the QObject API project? -- Thanks, Adam
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On Fri, 12 Mar 2010 09:11:31 -0600 Adam Litke a...@us.ibm.com wrote: Hi Luiz. Around the time when I introduced the new Asynchronous monitor command API we had talked about converting all commands to use this new API so that we can cut down on duplicate code paths and confusing code. I would like to propose this as a GSoC project idea. Do you think it should stand as its own project or should we merge it into your Convert Monitor commands to the QObject API project? I think it's a project by itself, but I wonder if it's too easy/short for GSoC. An experienced programmer can do the conversion plus testing in a day or two. There are probably a number of cleanups and adaptions that can take more, but still seems too short.
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On Fri, 2010-03-12 at 12:22 -0300, Luiz Capitulino wrote: On Fri, 12 Mar 2010 09:11:31 -0600 Adam Litke a...@us.ibm.com wrote: Hi Luiz. Around the time when I introduced the new Asynchronous monitor command API we had talked about converting all commands to use this new API so that we can cut down on duplicate code paths and confusing code. I would like to propose this as a GSoC project idea. Do you think it should stand as its own project or should we merge it into your Convert Monitor commands to the QObject API project? I think it's a project by itself, but I wonder if it's too easy/short for GSoC. An experienced programmer can do the conversion plus testing in a day or two. There are probably a number of cleanups and adaptions that can take more, but still seems too short. So given the relatively small scope of this additional work, maybe it should be an additional stretch goal to be added to your project. Once the student(s) have gone through the trouble to familiarize themselves with the monitor code, they would be well-positioned to complete this extra bit. How difficult do you imagine it will be to convert the remaining commands over to QObject? -- Thanks, Adam
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On Fri, 12 Mar 2010 09:38:47 -0600 Adam Litke a...@us.ibm.com wrote: On Fri, 2010-03-12 at 12:22 -0300, Luiz Capitulino wrote: On Fri, 12 Mar 2010 09:11:31 -0600 Adam Litke a...@us.ibm.com wrote: Hi Luiz. Around the time when I introduced the new Asynchronous monitor command API we had talked about converting all commands to use this new API so that we can cut down on duplicate code paths and confusing code. I would like to propose this as a GSoC project idea. Do you think it should stand as its own project or should we merge it into your Convert Monitor commands to the QObject API project? I think it's a project by itself, but I wonder if it's too easy/short for GSoC. An experienced programmer can do the conversion plus testing in a day or two. There are probably a number of cleanups and adaptions that can take more, but still seems too short. So given the relatively small scope of this additional work, maybe it should be an additional stretch goal to be added to your project. It could be, our we could try to come up with something additional. Once the student(s) have gone through the trouble to familiarize themselves with the monitor code, they would be well-positioned to complete this extra bit. How difficult do you imagine it will be to convert the remaining commands over to QObject? Well, I won't set the goal to convert all of them, because seems too much work and Anthony has said that we may not want all the handlers available under QMP. So, this has to be discussed (preferably before GSoC starts for students). Also, I have two other projects that could related to the async conversion: - Simplify/Improve the QObject API - Improve error handling (QError conversion involved)
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On 03/12/2010 09:56 AM, Luiz Capitulino wrote: Once the student(s) have gone through the trouble to familiarize themselves with the monitor code, they would be well-positioned to complete this extra bit. How difficult do you imagine it will be to convert the remaining commands over to QObject? Well, I won't set the goal to convert all of them, because seems too much work and Anthony has said that we may not want all the handlers available under QMP. So, this has to be discussed (preferably before GSoC starts for students). What I would like to see is a clean break between the human monitor and QMP whereas the human monitor is implemented in terms of QMP. For instance, the x and xp commands are not very useful for QMP. However, a generic memory read/write API would be pretty useful. The x/xp commands would be implemented in terms of the memory QMP API. Likewise, the sum command can be implemented in terms of the above API. Regards, Anthony Liguori Also, I have two other projects that could related to the async conversion: - Simplify/Improve the QObject API - Improve error handling (QError conversion involved)
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On Fri, 12 Mar 2010 10:36:53 -0600 Anthony Liguori aligu...@linux.vnet.ibm.com wrote: On 03/12/2010 09:56 AM, Luiz Capitulino wrote: Once the student(s) have gone through the trouble to familiarize themselves with the monitor code, they would be well-positioned to complete this extra bit. How difficult do you imagine it will be to convert the remaining commands over to QObject? Well, I won't set the goal to convert all of them, because seems too much work and Anthony has said that we may not want all the handlers available under QMP. So, this has to be discussed (preferably before GSoC starts for students). What I would like to see is a clean break between the human monitor and QMP whereas the human monitor is implemented in terms of QMP. For instance, the x and xp commands are not very useful for QMP. However, a generic memory read/write API would be pretty useful. The x/xp commands would be implemented in terms of the memory QMP API. Likewise, the sum command can be implemented in terms of the above API. Right, makes sense. So, we need to convert them all but maybe not make all of them available under QMP.
Re: [Qemu-devel] Ideas wiki for GSoC 2010
On Wed, 2010-03-10 at 18:30 -0300, Luiz Capitulino wrote: Hi there, Our wiki page for the Summer of Code 2010 is doing quite well: http://wiki.qemu.org/Google_Summer_of_Code_2010 Just to let you guys know that I'm going to give a talk at the local university (Unicamp) about kvm autotest, and will spread the word about the qemu and kvm summer of code applications, will incentivate the students to apply for qemu and kvm. The university was the 2nd overall place on number of student proposals accepted on gsoc for the last couple of years, with an excellent completion rate, so I believe we could have some good work coming out of it. Now the most important is: 1. Get mentors assigned to projects. Just put your name and email in the right field. It's ok and even desirable to have two mentors per project, but please remember that mentoring is serious work, more info here: http://code.google.com/p/google-summer-of-code/wiki/AdviceforMentors http://gsoc-wiki.osuosl.org/index.php/Main_Page 2. Do we have kvm-specific projects? Can they be part of the QEMU project or do we need a different mentoring organization for it? 3. Fill in the missing information for the suggested project (description, skill level, languages, etc) I will complete our application tomorrow or on Friday. PS: I'm CC'ing everyone who suggested projects there, except one or two I couldn't find the email address.