[ kvm-Bugs-2525768 ] kvm image corruption
Bugs item #2525768, was opened at 2009-01-21 16:03 Message generated for change (Comment added) made by danv You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2525768group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 8 Private: No Submitted By: Daniel van Vugt (danv) Assigned to: Nobody/Anonymous (nobody) Summary: kvm image corruption Initial Comment: Creating a new bug from my original problems detailed in bug 2490866, because that technically is a different problem. Reproduced with kvm-83 and kvm-82 vanilla. Host: Ubuntu 8.04 amd64, Intel Q6600 Guests: Windows 2003 Server, Windows 7 beta at least. After shutting down the guest cleanly, it's image file is (often but not always) corrupt and unbootable. A fact confirmed by the change in information reported by qemu-img info Example 1 (qcow2 image): $ qemu-img info windows7beta.qcow2 image: windows7beta.qcow2 file format: qcow2 virtual size: 20G (21474836480 bytes) disk size: 5.0G cluster_size: 4096 Snapshot list: ID TAG VM SIZE DATE VM CLOCK 1 Fresh_install 1.3M 2009-01-09 15:34:31 00:00:00.000 2 Activated_kvm-82 1.3M 2009-01-09 15:43:27 00:00:00.000 $ qemu -m 512 -usbdevice tablet -redir tcp:3389::3389 windows7beta.qcow2 (use and then shut down windows) $ qemu-img info windows7beta.qcow2 image: windows7beta.qcow2 file format: raw INVALID virtual size: 5.0G (5353566208 bytes) - INVALID disk size: 5.0G $ qemu -m 512 -usbdevice tablet -redir tcp:3389::3389 windows7beta.qcow2 -S (qemu) info snapshots Snapshot devices: ide0-hd0 bdrv_snapshot_list: error -95 (And the image won't boot) Example 2 (raw image converted from Example 1): $ qemu-img info windows7beta.raw image: windows7beta.raw file format: raw virtual size: 20G (21474836480 bytes) disk size: 4.7G (Install and test lots on the guest) $ qemu-img info windows7beta.raw image: windows7beta.raw file format: raw virtual size: 7.5G (8049315840 bytes) - INVALID disk size: 7.5G (And now the image won't boot) -- Comment By: Daniel van Vugt (danv) Date: 2010-11-27 16:04 Message: I don't have access to these VMs any more. Also I don't _remember_ experiencing this problem for quite a while. Assume it's fixed because I can no longer test or verify the bug. Close please... -- Comment By: Jes Sorensen (jessorensen) Date: 2010-11-26 16:28 Message: Hi, There has been a lot of fixes to the qcow2 code since your bug report - are you still able to reproduce it, or can we close the bug? Thanks, Jes -- Comment By: Daniel van Vugt (danv) Date: 2009-05-06 16:14 Message: Sad to say... I have reproduced the bug on kvm-85, qemu-0.10.0, qemu-0.10.2 and qemu-0.10.3. They all still corrupt some qcow2 images. -- Comment By: Daniel van Vugt (danv) Date: 2009-02-16 11:16 Message: Confirmed the bug is still present in kvm-84. Just booting up my guest, opening the browser and then shutting down turns the qcow2 image into a corrupt unbootable raw image file. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2525768group_id=180599 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 00/21] Kemari for KVM 0.2
2010/11/27 Stefan Hajnoczi stefa...@gmail.com: On Sat, Nov 27, 2010 at 4:29 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: 2010/11/27 Blue Swirl blauwir...@gmail.com: On Thu, Nov 25, 2010 at 6:06 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: Hi, This patch series is a revised version of Kemari for KVM, which applied comments for the previous post and KVM Forum 2010. The current code is based on qemu.git f711df67d611e4762966a249742a5f7499e19f99. For general information about Kemari, I've made a wiki page at qemu.org. http://wiki.qemu.org/Features/FaultTolerance The changes from v0.1.1 - v0.2 are: - Introduce a queue in event-tap to make VM sync live. - Change transaction receiver to a state machine for async receiving. - Replace net/block layer functions with event-tap proxy functions. - Remove dirty bitmap optimization for now. - convert DPRINTF() in ft_trans_file to trace functions. - convert fprintf() in ft_trans_file to error_report(). - improved error handling in ft_trans_file. - add a tmp pointer to qemu_del_vm_change_state_handler. The changes from v0.1 - v0.1.1 are: - events are tapped in net/block layer instead of device emulation layer. - Introduce a new option for -incoming to accept FT transaction. - Removed writev() support to QEMUFile and FdMigrationState for now. I would post this work in a different series. - Modified virtio-blk save/load handler to send inuse variable to correctly replay. - Removed configure --enable-ft-mode. - Removed unnecessary check for qemu_realloc(). The first 6 patches modify several functions of qemu to prepare introducing Kemari specific components. The next 6 patches are the components of Kemari. They introduce event-tap and the FT transaction protocol file based on buffered file. The design document of FT transaction protocol can be found at, http://wiki.qemu.org/images/b/b1/Kemari_sender_receiver_0.5a.pdf Then the following 4 patches modifies dma-helpers, virtio-blk virtio-net and e1000 to replace net/block layer functions with event-tap proxy functions. Please note that if Kemari is off, event-tap will just passthrough, and there is most no intrusion to exisiting functions including normal live migration. Would it be possible to make the changes only in the block/net layer, so that the devices are not modified at all? That is, the proxy function would always replaces the unproxied version. I understand the benefit of your suggestion. However it seems a bit tricky. It's because event-tap uses functions of emulators and net, but block.c is also linked for utilities like qemu-img that doesn't need emulators or net. In the previous version, I added function pointers to get around. http://lists.nongnu.org/archive/html/qemu-devel/2010-05/msg02378.html I wasn't confident of this approach and discussed it at KVM Forum, and decided to give a try to replace emulator functions with proxies. Suggestions are welcomed of course. Somehow I find some similarities to instrumentation patches. Perhaps the instrumentation framework could be used (maybe with some changes) for Kemari as well? That could be beneficial to both. Yes. I had the same idea but I'm not sure how tracing works. I think Stefan Hajnoczi knows it better. Stefan, is it possible to call arbitrary functions from the trace points? Yes, if you add code to ./tracetool. I'm not sure I see the connection between Kemari and tracing though. The connection is that it may be possible to remove Kemari specific hook point like in ioport.c and exec.c, and let tracing notify Kemari instead. One question I have about Kemari is whether it adds new constraints to the QEMU codebase? Fault tolerance seems like a cross-cutting concern - everyone writing device emulation or core QEMU code may need to be aware of new constraints. For example, you are not allowed to release I/O operations to the outside world directly, instead you need to go through Kemari code which makes I/O transactional and communicates with the passive host. You have converted e1000, virtio-net, and virtio-blk. How do we make sure new devices that are merged into qemu.git don't break Kemari? How do we go about supporting the existing hw/* devices? Whether Kemari adds constraints such as you mentioned, yes. If the devices (including existing ones) don't call Kemari code, they would certainly break Kemari. Altough using proxies looks explicit, to make it unaware from people writing device emulation, it's possible to remove proxies and put changes only into the block/net layer as Blue suggested. Yoshi Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info
Re: [Qemu-devel] [PATCH 00/21] Kemari for KVM 0.2
On Sat, Nov 27, 2010 at 8:53 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: 2010/11/27 Stefan Hajnoczi stefa...@gmail.com: On Sat, Nov 27, 2010 at 4:29 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: 2010/11/27 Blue Swirl blauwir...@gmail.com: On Thu, Nov 25, 2010 at 6:06 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: Hi, This patch series is a revised version of Kemari for KVM, which applied comments for the previous post and KVM Forum 2010. The current code is based on qemu.git f711df67d611e4762966a249742a5f7499e19f99. For general information about Kemari, I've made a wiki page at qemu.org. http://wiki.qemu.org/Features/FaultTolerance The changes from v0.1.1 - v0.2 are: - Introduce a queue in event-tap to make VM sync live. - Change transaction receiver to a state machine for async receiving. - Replace net/block layer functions with event-tap proxy functions. - Remove dirty bitmap optimization for now. - convert DPRINTF() in ft_trans_file to trace functions. - convert fprintf() in ft_trans_file to error_report(). - improved error handling in ft_trans_file. - add a tmp pointer to qemu_del_vm_change_state_handler. The changes from v0.1 - v0.1.1 are: - events are tapped in net/block layer instead of device emulation layer. - Introduce a new option for -incoming to accept FT transaction. - Removed writev() support to QEMUFile and FdMigrationState for now. I would post this work in a different series. - Modified virtio-blk save/load handler to send inuse variable to correctly replay. - Removed configure --enable-ft-mode. - Removed unnecessary check for qemu_realloc(). The first 6 patches modify several functions of qemu to prepare introducing Kemari specific components. The next 6 patches are the components of Kemari. They introduce event-tap and the FT transaction protocol file based on buffered file. The design document of FT transaction protocol can be found at, http://wiki.qemu.org/images/b/b1/Kemari_sender_receiver_0.5a.pdf Then the following 4 patches modifies dma-helpers, virtio-blk virtio-net and e1000 to replace net/block layer functions with event-tap proxy functions. Please note that if Kemari is off, event-tap will just passthrough, and there is most no intrusion to exisiting functions including normal live migration. Would it be possible to make the changes only in the block/net layer, so that the devices are not modified at all? That is, the proxy function would always replaces the unproxied version. I understand the benefit of your suggestion. However it seems a bit tricky. It's because event-tap uses functions of emulators and net, but block.c is also linked for utilities like qemu-img that doesn't need emulators or net. In the previous version, I added function pointers to get around. http://lists.nongnu.org/archive/html/qemu-devel/2010-05/msg02378.html I wasn't confident of this approach and discussed it at KVM Forum, and decided to give a try to replace emulator functions with proxies. Suggestions are welcomed of course. Somehow I find some similarities to instrumentation patches. Perhaps the instrumentation framework could be used (maybe with some changes) for Kemari as well? That could be beneficial to both. Yes. I had the same idea but I'm not sure how tracing works. I think Stefan Hajnoczi knows it better. Stefan, is it possible to call arbitrary functions from the trace points? Yes, if you add code to ./tracetool. I'm not sure I see the connection between Kemari and tracing though. The connection is that it may be possible to remove Kemari specific hook point like in ioport.c and exec.c, and let tracing notify Kemari instead. This all depends on how generic we want the trace points become. One possible extension to the event injection or instrumentation could be fault injection: based on some rule, make the instrumented function return error. That would be interesting for testing how guest handles failure cases. Maybe it should be also possible to handle event injection in a generic way. Split the instrumented function to two, before and after the tracepoint. The tracepoint registers the tail function in addition to the parameters. This may require a lot of refactoring though. One question I have about Kemari is whether it adds new constraints to the QEMU codebase? Fault tolerance seems like a cross-cutting concern - everyone writing device emulation or core QEMU code may need to be aware of new constraints. For example, you are not allowed to release I/O operations to the outside world directly, instead you need to go through Kemari code which makes I/O transactional and communicates with the passive host. You have converted e1000, virtio-net, and virtio-blk. How do we make sure new devices that are merged into qemu.git don't break Kemari? How do we go about supporting the existing hw/* devices? Whether Kemari adds constraints such as
Re: [Qemu-devel] [PATCH 00/21] Kemari for KVM 0.2
One question I have about Kemari is whether it adds new constraints to the QEMU codebase? Fault tolerance seems like a cross-cutting concern - everyone writing device emulation or core QEMU code may need to be aware of new constraints. For example, you are not allowed to release I/O operations to the outside world directly, instead you need to go through Kemari code which makes I/O transactional and communicates with the passive host. You have converted e1000, virtio-net, and virtio-blk. How do we make sure new devices that are merged into qemu.git don't break Kemari? How do we go about supporting the existing hw/* devices? IMO anything that requires devices to act differently is wrong. All external IO already goes though a common API (e.g. qemu_send_packet). You should be putting your transaction code there, not hacking individual devices. Paul -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 00/21] Kemari for KVM 0.2
On Sat, Nov 27, 2010 at 8:53 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: 2010/11/27 Stefan Hajnoczi stefa...@gmail.com: On Sat, Nov 27, 2010 at 4:29 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: 2010/11/27 Blue Swirl blauwir...@gmail.com: On Thu, Nov 25, 2010 at 6:06 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: Somehow I find some similarities to instrumentation patches. Perhaps the instrumentation framework could be used (maybe with some changes) for Kemari as well? That could be beneficial to both. Yes. I had the same idea but I'm not sure how tracing works. I think Stefan Hajnoczi knows it better. Stefan, is it possible to call arbitrary functions from the trace points? Yes, if you add code to ./tracetool. I'm not sure I see the connection between Kemari and tracing though. The connection is that it may be possible to remove Kemari specific hook point like in ioport.c and exec.c, and let tracing notify Kemari instead. I actually think the other way. Tracing just instruments and stashes away values. It does not change inputs or outputs, it does not change control flow, it does not affect state. Going down the route of side-effects mixes two different things: hooking into a subsystem and instrumentation. For hooking into a subsystem we should define proper interfaces. That interface can explicitly support modifying inputs/outputs or changing control flow. Tracing is much more ad-hoc and not a clean interface. It's also based on a layer of indirection via the tracetool code generator. That's okay because it doesn't affect the code it is called from and you don't need to debug trace events (they are simple and have almost no behavior). Hooking via tracing is just taking advantage of the cheap layer of indirection in order to get at interesting events in a subsystem. It's easy to hook up and quick to develop, but it's not a proper interface and will be hard to understand for other developers. One question I have about Kemari is whether it adds new constraints to the QEMU codebase? Fault tolerance seems like a cross-cutting concern - everyone writing device emulation or core QEMU code may need to be aware of new constraints. For example, you are not allowed to release I/O operations to the outside world directly, instead you need to go through Kemari code which makes I/O transactional and communicates with the passive host. You have converted e1000, virtio-net, and virtio-blk. How do we make sure new devices that are merged into qemu.git don't break Kemari? How do we go about supporting the existing hw/* devices? Whether Kemari adds constraints such as you mentioned, yes. If the devices (including existing ones) don't call Kemari code, they would certainly break Kemari. Altough using proxies looks explicit, to make it unaware from people writing device emulation, it's possible to remove proxies and put changes only into the block/net layer as Blue suggested. Anything that makes it hard to violate the constraints is good. Otherwise Kemari might get broken in the future and no one will know until a failover behaves incorrectly. Could you formulate the constraints so developers are aware of them in the future and can protect the codebase. How about expanding the Kemari wiki pages? Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 00/21] Kemari for KVM 0.2
2010/11/27 Blue Swirl blauwir...@gmail.com: On Sat, Nov 27, 2010 at 8:53 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: 2010/11/27 Stefan Hajnoczi stefa...@gmail.com: On Sat, Nov 27, 2010 at 4:29 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: 2010/11/27 Blue Swirl blauwir...@gmail.com: On Thu, Nov 25, 2010 at 6:06 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: Hi, This patch series is a revised version of Kemari for KVM, which applied comments for the previous post and KVM Forum 2010. The current code is based on qemu.git f711df67d611e4762966a249742a5f7499e19f99. For general information about Kemari, I've made a wiki page at qemu.org. http://wiki.qemu.org/Features/FaultTolerance The changes from v0.1.1 - v0.2 are: - Introduce a queue in event-tap to make VM sync live. - Change transaction receiver to a state machine for async receiving. - Replace net/block layer functions with event-tap proxy functions. - Remove dirty bitmap optimization for now. - convert DPRINTF() in ft_trans_file to trace functions. - convert fprintf() in ft_trans_file to error_report(). - improved error handling in ft_trans_file. - add a tmp pointer to qemu_del_vm_change_state_handler. The changes from v0.1 - v0.1.1 are: - events are tapped in net/block layer instead of device emulation layer. - Introduce a new option for -incoming to accept FT transaction. - Removed writev() support to QEMUFile and FdMigrationState for now. I would post this work in a different series. - Modified virtio-blk save/load handler to send inuse variable to correctly replay. - Removed configure --enable-ft-mode. - Removed unnecessary check for qemu_realloc(). The first 6 patches modify several functions of qemu to prepare introducing Kemari specific components. The next 6 patches are the components of Kemari. They introduce event-tap and the FT transaction protocol file based on buffered file. The design document of FT transaction protocol can be found at, http://wiki.qemu.org/images/b/b1/Kemari_sender_receiver_0.5a.pdf Then the following 4 patches modifies dma-helpers, virtio-blk virtio-net and e1000 to replace net/block layer functions with event-tap proxy functions. Please note that if Kemari is off, event-tap will just passthrough, and there is most no intrusion to exisiting functions including normal live migration. Would it be possible to make the changes only in the block/net layer, so that the devices are not modified at all? That is, the proxy function would always replaces the unproxied version. I understand the benefit of your suggestion. However it seems a bit tricky. It's because event-tap uses functions of emulators and net, but block.c is also linked for utilities like qemu-img that doesn't need emulators or net. In the previous version, I added function pointers to get around. http://lists.nongnu.org/archive/html/qemu-devel/2010-05/msg02378.html I wasn't confident of this approach and discussed it at KVM Forum, and decided to give a try to replace emulator functions with proxies. Suggestions are welcomed of course. Somehow I find some similarities to instrumentation patches. Perhaps the instrumentation framework could be used (maybe with some changes) for Kemari as well? That could be beneficial to both. Yes. I had the same idea but I'm not sure how tracing works. I think Stefan Hajnoczi knows it better. Stefan, is it possible to call arbitrary functions from the trace points? Yes, if you add code to ./tracetool. I'm not sure I see the connection between Kemari and tracing though. The connection is that it may be possible to remove Kemari specific hook point like in ioport.c and exec.c, and let tracing notify Kemari instead. This all depends on how generic we want the trace points become. One possible extension to the event injection or instrumentation could be fault injection: based on some rule, make the instrumented function return error. That would be interesting for testing how guest handles failure cases. Maybe it should be also possible to handle event injection in a generic way. Split the instrumented function to two, before and after the tracepoint. The tracepoint registers the tail function in addition to the parameters. This may require a lot of refactoring though. The idea looks cool but it's a bit out of the range I can handle now:-) Let's keep the idea of binding with trace points for now, and focus on how to insert net/block tap points. One question I have about Kemari is whether it adds new constraints to the QEMU codebase? Fault tolerance seems like a cross-cutting concern - everyone writing device emulation or core QEMU code may need to be aware of new constraints. For example, you are not allowed to release I/O operations to the outside world directly, instead you need to go through Kemari code which makes I/O transactional and communicates with the passive host. You
Re: [Qemu-devel] [PATCH 00/21] Kemari for KVM 0.2
2010/11/27 Paul Brook p...@codesourcery.com: One question I have about Kemari is whether it adds new constraints to the QEMU codebase? Fault tolerance seems like a cross-cutting concern - everyone writing device emulation or core QEMU code may need to be aware of new constraints. For example, you are not allowed to release I/O operations to the outside world directly, instead you need to go through Kemari code which makes I/O transactional and communicates with the passive host. You have converted e1000, virtio-net, and virtio-blk. How do we make sure new devices that are merged into qemu.git don't break Kemari? How do we go about supporting the existing hw/* devices? IMO anything that requires devices to act differently is wrong. All external IO already goes though a common API (e.g. qemu_send_packet). You should be putting your transaction code there, not hacking individual devices. So you're with Blue's idea to put them in block/net layer. Yoshi Paul -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 00/21] Kemari for KVM 0.2
2010/11/27 Stefan Hajnoczi stefa...@gmail.com: On Sat, Nov 27, 2010 at 8:53 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: 2010/11/27 Stefan Hajnoczi stefa...@gmail.com: On Sat, Nov 27, 2010 at 4:29 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: 2010/11/27 Blue Swirl blauwir...@gmail.com: On Thu, Nov 25, 2010 at 6:06 AM, Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp wrote: Somehow I find some similarities to instrumentation patches. Perhaps the instrumentation framework could be used (maybe with some changes) for Kemari as well? That could be beneficial to both. Yes. I had the same idea but I'm not sure how tracing works. I think Stefan Hajnoczi knows it better. Stefan, is it possible to call arbitrary functions from the trace points? Yes, if you add code to ./tracetool. I'm not sure I see the connection between Kemari and tracing though. The connection is that it may be possible to remove Kemari specific hook point like in ioport.c and exec.c, and let tracing notify Kemari instead. I actually think the other way. Tracing just instruments and stashes away values. It does not change inputs or outputs, it does not change control flow, it does not affect state. Going down the route of side-effects mixes two different things: hooking into a subsystem and instrumentation. For hooking into a subsystem we should define proper interfaces. That interface can explicitly support modifying inputs/outputs or changing control flow. Tracing is much more ad-hoc and not a clean interface. It's also based on a layer of indirection via the tracetool code generator. That's okay because it doesn't affect the code it is called from and you don't need to debug trace events (they are simple and have almost no behavior). Hooking via tracing is just taking advantage of the cheap layer of indirection in order to get at interesting events in a subsystem. It's easy to hook up and quick to develop, but it's not a proper interface and will be hard to understand for other developers. One question I have about Kemari is whether it adds new constraints to the QEMU codebase? Fault tolerance seems like a cross-cutting concern - everyone writing device emulation or core QEMU code may need to be aware of new constraints. For example, you are not allowed to release I/O operations to the outside world directly, instead you need to go through Kemari code which makes I/O transactional and communicates with the passive host. You have converted e1000, virtio-net, and virtio-blk. How do we make sure new devices that are merged into qemu.git don't break Kemari? How do we go about supporting the existing hw/* devices? Whether Kemari adds constraints such as you mentioned, yes. If the devices (including existing ones) don't call Kemari code, they would certainly break Kemari. Altough using proxies looks explicit, to make it unaware from people writing device emulation, it's possible to remove proxies and put changes only into the block/net layer as Blue suggested. Anything that makes it hard to violate the constraints is good. Otherwise Kemari might get broken in the future and no one will know until a failover behaves incorrectly. Blue and Paul prefer to put it into block/net layer, and you think it's better to provide API. I have an idea which may fit into both, which is to put the look into block/net layer, and make a list of devices that Kemari supports. Before turning on, we can check whether the devices tapped are those on the list. It's Kemari's responsibility to keep checking which devices can be supported. At this point, devices with proxies are on the list. Could you formulate the constraints so developers are aware of them in the future and can protect the codebase. How about expanding the Kemari wiki pages? If you like the idea above, I'm happy to make the list also on the wiki page. Yoshi Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How do I prevent one guest from hogging the disk I/O and preventing other guests from accessing the disk?
blkio may help you. http://www.mjmwired.net/kernel/Documentation/cgroups/blkio-controller.txt 2010/11/26 Henry Pepper henryp...@gmail.com: Hi I'm running some tests on a KVM setup, based on RHEL 6 beta2. When running a disk test in one guest, then no other guests seems to get any disk I/O done. How can I configure the KVM to ensure that some sort of round robin or other sharing is being done? Thanks Henry -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 00/16] boot order specification
On Wed, Nov 24, 2010 at 12:03:11PM +0200, Gleb Natapov wrote: On Tue, Nov 23, 2010 at 08:19:07PM -0500, Kevin O'Connor wrote: On Tue, Nov 23, 2010 at 05:31:41PM +0200, Gleb Natapov wrote: On Wed, Nov 17, 2010 at 06:43:47PM +0200, Gleb Natapov wrote: I am using open firmware naming scheme to specify device path names. In this version: added SCSI bus support. Pass boot order list as file to firmware. Names look like this on pci machine: [...] /p...@i0cf8/u...@1,2/h...@1/netw...@0/ether...@0 /r...@genroms/linuxboot.bin What's the plan for handling optionroms (ie, BCVs and BEVs)? This is an area which is a bit tricky - mainly due to legacy BIOS crud. An option rom can register either a BEV (eg, gpxe on a network card), or it can register one or more BCVs (eg, a scsi card registering two drives). How do we say boot from the optionrom on the second nic card? If you have a scsi card, how do we communicate that its second drive should be the c: drive? BEV should be easy. When you register BEV found on pci card you search for device path to that pci card to determine BEV's boot order. SeaBIOS has two separate optionrom passes - one to extract the roms from the cards and one to find BEVs and BCVs. In order to correlate a rom to a pci device SeaBIOS would have to keep track of each rom it deploys and then correlate it during the BEV/BCV scan. BCV should be the same, but since one PCI card may register several BCVs the problem is more complex. Device path has not only path to SCSI PCI card but to specific target,lun too. For instance this path /p...@i0cf8/s...@3,2/d...@0,0 points to SCSI card in pci slot 3 function 2 target 1 lun 1. The question is if BCV provides us with enough information to know what target/lun it is going to boot. How will seabios even know it's a SCSI card? All seabios sees is a PCI device with a valid option rom bar. Further, I don't see how seabios will know which BCV is which lun. BTW, I assume you're suggesting that if a disk is found first in the list then seabios should make that drive the c: and make hard drive booting be the first thing attempted? (This is what the seabios boot menu does.) The ugly thing about BCVs is that they are not necessarily registered in the rom for the device that controls it. So, if you have two of the same type of scsi card, each with two drives, it's possible for the optionrom to put all four drives in the rom of the first scsi card. That just broken optionrom. I can't see how we can solve this without communicating with such optionrom and letting it know what device we want to boot from. I wouldn't call it broken as the BIOS Boot Spec (BBS) specifically states that optionroms can do this. I don't know how many roms actually do it. The BCV and BEVs have a product name string that could be used to identify which one to boot. Unfortunately, there isn't a good way for qemu to find these strings (though maybe it could just hard code them for roms it ships with). SeaBIOS does show them in the boot menu, so a user could manually copy them to a command line. There can be also legacy optionrom that just hooks into int19 during init and hijack booting process entirely. I think those problems exist on real HW too. That's a separate problem which I wouldn't worry too much about. The only roms that I've seen do this today are roms we have the source for and can change. -Kevin -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 00/16] boot order specification
On Sat, Nov 27, 2010 at 10:41:10AM -0500, Kevin O'Connor wrote: On Wed, Nov 24, 2010 at 12:03:11PM +0200, Gleb Natapov wrote: On Tue, Nov 23, 2010 at 08:19:07PM -0500, Kevin O'Connor wrote: On Tue, Nov 23, 2010 at 05:31:41PM +0200, Gleb Natapov wrote: On Wed, Nov 17, 2010 at 06:43:47PM +0200, Gleb Natapov wrote: I am using open firmware naming scheme to specify device path names. In this version: added SCSI bus support. Pass boot order list as file to firmware. Names look like this on pci machine: [...] /p...@i0cf8/u...@1,2/h...@1/netw...@0/ether...@0 /r...@genroms/linuxboot.bin What's the plan for handling optionroms (ie, BCVs and BEVs)? This is an area which is a bit tricky - mainly due to legacy BIOS crud. An option rom can register either a BEV (eg, gpxe on a network card), or it can register one or more BCVs (eg, a scsi card registering two drives). How do we say boot from the optionrom on the second nic card? If you have a scsi card, how do we communicate that its second drive should be the c: drive? BEV should be easy. When you register BEV found on pci card you search for device path to that pci card to determine BEV's boot order. SeaBIOS has two separate optionrom passes - one to extract the roms from the cards and one to find BEVs and BCVs. In order to correlate a rom to a pci device SeaBIOS would have to keep track of each rom it deploys and then correlate it during the BEV/BCV scan. Yeah. I looked at the Seabios code. The simplest would be to change device path to point to rom instead of pci device. So if there is device path /p...@i0cf8/ether...@3 when rom is copied into the memory the path is changed to be /r...@addr where addr is memory address where rom was copied. The same with roms that are copied from qemu. When rom memory is later scanned for bevs/bcvs it will be easy to find boot priority of each one of them. BCV should be the same, but since one PCI card may register several BCVs the problem is more complex. Device path has not only path to SCSI PCI card but to specific target,lun too. For instance this path /p...@i0cf8/s...@3,2/d...@0,0 points to SCSI card in pci slot 3 function 2 target 1 lun 1. The question is if BCV provides us with enough information to know what target/lun it is going to boot. How will seabios even know it's a SCSI card? All seabios sees is a PCI device with a valid option rom bar. Further, I don't see how seabios will know which BCV is which lun. Seabios knows that this is SCSI card from its device class. Unfortunately it looks like bcv does not provide enough info to know what target it corresponds too. I can't think of enything smart we can do here, so lets just treat all bcvs as same priority. BTW, I assume you're suggesting that if a disk is found first in the list then seabios should make that drive the c: and make hard drive booting be the first thing attempted? (This is what the seabios boot menu does.) Yes. The ugly thing about BCVs is that they are not necessarily registered in the rom for the device that controls it. So, if you have two of the same type of scsi card, each with two drives, it's possible for the optionrom to put all four drives in the rom of the first scsi card. That just broken optionrom. I can't see how we can solve this without communicating with such optionrom and letting it know what device we want to boot from. I wouldn't call it broken as the BIOS Boot Spec (BBS) specifically states that optionroms can do this. I don't know how many roms actually do it. BSS tries to documents things post factum. I hope it doesn't encourage this type of option roms. But how it works if we have two scsi cards both have same option rom and each one of them tries to register bcv for all scsi card it found. Won't we have two bcvs registered for each scsi car then? The BCV and BEVs have a product name string that could be used to identify which one to boot. Unfortunately, there isn't a good way for qemu to find these strings (though maybe it could just hard code them for roms it ships with). SeaBIOS does show them in the boot menu, so a user could manually copy them to a command line. Two disks can have same product name no? And qemu can't even know product names for pass through devices. Also I wouldn't worry about optioroms qemu ships. We can fix those. There can be also legacy optionrom that just hooks into int19 during init and hijack booting process entirely. I think those problems exist on real HW too. That's a separate problem which I wouldn't worry too much about. The only roms that I've seen do this today are roms we have the source for and can change. Agree. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: [PATCHv6 00/16] boot order specification
On Sat, Nov 27, 2010 at 06:22:16PM +0200, Gleb Natapov wrote: On Sat, Nov 27, 2010 at 10:41:10AM -0500, Kevin O'Connor wrote: On Wed, Nov 24, 2010 at 12:03:11PM +0200, Gleb Natapov wrote: BEV should be easy. When you register BEV found on pci card you search for device path to that pci card to determine BEV's boot order. SeaBIOS has two separate optionrom passes - one to extract the roms from the cards and one to find BEVs and BCVs. In order to correlate a rom to a pci device SeaBIOS would have to keep track of each rom it deploys and then correlate it during the BEV/BCV scan. Yeah. I looked at the Seabios code. The simplest would be to change device path to point to rom instead of pci device. So if there is device path /p...@i0cf8/ether...@3 when rom is copied into the memory the path is changed to be /r...@addr where addr is memory address where rom was copied. Seabios would change its local copy of the path? [...] How will seabios even know it's a SCSI card? All seabios sees is a PCI device with a valid option rom bar. Further, I don't see how seabios will know which BCV is which lun. Seabios knows that this is SCSI card from its device class. I assume ethernet would be from device class as well? This seems fragile - it would require seabios to keep a list of device classes to name mappings, and a user may not be able to boot from a device if seabios isn't programmed for it (eg, a passthrough device). Unfortunately it looks like bcv does not provide enough info to know what target it corresponds too. I can't think of enything smart we can do here, so lets just treat all bcvs as same priority. There's the product name and there's the order it was registered in (ie, the third bcv on the rom). [...] That just broken optionrom. I can't see how we can solve this without communicating with such optionrom and letting it know what device we want to boot from. I wouldn't call it broken as the BIOS Boot Spec (BBS) specifically states that optionroms can do this. I don't know how many roms actually do it. BSS tries to documents things post factum. I hope it doesn't encourage this type of option roms. But how it works if we have two scsi cards both have same option rom and each one of them tries to register bcv for all scsi card it found. Won't we have two bcvs registered for each scsi car then? First optionrom finds all devices for that type of scsi card. Second optionrom detects that its drives have already been probed and resizes itself to zero. The BCV and BEVs have a product name string that could be used to identify which one to boot. Unfortunately, there isn't a good way for qemu to find these strings (though maybe it could just hard code them for roms it ships with). SeaBIOS does show them in the boot menu, so a user could manually copy them to a command line. Two disks can have same product name no? And qemu can't even know product names for pass through devices. Also I wouldn't worry about optioroms qemu ships. We can fix those. The product name is supposed to be unique. From the spec: The data within each header must be valid. Especially the `BCV' and `Pointer to Product Name String' fields. The BCV should point to a procedure that installs only that device into INT 13h services. It is strongly recommended that the Product Name String for each header uniquely identify the device to which that header belongs, so that when these strings are displayed to the user in a menu, the user can intelligently recognize and choose devices connected to that controller without having to open up the computer. -Kevin -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 00/16] boot order specification
On Sat, Nov 27, 2010 at 11:49:39AM -0500, Kevin O'Connor wrote: On Sat, Nov 27, 2010 at 06:22:16PM +0200, Gleb Natapov wrote: On Sat, Nov 27, 2010 at 10:41:10AM -0500, Kevin O'Connor wrote: On Wed, Nov 24, 2010 at 12:03:11PM +0200, Gleb Natapov wrote: BEV should be easy. When you register BEV found on pci card you search for device path to that pci card to determine BEV's boot order. SeaBIOS has two separate optionrom passes - one to extract the roms from the cards and one to find BEVs and BCVs. In order to correlate a rom to a pci device SeaBIOS would have to keep track of each rom it deploys and then correlate it during the BEV/BCV scan. Yeah. I looked at the Seabios code. The simplest would be to change device path to point to rom instead of pci device. So if there is device path /p...@i0cf8/ether...@3 when rom is copied into the memory the path is changed to be /r...@addr where addr is memory address where rom was copied. Seabios would change its local copy of the path? Yes. [...] How will seabios even know it's a SCSI card? All seabios sees is a PCI device with a valid option rom bar. Further, I don't see how seabios will know which BCV is which lun. Seabios knows that this is SCSI card from its device class. I assume ethernet would be from device class as well? Yes. This seems fragile - it would require seabios to keep a list of device classes to name mappings, and a user may not be able to boot from a device if seabios isn't programmed for it (eg, a passthrough device). Seabios can ignore device name from device path since the same information is present in pci config space of the device. So the device path can be /p...@i0cf8/s...@4 or /p...@i0cf8/@4 Seabios can detect that device is scsi just by looking at config space of pci device in slot 4 function 0. For, scsi I think, proper solution would be to have Seabios support for scsi controller emulated by qemu. This will make all devices bootable from BCV known to Seabios and will not require option rom. The only problem then will be with pass through devices, but since now only the whole scsi controller can be passed through not individual targets qemu can point device path only to the controller and not individual targets too. Unfortunately it looks like bcv does not provide enough info to know what target it corresponds too. I can't think of enything smart we can do here, so lets just treat all bcvs as same priority. There's the product name and there's the order it was registered in (ie, the third bcv on the rom). Doesn't help much if we can't correlate bcv to device path. [...] That just broken optionrom. I can't see how we can solve this without communicating with such optionrom and letting it know what device we want to boot from. I wouldn't call it broken as the BIOS Boot Spec (BBS) specifically states that optionroms can do this. I don't know how many roms actually do it. BSS tries to documents things post factum. I hope it doesn't encourage this type of option roms. But how it works if we have two scsi cards both have same option rom and each one of them tries to register bcv for all scsi card it found. Won't we have two bcvs registered for each scsi car then? First optionrom finds all devices for that type of scsi card. Second optionrom detects that its drives have already been probed and resizes itself to zero. The BCV and BEVs have a product name string that could be used to identify which one to boot. Unfortunately, there isn't a good way for qemu to find these strings (though maybe it could just hard code them for roms it ships with). SeaBIOS does show them in the boot menu, so a user could manually copy them to a command line. Two disks can have same product name no? And qemu can't even know product names for pass through devices. Also I wouldn't worry about optioroms qemu ships. We can fix those. The product name is supposed to be unique. From the spec: The data within each header must be valid. Especially the `BCV' and `Pointer to Product Name String' fields. The BCV should point to a procedure that installs only that device into INT 13h services. It is strongly recommended that the Product Name String for each header uniquely identify the device to which that header belongs, so that when these strings are displayed to the user in a menu, the user can intelligently recognize and choose devices connected to that controller without having to open up the computer. The spec is strongly recommended, not requires, but I guess if product violates this recommendation it will have the same problem on real HW too. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Mysterious address matching in assigned_dev_pci_read_config
Hi, can anyone explain this code in assigned_dev_pci_read_config? ... /* vga specific, remove later */ if (address == 0xFC) goto do_log; fd = pci_dev-real_device.config_fd; again: ret = pread(fd, val, len, address); if (ret != len) { ... } do_log: ... So this skips any read access to address 0xfc and returns 0 for some vga specific reasons. But even if it's supposed to work around issues with passing through some VGA adapters, this skipping affects all device types, no? Unfortunately, this hunk was already part of the very first version, no hints in git. Jan signature.asc Description: OpenPGP digital signature
Re: [PATCHv6 00/16] boot order specification
On Sat, Nov 27, 2010 at 07:06:19PM +0200, Gleb Natapov wrote: On Sat, Nov 27, 2010 at 11:49:39AM -0500, Kevin O'Connor wrote: On Sat, Nov 27, 2010 at 06:22:16PM +0200, Gleb Natapov wrote: Yeah. I looked at the Seabios code. The simplest would be to change device path to point to rom instead of pci device. So if there is device path /p...@i0cf8/ether...@3 when rom is copied into the memory the path is changed to be /r...@addr where addr is memory address where rom was copied. Seabios would change its local copy of the path? Yes. Thinking about this further - since the optionrom must be 2k aligned there are only 96 spots a rom can be in. So, it should be simple to just have optionrom_setup() declare a u16 romaddr_to_bdf[96]. How will seabios even know it's a SCSI card? All seabios sees is a PCI device with a valid option rom bar. Further, I don't see how seabios will know which BCV is which lun. Seabios knows that this is SCSI card from its device class. This seems fragile - it would require seabios to keep a list of device classes to name mappings, and a user may not be able to boot from a device if seabios isn't programmed for it (eg, a passthrough device). Seabios can ignore device name from device path since the same information is present in pci config space of the device. So the device path can be /p...@i0cf8/s...@4 or /p...@i0cf8/@4 Seabios can detect that device is scsi just by looking at config space of pci device in slot 4 function 0. I don't think seabios should try to parse the path. Instead, I think seabios should build a name for each device it finds using the same algorithm that qemu uses and then just do a string compare to see if there is a match. Also, if qemu wants seabios to boot from a rom, I think it should tell seabios that - something like /p...@i0cf8/r...@4.0/b...@0 to mean make the drive declared by the rom on pci device 4 function 0 in the first found bcv the c: drive. For, scsi I think, proper solution would be to have Seabios support for scsi controller emulated by qemu. This will make all devices bootable from BCV known to Seabios and will not require option rom. The only problem then will be with pass through devices, but since now only the whole scsi controller can be passed through not individual targets qemu can point device path only to the controller and not individual targets too. I'm okay with adding scsi support to seabios. However, the problem doesn't go away as network booting still requires a rom. Unfortunately it looks like bcv does not provide enough info to know what target it corresponds too. I can't think of enything smart we can do here, so lets just treat all bcvs as same priority. There's the product name and there's the order it was registered in (ie, the third bcv on the rom). Doesn't help much if we can't correlate bcv to device path. I'm confused by this. SeaBIOS can't boot the device in this situation - it can only run a rom. I think qemu should try to send info on which rom to run, not which device to boot. Each rom is uniquely identifiable by the pci device it came from (or fw_cfg slot), and each BCV can be identified by either its instance or its product name. -Kevin -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] KVM test: kvm_subprocess: rename get_command_status_output() and friends
get_command_status_output() - cmd_status_output() get_command_output() - cmd_output() get_command_status() - cmd_status() Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_subprocess.py | 27 +-- client/tests/kvm/kvm_test_utils.py | 20 +++--- client/tests/kvm/tests/clock_getres.py |2 +- client/tests/kvm/tests/ethtool.py | 10 +++--- client/tests/kvm/tests/guest_s4.py |2 +- client/tests/kvm/tests/guest_test.py |3 +- client/tests/kvm/tests/iofuzz.py |2 +- client/tests/kvm/tests/ioquit.py |2 +- client/tests/kvm/tests/iozone_windows.py |4 +- client/tests/kvm/tests/kdump.py|6 ++-- client/tests/kvm/tests/ksm_overcommit.py |4 +- client/tests/kvm/tests/linux_s3.py |2 +- client/tests/kvm/tests/migration.py|7 ++--- client/tests/kvm/tests/multicast.py|8 +++--- client/tests/kvm/tests/netperf.py |4 +- client/tests/kvm/tests/nic_promisc.py |6 ++-- client/tests/kvm/tests/nicdriver_unload.py |4 +- client/tests/kvm/tests/pci_hotplug.py |6 ++-- client/tests/kvm/tests/physical_resources_check.py |2 +- client/tests/kvm/tests/timedrift.py|2 +- client/tests/kvm/tests/vlan.py |9 +++--- client/tests/kvm/tests/whql_client_install.py |8 +++--- client/tests/kvm/tests/whql_submission.py |4 +- 23 files changed, 70 insertions(+), 74 deletions(-) diff --git a/client/tests/kvm/kvm_subprocess.py b/client/tests/kvm/kvm_subprocess.py index c92910c..c8caab2 100755 --- a/client/tests/kvm/kvm_subprocess.py +++ b/client/tests/kvm/kvm_subprocess.py @@ -1103,7 +1103,7 @@ class kvm_shell_session(kvm_expect): @param prompt: Regular expression describing the shell's prompt line. @param status_test_command: Command to be used for getting the last exit status of commands run inside the shell (used by -get_command_status_output() and friends). +cmd_status_output() and friends). # Init the superclass kvm_expect.__init__(self, command, id, auto_close, echo, linesep, @@ -1193,8 +1193,8 @@ class kvm_shell_session(kvm_expect): return o -def get_command_output(self, cmd, timeout=30.0, internal_timeout=None, - print_func=None): +def cmd_output(self, cmd, timeout=30.0, internal_timeout=None, + print_func=None): Send a command and return its output. @@ -1237,8 +1237,8 @@ class kvm_shell_session(kvm_expect): return remove_last_nonempty_line(remove_command_echo(o, cmd)) -def get_command_status_output(self, cmd, timeout=30.0, - internal_timeout=None, print_func=None): +def cmd_status_output(self, cmd, timeout=30.0, internal_timeout=None, + print_func=None): Send a command and return its exit status and output. @@ -1257,11 +1257,10 @@ class kvm_shell_session(kvm_expect): @raise ShellStatusError: Raised if the exit status cannot be obtained @raise ShellError: Raised if an unknown error occurs -o = self.get_command_output(cmd, timeout, internal_timeout, print_func) +o = self.cmd_output(cmd, timeout, internal_timeout, print_func) try: # Send the 'echo $?' (or equivalent) command to get the exit status -s = self.get_command_output(self.status_test_command, 10, -internal_timeout) +s = self.cmd_output(self.status_test_command, 10, internal_timeout) except ShellError: raise ShellStatusError(cmd, o) @@ -1273,8 +1272,8 @@ class kvm_shell_session(kvm_expect): raise ShellStatusError(cmd, o) -def get_command_status(self, cmd, timeout=30.0, internal_timeout=None, - print_func=None): +def cmd_status(self, cmd, timeout=30.0, internal_timeout=None, + print_func=None): Send a command and return its exit status. @@ -1292,8 +1291,8 @@ class kvm_shell_session(kvm_expect): @raise ShellStatusError: Raised if the exit status cannot be obtained @raise ShellError: Raised if an unknown error occurs -s, o = self.get_command_status_output(cmd, timeout, internal_timeout, - print_func) +s, o = self.cmd_status_output(cmd, timeout, internal_timeout, + print_func) return s @@ -1319,8 +1318,8 @@ class kvm_shell_session(kvm_expect): @raise ShellError: Raised if an
[PATCH 4/4] KVM test: ksm_overcommit: Trap PexpectTimeoutErrors
On the KSM overcommit test, read_until_last_line_matches is used, and now this function can raise an expect error. Trap it accordingly. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- client/tests/kvm/tests/ksm_overcommit.py | 25 ++--- 1 files changed, 14 insertions(+), 11 deletions(-) diff --git a/client/tests/kvm/tests/ksm_overcommit.py b/client/tests/kvm/tests/ksm_overcommit.py index f60929e..deadda1 100644 --- a/client/tests/kvm/tests/ksm_overcommit.py +++ b/client/tests/kvm/tests/ksm_overcommit.py @@ -27,12 +27,13 @@ def run_ksm_overcommit(test, params, env): logging.debug(Starting allocator.py on guest %s, vm.name) session.sendline(python /tmp/allocator.py) -(match, data) = session.read_until_last_line_matches([PASS:, FAIL:], - timeout) -if match != 0: -raise error.TestFail(Command allocator.py on guest %s failed.\n - return code: %s\n output:\n%s % - (vm.name, match, data)) +try: +(match, data) = session.read_until_last_line_matches( +[PASS:, FAIL:], +timeout) +except kvm_subprocess.ExpectProcessTerminatedError, e: +raise error.TestFail(Command allocator.py on vm '%s' failed: %s % + (vm.name, str(e))) def _execute_allocator(command, vm, session, timeout): @@ -50,12 +51,14 @@ def run_ksm_overcommit(test, params, env): logging.debug(Executing '%s' on allocator.py loop, vm: %s, timeout: %s, command, vm.name, timeout) session.sendline(command) -(match, data) = session.read_until_last_line_matches([PASS:,FAIL:], +try: +(match, data) = session.read_until_last_line_matches( + [PASS:,FAIL:], timeout) -if match != 0: -raise error.TestFail(Failed to execute '%s' on allocator.py, - vm: %s, output:\n%s % - (command, vm.name, data)) +except kvm_subprocess.ExpectProcessTerminatedError, e: +e_str = (Failed to execute command '%s' on allocator.py, + vm '%s': %s % (command, vm.name, str(e))) +raise error.TestFail(e_str) return (match, data) -- 1.7.2.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] KVM test: kvm_subprocess: rename kvm_shell_session and friends
PEP 8 states that class names should use the CapWords convention. kvm_spawn - Spawn kvm_tail - Tail kvm_expect - Expect kvm_shell_session - ShellSession This is an RFC because I wonder if the proposed names are too general, even though they are usually prefixed by 'kvm_subprocess.'. Signed-off-by: Michael Goldish mgold...@redhat.com --- client/tests/kvm/kvm_preprocessing.py |2 +- client/tests/kvm/kvm_scheduler.py |2 +- client/tests/kvm/kvm_subprocess.py| 60 client/tests/kvm/kvm_utils.py | 15 client/tests/kvm/kvm_vm.py|6 ++-- 5 files changed, 42 insertions(+), 43 deletions(-) diff --git a/client/tests/kvm/kvm_preprocessing.py b/client/tests/kvm/kvm_preprocessing.py index 1ddf99b..5ec38fb 100644 --- a/client/tests/kvm/kvm_preprocessing.py +++ b/client/tests/kvm/kvm_preprocessing.py @@ -204,7 +204,7 @@ def preprocess(test, params, env): if tcpdump not in env and params.get(run_tcpdump, yes) == yes: cmd = %s -npvi any 'dst port 68' % kvm_utils.find_command(tcpdump) logging.debug(Starting tcpdump (%s)..., cmd) -env[tcpdump] = kvm_subprocess.kvm_tail( +env[tcpdump] = kvm_subprocess.Tail( command=cmd, output_func=_update_address_cache, output_params=(env[address_cache],)) diff --git a/client/tests/kvm/kvm_scheduler.py b/client/tests/kvm/kvm_scheduler.py index f1adb39..aa581ad 100644 --- a/client/tests/kvm/kvm_scheduler.py +++ b/client/tests/kvm/kvm_scheduler.py @@ -78,7 +78,7 @@ class scheduler: for obj in env.values(): if isinstance(obj, kvm_vm.VM): obj.destroy() -elif isinstance(obj, kvm_subprocess.kvm_spawn): +elif isinstance(obj, kvm_subprocess.Spawn): obj.close() kvm_utils.dump_env(env, env_filename) w.write(cleanup_done\n) diff --git a/client/tests/kvm/kvm_subprocess.py b/client/tests/kvm/kvm_subprocess.py index c8caab2..e0723bb 100755 --- a/client/tests/kvm/kvm_subprocess.py +++ b/client/tests/kvm/kvm_subprocess.py @@ -292,12 +292,12 @@ def run_bg(command, termination_func=None, output_func=None, output_prefix=, @param timeout: Time duration (in seconds) to wait for the subprocess to terminate before returning -@return: A kvm_tail object. +@return: A Tail object. -process = kvm_tail(command=command, - termination_func=termination_func, - output_func=output_func, - output_prefix=output_prefix) +process = Tail(command=command, + termination_func=termination_func, + output_func=output_func, + output_prefix=output_prefix) end_time = time.time() + timeout while time.time() end_time and process.is_alive(): @@ -338,7 +338,7 @@ def run_fg(command, output_func=None, output_prefix=, timeout=1.0): return (status, output) -class kvm_spawn: +class Spawn: This class is used for spawning and controlling a child process. @@ -350,7 +350,7 @@ class kvm_spawn: The text file can be accessed at any time using get_output(). In addition, the server opens as many pipes as requested by the client and writes the output to them. -The pipes are requested and accessed by classes derived from kvm_spawn. +The pipes are requested and accessed by classes derived from Spawn. These pipes are referred to as readers. The server also receives input from the client and sends it to the child process. @@ -634,7 +634,7 @@ _thread_kill_requested = False def kill_tail_threads(): -Kill all kvm_tail threads. +Kill all Tail threads. After calling this function no new threads should be started. @@ -646,12 +646,12 @@ def kill_tail_threads(): _thread_kill_requested = False -class kvm_tail(kvm_spawn): +class Tail(Spawn): This class runs a child process in the background and sends its output in real time, line-by-line, to a callback function. -See kvm_spawn's docstring. +See Spawn's docstring. This class uses a single pipe reader to read data in real time from the child process and report it to a given callback function. @@ -692,10 +692,10 @@ class kvm_tail(kvm_spawn): # Add a reader and a close hook self._add_reader(tail) -self._add_close_hook(kvm_tail._join_thread) +self._add_close_hook(Tail._join_thread) # Init the superclass -kvm_spawn.__init__(self, command, id, auto_close, echo, linesep) +Spawn.__init__(self, command, id, auto_close, echo, linesep) # Remember some attributes self.termination_func = termination_func @@ -711,11 +711,11 @@ class kvm_tail(kvm_spawn): def __getinitargs__(self): -return
Re: [PATCHv6 00/16] boot order specification
On Sat, Nov 27, 2010 at 12:47:26PM -0500, Kevin O'Connor wrote: On Sat, Nov 27, 2010 at 07:06:19PM +0200, Gleb Natapov wrote: On Sat, Nov 27, 2010 at 11:49:39AM -0500, Kevin O'Connor wrote: On Sat, Nov 27, 2010 at 06:22:16PM +0200, Gleb Natapov wrote: Yeah. I looked at the Seabios code. The simplest would be to change device path to point to rom instead of pci device. So if there is device path /p...@i0cf8/ether...@3 when rom is copied into the memory the path is changed to be /r...@addr where addr is memory address where rom was copied. Seabios would change its local copy of the path? Yes. Thinking about this further - since the optionrom must be 2k aligned there are only 96 spots a rom can be in. So, it should be simple to just have optionrom_setup() declare a u16 romaddr_to_bdf[96]. That will work too. How will seabios even know it's a SCSI card? All seabios sees is a PCI device with a valid option rom bar. Further, I don't see how seabios will know which BCV is which lun. Seabios knows that this is SCSI card from its device class. This seems fragile - it would require seabios to keep a list of device classes to name mappings, and a user may not be able to boot from a device if seabios isn't programmed for it (eg, a passthrough device). Seabios can ignore device name from device path since the same information is present in pci config space of the device. So the device path can be /p...@i0cf8/s...@4 or /p...@i0cf8/@4 Seabios can detect that device is scsi just by looking at config space of pci device in slot 4 function 0. I don't think seabios should try to parse the path. Instead, I think seabios should build a name for each device it finds using the same algorithm that qemu uses and then just do a string compare to see if there is a match. Also, if qemu wants seabios to boot from a rom, I think it should tell seabios that - something like /p...@i0cf8/r...@4.0/b...@0 to mean make the drive declared by the rom on pci device 4 function 0 in the first found bcv the c: drive. Qemu does not know that Seabios needs optionrom to boot from a device. It knows even less about bcvs in option rom. Qemu knows about device itself, not how firmware boots from it. For, scsi I think, proper solution would be to have Seabios support for scsi controller emulated by qemu. This will make all devices bootable from BCV known to Seabios and will not require option rom. The only problem then will be with pass through devices, but since now only the whole scsi controller can be passed through not individual targets qemu can point device path only to the controller and not individual targets too. I'm okay with adding scsi support to seabios. However, the problem doesn't go away as network booting still requires a rom. But bev can be only one, so we do not have this problem with bev. Boot priority of a bev is boot priority of pci device it was loaded from. Unfortunately it looks like bcv does not provide enough info to know what target it corresponds too. I can't think of enything smart we can do here, so lets just treat all bcvs as same priority. There's the product name and there's the order it was registered in (ie, the third bcv on the rom). Doesn't help much if we can't correlate bcv to device path. I'm confused by this. SeaBIOS can't boot the device in this situation - it can only run a rom. I think qemu should try to send info on which rom to run, not which device to boot. Each rom is uniquely identifiable by the pci device it came from (or fw_cfg slot), and each BCV can be identified by either its instance or its product name. For Qemu those optionroms are just binary blobs. It doesn't know why they are needed at all (there is no difference for qemu between vga rom and e1000 rom). BTW are you actually aware of any option rom with multiple BCVs and, if yes, how those BCVs differ? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 00/16] boot order specification
On Sat, Nov 27, 2010 at 08:15:42PM +0200, Gleb Natapov wrote: On Sat, Nov 27, 2010 at 12:47:26PM -0500, Kevin O'Connor wrote: I don't think seabios should try to parse the path. Instead, I think seabios should build a name for each device it finds using the same algorithm that qemu uses and then just do a string compare to see if there is a match. Also, if qemu wants seabios to boot from a rom, I think it should tell seabios that - something like /p...@i0cf8/r...@4.0/b...@0 to mean make the drive declared by the rom on pci device 4 function 0 in the first found bcv the c: drive. Qemu does not know that Seabios needs optionrom to boot from a device. It knows even less about bcvs in option rom. Qemu knows about device itself, not how firmware boots from it. If the user wants to boot from a device and that device has an optionrom, then it's a safe bet that the optionrom is needed to boot from it. In any case, I'd rather have qemu know which devices seabios can boot then have seabios try to figure out what rom to run from a device path. There's the product name and there's the order it was registered in (ie, the third bcv on the rom). Doesn't help much if we can't correlate bcv to device path. I'm confused by this. SeaBIOS can't boot the device in this situation - it can only run a rom. I think qemu should try to send info on which rom to run, not which device to boot. Each rom is uniquely identifiable by the pci device it came from (or fw_cfg slot), and each BCV can be identified by either its instance or its product name. For Qemu those optionroms are just binary blobs. It doesn't know why they are needed at all (there is no difference for qemu between vga rom and e1000 rom). BTW are you actually aware of any option rom with multiple BCVs and, if yes, how those BCVs differ? Multiple BCVs - yes. A SCSI card will define a BCV for each attached drive. I don't have a scsi card myself, but the support was added by a user who ran into the problem first hand. I don't know if there are SCSI card roms that will register all the drives on multiple cards in the first rom. I wouldn't be surprised if there are because of the scarcity of space in the 0xc-0xf space. (Having secondary optionroms resize themselves to zero would be a big savings.) -Kevin -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 00/16] boot order specification
On Sat, Nov 27, 2010 at 01:40:12PM -0500, Kevin O'Connor wrote: On Sat, Nov 27, 2010 at 08:15:42PM +0200, Gleb Natapov wrote: On Sat, Nov 27, 2010 at 12:47:26PM -0500, Kevin O'Connor wrote: I don't think seabios should try to parse the path. Instead, I think seabios should build a name for each device it finds using the same algorithm that qemu uses and then just do a string compare to see if there is a match. Also, if qemu wants seabios to boot from a rom, I think it should tell seabios that - something like /p...@i0cf8/r...@4.0/b...@0 to mean make the drive declared by the rom on pci device 4 function 0 in the first found bcv the c: drive. Qemu does not know that Seabios needs optionrom to boot from a device. It knows even less about bcvs in option rom. Qemu knows about device itself, not how firmware boots from it. If the user wants to boot from a device and that device has an optionrom, then it's a safe bet that the optionrom is needed to boot from it. Suppose we add SCSI support to Seabios and suppose SCSI card Seabios can natively boot from has optionrom. What Seabios will do in such situation and how qemu can know it? Besides qemu support tries to be firmware agnostic. In any case, I'd rather have qemu know which devices seabios can boot then have seabios try to figure out what rom to run from a device path. You run all of them just like you do now. Information you get from qemu is only used for sorting BCV/BEV entries. BCV/BEV that does not have corespondent boot path in boot order list is put at the end. There's the product name and there's the order it was registered in (ie, the third bcv on the rom). Doesn't help much if we can't correlate bcv to device path. I'm confused by this. SeaBIOS can't boot the device in this situation - it can only run a rom. I think qemu should try to send info on which rom to run, not which device to boot. Each rom is uniquely identifiable by the pci device it came from (or fw_cfg slot), and each BCV can be identified by either its instance or its product name. For Qemu those optionroms are just binary blobs. It doesn't know why they are needed at all (there is no difference for qemu between vga rom and e1000 rom). BTW are you actually aware of any option rom with multiple BCVs and, if yes, how those BCVs differ? Multiple BCVs - yes. A SCSI card will define a BCV for each attached drive. I don't have a scsi card myself, but the support was added by a user who ran into the problem first hand. Optionrom is static. How number of BCVs can depend on number of attached drives? I don't know if there are SCSI card roms that will register all the drives on multiple cards in the first rom. I wouldn't be surprised if there are because of the scarcity of space in the 0xc-0xf space. (Having secondary optionroms resize themselves to zero would be a big savings.) -Kevin -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 00/16] boot order specification
On 11/23/2010 06:12 PM, Anthony Liguori wrote: On 11/23/2010 09:31 AM, Gleb Natapov wrote: Anthony, Blue No comments on this patch series for almost a week. Can it be applied? Does that mean everyone's happy or have folks not gotten around to review it? IOW, last call if you have objections :-) I haven't reviewed this - I trust the author and maintainers to get it right. But I notice the there is no documentation - surely some is needed? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 00/16] boot order specification
Trimming CC list, adding seabios list. On Sat, Nov 27, 2010 at 09:04:24PM +0200, Gleb Natapov wrote: On Sat, Nov 27, 2010 at 01:40:12PM -0500, Kevin O'Connor wrote: On Sat, Nov 27, 2010 at 08:15:42PM +0200, Gleb Natapov wrote: Qemu does not know that Seabios needs optionrom to boot from a device. It knows even less about bcvs in option rom. Qemu knows about device itself, not how firmware boots from it. If the user wants to boot from a device and that device has an optionrom, then it's a safe bet that the optionrom is needed to boot from it. Suppose we add SCSI support to Seabios and suppose SCSI card Seabios can natively boot from has optionrom. What Seabios will do in such situation and how qemu can know it? Besides qemu support tries to be firmware agnostic. In such a situation, under my proposal, users wouldn't be able to specify a default boot from their scsi drive until after qemu was also upgraded to know seabios could boot native scsi. (Or, they'd only be able to do it by adding in a command-line option.) In any case, I'd rather have qemu know which devices seabios can boot then have seabios try to figure out what rom to run from a device path. You run all of them just like you do now. Information you get from qemu is only used for sorting BCV/BEV entries. BCV/BEV that does not have corespondent boot path in boot order list is put at the end. If qemu sends in /p...@i0cf8/s...@3/d...@0,0 or /p...@i0cf8/ether...@4/ethernet-...@0 it will expect seabios to boot from the appropriate device. In both cases, seabios would need to run a rom in order to fulfill that request. Trying to figure out which rom is quite painful. That's why I'd prefer to see qemu instead pass in something like /p...@i0cf8/r...@3/b...@0 or /p...@i0cf8/r...@4/bev. That is, if the machine needs to boot via a rom I'd prefer qemu state that explicitly. BTW, in the situation where seabios has native device support (eg, ATA), I don't have any concerns. (The names are a bit verbose, but that's not really an issue.) BTW are you actually aware of any option rom with multiple BCVs and, if yes, how those BCVs differ? Multiple BCVs - yes. A SCSI card will define a BCV for each attached drive. I don't have a scsi card myself, but the support was added by a user who ran into the problem first hand. Optionrom is static. How number of BCVs can depend on number of attached drives? Not sure what you mean by Optionrom is static. SeaBIOS unlocks the memory, and the optionrom can and will modify itself with additional PNP headers so that it can list multiple BCVs - one for each drive. In particular, gPXE uses self modification to relocate parts of itself into high ram. -Kevin -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
suggested vhost link speed settings
Hi list, Being that the virtio interfaces are stated as acheiving 5-8 Gb throughput now with vhost, as opposed to 1Gb without, how should their link speed be defined when the choices are 2500M or 1M? I have them plotted out to make a 10Gb bond out of a pair, counting on 5Gb max each, which I'm imagining can be acheived without concern based on what I read. If I set them to 'auto-negotiate' will it internally flap between cause undesirable consequences? I don't want to set them at 2500 in case they ever do need to reach closer to 5 each. I'm not at a point where I can test anything yet, I'm just planning preconfiguring so far. For the sake of preventing issues down the line I wanted to see if there's a consensus or standard for this scenario and be as sure as possible ahead of time. Thanks, -C -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 00/16] boot order specification
On Sat, Nov 27, 2010 at 04:07:45PM -0500, Kevin O'Connor wrote: Trimming CC list, adding seabios list. On Sat, Nov 27, 2010 at 09:04:24PM +0200, Gleb Natapov wrote: On Sat, Nov 27, 2010 at 01:40:12PM -0500, Kevin O'Connor wrote: On Sat, Nov 27, 2010 at 08:15:42PM +0200, Gleb Natapov wrote: Qemu does not know that Seabios needs optionrom to boot from a device. It knows even less about bcvs in option rom. Qemu knows about device itself, not how firmware boots from it. If the user wants to boot from a device and that device has an optionrom, then it's a safe bet that the optionrom is needed to boot from it. Suppose we add SCSI support to Seabios and suppose SCSI card Seabios can natively boot from has optionrom. What Seabios will do in such situation and how qemu can know it? Besides qemu support tries to be firmware agnostic. In such a situation, under my proposal, users wouldn't be able to specify a default boot from their scsi drive until after qemu was also upgraded to know seabios could boot native scsi. (Or, they'd only be able to do it by adding in a command-line option.) If scsi card has optionrom with only one bcv then Seabios can determine its boot order from device path, so why not provide user with this option today? Besides qemu may be used to emulates sparc with openbios and this combination may be able to boot from scsi device. Qemu is not just x86 emulator running Seabios. If there is problem with scsi boot we let management know, so it will not create unbootable configuration. Today it is impossible to boot guest from scsi in qemu btw. In any case, I'd rather have qemu know which devices seabios can boot then have seabios try to figure out what rom to run from a device path. You run all of them just like you do now. Information you get from qemu is only used for sorting BCV/BEV entries. BCV/BEV that does not have corespondent boot path in boot order list is put at the end. If qemu sends in /p...@i0cf8/s...@3/d...@0,0 or /p...@i0cf8/ether...@4/ethernet-...@0 it will expect seabios to boot from the appropriate device. In both cases, seabios would need to run a rom in order to fulfill that request. Trying to figure out which rom is quite painful. That's why I'd prefer to see qemu instead pass in something like /p...@i0cf8/r...@3/b...@0 or /p...@i0cf8/r...@4/bev. That is, if the machine needs to boot via a rom I'd prefer qemu state that explicitly. It is painful in Seabios it is impossible in qemu at all. There is no way for qemu to know about BCVs or BEVs in optionroms especially considering that they are created at runtime like you say bellow. The best qemu can do is to ask user what device user wants to boot from and pass this information to Seabios in form of device path. Seabios (or other firmware) has to figure out how to boot from the device or ignore request if it can't. We can't provide the same functionality as Seabios' f12 menu on qemu command line since content of the menu depend on run time. BTW, in the situation where seabios has native device support (eg, ATA), I don't have any concerns. (The names are a bit verbose, but that's not really an issue.) This + network booting are the may use case really. And I do not see what problem we have with BEV devices. /p...@i0cf8/r...@4/bev is not much different from /p...@i0cf8/ether...@4/ethernet-...@0 since there can be only one bev per pci device. It is easy for Seabios to see that to boot from pci device in slot 4 func 0 it has to execute BEV. BTW are you actually aware of any option rom with multiple BCVs and, if yes, how those BCVs differ? Multiple BCVs - yes. A SCSI card will define a BCV for each attached drive. I don't have a scsi card myself, but the support was added by a user who ran into the problem first hand. Optionrom is static. How number of BCVs can depend on number of attached drives? Not sure what you mean by Optionrom is static. SeaBIOS unlocks the memory, and the optionrom can and will modify itself with additional PNP headers so that it can list multiple BCVs - one for each drive. In particular, gPXE uses self modification to relocate parts of itself into high ram. Optionrom is static was my misunderstanding. As you say here optionrom can create BEVs/BCVs at runtime which make it impossible for qemu to know about them even if qemu examine optionroms of devices. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 00/16] boot order specification
On Sat, Nov 27, 2010 at 10:56:10PM +0200, Avi Kivity wrote: On 11/23/2010 06:12 PM, Anthony Liguori wrote: On 11/23/2010 09:31 AM, Gleb Natapov wrote: Anthony, Blue No comments on this patch series for almost a week. Can it be applied? Does that mean everyone's happy or have folks not gotten around to review it? IOW, last call if you have objections :-) I haven't reviewed this - I trust the author and maintainers to get it right. But I notice the there is no documentation - surely some is needed? The patch creates Openfirmware device path from qdev hierarchy. Each element of a device path depends on type of a bus the device resides on. You can find various bus bindings here: http://playground.sun.com/1275/bindings/ and main spec is here http://forthworks.com/standards/of1275.pdf. Format in which list of device paths is passed to firmware is documented by comment (it is very simple). The only thing missing is command line option documentation. I will add it and resend if no more changes are needed for patch to be excepted. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html