Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 02/24/2011 07:58 PM, Anthony Liguori wrote: If you move the cdrom to a different IDE channel, you have to update the stateful non-config file. Whereas if you do $ qemu-img create -f cd-tray -b ~/foo.img ~/foo-media-tray.img $ qemu -cdrom ~/foo-media-tray.img the cd-rom tray state will be tracked in the image file. Yeah, but how do you move it? There is no need to move the file at all. Simply point the new drive at the media tray. If you do a remove/add through QMP, then the config file will reflect things just fine. If all access to the state file is through QMP then it becomes more palatable. A bit on that later. If you want to do it outside of QEMU, then you can just ignore the config file and manage all of the state yourself. But it's never going to work as well (it will be racy) and you're pushing a tremendous amount of knowledge that ultimately belongs in QEMU (what state needs to persist) to something that isn't QEMU which means it's probably not going to be done correctly. I know you're a big fan of the omnipotent management tool but my experience has been that we need to help the management tooling folks more by expecting less from them. I thought that's what I'm doing by separating the state out. It's easy for management to assemble configuration from their database and convert it into a centralized representation (like a qemu command line). It's a lot harder to disassemble a central state representation and move it back to the database. Using QMP is better than directly accessing the state file since qemu does the disassembly for you (provided the command references the device using its normal path, not some random key). The file just becomes a way to survive a crash, and all management needs to know about is to make it available and back it up. But it means that everything must be done via QMP, including assembly of the machine, otherwise the state file can become stale. Separating the state out to the device is even easier, since management is already expected to take care of disk images. All that's needed is to create the media tray image once, then you can forget about it completely. Again the question is who is the authoritative source of the configuration. Is it the management tool or is it qemu? QEMU. No question about it. At any point in time, we are the authoritative source of what the guest's configuration is. There's no doubt about it. A management tool can try to keep up with us, but ultimately we are the only ones that know for sure. We have all of this information internally. Just persisting it is not a major architectural change. It's something we should have been doing (arguably) from the very beginning. That's a huge divergence from how management tools are written. Currently they contain the required guest configuration, a representation of what's the current live configuration, and they issue monitor commands to move the live configuration towards the required configuration (or just generate a qemu command line). What you're describing is completely different, I'm not even sure what it is. The management tool already has to keep track of (the optional parts of) the guest device tree. It cannot start reading the stateful non-config file at random points in time. So all that is left is the guest controlled portions of the device tree, which are pretty rare, and random events like live-copy migration. I think that introducing a new authoritative source of information will create a lot of problems. QEMU has always been the authoritative source. Nothing new has been introduced. We never persisted the machine's configuration which meant management tools had to try to aggressively keep up with us which is intrinsically error prone. Fixing this will only improve existing management tools. If you look at management tools, they believe they are the authoritative source of configuration information (not guest state, which is more or less ignored). Right, but we should make it easy, not hard. Yeah, I fail to see how this makes it hard. We conveniently are saying, hey, this is all the state that needs to be persisted. We'll persist it for you if you want, otherwise, we'll expose it in a central location. The state-in-a-file is just a blob. Don't expect the tool to parse it and reassociate the various bits to its own representation. Exposing it via QMP commands is a lot better though. If the tool wants to ignore it and guess based on various combinations of other commands, more power to it. I don't see why it doesn't work. Please explain. 1) guest eject 2) qemu posts eject event 3) qemu acknowledges eject to the guest 4) management tool sees eject event and updates guest config There's a race between 3 4. It can only be addressed by interposing 4 between 2 and 3 OR making qemu persist this state between 2 and 3 such that the management tool
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 02/24/2011 07:45 PM, Anthony Liguori wrote: Yeah, it feels like we're introducing QEMU level RAID. We already did, with sheepdog (well the RAID code is really outside qemu itself). At what point are we going to add RAID5 support and not just RAID1. And it certainly begs the question of whether this use-case can be satisfied by just using Linux's RAID stack leaving one drive degraded and enabling the other drive when the time comes to fail over. The magic word is btrfs, we already have atomic live copy outside qemu ('cp --reflink'). If you also want an atomic switch you have to move it inside qemu so you can switch to the new file. Using one of the Linux RAID-1 implementations (md, dm, or drbd) is also possible but these are really designed for block devices, not files. -- error compiling committee.c: too many arguments to function
[Qemu-devel] Re: [PATCH v2 upstream 07/22] add assertions on the owner of a QemuMutex
On 2011-02-26 16:40, Paolo Bonzini wrote: These are already present in the Win32 implementation, add them to the pthread wrappers as well. Use PTHREAD_MUTEX_ERRORCHECK for mutex operations, and track the owner separately for cond_signal/broadcast. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- qemu-thread-posix.c | 23 +-- qemu-thread-posix.h |1 + 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/qemu-thread-posix.c b/qemu-thread-posix.c index e307773..a4c6e25 100644 --- a/qemu-thread-posix.c +++ b/qemu-thread-posix.c @@ -16,9 +16,12 @@ #include time.h #include signal.h #include stdint.h +#include assert.h #include string.h #include qemu-thread.h +static pthread_t pthread_null; + static void error_exit(int err, const char *msg) { fprintf(stderr, qemu: %s: %s\n, msg, strerror(err)); @@ -28,8 +31,13 @@ static void error_exit(int err, const char *msg) void qemu_mutex_init(QemuMutex *mutex) { int err; +pthread_mutexattr_t mutexattr; -err = pthread_mutex_init(mutex-lock, NULL); +mutex-owner = pthread_null; +pthread_mutexattr_init(mutexattr); +pthread_mutexattr_settype(mutexattr, PTHREAD_MUTEX_ERRORCHECK); +err = pthread_mutex_init(mutex-lock, mutexattr); +pthread_mutexattr_destroy(mutexattr); if (err) error_exit(err, __func__); } @@ -48,13 +56,20 @@ void qemu_mutex_lock(QemuMutex *mutex) int err; err = pthread_mutex_lock(mutex-lock); +mutex-owner = pthread_self(); if (err) error_exit(err, __func__); } int qemu_mutex_trylock(QemuMutex *mutex) { -return pthread_mutex_trylock(mutex-lock); +int err; +err = pthread_mutex_trylock(mutex-lock); +if (err == 0) { +mutex-owner = pthread_self(); +} + +return !!err; } static void timespec_add_ms(struct timespec *ts, uint64_t msecs) @@ -85,6 +100,7 @@ void qemu_mutex_unlock(QemuMutex *mutex) { int err; +mutex-owner = pthread_null; err = pthread_mutex_unlock(mutex-lock); if (err) error_exit(err, __func__); @@ -130,7 +146,10 @@ void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex) { int err; +assert(pthread_equal(mutex-owner, pthread_self())); +mutex-owner = pthread_null; err = pthread_cond_wait(cond-cond, mutex-lock); Though POSIX is not 100% explicit on this, every sane pthread_cond_wait implementation will apply the same error checking as on pthread_mutex_unlock when the given mutex is of PTHREAD_MUTEX_ERRORCHECK. So, this assert is actually redundant as well. Now that we are left without any assertions, I start wondering about one of the original missions: enforce qemu_cond_signal/broadcast to be called under a mutex. What about extending those services with a mutex argument and applying the assert there? Could become a static-inline wrapper so that the argument is optimized away if assert() is inactive. Jan signature.asc Description: OpenPGP digital signature
[Qemu-devel] Re: [PATCH v2 upstream 19/22] move blocking of signals to qemu_signalfd_init
On 2011-02-26 16:40, Paolo Bonzini wrote: Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- cpus.c | 87 ++- 1 files changed, 36 insertions(+), 51 deletions(-) diff --git a/cpus.c b/cpus.c index 32e9352..8c440f1 100644 --- a/cpus.c +++ b/cpus.c @@ -346,11 +346,37 @@ static void sigfd_handler(void *opaque) } } -static int qemu_signalfd_init(sigset_t mask) +static int qemu_signalfd_init(void) { int sigfd; +sigset_t set; -sigfd = qemu_signalfd(mask); +#ifdef CONFIG_IOTHREAD +/* SIGUSR2 used by posix-aio-compat.c */ +sigemptyset(set); +sigaddset(set, SIGUSR2); +pthread_sigmask(SIG_UNBLOCK, set, NULL); Didn't you want to rename the function for the sake of non-signalfd blocks like above? + +sigemptyset(set); +sigaddset(set, SIGIO); +sigaddset(set, SIGALRM); +sigaddset(set, SIG_IPI); +sigaddset(set, SIGBUS); +pthread_sigmask(SIG_BLOCK, set, NULL); +#else +sigemptyset(set); This line is shared and can be moved out of the #ifdef. +sigaddset(set, SIGBUS); +if (kvm_enabled()) { +/* + * We need to process timer signals synchronously to avoid a race + * between exit_request check and KVM vcpu entry. + */ +sigaddset(set, SIGIO); +sigaddset(set, SIGALRM); +} +#endif + +sigfd = qemu_signalfd(set); if (sigfd == -1) { fprintf(stderr, failed to create signalfd\n); return -errno; @@ -438,6 +464,12 @@ static void qemu_event_increment(void) static void qemu_kvm_eat_signals(CPUState *env) { } + +static int qemu_signalfd_init(void) +{ +return 0; +} + #endif /* _WIN32 */ #ifndef CONFIG_IOTHREAD @@ -471,39 +503,14 @@ static void qemu_kvm_init_cpu_signals(CPUState *env) #endif } -#ifndef _WIN32 -static sigset_t block_synchronous_signals(void) -{ -sigset_t set; - -sigemptyset(set); -sigaddset(set, SIGBUS); -if (kvm_enabled()) { -/* - * We need to process timer signals synchronously to avoid a race - * between exit_request check and KVM vcpu entry. - */ -sigaddset(set, SIGIO); -sigaddset(set, SIGALRM); -} - -return set; -} -#endif - int qemu_init_main_loop(void) { -#ifndef _WIN32 -sigset_t blocked_signals; int ret; -blocked_signals = block_synchronous_signals(); - -ret = qemu_signalfd_init(blocked_signals); +ret = qemu_signalfd_init(); if (ret) { return ret; } -#endif qemu_init_sigbus(); @@ -651,35 +658,13 @@ static void qemu_tcg_init_cpu_signals(void) pthread_sigmask(SIG_UNBLOCK, set, NULL); } -static sigset_t block_io_signals(void) -{ -sigset_t set; - -/* SIGUSR2 used by posix-aio-compat.c */ -sigemptyset(set); -sigaddset(set, SIGUSR2); -pthread_sigmask(SIG_UNBLOCK, set, NULL); - -sigemptyset(set); -sigaddset(set, SIGIO); -sigaddset(set, SIGALRM); -sigaddset(set, SIG_IPI); -sigaddset(set, SIGBUS); -pthread_sigmask(SIG_BLOCK, set, NULL); - -return set; -} - int qemu_init_main_loop(void) { int ret; -sigset_t blocked_signals; qemu_init_sigbus(); -blocked_signals = block_io_signals(); - -ret = qemu_signalfd_init(blocked_signals); +ret = qemu_signalfd_init(); if (ret) { return ret; } Beside the minor nits, this looks good. Jan signature.asc Description: OpenPGP digital signature
[Qemu-devel] Re: [PATCH v2 uq/master 00/22] Win32 iothread support
On 2011-02-26 16:39, Paolo Bonzini wrote: After gathering the comments about the two series I sent separately, here is the full series for Win32 iothread support, ready to be applied to uq/master. Patches 1 to 5 are generic Win32 improvements, including the qemu-thread implementation. Because of complex dependencies, I think it's better if this part is also routed through uq/master. Agreed. This also ensures that the non-Win32 parts receive autotest blessing before being applied. Patches 6 to 8 are generic threading improvements, including using PTHREAD_MUTEX_ERRORCHECK as suggested by Jan. Patches 9 to 17 eliminate polling, replacing condition variable timedwait with wait. Patch 18 removes a redundant condition from the TCG cpu_exec_all function. Patches 19 to 21 add all necessary stubs to make iothread compile with Win32, except the IPI calls. These are provided by patch 22. Tested on Wine and Linux, not on real Windows. The series introduces a dependency on Windows 2K or newer. I don't think either 95/98/ME or Windows NT 3.x are reasonable host systems for QEMU, anyway. I incorporated all suggestions from Jan, including his renaming patch for qemu_*_is_self, and included Aurelien's sh4 tweak to cpu_halted. #ifndef _WIN32 Except for the minor remarks, looks very good to me. #else Can't asses. #endif Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 02/27/2011 11:10 AM, Avi Kivity wrote: On 02/24/2011 07:58 PM, Anthony Liguori wrote: If you move the cdrom to a different IDE channel, you have to update the stateful non-config file. Whereas if you do $ qemu-img create -f cd-tray -b ~/foo.img ~/foo-media-tray.img $ qemu -cdrom ~/foo-media-tray.img the cd-rom tray state will be tracked in the image file. Yeah, but how do you move it? There is no need to move the file at all. Simply point the new drive at the media tray. If you do a remove/add through QMP, then the config file will reflect things just fine. If all access to the state file is through QMP then it becomes more palatable. A bit on that later. If you want to do it outside of QEMU, then you can just ignore the config file and manage all of the state yourself. But it's never going to work as well (it will be racy) and you're pushing a tremendous amount of knowledge that ultimately belongs in QEMU (what state needs to persist) to something that isn't QEMU which means it's probably not going to be done correctly. I know you're a big fan of the omnipotent management tool but my experience has been that we need to help the management tooling folks more by expecting less from them. I thought that's what I'm doing by separating the state out. It's easy for management to assemble configuration from their database and convert it into a centralized representation (like a qemu command line). It's a lot harder to disassemble a central state representation and move it back to the database. Using QMP is better than directly accessing the state file since qemu does the disassembly for you (provided the command references the device using its normal path, not some random key). The file just becomes a way to survive a crash, and all management needs to know about is to make it available and back it up. But it means that everything must be done via QMP, including assembly of the machine, otherwise the state file can become stale. Separating the state out to the device is even easier, since management is already expected to take care of disk images. All that's needed is to create the media tray image once, then you can forget about it completely. Again the question is who is the authoritative source of the configuration. Is it the management tool or is it qemu? QEMU. No question about it. At any point in time, we are the authoritative source of what the guest's configuration is. There's no doubt about it. A management tool can try to keep up with us, but ultimately we are the only ones that know for sure. We have all of this information internally. Just persisting it is not a major architectural change. It's something we should have been doing (arguably) from the very beginning. That's a huge divergence from how management tools are written. Currently they contain the required guest configuration, a representation of what's the current live configuration, and they issue monitor commands to move the live configuration towards the required configuration (or just generate a qemu command line). What you're describing is completely different, I'm not even sure what it is. The management tool already has to keep track of (the optional parts of) the guest device tree. It cannot start reading the stateful non-config file at random points in time. So all that is left is the guest controlled portions of the device tree, which are pretty rare, and random events like live-copy migration. I think that introducing a new authoritative source of information will create a lot of problems. QEMU has always been the authoritative source. Nothing new has been introduced. We never persisted the machine's configuration which meant management tools had to try to aggressively keep up with us which is intrinsically error prone. Fixing this will only improve existing management tools. If you look at management tools, they believe they are the authoritative source of configuration information (not guest state, which is more or less ignored). Right, but we should make it easy, not hard. Yeah, I fail to see how this makes it hard. We conveniently are saying, hey, this is all the state that needs to be persisted. We'll persist it for you if you want, otherwise, we'll expose it in a central location. The state-in-a-file is just a blob. Don't expect the tool to parse it and reassociate the various bits to its own representation. Exposing it via QMP commands is a lot better though. If the tool wants to ignore it and guess based on various combinations of other commands, more power to it. I don't see why it doesn't work. Please explain. 1) guest eject 2) qemu posts eject event 3) qemu acknowledges eject to the guest 4) management tool sees eject event and updates guest config There's a race between 3 4. It can only be addressed by interposing 4 between 2 and 3 OR making qemu persist this state between 2 and 3 such that the management tool can reliably query it. If it is my cd-rom tray block
Re: [Qemu-devel] Re: KVM call agenda for Jan 25
On Sat, Feb 26, 2011 at 9:50 PM, Dushyant Bansal cs5070...@cse.iitd.ac.in wrote: Disk block size is usually 512 bytes and in qemu-img, sector size is also 512B. And, this change would copy n sectors even if only one of them actually contains data (while cp checks and copies one block(=sector) at a time). Therefore, it will end up writing more data than cp. cp(1) from GNU coreutils 8.5 works in units of 32 KB on my system. It reads 32 KB and checks for all zeroes. If there are all zeroes, it seeks ahead 32 KB in the output file. Otherwise it writes the entire 32 KB. You can check what cp(1) is doing using strace(1). virtual size: 10G (10737418240 bytes) disk size: 569M convert- original time 0m52.522s convert- modified (resultant disk size: 5.3G) time 2m12.744s cp time 0m51.724s (same disk size) --- virtual size: 10G (10737418240 bytes) disk size: 3.6G convert- original time 1m52.249s convert- modified (resultant disk size: 7.1G) time 3m2.891s cp time 1m55.320s (same disk size) --- In these results, we can see that resultant disk size has increased. If I'm reading this correctly then qemu-img convert is within a few seconds of cp(1) for you? Stefan
Re: [Qemu-devel] Re: [PATCH] Split machine creation from the main loop
On 02/24/2011 07:25 PM, Anthony Liguori wrote: Is it really necessary? What's blocking us from initializing chardevs early? Well We initialize all chardevs at once right now and what set of chardevs there are depends on the machine (by the way defaults are applied). You could initialize chardevs in two stages although that requires quite a bit of additional complexity. We could initialize chardevs on demand - that should resolve any dependencies? It would be a pity to divorce the monitor from chardevs, they're really flexible. Couple considerations: 1) chardevs don't support multiple simultaneous connections. I view this as a blocker for QMP. What do you mean by that? Something like ,server which keeps on listening after it a connection is established? 2) Because chardevs don't support multiple connections, we can't reasonably hook on things like connect/disconnect which means that fd's sent via SCM_RIGHTs have to be handled in a very special way. By going outside of the chardev layer, we can let fd's via SCM_RIGHTS queue up naturally and have getfd/setfd refer to the fd at the top of the queue. It makes it quite a bit easier to work with (I believe Daniel had actually requested this a while ago). I really don't follow... what's the connection between SCM_RIGHTS and multiple connections? 3) By treating QMP as a special case, we don't have to treat chardevs overall as a special case. This feels more right to me although I can't say I have a strong opinion formed yet. 2) Make qemu_machine_init() take no parameters and just reference global state. 3) Teach all QMP functions to behave themselves if called before qemu_machine_init() 4) Introduce QMP function to call qemu_machine_init() An alternative is to remove all guest-visible content from qemu_machine_init(). So machine-init() would take no parameters and only build the static devices (power supply?). Everything else would be hot-plugged (perhaps some would fail if the machine was started - cold-plug only). All that qemu_machine_init() is is guest-visible content. That's the point of refactoring this. Sorry, poorly phrased. Configurable guest visible content. (6) can be started right now. (1) comes with the QAPI merge. (2) is pretty easy to do after applying this patch. (3) is probably something that can be done shortly after (1). (4) and (5) really require everything but (6) to be in place before we can meaningful do it. I think we can lay out much of the ground work for this in 0.15 and I think we can have a total conversion realistically for 0.16. That means that by EOY, we could invoke QEMU with no options and do everything through QMP. It's something that I've agitated for a long while, but when I see all the work needed, I'm not sure it's cost effective. There's a lot of secondary benefits that come from doing this. QMP becomes a much stronger interface. A lot of operations that right now are only specifiable by the command line become dynamic which mitigates reboots in the long term. Only the hot-pluggable ones. It also lays the ground work for a fully decoupled device model whereas the only interface between the devices and the outside world is a subset of QMP (think seccomp()). Whether creating a machine with no command line options is high value is probably irrelevant. I think we want to go in this direction regardless. I agree it's a good thing. -- error compiling committee.c: too many arguments to function
[Qemu-devel] [Bug 725991] [NEW] cpu-exec.c:766: handle_cpu_signal: Assertion failed.
Public bug reported: qemu-user 0.14.0 Host: amd64 (Phenom X6) host, Gentoo Linux Guest: armv6l Gentoo Linux Full error message: cpu-exec.c:766: handle_cpu_signal: Assertion `({ unsigned long __guest = (unsigned long)(address) - guest_base; __guest (1ul 32); })' failed. Happened in gcc process, no reliable steps to reproduce. ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/725991 Title: cpu-exec.c:766: handle_cpu_signal: Assertion failed. Status in QEMU: New Bug description: qemu-user 0.14.0 Host: amd64 (Phenom X6) host, Gentoo Linux Guest: armv6l Gentoo Linux Full error message: cpu-exec.c:766: handle_cpu_signal: Assertion `({ unsigned long __guest = (unsigned long)(address) - guest_base; __guest (1ul 32); })' failed. Happened in gcc process, no reliable steps to reproduce.
[Qemu-devel] Re: [PATCH 1/3] kvm-unit-tests: add x86 port io accessors
On 02/24/2011 11:48 PM, Anthony Liguori wrote: Signed-off-by: Anthony Liguorialigu...@us.ibm.com diff --git a/lib/x86/io.h b/lib/x86/io.h new file mode 100644 index 000..bd6341c --- /dev/null +++ b/lib/x86/io.h @@ -0,0 +1,40 @@ +#ifndef IO_H +#define IO_H + +static inline unsigned char inb(unsigned short port) +{ +unsigned char value; +asm volatile(inb %w1, %0 : =a (value) : Nd (port)); +return value; +} Are those %[wb] really needed? gcc should do the right thing based on the argument type. Might as well put all of that into processor.h. -- error compiling committee.c: too many arguments to function
[Qemu-devel] Re: [PATCH 2/3] kvm-unit-tests: do not set level sensitive irq when initializing the PIC
On 02/24/2011 11:48 PM, Anthony Liguori wrote: I'm not sure if this was intentional but the QEMU i8259 does not support this flag. I haven't observed any issues with this but I'll happily admit that I'm not very aware of what I'm doing here. Signed-off-by: Anthony Liguorialigu...@us.ibm.com static u32 xapic_read(unsigned reg) { return *(volatile u32 *)(g_apic + reg); @@ -133,7 +129,7 @@ void ioapic_write_redir(unsigned line, ioapic_redir_entry_t e) void enable_apic(void) { printf(enabling apic\n); -xapic_write(0xf0, 0x1ff); /* spurious vector register */ +xapic_write(0xf0, 0x1f7); /* spurious vector register */ } Not sure what you're doing here. You're changing the APIC Spurious Vector from 0xff to 0xf7? This has nothing to do with the i8259 or level triggeredness as far as I can tell - it just enables the APIC (bit 8) and selects a vector for reporting spurious interrupts (0xff happens to be the reset value). -- error compiling committee.c: too many arguments to function
[Qemu-devel] Re: [PATCH 3/3] kvm-unit-tests: make I/O more friendly to existing QEMU hardware
On 02/24/2011 11:48 PM, Anthony Liguori wrote: Use the serial port for printf() and use the Bochs bios exit port if the testdev port isn't available. This unconditionally switches to use the serial port but tries to use the testdev exit port since that lets you pass an exit status. The only issue I can see is that the serial port is much slower than the testdev port (since we can use rep outsb there). That only matters for access.flat in full output mode (not default). So please keep the old code under #if 0 so we can easily switch back to it. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 02/27/2011 03:55 AM, Dor Laor wrote: What about a simpler approach were QMP events will be written to a event-log-file (or even named pipe). The management tool can just use a small daemon that does nothing other than write QMP events to a log. There's no real need for this code to live in QEMU. Since events are posted, even if we wrote it in QEMU, the event wouldn't be guaranteed on disk by the time the event invocation returns. Regards, Anthony Liguori
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 02/27/2011 03:10 AM, Avi Kivity wrote: On 02/24/2011 07:58 PM, Anthony Liguori wrote: If you move the cdrom to a different IDE channel, you have to update the stateful non-config file. Whereas if you do $ qemu-img create -f cd-tray -b ~/foo.img ~/foo-media-tray.img $ qemu -cdrom ~/foo-media-tray.img the cd-rom tray state will be tracked in the image file. Yeah, but how do you move it? There is no need to move the file at all. Simply point the new drive at the media tray. No, I was asking, how do you move the cdrom to a different IDE channel. Are you using QMP? Are you changing the command line arguments? If you do a remove/add through QMP, then the config file will reflect things just fine. If all access to the state file is through QMP then it becomes more palatable. A bit on that later. As I think I've mentioned before, I hadn't really thought about an opaque state file but I'm not necessary opposed to it. I don't see an obvious advantage to making it opaque but I agree it should be accessible via QMP. If you want to do it outside of QEMU, then you can just ignore the config file and manage all of the state yourself. But it's never going to work as well (it will be racy) and you're pushing a tremendous amount of knowledge that ultimately belongs in QEMU (what state needs to persist) to something that isn't QEMU which means it's probably not going to be done correctly. I know you're a big fan of the omnipotent management tool but my experience has been that we need to help the management tooling folks more by expecting less from them. I thought that's what I'm doing by separating the state out. It's easy for management to assemble configuration from their database and convert it into a centralized representation (like a qemu command line). It's a lot harder to disassemble a central state representation and move it back to the database. Using QMP is better than directly accessing the state file since qemu does the disassembly for you (provided the command references the device using its normal path, not some random key). The file just becomes a way to survive a crash, and all management needs to know about is to make it available and back it up. But it means that everything must be done via QMP, including assembly of the machine, otherwise the state file can become stale. Separating the state out to the device is even easier, since management is already expected to take care of disk images. All that's needed is to create the media tray image once, then you can forget about it completely. Except that instead of having one state file, we might have a dozen additional device state files. Again the question is who is the authoritative source of the configuration. Is it the management tool or is it qemu? QEMU. No question about it. At any point in time, we are the authoritative source of what the guest's configuration is. There's no doubt about it. A management tool can try to keep up with us, but ultimately we are the only ones that know for sure. We have all of this information internally. Just persisting it is not a major architectural change. It's something we should have been doing (arguably) from the very beginning. That's a huge divergence from how management tools are written. This is one of the reasons why management tooling around QEMU needs quite a bit of improving. There is simply no way a management tool can do a good job of being an authoritative source of configuration. The races we're discussion is a good example of why. But beyond those races, QEMU is the only entity that knows with certainty what bits of information are important to persist in order to preserve a guest across shutdown/restart. The fact that we've punted this problem for so long has only ensured that management tools are either intrinsically broken or only support the most minimal subset of functionality we actually support. Currently they contain the required guest configuration, a representation of what's the current live configuration, and they issue monitor commands to move the live configuration towards the required configuration (or just generate a qemu command line). What you're describing is completely different, I'm not even sure what it is. Management tools shouldn't have to think about how the monitor commands they issue impact the invocation options of QEMU. The management tool already has to keep track of (the optional parts of) the guest device tree. It cannot start reading the stateful non-config file at random points in time. So all that is left is the guest controlled portions of the device tree, which are pretty rare, and random events like live-copy migration. I think that introducing a new authoritative source of information will create a lot of problems. QEMU has always been the authoritative source. Nothing new has been introduced. We never persisted
Re: [Qemu-devel] Re: [PATCH 1/3] kvm-unit-tests: add x86 port io accessors
On 02/27/2011 06:44 AM, Avi Kivity wrote: On 02/24/2011 11:48 PM, Anthony Liguori wrote: Signed-off-by: Anthony Liguorialigu...@us.ibm.com diff --git a/lib/x86/io.h b/lib/x86/io.h new file mode 100644 index 000..bd6341c --- /dev/null +++ b/lib/x86/io.h @@ -0,0 +1,40 @@ +#ifndef IO_H +#define IO_H + +static inline unsigned char inb(unsigned short port) +{ +unsigned char value; +asm volatile(inb %w1, %0 : =a (value) : Nd (port)); +return value; +} Are those %[wb] really needed? gcc should do the right thing based on the argument type. It's just a little extra type safety. Might as well put all of that into processor.h. Seems reasonable. Regards, Anthony Liguori
[Qemu-devel] [PATCH] simpletrace: Thread-safe tracing
Trace events outside the global mutex cannot be used with the simple trace backend since it is not thread-safe. There is no check to prevent them being enabled so people sometimes learn this the hard way. This patch restructures the simple trace backend with a ring buffer suitable for multiple concurrent writers. A writeout thread empties the trace buffer when threshold fill levels are reached. Should the writeout thread be unable to keep up with trace generation, records will simply be dropped. Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- docs/tracing.txt |5 - simpletrace.c| 297 +++--- simpletrace.h|8 ++ vl.c | 16 +-- 4 files changed, 207 insertions(+), 119 deletions(-) diff --git a/docs/tracing.txt b/docs/tracing.txt index a6cc56f..f15069c 100644 --- a/docs/tracing.txt +++ b/docs/tracing.txt @@ -141,11 +141,6 @@ source tree. It may not be as powerful as platform-specific or third-party trace backends but it is portable. This is the recommended trace backend unless you have specific needs for more advanced backends. -Warning: the simple backend is not thread-safe so only enable trace events -that are executed while the global mutex is held. Much of QEMU meets this -requirement but some utility functions like qemu_malloc() or thread-related -code cannot be safely traced using the simple backend. - Monitor commands * info trace diff --git a/simpletrace.c b/simpletrace.c index 9ea0d1f..9403ea2 100644 --- a/simpletrace.c +++ b/simpletrace.c @@ -12,6 +12,9 @@ #include stdint.h #include stdio.h #include time.h +#include signal.h +#include pthread.h +#include qerror.h #include qemu-timer.h #include trace.h @@ -24,6 +27,9 @@ /** Trace file version number, bump if format changes */ #define HEADER_VERSION 0 +/** Trace record is valid */ +#define TRACE_RECORD_VALID ((uint64_t)1 63) + /** Trace buffer entry */ typedef struct { uint64_t event; @@ -37,126 +43,130 @@ typedef struct { } TraceRecord; enum { -TRACE_BUF_LEN = 64 * 1024 / sizeof(TraceRecord), +TRACE_BUF_LEN = 4096, +TRACE_BUF_FLUSH_THRESHOLD = TRACE_BUF_LEN / 4, }; +/* + * Trace records are written out by a dedicated thread. The thread waits for + * records to become available, writes them out, and then waits again. + */ +static pthread_mutex_t trace_lock = PTHREAD_MUTEX_INITIALIZER; +static pthread_cond_t trace_available_cond = PTHREAD_COND_INITIALIZER; +static pthread_cond_t trace_empty_cond = PTHREAD_COND_INITIALIZER; +static bool trace_available; +static bool trace_writeout_enabled; + static TraceRecord trace_buf[TRACE_BUF_LEN]; static unsigned int trace_idx; static FILE *trace_fp; static char *trace_file_name = NULL; -static bool trace_file_enabled = false; -void st_print_trace_file_status(FILE *stream, int (*stream_printf)(FILE *stream, const char *fmt, ...)) +/** + * Read a trace record from the trace buffer + * + * @idx Trace buffer index + * @record Trace record to fill + * + * Returns false if the record is not valid. + */ +static bool get_trace_record(unsigned int idx, TraceRecord *record) { -stream_printf(stream, Trace file \%s\ %s.\n, - trace_file_name, trace_file_enabled ? on : off); -} +if (!(trace_buf[idx].event TRACE_RECORD_VALID)) { +return false; +} -static bool write_header(FILE *fp) -{ -static const TraceRecord header = { -.event = HEADER_EVENT_ID, -.timestamp_ns = HEADER_MAGIC, -.x1 = HEADER_VERSION, -}; +__sync_synchronize(); /* read memory barrier before accessing record */ -return fwrite(header, sizeof header, 1, fp) == 1; +*record = trace_buf[idx]; +record-event = ~TRACE_RECORD_VALID; +return true; } /** - * set_trace_file : To set the name of a trace file. - * @file : pointer to the name to be set. - * If NULL, set to the default name-pid set at config time. + * Kick writeout thread + * + * @waitWhether to wait for writeout thread to complete */ -bool st_set_trace_file(const char *file) +static void flush_trace_file(bool wait) { -st_set_trace_file_enabled(false); +pthread_mutex_lock(trace_lock); +trace_available = true; +pthread_cond_signal(trace_available_cond); -free(trace_file_name); - -if (!file) { -if (asprintf(trace_file_name, CONFIG_TRACE_FILE, getpid()) 0) { -trace_file_name = NULL; -return false; -} -} else { -if (asprintf(trace_file_name, %s, file) 0) { -trace_file_name = NULL; -return false; -} +if (wait) { +pthread_cond_wait(trace_empty_cond, trace_lock); } -st_set_trace_file_enabled(true); -return true; +pthread_mutex_unlock(trace_lock); } -static void flush_trace_file(void) +static void wait_for_trace_records_available(void) { -/* If the trace file is not open yet,
[Qemu-devel] Re: [PATCH v2 upstream 07/22] add assertions on the owner of a QemuMutex
On 02/27/2011 10:33 AM, Jan Kiszka wrote: Now that we are left without any assertions, I start wondering about one of the original missions: enforce qemu_cond_signal/broadcast to be called under a mutex. What about extending those services with a mutex argument and applying the assert there? That is one of the patches in my queue that I haven't submitted yet. :) Paolo
[Qemu-devel] Re: [PATCH v2 upstream 19/22] move blocking of signals to qemu_signalfd_init
On 02/27/2011 10:41 AM, Jan Kiszka wrote: +#ifdef CONFIG_IOTHREAD +/* SIGUSR2 used by posix-aio-compat.c */ +sigemptyset(set); +sigaddset(set, SIGUSR2); +pthread_sigmask(SIG_UNBLOCK,set, NULL); Didn't you want to rename the function for the sake of non-signalfd blocks like above? Right. + +sigemptyset(set); +sigaddset(set, SIGIO); +sigaddset(set, SIGALRM); +sigaddset(set, SIG_IPI); +sigaddset(set, SIGBUS); +pthread_sigmask(SIG_BLOCK,set, NULL); +#else +sigemptyset(set); This line is shared and can be moved out of the #ifdef. It's shared but for different purposes (UNBLOCK for iothread, BLOCK/signalfd for !iothread), so I decided not to hoist it out. Paolo
[Qemu-devel] Re: [PATCH] simpletrace: Thread-safe tracing
On 02/27/2011 03:58 PM, Stefan Hajnoczi wrote: + * Trace records are written out by a dedicated thread. The thread waits for + * records to become available, writes them out, and then waits again. + */ +static pthread_mutex_t trace_lock = PTHREAD_MUTEX_INITIALIZER; +static pthread_cond_t trace_available_cond = PTHREAD_COND_INITIALIZER; +static pthread_cond_t trace_empty_cond = PTHREAD_COND_INITIALIZER; +static bool trace_available; +static bool trace_writeout_enabled; Please use QemuThread. + +pthread_attr_init(attr); +pthread_attr_setdetachstate(attr, PTHREAD_CREATE_DETACHED); -tp = find_trace_event_by_name(tname); -if (tp) { -tp-state = tstate; -return true; +sigfillset(set); +pthread_sigmask(SIG_SETMASK, set, oldset); +ret = pthread_create(thread, attr, writeout_thread, NULL); +pthread_sigmask(SIG_SETMASK, oldset, NULL); This is also taken care of by QemuThread. Paolo
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 02/27/2011 04:00 PM, Anthony Liguori wrote: On 02/27/2011 03:10 AM, Avi Kivity wrote: On 02/24/2011 07:58 PM, Anthony Liguori wrote: If you move the cdrom to a different IDE channel, you have to update the stateful non-config file. Whereas if you do $ qemu-img create -f cd-tray -b ~/foo.img ~/foo-media-tray.img $ qemu -cdrom ~/foo-media-tray.img the cd-rom tray state will be tracked in the image file. Yeah, but how do you move it? There is no need to move the file at all. Simply point the new drive at the media tray. No, I was asking, how do you move the cdrom to a different IDE channel. Are you using QMP? Are you changing the command line arguments? Yes. If we're doing hot-move (not really relevant to ide-cd) then you'd use QMP. If you're editing a virtual machine that is down, or scheduling a change for the next reboot, then you're using command line arguments (or cold-plugging into a stopped guest). Requiring management to remember the old configuration and issue delta commands to move the device for the cold-plug case is increased complexity IMO. If you do a remove/add through QMP, then the config file will reflect things just fine. If all access to the state file is through QMP then it becomes more palatable. A bit on that later. As I think I've mentioned before, I hadn't really thought about an opaque state file but I'm not necessary opposed to it. I don't see an obvious advantage to making it opaque but I agree it should be accessible via QMP. The advantage is that we keep the management tool talking to one interface (I don't think we should prevent users from interpreting it, just make it unnecessary). I thought that's what I'm doing by separating the state out. It's easy for management to assemble configuration from their database and convert it into a centralized representation (like a qemu command line). It's a lot harder to disassemble a central state representation and move it back to the database. Using QMP is better than directly accessing the state file since qemu does the disassembly for you (provided the command references the device using its normal path, not some random key). The file just becomes a way to survive a crash, and all management needs to know about is to make it available and back it up. But it means that everything must be done via QMP, including assembly of the machine, otherwise the state file can become stale. Separating the state out to the device is even easier, since management is already expected to take care of disk images. All that's needed is to create the media tray image once, then you can forget about it completely. Except that instead of having one state file, we might have a dozen additional device state files. That is fine. We already have one state file per block device. QEMU. No question about it. At any point in time, we are the authoritative source of what the guest's configuration is. There's no doubt about it. A management tool can try to keep up with us, but ultimately we are the only ones that know for sure. We have all of this information internally. Just persisting it is not a major architectural change. It's something we should have been doing (arguably) from the very beginning. That's a huge divergence from how management tools are written. This is one of the reasons why management tooling around QEMU needs quite a bit of improving. There is simply no way a management tool can do a good job of being an authoritative source of configuration. The races we're discussion is a good example of why. What we're discussing is not configuration. It is non-volatile state. Configuration comes from the user; state comes from the guest (the management tool may edit state; but the guest cannot edit the configuration). I agree 100% the management tool cannot be the authoritative source of state. My position is: - the management tool should be 100% in control of configuration (how the guest is put together from its components) - qemu should be 100% in control of state (memory, disk state, NVRAM in various components, cd-rom eject state, explosive bolts for payload separation, self-destruct mechanism, etc.) - the management tool should have access to state using the same identifiers it used to create the devices that contain the state - it is preferable to store state in the device so that when the configuration changes, state is maintained (like hot-unplug of a network card with NVRAM followed by hot-plug of the same card). - the angular momentum of the planet we (presumably) are on won't change, whatever we do [1] But beyond those races, QEMU is the only entity that knows with certainty what bits of information are important to persist in order to preserve a guest across shutdown/restart. The fact that we've punted this problem for so long has only ensured that management tools are either intrinsically broken or
Re: [Qemu-devel] Re: [PATCH 1/3] kvm-unit-tests: add x86 port io accessors
On 02/27/2011 04:01 PM, Anthony Liguori wrote: On 02/27/2011 06:44 AM, Avi Kivity wrote: On 02/24/2011 11:48 PM, Anthony Liguori wrote: Signed-off-by: Anthony Liguorialigu...@us.ibm.com diff --git a/lib/x86/io.h b/lib/x86/io.h new file mode 100644 index 000..bd6341c --- /dev/null +++ b/lib/x86/io.h @@ -0,0 +1,40 @@ +#ifndef IO_H +#define IO_H + +static inline unsigned char inb(unsigned short port) +{ +unsigned char value; +asm volatile(inb %w1, %0 : =a (value) : Nd (port)); +return value; +} Are those %[wb] really needed? gcc should do the right thing based on the argument type. It's just a little extra type safety. Yeah, but those constraints aren't really documented. Linux does use them in a handful of places so they're unlikely to go away though. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 02/27/2011 03:49 PM, Anthony Liguori wrote: On 02/27/2011 03:55 AM, Dor Laor wrote: What about a simpler approach were QMP events will be written to a event-log-file (or even named pipe). The management tool can just use a small daemon that does nothing other than write QMP events to a log. There's no real need for this code to live in QEMU. IIUC in case the management daemon will run qemu using named pipe for qmp it will happen automatically. If you agree to this approach it will simplify the more complex config file option (although it is nice to have as independent option for single hosts managed by simpler mgmts) Since events are posted, even if we wrote it in QEMU, the event wouldn't be guaranteed on disk by the time the event invocation returns. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH] simpletrace: Thread-safe tracing
On 02/27/2011 04:58 PM, Stefan Hajnoczi wrote: Trace events outside the global mutex cannot be used with the simple trace backend since it is not thread-safe. There is no check to prevent them being enabled so people sometimes learn this the hard way. This patch restructures the simple trace backend with a ring buffer suitable for multiple concurrent writers. A writeout thread empties the trace buffer when threshold fill levels are reached. Should the writeout thread be unable to keep up with trace generation, records will simply be dropped. It would be good to have an indication of the fact that records were dropped in the file. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] Stupid question?
On 02/27/2011 08:22 AM, Amos Kong wrote: On Sun, Feb 27, 2011 at 6:22 AM, Dushyant Bansal cs5070...@cse.iitd.ac.in wrote: On Sunday 27 February 2011 03:45 AM, Frans de Boer wrote: Hi all, This is the only QEMU list, so I put my question her. How can I copy the contents of a *.raw image to a real HD partition or vice versa. I often use a virtual image using qemu/kvm to test some OS's and want to migrate them to a HD partition if the time is right. Also, I like to preserve old partitions for future use/reference. Any suggestions? Regards, Frans. You can mount it using losetup. http://blog.piotrj.org/2009/03/mounting-raw-kvmqemu-image.html losetup is fine, can we directly use dd ? dd words for a raw file; qemu-img works for all qemu supported formats. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] Stupid question?
kpartx is easier than losetup for getting at individual partitions inside a raw image. http://linux.die.net/man/8/kpartx Stefan
Re: [Qemu-devel] Re: [PATCH] simpletrace: Thread-safe tracing
On Sun, Feb 27, 2011 at 3:13 PM, Paolo Bonzini pbonz...@redhat.com wrote: On 02/27/2011 03:58 PM, Stefan Hajnoczi wrote: + * Trace records are written out by a dedicated thread. The thread waits for + * records to become available, writes them out, and then waits again. + */ +static pthread_mutex_t trace_lock = PTHREAD_MUTEX_INITIALIZER; +static pthread_cond_t trace_available_cond = PTHREAD_COND_INITIALIZER; +static pthread_cond_t trace_empty_cond = PTHREAD_COND_INITIALIZER; +static bool trace_available; +static bool trace_writeout_enabled; Please use QemuThread. The tracing code itself should use avoid core QEMU code. Otherwise we can't trace QemuThread - we'd have an infinite loop. Stefan
Re: [Qemu-devel] [PATCH] simpletrace: Thread-safe tracing
On Sun, Feb 27, 2011 at 4:14 PM, Avi Kivity a...@redhat.com wrote: On 02/27/2011 04:58 PM, Stefan Hajnoczi wrote: Trace events outside the global mutex cannot be used with the simple trace backend since it is not thread-safe. There is no check to prevent them being enabled so people sometimes learn this the hard way. This patch restructures the simple trace backend with a ring buffer suitable for multiple concurrent writers. A writeout thread empties the trace buffer when threshold fill levels are reached. Should the writeout thread be unable to keep up with trace generation, records will simply be dropped. It would be good to have an indication of the fact that records were dropped in the file. Good idea. Trace files begin with a record that has a special ID. I'll look at extending this either by picking another special ID or by reusing the header record. Stefan
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 02/27/2011 10:02 AM, Dor Laor wrote: On 02/27/2011 03:49 PM, Anthony Liguori wrote: On 02/27/2011 03:55 AM, Dor Laor wrote: What about a simpler approach were QMP events will be written to a event-log-file (or even named pipe). The management tool can just use a small daemon that does nothing other than write QMP events to a log. There's no real need for this code to live in QEMU. IIUC in case the management daemon will run qemu using named pipe for qmp it will happen automatically. No, the event model is changing (it was always intended to change though). Events will need explicit registration so it's necessary to have bidirectional communication. So you can't just do -qmp foo -chardev file:event.log,id=foo. You won't actually see most of the events. Regards, Anthony Liguori If you agree to this approach it will simplify the more complex config file option (although it is nice to have as independent option for single hosts managed by simpler mgmts) Since events are posted, even if we wrote it in QEMU, the event wouldn't be guaranteed on disk by the time the event invocation returns. Regards, Anthony Liguori
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 02/27/2011 09:31 AM, Avi Kivity wrote: On 02/27/2011 04:00 PM, Anthony Liguori wrote: On 02/27/2011 03:10 AM, Avi Kivity wrote: On 02/24/2011 07:58 PM, Anthony Liguori wrote: If you move the cdrom to a different IDE channel, you have to update the stateful non-config file. Whereas if you do $ qemu-img create -f cd-tray -b ~/foo.img ~/foo-media-tray.img $ qemu -cdrom ~/foo-media-tray.img the cd-rom tray state will be tracked in the image file. Yeah, but how do you move it? There is no need to move the file at all. Simply point the new drive at the media tray. No, I was asking, how do you move the cdrom to a different IDE channel. Are you using QMP? Are you changing the command line arguments? Yes. If we're doing hot-move (not really relevant to ide-cd) then you'd use QMP. If you're editing a virtual machine that is down, or scheduling a change for the next reboot, then you're using command line arguments (or cold-plugging into a stopped guest). Requiring management to remember the old configuration and issue delta commands to move the device for the cold-plug case is increased complexity IMO. If you do a remove/add through QMP, then the config file will reflect things just fine. If all access to the state file is through QMP then it becomes more palatable. A bit on that later. As I think I've mentioned before, I hadn't really thought about an opaque state file but I'm not necessary opposed to it. I don't see an obvious advantage to making it opaque but I agree it should be accessible via QMP. The advantage is that we keep the management tool talking to one interface (I don't think we should prevent users from interpreting it, just make it unnecessary). I thought that's what I'm doing by separating the state out. It's easy for management to assemble configuration from their database and convert it into a centralized representation (like a qemu command line). It's a lot harder to disassemble a central state representation and move it back to the database. Using QMP is better than directly accessing the state file since qemu does the disassembly for you (provided the command references the device using its normal path, not some random key). The file just becomes a way to survive a crash, and all management needs to know about is to make it available and back it up. But it means that everything must be done via QMP, including assembly of the machine, otherwise the state file can become stale. Separating the state out to the device is even easier, since management is already expected to take care of disk images. All that's needed is to create the media tray image once, then you can forget about it completely. Except that instead of having one state file, we might have a dozen additional device state files. That is fine. We already have one state file per block device. QEMU. No question about it. At any point in time, we are the authoritative source of what the guest's configuration is. There's no doubt about it. A management tool can try to keep up with us, but ultimately we are the only ones that know for sure. We have all of this information internally. Just persisting it is not a major architectural change. It's something we should have been doing (arguably) from the very beginning. That's a huge divergence from how management tools are written. This is one of the reasons why management tooling around QEMU needs quite a bit of improving. There is simply no way a management tool can do a good job of being an authoritative source of configuration. The races we're discussion is a good example of why. What we're discussing is not configuration. It is non-volatile state. Configuration comes from the user; state comes from the guest (the management tool may edit state; but the guest cannot edit the configuration). I agree 100% the management tool cannot be the authoritative source of state. My position is: - the management tool should be 100% in control of configuration (how the guest is put together from its components) - qemu should be 100% in control of state (memory, disk state, NVRAM in various components, cd-rom eject state, explosive bolts for payload separation, self-destruct mechanism, etc.) There simply is not such a clean separation between the two because things that the guest does affects the configuration of the guest. Hot plug, removable media eject, persistent device settings (whether it's CMOS or EEPROM) all disrupt this model. If you really wanted to have this separation, you'd have to be very strict about making all guest settings not be specified in config. You would need to do: qemu-img create -f e1000-eprom -o macaddr=12:23:45:67:78:90 e1000.0.rom qemu-img create -f e1000-eprom -o macaddr=12:23:45:67:78:91 e1000.1.rom qemu -device e1000,id=e1000.0,eeprom=e1000.0.rom -device e1000,id=e1000.1,eeprom=e1000.1.rom And now I need a tool that lets
[Qemu-devel] [PATCH 1/3] w32: Add new directory hierarchy for MinGW extensions
This is a first step which will reduce the number of conditional compilations caused by MinGW. Include files will be added to hosts/w32/include. Signed-off-by: Stefan Weil w...@mail.berlios.de --- configure|1 + hosts/w32/README |1 + 2 files changed, 2 insertions(+), 0 deletions(-) create mode 100644 hosts/w32/README diff --git a/configure b/configure index 3036faf..47779b6 100755 --- a/configure +++ b/configure @@ -474,6 +474,7 @@ if test $mingw32 = yes ; then QEMU_CFLAGS=-DWIN32_LEAN_AND_MEAN -DWINVER=0x501 $QEMU_CFLAGS # enable C99/POSIX format strings (needs mingw32-runtime 3.15 or later) QEMU_CFLAGS=-D__USE_MINGW_ANSI_STDIO=1 $QEMU_CFLAGS + QEMU_CFLAGS=-I\$(SRC_PATH)/hosts/w32/include $QEMU_CFLAGS LIBS=-lwinmm -lws2_32 -liberty -liphlpapi $LIBS prefix=c:/Program Files/Qemu mandir=\${prefix} diff --git a/hosts/w32/README b/hosts/w32/README new file mode 100644 index 000..d5aafe9 --- /dev/null +++ b/hosts/w32/README @@ -0,0 +1 @@ +This directory contains extensions of MinGW which are needed by QEMU. -- 1.7.2.3
[Qemu-devel] [PATCH 2/3] w32: Add macro timersub to sys/time.h
timersub is needed by the latest vnc code. Signed-off-by: Stefan Weil w...@mail.berlios.de --- hosts/w32/include/sys/time.h | 24 1 files changed, 24 insertions(+), 0 deletions(-) create mode 100644 hosts/w32/include/sys/time.h diff --git a/hosts/w32/include/sys/time.h b/hosts/w32/include/sys/time.h new file mode 100644 index 000..94056ff --- /dev/null +++ b/hosts/w32/include/sys/time.h @@ -0,0 +1,24 @@ +/* + * Extensions of MinGW sys/time.h + * + * Copyright (C) 2011 Stefan Weil + * + * This work is licensed under the terms of the GNU LGPL, version 2.1 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#include_next sys/time.h + +#ifndef timersub +/* This is a copy from GNU C Library (GNU LGPL 2.1), sys/time.h. */ +# define timersub(a, b, result) \ + do {\ +(result)-tv_sec = (a)-tv_sec - (b)-tv_sec; \ +(result)-tv_usec = (a)-tv_usec - (b)-tv_usec; \ +if ((result)-tv_usec 0) { \ + --(result)-tv_sec; \ + (result)-tv_usec += 100; \ +} \ + } while (0) +#endif -- 1.7.2.3
[Qemu-devel] [PATCH 3/3] osdep: Remove conditional compilation (fixes w32 compilation)
sys/time.h also exists for MinGW, and it is needed because it declares struct timeval (needed by latest vnc code). Signed-off-by: Stefan Weil w...@mail.berlios.de --- osdep.h |2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/osdep.h b/osdep.h index 27eedcf..5a667b4 100644 --- a/osdep.h +++ b/osdep.h @@ -8,9 +8,7 @@ #include sys/signal.h #endif -#ifndef _WIN32 #include sys/time.h -#endif #ifndef glue #define xglue(x, y) x ## y -- 1.7.2.3
[Qemu-devel] Re: [PATCH] simpletrace: Thread-safe tracing
On 02/27/2011 06:02 PM, Stefan Hajnoczi wrote: On Sun, Feb 27, 2011 at 3:13 PM, Paolo Bonzinipbonz...@redhat.com wrote: On 02/27/2011 03:58 PM, Stefan Hajnoczi wrote: + * Trace records are written out by a dedicated thread. The thread waits for + * records to become available, writes them out, and then waits again. + */ +static pthread_mutex_t trace_lock = PTHREAD_MUTEX_INITIALIZER; +static pthread_cond_t trace_available_cond = PTHREAD_COND_INITIALIZER; +static pthread_cond_t trace_empty_cond = PTHREAD_COND_INITIALIZER; +static bool trace_available; +static bool trace_writeout_enabled; Please use QemuThread. The tracing code itself should use avoid core QEMU code. Otherwise we can't trace QemuThread - we'd have an infinite loop. Hmm, right... they'll use stdio to trace Win32 then... :) I was actually thinking more of the code duplication. But do you really need tracing at such a low level? I'd expect tracing wrappers like qemu_lock_mutex_iothread, not mutexes in general. Paolo
Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote: On 2011-02-26 12:43, xming wrote: When trying to start X (and it loads qxl driver) the kvm process just crashes. This is fixed by Gerd's attached patch (taken from rhel repository, don't know why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email). qemu-kvm 0.14 startup line /usr/bin/kvm -name spaceball,process=spaceball -m 1024 -kernel /boot/bzImage-2.6.37.2-guest -append root=/dev/vda ro -smp 1 -netdev type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice port=5957,disable-ticketing -monitor telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile /var/run/kvm/spaceball.pid host is running vanilla 2.6.37.1 on amd64. Here is the bt # gdb /usr/bin/qemu-system-x86_64 GNU gdb (Gentoo 7.2 p1) 7.2 Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type show copying and show warranty for details. This GDB was configured as x86_64-pc-linux-gnu. For bug reporting instructions, please see: http://bugs.gentoo.org/... Reading symbols from /usr/bin/qemu-system-x86_64...done. (gdb) set args -name spaceball,process=spaceball -m 1024 -kernel /boot/bzImage-2.6.37.2-guest -append root=/dev/vda ro -smp 1 -netdev type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice port=5957,disable-ticketing -monitor telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile /var/run/kvm/spaceball.pid (gdb) run Starting program: /usr/bin/qemu-system-x86_64 -name spaceball,process=spaceball -m 1024 -kernel /boot/bzImage-2.6.37.2-guest -append root=/dev/vda ro -smp 1 -netdev type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice port=5957,disable-ticketing -monitor telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile /var/run/kvm/spaceball.pid [Thread debugging using libthread_db enabled] do_spice_init: starting 0.6.0 spice_server_add_interface: SPICE_INTERFACE_KEYBOARD spice_server_add_interface: SPICE_INTERFACE_MOUSE [New Thread 0x74802710 (LWP 30294)] spice_server_add_interface: SPICE_INTERFACE_QXL [New Thread 0x7fffaacae710 (LWP 30295)] red_worker_main: begin handle_dev_destroy_surfaces: handle_dev_destroy_surfaces: handle_dev_input: start [New Thread 0x7fffaa4ad710 (LWP 30298)] [New Thread 0x7fffa9cac710 (LWP 30299)] [New Thread 0x7fffa94ab710 (LWP 30300)] [New Thread 0x7fffa8caa710 (LWP 30301)] [New Thread 0x7fffa3fff710 (LWP 30302)] [New Thread 0x7fffa37fe710 (LWP 30303)] [New Thread 0x7fffa2ffd710 (LWP 30304)] [New Thread 0x7fffa27fc710 (LWP 30305)] [New Thread 0x7fffa1ffb710 (LWP 30306)] [New Thread 0x7fffa17fa710 (LWP 30307)] reds_handle_main_link: reds_show_new_channel: channel 1:0, connected successfully, over Non Secure link reds_main_handle_message: net test: latency 5.636000 ms, bitrate 11027768 bps (10.516899 Mbps) reds_show_new_channel: channel 2:0, connected successfully, over Non Secure link red_dispatcher_set_peer: handle_dev_input: connect handle_new_display_channel: jpeg disabled handle_new_display_channel: zlib-over-glz disabled reds_show_new_channel: channel 4:0, connected successfully, over Non Secure link red_dispatcher_set_cursor_peer: handle_dev_input: cursor connect reds_show_new_channel: channel 3:0, connected successfully, over Non Secure link inputs_link: [New Thread 0x7fffa07f8710 (LWP 30312)] [New Thread 0x7fff9fff7710 (LWP 30313)] [New Thread 0x7fff9f7f6710 (LWP 30314)] [New Thread 0x7fff9eff5710 (LWP 30315)] [New Thread 0x7fff9e7f4710 (LWP 30316)] [New Thread 0x7fff9dff3710 (LWP 30317)] [New Thread 0x7fff9d7f2710 (LWP 30318)] qemu-system-x86_64: /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Program received signal SIGABRT, Aborted. [Switching to Thread 0x74802710 (LWP 30294)] 0x75daa165 in raise () from /lib/libc.so.6 (gdb) (gdb) (gdb) (gdb) (gdb) bt #0 0x75daa165 in raise () from /lib/libc.so.6 #1 0x75dab580 in abort () from /lib/libc.so.6 #2 0x75da3201 in __assert_fail () from /lib/libc.so.6 #3 0x00436f7e in kvm_mutex_unlock () at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724 #4 qemu_mutex_unlock_iothread ()
Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
On 2011-02-27 20:03, Alon Levy wrote: On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote: On 2011-02-26 12:43, xming wrote: When trying to start X (and it loads qxl driver) the kvm process just crashes. This is fixed by Gerd's attached patch (taken from rhel repository, don't know why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email). Patch looks OK on first glance, but the changelog is misleading: This was broken for _both_ trees, but upstream didn't detect the bug. My concerns regarding other side effects of juggling with global mutex in spice code remain. Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote: On 2011-02-27 20:03, Alon Levy wrote: On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote: On 2011-02-26 12:43, xming wrote: When trying to start X (and it loads qxl driver) the kvm process just crashes. This is fixed by Gerd's attached patch (taken from rhel repository, don't know why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email). Patch looks OK on first glance, but the changelog is misleading: This was broken for _both_ trees, but upstream didn't detect the bug. The trees the patch commit message refers to are qemu and qemu-kvm. qemu doesn't even have cpu_single_env. It didn't talk about two qemu-kvm trees. My concerns regarding other side effects of juggling with global mutex in spice code remain. I know there used to be a mutex in spice code and during the upstreaming process it got ditched in favor of the qemu global io mutex. I would have rather deferred this to Gerd since he wrote this, but he is not available atm. Jan
[Qemu-devel] [PATCH] w32: Add support for curses
MinGW optionally includes pdcurses, so add support for it. Signed-off-by: Stefan Weil w...@mail.berlios.de --- configure |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/configure b/configure index 47779b6..ca15632 100755 --- a/configure +++ b/configure @@ -1562,7 +1562,11 @@ fi ## # curses probe -curses_list=-lncurses -lcurses +if test $mingw32 = yes ; then +curses_list=-lpdcurses +else +curses_list=-lncurses -lcurses +fi if test $curses != no ; then curses_found=no -- 1.7.2.3
Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
On 2011-02-27 20:16, Alon Levy wrote: On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote: On 2011-02-27 20:03, Alon Levy wrote: On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote: On 2011-02-26 12:43, xming wrote: When trying to start X (and it loads qxl driver) the kvm process just crashes. This is fixed by Gerd's attached patch (taken from rhel repository, don't know why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email). Patch looks OK on first glance, but the changelog is misleading: This was broken for _both_ trees, but upstream didn't detect the bug. The trees the patch commit message refers to are qemu and qemu-kvm. The same did I. qemu doesn't even have cpu_single_env. Really? Check again. :) It didn't talk about two qemu-kvm trees. My concerns regarding other side effects of juggling with global mutex in spice code remain. I know there used to be a mutex in spice code and during the upstreaming process it got ditched in favor of the qemu global io mutex. I would have rather deferred this to Gerd since he wrote this, but he is not available atm. It's not necessarily bad to drop the io mutex, but it is more tricky than it may appear on first glance. Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
On Sun, Feb 27, 2011 at 08:27:01PM +0100, Jan Kiszka wrote: On 2011-02-27 20:16, Alon Levy wrote: On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote: On 2011-02-27 20:03, Alon Levy wrote: On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote: On 2011-02-26 12:43, xming wrote: When trying to start X (and it loads qxl driver) the kvm process just crashes. This is fixed by Gerd's attached patch (taken from rhel repository, don't know why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email). Patch looks OK on first glance, but the changelog is misleading: This was broken for _both_ trees, but upstream didn't detect the bug. The trees the patch commit message refers to are qemu and qemu-kvm. The same did I. qemu doesn't even have cpu_single_env. Really? Check again. :) Sorry, grepped the wrong repo. I'll send this to qemu-devel too then. It didn't talk about two qemu-kvm trees. My concerns regarding other side effects of juggling with global mutex in spice code remain. I know there used to be a mutex in spice code and during the upstreaming process it got ditched in favor of the qemu global io mutex. I would have rather deferred this to Gerd since he wrote this, but he is not available atm. It's not necessarily bad to drop the io mutex, but it is more tricky than it may appear on first glance. Jan
Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
On Sun, Feb 27, 2011 at 08:27:01PM +0100, Jan Kiszka wrote: On 2011-02-27 20:16, Alon Levy wrote: On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote: On 2011-02-27 20:03, Alon Levy wrote: On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote: On 2011-02-26 12:43, xming wrote: When trying to start X (and it loads qxl driver) the kvm process just crashes. This is fixed by Gerd's attached patch (taken from rhel repository, don't know why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email). Patch looks OK on first glance, but the changelog is misleading: This was broken for _both_ trees, but upstream didn't detect the bug. The trees the patch commit message refers to are qemu and qemu-kvm. The same did I. qemu doesn't even have cpu_single_env. Really? Check again. :) It didn't talk about two qemu-kvm trees. My concerns regarding other side effects of juggling with global mutex in spice code remain. I know there used to be a mutex in spice code and during the upstreaming process it got ditched in favor of the qemu global io mutex. I would have rather deferred this to Gerd since he wrote this, but he is not available atm. It's not necessarily bad to drop the io mutex, but it is more tricky than it may appear on first glance. The problem with not dropping it is that we may be in vga mode and create updates synthtically (i.e. qemu created and not driver created) that access the framebuffer and need to be locked so the framebuffer isn't updated at the same time. We drop the mutex only when we are about to call the dispatcher, which basically waits on red_worker (a libspice-server thread) to do some work. red_worker may in turn callback into qxl in qemu, which may try to acquire the lock. (the many may's here are just reflections of the codepaths). Jan
[Qemu-devel] [PATCH] spice/qxl: locking fix for qemu-kvm
From: Gerd Hoffmann kra...@redhat.com qxl needs to release the qemu lock before calling some libspice functions (and re-aquire it later). In upstream qemu qxl can just use qemu_mutex_{unlock,lock}_iothread. In qemu-kvm this doesn't work, qxl needs additionally save+restore the cpu_single_env pointer on unlock+lock. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- This patch fixes the segmentation faults reported from gentoo and fedora on cpu_single_env asserts. --- hw/qxl.c | 37 + ui/spice-display.c | 12 ++-- ui/spice-display.h |6 ++ 3 files changed, 41 insertions(+), 14 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index fe4212b..117f7c8 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -125,6 +125,27 @@ static void qxl_reset_memslots(PCIQXLDevice *d); static void qxl_reset_surfaces(PCIQXLDevice *d); static void qxl_ring_set_dirty(PCIQXLDevice *qxl); +/* qemu-kvm locking ... */ +void qxl_unlock_iothread(SimpleSpiceDisplay *ssd) +{ +if (cpu_single_env) { +assert(ssd-env == NULL); +ssd-env = cpu_single_env; +cpu_single_env = NULL; +} +qemu_mutex_unlock_iothread(); +} + +void qxl_lock_iothread(SimpleSpiceDisplay *ssd) +{ +qemu_mutex_lock_iothread(); +if (ssd-env) { +assert(cpu_single_env == NULL); +cpu_single_env = ssd-env; +ssd-env = NULL; +} +} + static inline uint32_t msb_mask(uint32_t val) { uint32_t mask; @@ -662,10 +683,10 @@ static void qxl_hard_reset(PCIQXLDevice *d, int loadvm) dprint(d, 1, %s: start%s\n, __FUNCTION__, loadvm ? (loadvm) : ); -qemu_mutex_unlock_iothread(); +qxl_unlock_iothread(d-ssd); d-ssd.worker-reset_cursor(d-ssd.worker); d-ssd.worker-reset_image_cache(d-ssd.worker); -qemu_mutex_lock_iothread(); +qxl_lock_iothread(d-ssd); qxl_reset_surfaces(d); qxl_reset_memslots(d); @@ -795,9 +816,9 @@ static void qxl_reset_surfaces(PCIQXLDevice *d) { dprint(d, 1, %s:\n, __FUNCTION__); d-mode = QXL_MODE_UNDEFINED; -qemu_mutex_unlock_iothread(); +qxl_unlock_iothread(d-ssd); d-ssd.worker-destroy_surfaces(d-ssd.worker); -qemu_mutex_lock_iothread(); +qxl_lock_iothread(d-ssd); memset(d-guest_surfaces.cmds, 0, sizeof(d-guest_surfaces.cmds)); } @@ -866,9 +887,9 @@ static void qxl_destroy_primary(PCIQXLDevice *d) dprint(d, 1, %s\n, __FUNCTION__); d-mode = QXL_MODE_UNDEFINED; -qemu_mutex_unlock_iothread(); +qxl_unlock_iothread(d-ssd); d-ssd.worker-destroy_primary_surface(d-ssd.worker, 0); -qemu_mutex_lock_iothread(); +qxl_lock_iothread(d-ssd); } static void qxl_set_mode(PCIQXLDevice *d, int modenr, int loadvm) @@ -938,10 +959,10 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) case QXL_IO_UPDATE_AREA: { QXLRect update = d-ram-update_area; -qemu_mutex_unlock_iothread(); +qxl_unlock_iothread(d-ssd); d-ssd.worker-update_area(d-ssd.worker, d-ram-update_surface, update, NULL, 0, 0); -qemu_mutex_lock_iothread(); +qxl_lock_iothread(d-ssd); break; } case QXL_IO_NOTIFY_CMD: diff --git a/ui/spice-display.c b/ui/spice-display.c index 020b423..defe652 100644 --- a/ui/spice-display.c +++ b/ui/spice-display.c @@ -186,18 +186,18 @@ void qemu_spice_create_host_primary(SimpleSpiceDisplay *ssd) surface.mem= (intptr_t)ssd-buf; surface.group_id = MEMSLOT_GROUP_HOST; -qemu_mutex_unlock_iothread(); +qxl_unlock_iothread(ssd); ssd-worker-create_primary_surface(ssd-worker, 0, surface); -qemu_mutex_lock_iothread(); +qxl_lock_iothread(ssd); } void qemu_spice_destroy_host_primary(SimpleSpiceDisplay *ssd) { dprint(1, %s:\n, __FUNCTION__); -qemu_mutex_unlock_iothread(); +qxl_unlock_iothread(ssd); ssd-worker-destroy_primary_surface(ssd-worker, 0); -qemu_mutex_lock_iothread(); +qxl_lock_iothread(ssd); } void qemu_spice_vm_change_state_handler(void *opaque, int running, int reason) @@ -207,9 +207,9 @@ void qemu_spice_vm_change_state_handler(void *opaque, int running, int reason) if (running) { ssd-worker-start(ssd-worker); } else { -qemu_mutex_unlock_iothread(); +qxl_unlock_iothread(ssd); ssd-worker-stop(ssd-worker); -qemu_mutex_lock_iothread(); +qxl_lock_iothread(ssd); } ssd-running = running; } diff --git a/ui/spice-display.h b/ui/spice-display.h index aef0464..df74828 100644 --- a/ui/spice-display.h +++ b/ui/spice-display.h @@ -43,6 +43,9 @@ typedef struct SimpleSpiceDisplay { QXLRect dirty; int notify; int running; + +/* qemu-kvm locking ... */ +void *env; } SimpleSpiceDisplay; typedef struct SimpleSpiceUpdate { @@ -52,6 +55,9 @@ typedef struct SimpleSpiceUpdate { uint8_t *bitmap; } SimpleSpiceUpdate; +void
Re: [Qemu-devel] GSoC 2011 project ideas
Hi there, El 23/02/2011, a las 20:42, Luiz Capitulino escribió: Hi there, Google will begin accepting mentoring organizations applications next week, but we count only with three projects so far. Although there doesn't seem to exist a hard deadline associated with the ideas page, nor with the number of projects, we had more than 20 projects suggestions last year and a number of volunteering mentors. Is there any problem to repeat previously proposed and not chosen projects? So, if you have a project idea and/or wish to be a mentor, please add an entry here: http://wiki.qemu.org/Google_Summer_of_Code_2011 Regards, Natalia Portillo
[Qemu-devel] [PATCH v2] move eeprom init from reset function to init function
When we hot plug network pci devices, the mac address is still set to 00:00:00:00:00:00 on the guest machine. The reason is that the kernel does not reset the device and we init eeprom in reset function. move eeprom init from reset function into init function, as it is read only, and does not need init again in reset function. Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- hw/pcnet.c | 27 ++- hw/rtl8139.c | 24 2 files changed, 26 insertions(+), 25 deletions(-) diff --git a/hw/pcnet.c b/hw/pcnet.c index db52dc5..4499031 100644 --- a/hw/pcnet.c +++ b/hw/pcnet.c @@ -1557,19 +1557,6 @@ uint32_t pcnet_bcr_readw(PCNetState *s, uint32_t rap) void pcnet_h_reset(void *opaque) { PCNetState *s = opaque; -int i; -uint16_t checksum; - -/* Initialize the PROM */ - -memcpy(s-prom, s-conf.macaddr.a, 6); -s-prom[12] = s-prom[13] = 0x00; -s-prom[14] = s-prom[15] = 0x57; - -for (i = 0,checksum = 0; i 16; i++) -checksum += s-prom[i]; -*(uint16_t *)s-prom[12] = cpu_to_le16(checksum); - s-bcr[BCR_MSRDA] = 0x0005; s-bcr[BCR_MSWRA] = 0x0005; @@ -1736,6 +1723,20 @@ void pcnet_common_cleanup(PCNetState *d) int pcnet_common_init(DeviceState *dev, PCNetState *s, NetClientInfo *info) { +int i; +uint16_t checksum; + +/* Initialize the PROM */ + +memcpy(s-prom, s-conf.macaddr.a, 6); +s-prom[12] = s-prom[13] = 0x00; +s-prom[14] = s-prom[15] = 0x57; + +for (i = 0, checksum = 0; i 16; i++) { +checksum += s-prom[i]; +} +*(uint16_t *)s-prom[12] = cpu_to_le16(checksum); + s-poll_timer = qemu_new_timer(vm_clock, pcnet_poll_timer, s); qemu_macaddr_default_if_unset(s-conf.macaddr); diff --git a/hw/rtl8139.c b/hw/rtl8139.c index a22530c..8356d5a 100644 --- a/hw/rtl8139.c +++ b/hw/rtl8139.c @@ -1189,18 +1189,6 @@ static void rtl8139_reset(DeviceState *d) rtl8139_update_irq(s); -/* prepare eeprom */ -s-eeprom.contents[0] = 0x8129; -#if 1 -// PCI vendor and device ID should be mirrored here -s-eeprom.contents[1] = PCI_VENDOR_ID_REALTEK; -s-eeprom.contents[2] = PCI_DEVICE_ID_REALTEK_8139; -#endif - -s-eeprom.contents[7] = s-conf.macaddr.a[0] | s-conf.macaddr.a[1] 8; -s-eeprom.contents[8] = s-conf.macaddr.a[2] | s-conf.macaddr.a[3] 8; -s-eeprom.contents[9] = s-conf.macaddr.a[4] | s-conf.macaddr.a[5] 8; - /* mark all status registers as owned by host */ for (i = 0; i 4; ++i) { @@ -3392,6 +3380,18 @@ static int pci_rtl8139_init(PCIDevice *dev) qemu_macaddr_default_if_unset(s-conf.macaddr); +/* prepare eeprom */ +s-eeprom.contents[0] = 0x8129; +#if 1 +/* PCI vendor and device ID should be mirrored here */ +s-eeprom.contents[1] = PCI_VENDOR_ID_REALTEK; +s-eeprom.contents[2] = PCI_DEVICE_ID_REALTEK_8139; +#endif + +s-eeprom.contents[7] = s-conf.macaddr.a[0] | s-conf.macaddr.a[1] 8; +s-eeprom.contents[8] = s-conf.macaddr.a[2] | s-conf.macaddr.a[3] 8; +s-eeprom.contents[9] = s-conf.macaddr.a[4] | s-conf.macaddr.a[5] 8; + s-nic = qemu_new_nic(net_rtl8139_info, s-conf, dev-qdev.info-name, dev-qdev.id, s); qemu_format_nic_info_str(s-nic-nc, s-conf.macaddr.a); -- 1.7.1
[Qemu-devel] [PATCH] fix build errors when we enable acpi_piix4 debug
I enable acpi_piix4 debug, and got the following build errors: # make CClibhw64/acpi_piix4.o cc1: warnings being treated as errors /home/wency/source/qemu/hw/acpi_piix4.c: In function ‘pm_ioport_write’: /home/wency/source/qemu/hw/acpi_piix4.c:193: error: format ‘%04x’ expects type ‘unsigned int’, but argument 2 has type ‘uint64_t’ /home/wency/source/qemu/hw/acpi_piix4.c:193: error: format ‘%04x’ expects type ‘unsigned int’, but argument 3 has type ‘uint64_t’ /home/wency/source/qemu/hw/acpi_piix4.c: In function ‘pm_ioport_read’: /home/wency/source/qemu/hw/acpi_piix4.c:219: error: format ‘%04x’ expects type ‘unsigned int’, but argument 2 has type ‘uint64_t’ make[1]: *** [acpi_piix4.o] Error 1 make: *** [subdir-libhw64] Error 2 Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- hw/acpi_piix4.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c index 5bbc2b5..b5a2762 100644 --- a/hw/acpi_piix4.c +++ b/hw/acpi_piix4.c @@ -190,7 +190,8 @@ static void pm_ioport_write(IORange *ioport, uint64_t addr, unsigned width, default: break; } -PIIX4_DPRINTF(PM writew port=0x%04x val=0x%04x\n, addr, val); +PIIX4_DPRINTF(PM writew port=0x%04x val=0x%04x\n, (unsigned int)addr, + (unsigned int)val); } static void pm_ioport_read(IORange *ioport, uint64_t addr, unsigned width, @@ -216,7 +217,7 @@ static void pm_ioport_read(IORange *ioport, uint64_t addr, unsigned width, val = 0; break; } -PIIX4_DPRINTF(PM readw port=0x%04x val=0x%04x\n, addr, val); +PIIX4_DPRINTF(PM readw port=0x%04x val=0x%04x\n, (unsigned int)addr, val); *data = val; } -- 1.7.1
Re: [Qemu-devel] [PATCH] `qdev_free` when unplug a pci device
Hi Markus Armbruster At 02/23/2011 04:30 PM, Markus Armbruster Write: Isaku Yamahata yamah...@valinux.co.jp writes: snip I don't think this patch is correct. Let me explain. Device hot unplug is *not* guaranteed to succeed. For some buses, such as USB, it always succeeds immediately, i.e. when the device_del monitor command finishes, the device is gone. Live is good. But for PCI, device_del merely initiates the ACPI unplug rain dance. It doesn't wait for the dance to complete. Why? The dance can take an unpredictable amount of time, including forever. Problem: Subsequent device_add can fail if it reuses the qdev ID or PCI slot, and the unplug has not yet completed (race condition), or it failed. Yes, Virginia, PCI hotplug *can* fail. When unplug succeeds, the qdev is automatically destroyed. pciej_write() does that for PIIX4. Looks like pcie_cap_slot_event() does it for PCIE. I got a similar problem. When I unplug a pci device by hand, it works as expected, and I can hotplug it again. But when I use a srcipt to do the same thing, sometimes it failed. I think I may find another bug. Steps to reproduce this bug: 1. cat ./test-e1000.sh # RHEL6RC is domain name #! /bin/bash while true; do virsh attach-interface RHEL6RC network default --mac 52:54:00:1f:db:c7 --model e1000 if [[ $? -ne 0 ]]; then break fi virsh detach-interface RHEL6RC network --mac 52:54:00:1f:db:c7 if [[ $? -ne 0 ]]; then break fi sleep 5 done 2. ./test-e1000.sh Interface attached successfully Interface detached successfully Interface attached successfully Interface detached successfully error: Failed to attach interface error: operation failed: adding e1000,netdev=hostnet1,id=net1,mac=52:54:00:1f:db:c7,bus=pci.0,addr=0x8 device failed: Duplicate ID 'net1' for device I use ftrace to trace whether sci interrupt is passed to guest OS, here is log: # cat trace | grep 'irq=9' idle-0 [000] 118342.634772: irq_handler_entry: irq=9 name=acpi idle-0 [000] 118342.634831: irq_handler_exit: irq=9 ret=handled idle-0 [000] 118342.693216: irq_handler_entry: irq=9 name=acpi idle-0 [000] 118342.693254: irq_handler_exit: irq=9 ret=handled idle-0 [000] 118347.737689: irq_handler_entry: irq=9 name=acpi idle-0 [000] 118347.737738: irq_handler_exit: irq=9 ret=handled According to the log, we only receive 3 sci interrupt, and one is lost. I enable piix4's debug and 1 line printf into piix4_device_hotplug: printf(slot: %d, up: %d, down: %d, state: %d\n, slot, s-pci0_status.up, s-pci0_status.down, state); here is the log: PM: mapping to 0xb000 PM readw port=0x0004 val=0x ... gpe write afe2 == 0 gpe write afe0 == 255 gpe write afe3 == 0 gpe write afe1 == 255 PM readw port=0x val=0x0001 PM readw port=0x0002 val=0x gpe read afe0 == 0 gpe read afe2 == 0 gpe read afe1 == 0 gpe read afe3 == 0 PM writew port=0x val=0x0020 PM readw port=0x0002 val=0x PM writew port=0x0002 val=0x0020 PM readw port=0x0002 val=0x0020 gpe write afe2 == 255 gpe write afe3 == 255 ... slot: 6, up: 0, down: 0, state: 1 we attach interface the first time PM readw port=0x val=0x0001 PM readw port=0x0002 val=0x0120 gpe read afe0 == 2 gpe read afe2 == ff gpe read afe2 == ff gpe write afe2 == 253 gpe read afe1 == 0 gpe read afe3 == ff pcihotplug read ae00 == 40 pcihotplug read ae04 == 0 ... gpe write afe0 == 2 gpe write afe2 == 255 slot: 6, up: 64, down: 0, state: 0 = we detach interface the first time PM readw port=0x val=0x0001 PM readw port=0x0002 val=0x0120 gpe read afe0 == 2 gpe read afe2 == ff gpe read afe2 == ff gpe write afe2 == 253 gpe read afe1 == 0 gpe read afe3 == ff pcihotplug read ae00 == 0 pcihotplug read ae04 == 40 ... pciej write ae08 == 64 === we will call qdev_free() pciej write ae08 == 64 gpe write afe0 == 2 gpe write afe2 == 255 slot: 7, up: 0, down: 64, state: 1 === we attach interface the second time. PM readw port=0x val=0x0001 PM readw port=0x0002 val=0x0120 gpe read afe0 == 2 gpe read afe2 == ff gpe read afe2 == ff gpe write afe2 == 253 gpe read afe1 == 0 gpe read afe3 == ff pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug
Re: [Qemu-devel] Re: [PATCH] Split machine creation from the main loop
On 02/27/2011 05:33 AM, Avi Kivity wrote: On 02/24/2011 07:25 PM, Anthony Liguori wrote: Is it really necessary? What's blocking us from initializing chardevs early? Well We initialize all chardevs at once right now and what set of chardevs there are depends on the machine (by the way defaults are applied). You could initialize chardevs in two stages although that requires quite a bit of additional complexity. We could initialize chardevs on demand - that should resolve any dependencies? I think that potentially screws up the way -global works. There's some deep black magic involved in how -global, defaults, and device initialization interact. It would be a pity to divorce the monitor from chardevs, they're really flexible. Couple considerations: 1) chardevs don't support multiple simultaneous connections. I view this as a blocker for QMP. What do you mean by that? Something like ,server which keeps on listening after it a connection is established? ,server won't allow multiple simultaneous connections. CharDriverStates simply don't have a connection semantic. There can only be one thing connected to it at a time. This is why we don't use CharDriverState for VNC. We should have another abstraction for connection based backend. I'll take a go at this when I'm ready to try to get those patches in. Just to be clear though, there is a CharDriverState version of the new QMP server. This would be a second option for creating a QMP server and it takes a different command line sytnax. 2) Because chardevs don't support multiple connections, we can't reasonably hook on things like connect/disconnect which means that fd's sent via SCM_RIGHTs have to be handled in a very special way. By going outside of the chardev layer, we can let fd's via SCM_RIGHTS queue up naturally and have getfd/setfd refer to the fd at the top of the queue. It makes it quite a bit easier to work with (I believe Daniel had actually requested this a while ago). I really don't follow... what's the connection between SCM_RIGHTS and multiple connections? Monitors have a single fd. That fd is associated with the monitor and lives beyond the length of the connection to the monitor (recall that chardevs don't have a notion of connection life cycle). This means if a management tool forgets to do a closefd on an unused fd, there's no easy way for QEMU to automatically clean that up. IOW, a crashed management tool == fd leak in QEMU. (6) can be started right now. (1) comes with the QAPI merge. (2) is pretty easy to do after applying this patch. (3) is probably something that can be done shortly after (1). (4) and (5) really require everything but (6) to be in place before we can meaningful do it. I think we can lay out much of the ground work for this in 0.15 and I think we can have a total conversion realistically for 0.16. That means that by EOY, we could invoke QEMU with no options and do everything through QMP. It's something that I've agitated for a long while, but when I see all the work needed, I'm not sure it's cost effective. There's a lot of secondary benefits that come from doing this. QMP becomes a much stronger interface. A lot of operations that right now are only specifiable by the command line become dynamic which mitigates reboots in the long term. Only the hot-pluggable ones. Yup, but it forces us to treat options that cannot change at runtime as special cases which I think is a nice plus. Customers don't like having their guests rebooted during a scheduled downtime so we really ought to try to have as many things tunable at runtime as possible. Regards, Anthony Liguori
Re: [Qemu-devel] Re: KVM call agenda for Jan 25
Stefan Hajnoczi stefa...@gmail.com writes: On Sat, Feb 26, 2011 at 9:50 PM, Dushyant Bansal cs5070...@cse.iitd.ac.in wrote: Disk block size is usually 512 bytes and in qemu-img, sector size is also 512B. And, this change would copy n sectors even if only one of them actually contains data (while cp checks and copies one block(=sector) at a time). Therefore, it will end up writing more data than cp. cp(1) from GNU coreutils 8.5 works in units of 32 KB on my system. It reads 32 KB and checks for all zeroes. If there are all zeroes, it seeks ahead 32 KB in the output file. Otherwise it writes the entire 32 KB. The latest version[*] of cp does better: cp now copies sparse files efficiently on file systems with FIEMAP support (ext4, btrfs, xfs, ocfs2). Before, it had to read 2^20 bytes when copying a 1MiB sparse file. Now, it copies bytes only for the non-sparse sections of a file. Similarly, to induce a hole in the output file, it had to detect a long sequence of zero bytes. Now, it knows precisely where each hole in an input file is, and can reproduce them efficiently in the output file. mv also benefits when it resorts to copying, e.g., between file systems. But beware of kernel bugs[**]. You can check what cp(1) is doing using strace(1). Often enlightening :) [...] [*] http://lists.gnu.org/archive/html/coreutils-announce/2011-02/msg0.html [**] http://lwn.net/Articles/429345/ Subscribera-only, should become public soon. In the meantime, try http://lwn.net/Articles/429349/
Re: [Qemu-devel] [PATCH V6 3/4] qmp, nmi: convert do_inject_nmi() to QObject
Anthony Liguori anth...@codemonkey.ws writes: On 02/25/2011 03:54 AM, Markus Armbruster wrote: Anthony Liguorialigu...@linux.vnet.ibm.com writes: On 02/24/2011 10:20 AM, Markus Armbruster wrote: Anthony Liguorialigu...@linux.vnet.ibm.com writes: On 02/24/2011 02:33 AM, Markus Armbruster wrote: Anthony Liguorianth...@codemonkey.wswrites: [...] Please describe all expected errors. Quoting qmp-commands.hx: 3. Errors, in special, are not documented. Applications should NOT check for specific errors classes or data (it's strongly recommended to only check for the error key) Indeed, not a single error is documented there. This is intentional. Yeah, but we're not 0.14 anymore and for 0.15, we need to document errors. If you are suggesting I send a patch to remove that section, I'm more than happy to. Two separate issues here: 1. Are we ready to commit to the current design of errors, and 2. Is it fair to reject Lai's patch now because he doesn't document his errors. I'm not commenting on 1. here. Regarding 2.: rejecting a patch because it doesn't document an aspect that current master intentionally leaves undocumented is not how you treat contributors. At least not if you want any other than certified masochists who enjoy pain, and professionals who get adequately compensated for it. Lead by example, not by fiat. http://repo.or.cz/w/qemu/aliguori.git/blob/refs/heads/glib:/qmp-schema.json I am in the process of documenting the errors of every command. It's a royal pain but I'm going to document everything we have right now. It's actually the last bit of work I need to finish before sending QAPI out. So for new commands being added, it would be hugely helpful for the authors to document the errors such that I don't have to reverse engineer all of the possible error conditions. The moment this lands in master, you can begin to demand error descriptions from contributors. Until then, I'll NAK error descriptions in qmp-commands.txt. We left them undocumented there for good reasons: No, it was always a bad reason. Good documentation is necessary to build good commands. Errors are a huge part of the semantics of a command. We cannot properly assess a command unless it's behavior in error conditions is well defined. The decision to declare QMP stable was made at the KVM Forum after quite some deliberation. We explictly excluded errors. We didn't do that because errors are not worthy of specification (that would be silly). We did it because there was too much unhappiness about the current state of errors to close the debate on them at that time. Luiz mindfully documented that decision in qmp-commands.txt: 3. Errors, in special, are not documented. Applications should NOT check for specific errors classes or data (it's strongly recommended to only check for the error key) Nobody denies that errors need to be specified, and this clause needs to go. But I must insist on proper review. No sneaking in of error specification through the commit backdoor, please. Once we have an error design in place that has a reasonable hope to stand the test of time, and have errors documented for at least some of the commands here, we can start to require proper error documentation for new commands. But not now. I won't NAK non-normative error descriptions, say in commit messages, or in comments. And I won't object to you asking for them. But I feel you really shouldn't make it a condition for committing patches. Especially not for simple patches that have been on list for months. I'm strongly committed to making QMP a first class interface in QEMU for 0.15. I feel documentation is a crucial part to making that happen. I'm not asking for test cases even though that's something that we'll need for 0.15 because there's not enough infrastructure in master yet to do that in a reasonable way. I realize I'm likely going to have to write that test case and I'm happy to do that. But there's no reason that we shouldn't require thorough documentation for all new QMP commands moving forward including error semantics. This is a critical part of having a first class API and no additional infrastructure is needed in master to do this. Broken record time, again. Nobody denies that errors need to be specified. Fair goal for 0.15. But I must insist on proper review. No sneaking in of error specification through the commit backdoor, please. I won't NAK non-normative error descriptions, say in commit messages, or in comments. And I won't object to you asking for them. But I feel you really shouldn't make it a condition for committing patches. Especially not for simple patches that have been on list for months.