Re: [PATCH 1/3] KVM: x86 emulator: Make one byte instruction emulation separate
On 03/10/2011 09:35 AM, Takuya Yoshikawa wrote: x86_emulate_insn() is too long and has many confusing goto statements. This patch is the first part of a work which tries to split it into a few meaningful functions: just encapsulates the switch statement for the one byte instruction emulation as emulate_onebyte_insn(). I, too, dislike the switch statements. So I started on a path to eliminate it completely - take a look at struct opcode::execute. If present, it is executed instead of the code in the switch statements. The plan is to migrate all of the contents of the switch statements into -execute() callbacks. This way, all of the information about an instruction is present in the decode tables. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2] fix regression caused by e48672fa25e879f7ae21785c7efd187738139593
On 03/10/2011 12:36 AM, Nikola Ciprich wrote: commit 387b9f97750444728962b236987fbe8ee8cc4f8c moved kvm_request_guest_time_update(vcpu), breaking 32bit SMP guests using kvm-clock. Fix this by moving (new) clock update function to proper place. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FreeBSD boot hangs on qemu-kvm on AMD host
On 03/09/2011 07:11 PM, Michael Tokarev wrote: 09.03.2011 19:34, Avi Kivity wrote: [] Sorry, I misread. So kvm.git works, we just have to identify what patch fixed it. 2.6.38-rc8 does not work - verified by Dominik. But kvm.git works. There's some chance it's e5d135f80b98b0. If you apply it, also apply the following commit 2607b0533353c since it fixes a bug that tends to be exposed by the previous patch. So, the two mentioned patches on top of 2.6.37.3 makes the problem go away, -- guest runs flawlessly with the two patches applied. I had to apply kvm_host.h change manually since there's one extra #define (KVM_REQ_APF_HALT) which is not present in 2.6.37. So.. -stable candidates, after applying them for 2.6.38? :) Certainly. Thanks for testing. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] KVM: x86 emulator: Make one byte instruction emulation separate
On Thu, 10 Mar 2011 11:05:38 +0200 Avi Kivity a...@redhat.com wrote: On 03/10/2011 09:35 AM, Takuya Yoshikawa wrote: x86_emulate_insn() is too long and has many confusing goto statements. This patch is the first part of a work which tries to split it into a few meaningful functions: just encapsulates the switch statement for the one byte instruction emulation as emulate_onebyte_insn(). I, too, dislike the switch statements. So I started on a path to eliminate it completely - take a look at struct opcode::execute. If present, it is executed instead of the code in the switch statements. The plan is to migrate all of the contents of the switch statements into -execute() callbacks. This way, all of the information about an instruction is present in the decode tables. I see. I'm looking forward to the completion of the plan! Thanks, Takuya -- error compiling committee.c: too many arguments to function -- Takuya Yoshikawa yoshikawa.tak...@oss.ntt.co.jp -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] KVM: x86 emulator: Make one byte instruction emulation separate
On 03/10/2011 11:26 AM, Takuya Yoshikawa wrote: The plan is to migrate all of the contents of the switch statements into -execute() callbacks. This way, all of the information about an instruction is present in the decode tables. I see. I'm looking forward to the completion of the plan! I don't know if anyone is working on it, so feel free to send patches! -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] KVM: x86 emulator: Make one byte instruction emulation separate
On Thu, 10 Mar 2011 11:27:30 +0200 Avi Kivity a...@redhat.com wrote: On 03/10/2011 11:26 AM, Takuya Yoshikawa wrote: I don't know if anyone is working on it, so feel free to send patches! Yes, I'm interested in it. So I will take a look and try! I was doing some live migration tests using Kemari for some time. Now it's time to restart KVM code reading! Thanks, Takuya -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/4] KVM: MMU: cleanup pte write path
On 03/09/2011 09:43 AM, Xiao Guangrong wrote: This patch does: - call vcpu-arch.mmu.update_pte directly - use gfn_to_pfn_atomic in update_pte path The suggestion is from Avi. - mmu_guess_page_from_pte_write(vcpu, gpa, gentry); + mmu_seq = vcpu-kvm-mmu_notifier_seq; + smp_rmb(); smp_rmb() should come before, no? but the problem was present in the original code, too. + spin_lock(vcpu-kvm-mmu_lock); if (atomic_read(vcpu-kvm-arch.invlpg_counter) != invlpg_counter) gentry = 0; @@ -3365,7 +3345,8 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, if (gentry !((sp-role.word ^ vcpu-arch.mmu.base_role.word) mask.word)) - mmu_pte_write_new_pte(vcpu, sp, spte,gentry); + mmu_pte_write_new_pte(vcpu, sp, spte,gentry, + mmu_seq); Okay, we're only iterating over indirect pages, so we won't call nonpaging_update_pte(). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] KVM: fix rcu usage in init_rmode_* functions
On 03/09/2011 09:41 AM, Xiao Guangrong wrote: fix: [ 3494.671786] stack backtrace: [ 3494.671789] Pid: 10527, comm: qemu-system-x86 Not tainted 2.6.38-rc6+ #23 [ 3494.671790] Call Trace: [ 3494.671796] [] ? lockdep_rcu_dereference+0x9d/0xa5 [ 3494.671826] [] ? kvm_memslots+0x6b/0x73 [kvm] [ 3494.671834] [] ? gfn_to_memslot+0x16/0x4f [kvm] [ 3494.671843] [] ? gfn_to_hva+0x16/0x27 [kvm] [ 3494.671851] [] ? kvm_write_guest_page+0x31/0x83 [kvm] [ 3494.671861] [] ? kvm_clear_guest_page+0x1a/0x1c [kvm] [ 3494.671867] [] ? vmx_set_tss_addr+0x83/0x122 [kvm_intel] and: [ 8328.789599] stack backtrace: [ 8328.789601] Pid: 18736, comm: qemu-system-x86 Not tainted 2.6.38-rc6+ #23 [ 8328.789603] Call Trace: [ 8328.789609] [] ? lockdep_rcu_dereference+0x9d/0xa5 [ 8328.789621] [] ? kvm_memslots+0x6b/0x73 [kvm] [ 8328.789628] [] ? gfn_to_memslot+0x16/0x4f [kvm] [ 8328.789635] [] ? gfn_to_hva+0x16/0x27 [kvm] [ 8328.789643] [] ? kvm_write_guest_page+0x31/0x83 [kvm] [ 8328.789699] [] ? kvm_clear_guest_page+0x1a/0x1c [kvm] [ 8328.789713] [] ? vmx_create_vcpu+0x316/0x3c8 [kvm_intel] Applied all four patches, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: VNC and SDL/VGA simultaneously?
On 03/09/2011 11:31 PM, Erik Rull wrote: Hi all, is it possible to parameterize qemu in a way where the VNC port and the VGA output is available in parallel? Not really, though it should be possible to do it with some effort. My system screen remains dark if I run it with the -vnc :0 option and vnc is unavailable when SDL/VGA is available. What's your use case? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] sockets: add qemu_socketpair()
Signed-off-by: Corentin Chary corentin.ch...@gmail.com --- osdep.c | 83 + qemu_socket.h |1 + 2 files changed, 84 insertions(+), 0 deletions(-) diff --git a/osdep.c b/osdep.c index 327583b..93bfbe0 100644 --- a/osdep.c +++ b/osdep.c @@ -147,6 +147,89 @@ int qemu_socket(int domain, int type, int protocol) return ret; } +#ifdef _WIN32 +int qemu_socketpair(int domain, int type, int protocol, int socks[2]) +{ +union { + struct sockaddr_in inaddr; + struct sockaddr addr; +} a; +int listener; +socklen_t addrlen = sizeof(a.inaddr); +int reuse = 1; + +if (domain == AF_UNIX) { +domain = AF_INET; +} + +if (socks == 0) { +return EINVAL; +} + +listener = qemu_socket(domain, type, protocol); +if (listener 0) { +return listener; +} + +memset(a, 0, sizeof(a)); +a.inaddr.sin_family = AF_INET; +a.inaddr.sin_addr.s_addr = htonl(INADDR_LOOPBACK); +a.inaddr.sin_port = 0; + +socks[0] = socks[1] = -1; + +if (setsockopt(listener, SOL_SOCKET, SO_REUSEADDR, + (char*) reuse, (socklen_t) sizeof(reuse)) == -1) { +goto error; +} +if (bind(listener, a.addr, sizeof(a.inaddr)) 0) { +goto error; +} + +memset(a, 0, sizeof(a)); +if (getsockname(listener, a.addr, addrlen) 0) { +goto error; +} + +a.inaddr.sin_addr.s_addr = htonl(INADDR_LOOPBACK); +a.inaddr.sin_family = AF_INET; + +if (listen(listener, 1) 0) { +goto error; +} + +socks[0] = qemu_socket(AF_INET, SOCK_STREAM, 0); +if (socks[0] 0) { +goto error; +} +if (connect(socks[0], a.addr, sizeof(a.inaddr)) 0) { +goto error; +} + +socks[1] = qemu_accept(listener, NULL, NULL); +if (socks[1] 0) { +goto error; +} + +closesocket(listener); +return 0; + +error: +if (listener != -1) +closesocket(listener); +if (socks[0] != -1) +closesocket(socks[0]); +if (socks[1] != -1) +closesocket(socks[1]); +return -1; +} +#else +int qemu_socketpair(int domain, int type, int protocol, int socks[2]) +{ +return socketpair(domain, type, protocol, socks); +} +#endif + /* * Accept a connection and set FD_CLOEXEC */ diff --git a/qemu_socket.h b/qemu_socket.h index 180e4db..d7eb9a5 100644 --- a/qemu_socket.h +++ b/qemu_socket.h @@ -34,6 +34,7 @@ int inet_aton(const char *cp, struct in_addr *ia); /* misc helpers */ int qemu_socket(int domain, int type, int protocol); +int qemu_socketpair(int domain, int type, int protocol, int socks[2]); int qemu_accept(int s, struct sockaddr *addr, socklen_t *addrlen); void socket_set_nonblock(int fd); int send_all(int fd, const void *buf, int len1); -- 1.7.3.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] vnc: don't mess up with iohandlers in the vnc thread
The threaded VNC servers messed up with QEMU fd handlers without any kind of locking, and that can cause some nasty race conditions. Using qemu_mutex_lock_iothread() won't work because vnc_dpy_cpy(), which will wait for the current job queue to finish, can be called with the iothread lock held. Instead, we now store the data in a temporary buffer, and use a socket pair to notify the main thread that new data is available. The data is not directly send through this socket because it would make the code more complex without any performance benefit. vnc_[un]lock_ouput() is still needed to access VncState members like abort, csock or jobs_buffer. Thanks to Jan Kiszka for helping me solve this issue. Signed-off-by: Corentin Chary corentin.ch...@gmail.com --- ui/vnc-jobs-async.c | 50 -- ui/vnc-jobs-sync.c |4 ui/vnc-jobs.h |1 + ui/vnc.c| 16 ui/vnc.h|2 ++ 5 files changed, 55 insertions(+), 18 deletions(-) diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c index f596247..543b5a9 100644 --- a/ui/vnc-jobs-async.c +++ b/ui/vnc-jobs-async.c @@ -28,6 +28,7 @@ #include vnc.h #include vnc-jobs.h +#include qemu_socket.h /* * Locking: @@ -155,6 +156,24 @@ void vnc_jobs_join(VncState *vs) qemu_cond_wait(queue-cond, queue-mutex); } vnc_unlock_queue(queue); +vnc_jobs_consume_buffer(vs); +} + +void vnc_jobs_consume_buffer(VncState *vs) +{ +bool flush; + +vnc_lock_output(vs); +if (vs-jobs_buffer.offset) { +vnc_write(vs, vs-jobs_buffer.buffer, vs-jobs_buffer.offset); +buffer_reset(vs-jobs_buffer); +} +flush = vs-csock != -1 vs-abort != true; +vnc_unlock_output(vs); + +if (flush) { + vnc_flush(vs); +} } /* @@ -197,7 +216,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue) VncState vs; int n_rectangles; int saved_offset; -bool flush; vnc_lock_queue(queue); while (QTAILQ_EMPTY(queue-jobs) !queue-exit) { @@ -213,6 +231,7 @@ static int vnc_worker_thread_loop(VncJobQueue *queue) vnc_lock_output(job-vs); if (job-vs-csock == -1 || job-vs-abort == true) { +vnc_unlock_output(job-vs); goto disconnected; } vnc_unlock_output(job-vs); @@ -233,10 +252,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue) if (job-vs-csock == -1) { vnc_unlock_display(job-vs-vd); -/* output mutex must be locked before going to - * disconnected: - */ -vnc_lock_output(job-vs); goto disconnected; } @@ -254,24 +269,23 @@ static int vnc_worker_thread_loop(VncJobQueue *queue) vs.output.buffer[saved_offset] = (n_rectangles 8) 0xFF; vs.output.buffer[saved_offset + 1] = n_rectangles 0xFF; -/* Switch back buffers */ vnc_lock_output(job-vs); -if (job-vs-csock == -1) { -goto disconnected; -} - -vnc_write(job-vs, vs.output.buffer, vs.output.offset); +if (job-vs-csock != -1) { +int notify = !job-vs-jobs_buffer.offset; -disconnected: -/* Copy persistent encoding data */ -vnc_async_encoding_end(job-vs, vs); -flush = (job-vs-csock != -1 job-vs-abort != true); -vnc_unlock_output(job-vs); +buffer_reserve(job-vs-jobs_buffer, vs.output.offset); +buffer_append(job-vs-jobs_buffer, vs.output.buffer, + vs.output.offset); +/* Copy persistent encoding data */ +vnc_async_encoding_end(job-vs, vs); -if (flush) { -vnc_flush(job-vs); +if (notify) { +send(job-vs-jobs_socks[1], (char []){1}, 1, 0); +} } +vnc_unlock_output(job-vs); +disconnected: vnc_lock_queue(queue); QTAILQ_REMOVE(queue-jobs, job, next); vnc_unlock_queue(queue); diff --git a/ui/vnc-jobs-sync.c b/ui/vnc-jobs-sync.c index 49b77af..3a4d7df 100644 --- a/ui/vnc-jobs-sync.c +++ b/ui/vnc-jobs-sync.c @@ -36,6 +36,10 @@ void vnc_jobs_join(VncState *vs) { } +void vnc_jobs_consume_buffer(VncState *vs) +{ +} + VncJob *vnc_job_new(VncState *vs) { vs-job.vs = vs; diff --git a/ui/vnc-jobs.h b/ui/vnc-jobs.h index b8dab81..7c529b7 100644 --- a/ui/vnc-jobs.h +++ b/ui/vnc-jobs.h @@ -37,6 +37,7 @@ void vnc_job_push(VncJob *job); bool vnc_has_job(VncState *vs); void vnc_jobs_clear(VncState *vs); void vnc_jobs_join(VncState *vs); +void vnc_jobs_consume_buffer(VncState *vs); #ifdef CONFIG_VNC_THREAD diff --git a/ui/vnc.c b/ui/vnc.c index 610f884..48a81a2 100644 --- a/ui/vnc.c +++ b/ui/vnc.c @@ -1011,6 +1011,10 @@ static void vnc_disconnect_finish(VncState *vs) #ifdef CONFIG_VNC_THREAD qemu_mutex_destroy(vs-output_mutex); +qemu_set_fd_handler2(vs-jobs_socks[0], NULL, NULL, NULL, vs); +close(vs-jobs_socks[0]); +close(vs-jobs_socks[1]); +buffer_free(vs-jobs_buffer); #endif for (i = 0; i VNC_STAT_ROWS; ++i) {
Re: [PATCH 2/2] vnc: don't mess up with iohandlers in the vnc thread
On 03/10/2011 01:59 PM, Corentin Chary wrote: Instead, we now store the data in a temporary buffer, and use a socket pair to notify the main thread that new data is available. You can use a bottom half for this instead of a special socket. Signaling a bottom half is async-signal- and thread-safe. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] vnc: don't mess up with iohandlers in the vnc thread
On 03/10/2011 07:06 AM, Paolo Bonzini wrote: On 03/10/2011 01:59 PM, Corentin Chary wrote: Instead, we now store the data in a temporary buffer, and use a socket pair to notify the main thread that new data is available. You can use a bottom half for this instead of a special socket. Signaling a bottom half is async-signal- and thread-safe. Bottom halves are thread safe? I don't think so. They probably should be but they aren't today. Regards, Anthony Liguori Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] vnc: don't mess up with iohandlers in the vnc thread
On 10.03.2011 13:59, Corentin Chary wrote: The threaded VNC servers messed up with QEMU fd handlers without any kind of locking, and that can cause some nasty race conditions. Using qemu_mutex_lock_iothread() won't work because vnc_dpy_cpy(), which will wait for the current job queue to finish, can be called with the iothread lock held. Instead, we now store the data in a temporary buffer, and use a socket pair to notify the main thread that new data is available. The data is not directly send through this socket because it would make the code more complex without any performance benefit. vnc_[un]lock_ouput() is still needed to access VncState members like abort, csock or jobs_buffer. is this compatible with qemu-kvm? thanks, peter Thanks to Jan Kiszka for helping me solve this issue. Signed-off-by: Corentin Charycorentin.ch...@gmail.com --- ui/vnc-jobs-async.c | 50 -- ui/vnc-jobs-sync.c |4 ui/vnc-jobs.h |1 + ui/vnc.c| 16 ui/vnc.h|2 ++ 5 files changed, 55 insertions(+), 18 deletions(-) diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c index f596247..543b5a9 100644 --- a/ui/vnc-jobs-async.c +++ b/ui/vnc-jobs-async.c @@ -28,6 +28,7 @@ #include vnc.h #include vnc-jobs.h +#include qemu_socket.h /* * Locking: @@ -155,6 +156,24 @@ void vnc_jobs_join(VncState *vs) qemu_cond_wait(queue-cond,queue-mutex); } vnc_unlock_queue(queue); +vnc_jobs_consume_buffer(vs); +} + +void vnc_jobs_consume_buffer(VncState *vs) +{ +bool flush; + +vnc_lock_output(vs); +if (vs-jobs_buffer.offset) { +vnc_write(vs, vs-jobs_buffer.buffer, vs-jobs_buffer.offset); +buffer_reset(vs-jobs_buffer); +} +flush = vs-csock != -1 vs-abort != true; +vnc_unlock_output(vs); + +if (flush) { + vnc_flush(vs); +} } /* @@ -197,7 +216,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue) VncState vs; int n_rectangles; int saved_offset; -bool flush; vnc_lock_queue(queue); while (QTAILQ_EMPTY(queue-jobs) !queue-exit) { @@ -213,6 +231,7 @@ static int vnc_worker_thread_loop(VncJobQueue *queue) vnc_lock_output(job-vs); if (job-vs-csock == -1 || job-vs-abort == true) { +vnc_unlock_output(job-vs); goto disconnected; } vnc_unlock_output(job-vs); @@ -233,10 +252,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue) if (job-vs-csock == -1) { vnc_unlock_display(job-vs-vd); -/* output mutex must be locked before going to - * disconnected: - */ -vnc_lock_output(job-vs); goto disconnected; } @@ -254,24 +269,23 @@ static int vnc_worker_thread_loop(VncJobQueue *queue) vs.output.buffer[saved_offset] = (n_rectangles 8) 0xFF; vs.output.buffer[saved_offset + 1] = n_rectangles 0xFF; -/* Switch back buffers */ vnc_lock_output(job-vs); -if (job-vs-csock == -1) { -goto disconnected; -} - -vnc_write(job-vs, vs.output.buffer, vs.output.offset); +if (job-vs-csock != -1) { +int notify = !job-vs-jobs_buffer.offset; -disconnected: -/* Copy persistent encoding data */ -vnc_async_encoding_end(job-vs,vs); -flush = (job-vs-csock != -1 job-vs-abort != true); -vnc_unlock_output(job-vs); +buffer_reserve(job-vs-jobs_buffer, vs.output.offset); +buffer_append(job-vs-jobs_buffer, vs.output.buffer, + vs.output.offset); +/* Copy persistent encoding data */ +vnc_async_encoding_end(job-vs,vs); -if (flush) { -vnc_flush(job-vs); +if (notify) { +send(job-vs-jobs_socks[1], (char []){1}, 1, 0); +} } +vnc_unlock_output(job-vs); +disconnected: vnc_lock_queue(queue); QTAILQ_REMOVE(queue-jobs, job, next); vnc_unlock_queue(queue); diff --git a/ui/vnc-jobs-sync.c b/ui/vnc-jobs-sync.c index 49b77af..3a4d7df 100644 --- a/ui/vnc-jobs-sync.c +++ b/ui/vnc-jobs-sync.c @@ -36,6 +36,10 @@ void vnc_jobs_join(VncState *vs) { } +void vnc_jobs_consume_buffer(VncState *vs) +{ +} + VncJob *vnc_job_new(VncState *vs) { vs-job.vs = vs; diff --git a/ui/vnc-jobs.h b/ui/vnc-jobs.h index b8dab81..7c529b7 100644 --- a/ui/vnc-jobs.h +++ b/ui/vnc-jobs.h @@ -37,6 +37,7 @@ void vnc_job_push(VncJob *job); bool vnc_has_job(VncState *vs); void vnc_jobs_clear(VncState *vs); void vnc_jobs_join(VncState *vs); +void vnc_jobs_consume_buffer(VncState *vs); #ifdef CONFIG_VNC_THREAD diff --git a/ui/vnc.c b/ui/vnc.c index 610f884..48a81a2 100644 --- a/ui/vnc.c +++ b/ui/vnc.c @@ -1011,6 +1011,10 @@ static void vnc_disconnect_finish(VncState *vs) #ifdef CONFIG_VNC_THREAD qemu_mutex_destroy(vs-output_mutex); +qemu_set_fd_handler2(vs-jobs_socks[0], NULL, NULL, NULL, vs); +
Re: [PATCH 2/2] vnc: don't mess up with iohandlers in the vnc thread
On Thu, Mar 10, 2011 at 1:45 PM, Anthony Liguori aligu...@us.ibm.com wrote: On 03/10/2011 07:06 AM, Paolo Bonzini wrote: On 03/10/2011 01:59 PM, Corentin Chary wrote: Instead, we now store the data in a temporary buffer, and use a socket pair to notify the main thread that new data is available. You can use a bottom half for this instead of a special socket. Signaling a bottom half is async-signal- and thread-safe. Bottom halves are thread safe? I don't think so. The bottom halves API is not thread safe, but calling qemu_bh_schedule_idle() in another thread *seems* to be safe (here, it would be protected from qemu_bh_delete() by vnc_lock_output()). -- Corentin Chary http://xf.iksaif.net -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] vnc: don't mess up with iohandlers in the vnc thread
On 03/10/2011 02:45 PM, Anthony Liguori wrote: On 03/10/2011 07:06 AM, Paolo Bonzini wrote: On 03/10/2011 01:59 PM, Corentin Chary wrote: Instead, we now store the data in a temporary buffer, and use a socket pair to notify the main thread that new data is available. You can use a bottom half for this instead of a special socket. Signaling a bottom half is async-signal- and thread-safe. Bottom halves are thread safe? I don't think so. They probably should be but they aren't today. Creating a new bottom half is not thread-safe, but scheduling one is. Assuming that you never use qemu_bh_schedule_idle, qemu_bh_schedule boils down to: if (bh-scheduled) return; bh-scheduled = 1; /* stop the currently executing CPU to execute the BH ASAP */ qemu_notify_event(); You may have a spurious wakeup if two threads race on the same bottom half (including the signaling thread racing with the IO thread), but overall you can safely treat them as thread-safe. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] vnc: don't mess up with iohandlers in the vnc thread
On 03/10/2011 02:54 PM, Corentin Chary wrote: You can use a bottom half for this instead of a special socket. Signaling a bottom half is async-signal- and thread-safe. Bottom halves are thread safe? I don't think so. The bottom halves API is not thread safe, but calling qemu_bh_schedule_idle() Not _idle please. in another thread *seems* to be safe (here, it would be protected from qemu_bh_delete() by vnc_lock_output()). If it weren't protected against qemu_bh_delete, you would have already the same race between writing to the socket and closing it in another thread. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5] vnc: don't mess up with iohandlers in the vnc thread
The threaded VNC servers messed up with QEMU fd handlers without any kind of locking, and that can cause some nasty race conditions. Using qemu_mutex_lock_iothread() won't work because vnc_dpy_cpy(), which will wait for the current job queue to finish, can be called with the iothread lock held. Instead, we now store the data in a temporary buffer, and use a bottom half to notify the main thread that new data is available. vnc_[un]lock_ouput() is still needed to access VncState members like abort, csock or jobs_buffer. Signed-off-by: Corentin Chary corentin.ch...@gmail.com --- ui/vnc-jobs-async.c | 48 +--- ui/vnc-jobs.h |1 + ui/vnc.c| 12 ui/vnc.h|2 ++ 4 files changed, 44 insertions(+), 19 deletions(-) diff --git a/ui/vnc-jobs-async.c b/ui/vnc-jobs-async.c index f596247..4438776 100644 --- a/ui/vnc-jobs-async.c +++ b/ui/vnc-jobs-async.c @@ -28,6 +28,7 @@ #include vnc.h #include vnc-jobs.h +#include qemu_socket.h /* * Locking: @@ -155,6 +156,24 @@ void vnc_jobs_join(VncState *vs) qemu_cond_wait(queue-cond, queue-mutex); } vnc_unlock_queue(queue); +vnc_jobs_consume_buffer(vs); +} + +void vnc_jobs_consume_buffer(VncState *vs) +{ +bool flush; + +vnc_lock_output(vs); +if (vs-jobs_buffer.offset) { +vnc_write(vs, vs-jobs_buffer.buffer, vs-jobs_buffer.offset); +buffer_reset(vs-jobs_buffer); +} +flush = vs-csock != -1 vs-abort != true; +vnc_unlock_output(vs); + +if (flush) { + vnc_flush(vs); +} } /* @@ -197,7 +216,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue) VncState vs; int n_rectangles; int saved_offset; -bool flush; vnc_lock_queue(queue); while (QTAILQ_EMPTY(queue-jobs) !queue-exit) { @@ -213,6 +231,7 @@ static int vnc_worker_thread_loop(VncJobQueue *queue) vnc_lock_output(job-vs); if (job-vs-csock == -1 || job-vs-abort == true) { +vnc_unlock_output(job-vs); goto disconnected; } vnc_unlock_output(job-vs); @@ -233,10 +252,6 @@ static int vnc_worker_thread_loop(VncJobQueue *queue) if (job-vs-csock == -1) { vnc_unlock_display(job-vs-vd); -/* output mutex must be locked before going to - * disconnected: - */ -vnc_lock_output(job-vs); goto disconnected; } @@ -254,24 +269,19 @@ static int vnc_worker_thread_loop(VncJobQueue *queue) vs.output.buffer[saved_offset] = (n_rectangles 8) 0xFF; vs.output.buffer[saved_offset + 1] = n_rectangles 0xFF; -/* Switch back buffers */ vnc_lock_output(job-vs); -if (job-vs-csock == -1) { -goto disconnected; +if (job-vs-csock != -1) { +buffer_reserve(job-vs-jobs_buffer, vs.output.offset); +buffer_append(job-vs-jobs_buffer, vs.output.buffer, + vs.output.offset); +/* Copy persistent encoding data */ +vnc_async_encoding_end(job-vs, vs); + + qemu_bh_schedule(job-vs-bh); } - -vnc_write(job-vs, vs.output.buffer, vs.output.offset); - -disconnected: -/* Copy persistent encoding data */ -vnc_async_encoding_end(job-vs, vs); -flush = (job-vs-csock != -1 job-vs-abort != true); vnc_unlock_output(job-vs); -if (flush) { -vnc_flush(job-vs); -} - +disconnected: vnc_lock_queue(queue); QTAILQ_REMOVE(queue-jobs, job, next); vnc_unlock_queue(queue); diff --git a/ui/vnc-jobs.h b/ui/vnc-jobs.h index b8dab81..4c661f9 100644 --- a/ui/vnc-jobs.h +++ b/ui/vnc-jobs.h @@ -40,6 +40,7 @@ void vnc_jobs_join(VncState *vs); #ifdef CONFIG_VNC_THREAD +void vnc_jobs_consume_buffer(VncState *vs); void vnc_start_worker_thread(void); bool vnc_worker_thread_running(void); void vnc_stop_worker_thread(void); diff --git a/ui/vnc.c b/ui/vnc.c index 610f884..f28873b 100644 --- a/ui/vnc.c +++ b/ui/vnc.c @@ -1011,7 +1011,10 @@ static void vnc_disconnect_finish(VncState *vs) #ifdef CONFIG_VNC_THREAD qemu_mutex_destroy(vs-output_mutex); +qemu_bh_delete(vs-bh); +buffer_free(vs-jobs_buffer); #endif + for (i = 0; i VNC_STAT_ROWS; ++i) { qemu_free(vs-lossy_rect[i]); } @@ -1226,6 +1229,14 @@ static long vnc_client_read_plain(VncState *vs) return ret; } +#ifdef CONFIG_VNC_THREAD +static void vnc_jobs_bh(void *opaque) +{ +VncState *vs = opaque; + +vnc_jobs_consume_buffer(vs); +} +#endif /* * First function called whenever there is more data to be read from @@ -2525,6 +2536,7 @@ static void vnc_connect(VncDisplay *vd, int csock) #ifdef CONFIG_VNC_THREAD qemu_mutex_init(vs-output_mutex); +vs-bh = qemu_bh_new(vnc_jobs_bh, vs); #endif QTAILQ_INSERT_HEAD(vd-clients, vs, next); diff --git a/ui/vnc.h b/ui/vnc.h index 8a1e7b9..bca5f87 100644 --- a/ui/vnc.h +++ b/ui/vnc.h @@ -283,6 +283,8 @@ struct VncState VncJob job; #else
Re: Network performance with small packets - continued
On Thursday, March 10, 2011 12:54:58 am Michael S. Tsirkin wrote: On Wed, Mar 09, 2011 at 05:25:11PM -0600, Tom Lendacky wrote: As for which CPU the interrupt gets pinned to, that doesn't matter - see below. So what hurts us the most is that the IRQ jumps between the VCPUs? Yes, it appears that allowing the IRQ to run on more than one vCPU hurts. Without the publish last used index patch, vhost keeps injecting an irq for every received packet until the guest eventually turns off notifications. Because the irq injections end up overlapping we get contention on the irq_desc_lock_class lock. Here are some results using the baseline setup with irqbalance running. Txn Rate: 107,714.53 Txn/Sec, Pkt Rate: 214,006 Pkts/Sec Exits: 121,050.45 Exits/Sec TxCPU: 9.61% RxCPU: 99.45% Virtio1-input Interrupts/Sec (CPU0/CPU1): 13,975/0 Virtio1-output Interrupts/Sec (CPU0/CPU1): 0/0 About a 24% increase over baseline. Irqbalance essentially pinned the virtio irq to CPU0 preventing the irq lock contention and resulting in nice gains. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network performance with small packets - continued
On Thu, Mar 10, 2011 at 09:23:42AM -0600, Tom Lendacky wrote: On Thursday, March 10, 2011 12:54:58 am Michael S. Tsirkin wrote: On Wed, Mar 09, 2011 at 05:25:11PM -0600, Tom Lendacky wrote: As for which CPU the interrupt gets pinned to, that doesn't matter - see below. So what hurts us the most is that the IRQ jumps between the VCPUs? Yes, it appears that allowing the IRQ to run on more than one vCPU hurts. Without the publish last used index patch, vhost keeps injecting an irq for every received packet until the guest eventually turns off notifications. Are you sure you see that? If yes publish used should help a lot. Because the irq injections end up overlapping we get contention on the irq_desc_lock_class lock. Here are some results using the baseline setup with irqbalance running. Txn Rate: 107,714.53 Txn/Sec, Pkt Rate: 214,006 Pkts/Sec Exits: 121,050.45 Exits/Sec TxCPU: 9.61% RxCPU: 99.45% Virtio1-input Interrupts/Sec (CPU0/CPU1): 13,975/0 Virtio1-output Interrupts/Sec (CPU0/CPU1): 0/0 About a 24% increase over baseline. Irqbalance essentially pinned the virtio irq to CPU0 preventing the irq lock contention and resulting in nice gains. OK, so we probably want some form of delayed free for TX on top, and that should get us nice results already. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network performance with small packets - continued
On Thursday, March 10, 2011 09:34:22 am Michael S. Tsirkin wrote: On Thu, Mar 10, 2011 at 09:23:42AM -0600, Tom Lendacky wrote: On Thursday, March 10, 2011 12:54:58 am Michael S. Tsirkin wrote: On Wed, Mar 09, 2011 at 05:25:11PM -0600, Tom Lendacky wrote: As for which CPU the interrupt gets pinned to, that doesn't matter - see below. So what hurts us the most is that the IRQ jumps between the VCPUs? Yes, it appears that allowing the IRQ to run on more than one vCPU hurts. Without the publish last used index patch, vhost keeps injecting an irq for every received packet until the guest eventually turns off notifications. Are you sure you see that? If yes publish used should help a lot. I definitely see that. I ran lockstat in the guest and saw the contention on the lock when the irq was able to run on either vCPU. Once the irq was pinned the contention disappeared. The publish used index patch should eliminate the extra irq injections and then the pinning or use of irqbalance shouldn't be required. I'm getting a kernel oops during boot with the publish last used patches that I pulled from the mailing list - I had to make some changes in order to get them to apply and compile and might not have done the right things. Can you re-spin that patchset against kvm.git? Because the irq injections end up overlapping we get contention on the irq_desc_lock_class lock. Here are some results using the baseline setup with irqbalance running. Txn Rate: 107,714.53 Txn/Sec, Pkt Rate: 214,006 Pkts/Sec Exits: 121,050.45 Exits/Sec TxCPU: 9.61% RxCPU: 99.45% Virtio1-input Interrupts/Sec (CPU0/CPU1): 13,975/0 Virtio1-output Interrupts/Sec (CPU0/CPU1): 0/0 About a 24% increase over baseline. Irqbalance essentially pinned the virtio irq to CPU0 preventing the irq lock contention and resulting in nice gains. OK, so we probably want some form of delayed free for TX on top, and that should get us nice results already. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: VNC and SDL/VGA simultaneously?
Avi Kivity wrote: On 03/09/2011 11:31 PM, Erik Rull wrote: Hi all, is it possible to parameterize qemu in a way where the VNC port and the VGA output is available in parallel? Not really, though it should be possible to do it with some effort. My system screen remains dark if I run it with the -vnc :0 option and vnc is unavailable when SDL/VGA is available. What's your use case? I want to make remote support possble for a guest system that has only a local network connection to the host and has a graphical console running where the operator works on. (So real VNC is not available to the rest of the world) It would also be okay if a switching between the VNC and the SDL/VGA would be possible at runtime so that the remote support can do the work and then switch back to the operators screen. Best regards, Erik -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
virtio_net: remove recv refill work?
Hello Rusty, What's the reason to use refill work in receiving path? static void refill_work(struct work_struct *work) { struct virtnet_info *vi; bool still_empty; vi = container_of(work, struct virtnet_info, refill.work); napi_disable(vi-napi); still_empty = !try_fill_recv(vi, GFP_KERNEL); napi_enable(vi-napi); /* In theory, this can happen: if we don't get any buffers in * we will *never* try to fill again. */ if (still_empty) schedule_delayed_work(vi-refill, HZ/2); } It looks more expensive than only refilling the buffers in recv path. It could cause more guest exits, maybe more RX interrupts. Can we move the refill work in recv path only? If we can remove it, I will run some test to compare the difference with the change. Thanks Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: virtio_net: remove recv refill work?
Never mind, the refill work is needed only for OOM. It can't be replaced. thanks Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: VNC and SDL/VGA simultaneously?
We tried before with both spice(QXL) and VNC enabled at the same time for the same VM. It works a little bit, I mean VNC session can hold some time. I use gtk-vnc and it looks like qemu vnc implementation sends some special packets and cause gtk-vnc broken. BR Bitman Zhou 在 2011-03-10四的 20:15 +0100,Erik Rull写道: Avi Kivity wrote: On 03/09/2011 11:31 PM, Erik Rull wrote: Hi all, is it possible to parameterize qemu in a way where the VNC port and the VGA output is available in parallel? Not really, though it should be possible to do it with some effort. My system screen remains dark if I run it with the -vnc :0 option and vnc is unavailable when SDL/VGA is available. What's your use case? I want to make remote support possble for a guest system that has only a local network connection to the host and has a graphical console running where the operator works on. (So real VNC is not available to the rest of the world) It would also be okay if a switching between the VNC and the SDL/VGA would be possible at runtime so that the remote support can do the work and then switch back to the operators screen. Best regards, Erik -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network performance with small packets
On Tue, 08 Mar 2011 20:21:18 -0600, Andrew Theurer haban...@linux.vnet.ibm.com wrote: On Tue, 2011-03-08 at 13:57 -0800, Shirley Ma wrote: On Wed, 2011-02-09 at 11:07 +1030, Rusty Russell wrote: I've finally read this thread... I think we need to get more serious with our stats gathering to diagnose these kind of performance issues. This is a start; it should tell us what is actually happening to the virtio ring(s) without significant performance impact... Should we also add similar stat on vhost vq as well for monitoring vhost_signal vhost_notify? Tom L has started using Rusty's patches and found some interesting results, sent yesterday: http://marc.info/?l=kvmm=129953710930124w=2 Hmm, I'm not subscribed to kvm@ any more, so I didn't get this, so replying here: Also, it looks like vhost is sending a lot of notifications for packets it has received before the guest can get scheduled to disable notifications and begin processing the packets resulting in some lock contention in the guest (and high interrupt rates). Yes, this is a virtio design flaw, but one that should be fixable. We have room at the end of the ring, which we can put a last_used count. Then we can tell if wakeups are redundant, before the guest updates the flag. Here's an old patch where I played with implementing this: virtio: put last_used and last_avail index into ring itself. Generally, the other end of the virtio ring doesn't need to see where you're up to in consuming the ring. However, to completely understand what's going on from the outside, this information must be exposed. For example, if you want to save and restore a virtio_ring, but you're not the consumer because the kernel is using it directly. Fortunately, we have room to expand: the ring is always a whole number of pages and there's hundreds of bytes of padding after the avail ring and the used ring, whatever the number of descriptors (which must be a power of 2). We add a feature bit so the guest can tell the host that it's writing out the current value there, if it wants to use that. Signed-off-by: Rusty Russell ru...@rustcorp.com.au --- drivers/virtio/virtio_ring.c | 23 +++ include/linux/virtio_ring.h | 12 +++- 2 files changed, 26 insertions(+), 9 deletions(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -71,9 +71,6 @@ struct vring_virtqueue /* Number we've added since last sync. */ unsigned int num_added; - /* Last used index we've seen. */ - u16 last_used_idx; - /* How to notify other side. FIXME: commonalize hcalls! */ void (*notify)(struct virtqueue *vq); @@ -278,12 +275,13 @@ static void detach_buf(struct vring_virt static inline bool more_used(const struct vring_virtqueue *vq) { - return vq-last_used_idx != vq-vring.used-idx; + return vring_last_used(vq-vring) != vq-vring.used-idx; } static void *vring_get_buf(struct virtqueue *_vq, unsigned int *len) { struct vring_virtqueue *vq = to_vvq(_vq); + struct vring_used_elem *u; void *ret; unsigned int i; @@ -300,8 +298,11 @@ static void *vring_get_buf(struct virtqu return NULL; } - i = vq-vring.used-ring[vq-last_used_idx%vq-vring.num].id; - *len = vq-vring.used-ring[vq-last_used_idx%vq-vring.num].len; + u = vq-vring.used-ring[vring_last_used(vq-vring) % vq-vring.num]; + i = u-id; + *len = u-len; + /* Make sure we don't reload i after doing checks. */ + rmb(); if (unlikely(i = vq-vring.num)) { BAD_RING(vq, id %u out of range\n, i); @@ -315,7 +316,8 @@ static void *vring_get_buf(struct virtqu /* detach_buf clears data, so grab it now. */ ret = vq-data[i]; detach_buf(vq, i); - vq-last_used_idx++; + vring_last_used(vq-vring)++; + END_USE(vq); return ret; } @@ -402,7 +404,6 @@ struct virtqueue *vring_new_virtqueue(un vq-vq.name = name; vq-notify = notify; vq-broken = false; - vq-last_used_idx = 0; vq-num_added = 0; list_add_tail(vq-vq.list, vdev-vqs); #ifdef DEBUG @@ -413,6 +414,10 @@ struct virtqueue *vring_new_virtqueue(un vq-indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC); + /* We publish indices whether they offer it or not: if not, it's junk +* space anyway. But calling this acknowledges the feature. */ + virtio_has_feature(vdev, VIRTIO_RING_F_PUBLISH_INDICES); + /* No callback? Tell other side not to bother us. */ if (!callback) vq-vring.avail-flags |= VRING_AVAIL_F_NO_INTERRUPT; @@ -443,6 +448,8 @@ void vring_transport_features(struct vir switch (i) { case VIRTIO_RING_F_INDIRECT_DESC: break; +
Re: [PATCH v2 4/4] KVM: MMU: cleanup pte write path
On 03/10/2011 06:21 PM, Avi Kivity wrote: On 03/09/2011 09:43 AM, Xiao Guangrong wrote: This patch does: - call vcpu-arch.mmu.update_pte directly - use gfn_to_pfn_atomic in update_pte path The suggestion is from Avi. -mmu_guess_page_from_pte_write(vcpu, gpa, gentry); +mmu_seq = vcpu-kvm-mmu_notifier_seq; +smp_rmb(); smp_rmb() should come before, no? but the problem was present in the original code, too. Um, i think smb_rmb is used to avoid read mmu_notifier_seq reorder to the behind of gfn_to_pfn in the original code, like this: CPU A: B gfn_to_pfn invalidate_page mmu_notifier_seq++ read mmu_notifier_seq then, cpu A will get the invalid pfn. But, after this cleanup patch, we use gfn_to_pfn_atomic in the protection of mmu_lock, so i think the mmu_seq code can be removed. Subject: [PATCH] KVM: MMU: remove mmu_seq verification in kvm_mmu_pte_write The mmu_seq verification can be removed since we get the pfn in the protection of mmu_lock Signed-off-by: Xiao Guangrong xiaoguangr...@cn.fujitsu.com --- arch/x86/include/asm/kvm_host.h |2 +- arch/x86/kvm/mmu.c | 16 +--- arch/x86/kvm/paging_tmpl.h |4 +--- 3 files changed, 7 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index c8af099..18a95e9 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -256,7 +256,7 @@ struct kvm_mmu { struct kvm_mmu_page *sp); void (*invlpg)(struct kvm_vcpu *vcpu, gva_t gva); void (*update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, - u64 *spte, const void *pte, unsigned long mmu_seq); + u64 *spte, const void *pte); hpa_t root_hpa; int root_level; int shadow_root_level; diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 22fae75..2841805 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1206,7 +1206,7 @@ static void nonpaging_invlpg(struct kvm_vcpu *vcpu, gva_t gva) static void nonpaging_update_pte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, u64 *spte, -const void *pte, unsigned long mmu_seq) +const void *pte) { WARN_ON(1); } @@ -3163,9 +3163,8 @@ static void mmu_pte_write_zap_pte(struct kvm_vcpu *vcpu, } static void mmu_pte_write_new_pte(struct kvm_vcpu *vcpu, - struct kvm_mmu_page *sp, - u64 *spte, - const void *new, unsigned long mmu_seq) + struct kvm_mmu_page *sp, u64 *spte, + const void *new) { if (sp-role.level != PT_PAGE_TABLE_LEVEL) { ++vcpu-kvm-stat.mmu_pde_zapped; @@ -3173,7 +3172,7 @@ static void mmu_pte_write_new_pte(struct kvm_vcpu *vcpu, } ++vcpu-kvm-stat.mmu_pte_updated; - vcpu-arch.mmu.update_pte(vcpu, sp, spte, new, mmu_seq); + vcpu-arch.mmu.update_pte(vcpu, sp, spte, new); } static bool need_remote_flush(u64 old, u64 new) @@ -3229,7 +3228,6 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, struct kvm_mmu_page *sp; struct hlist_node *node; LIST_HEAD(invalid_list); - unsigned long mmu_seq; u64 entry, gentry, *spte; unsigned pte_size, page_offset, misaligned, quadrant, offset; int level, npte, invlpg_counter, r, flooded = 0; @@ -3271,9 +3269,6 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, break; } - mmu_seq = vcpu-kvm-mmu_notifier_seq; - smp_rmb(); - spin_lock(vcpu-kvm-mmu_lock); if (atomic_read(vcpu-kvm-arch.invlpg_counter) != invlpg_counter) gentry = 0; @@ -3345,8 +3340,7 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, if (gentry !((sp-role.word ^ vcpu-arch.mmu.base_role.word) mask.word)) - mmu_pte_write_new_pte(vcpu, sp, spte, gentry, - mmu_seq); + mmu_pte_write_new_pte(vcpu, sp, spte, gentry); if (!remote_flush need_remote_flush(entry, *spte)) remote_flush = true; ++spte; diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index 7514050..3dee563 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -325,7 +325,7 @@ no_present: } static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, - u64 *spte, const void *pte, unsigned long mmu_seq) +
Re: [PATCH] kvm: ppc: Fix breakage of kvm_arch_pre_run/process_irqchip_events
On 17.02.2011, at 22:01, Jan Kiszka wrote: On 2011-02-07 12:19, Jan Kiszka wrote: We do not check them, and the only arch with non-empty implementations always returns 0 (this is also true for qemu-kvm). Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: Alexander Graf ag...@suse.de --- kvm.h |5 ++--- target-i386/kvm.c |8 ++-- target-ppc/kvm.c |6 ++ target-s390x/kvm.c |6 ++ 4 files changed, 8 insertions(+), 17 deletions(-) ... diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 93ecc57..bd4012a 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -256,14 +256,12 @@ int kvm_arch_pre_run(CPUState *env, struct kvm_run *run) return 0; } -int kvm_arch_post_run(CPUState *env, struct kvm_run *run) +void kvm_arch_post_run(CPUState *env, struct kvm_run *run) { -return 0; } -int kvm_arch_process_irqchip_events(CPUState *env) +void kvm_arch_process_irqchip_events(CPUState *env) { -return 0; } Oops. Do we already have a built-bot for KVM-enabled PPC (and s390) targets somewhere? Just before leaving for vacation I prepared a machine for each and gave stefan access to them. Looks like they're not officially running though - will try to look at this asap. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: TX from KVM guest virtio_net to vhost issues
On Wed, 09 Mar 2011 13:46:36 -0800, Shirley Ma mashi...@us.ibm.com wrote: Since we have lots of performance discussions about virtio_net and vhost communication. I think it's better to have a common understandings of the code first, then we can seek the right directions to improve it. We also need to collect more statistics data on both virtio and vhost. Let's look at TX first: from virtio_net(guest) to vhost(host), send vq is shared between guest virtio_net and host vhost, it uses memory barriers to sync the changes. In the start: Guest virtio_net TX send completion interrupt (for freeing used skbs) is disable. Guest virtio_net TX send completion interrupt is enabled only when send vq is overrun, guest needs to wait vhost to consume more available skbs. Host vhost notification is enabled in the beginning (for consuming available skbs); It is disable whenever the send vq is not empty. Once the send vq is empty, the notification is enabled by vhost. In guest start_xmit(), it first frees used skbs, then send available skbs to vhost, ideally guest never enables TX send completion interrupts to free used skbs if vhost keeps posting used skbs in send vq. In vhost handle_tx(), it wakes up by guest whenever the send vq has a skb to send, once the send vq is not empty, vhost exits handle_tx() without enabling notification. Ideally if guest keeps xmit skbs in send vq, the notification is never enabled. I don't see issues on this implementation. However, in our TCP_STREAM small message size test, we found that somehow guest couldn't see more used skbs to free, which caused frequently TX send queue overrun. So it seems like the guest is outrunning the host? In our TCP_RR small message size multiple streams test, we found that vhost couldn't see more xmit skbs in send vq, thus it enabled notification too often. And here the host is outrunning the guest? What's the possible cause here in xmit? How guest, vhost are being scheduled? Whether it's possible, guest virtio_net cooperates with vhost for ideal performance: both guest virtio_net and vhost can be in pace with send vq without many notifications and exits? We tend to blame the scheduler for all kinds of things, until we find the real cause :) Nailing things to different CPUs might help us determine if it really is scheduler... But if one side outruns the other, it does a lot of unnecessary work notifying/interrupting it over and over again before the host/guest gets a chance to shut notifications/interrupts off. Hence the last_used publishing patch (Xen does this right, I screwed it up). Long weekend here, and I'm otherwise committed. But if noone has cleaned up that patch by early next week, I'll try to do so and see if we can make a real difference. Cheers, Rusty. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: ppc: Fix breakage of kvm_arch_pre_run/process_irqchip_events
On Fri, Mar 11, 2011 at 5:55 AM, Alexander Graf ag...@suse.de wrote: On 17.02.2011, at 22:01, Jan Kiszka wrote: On 2011-02-07 12:19, Jan Kiszka wrote: We do not check them, and the only arch with non-empty implementations always returns 0 (this is also true for qemu-kvm). Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: Alexander Graf ag...@suse.de --- kvm.h | 5 ++--- target-i386/kvm.c | 8 ++-- target-ppc/kvm.c | 6 ++ target-s390x/kvm.c | 6 ++ 4 files changed, 8 insertions(+), 17 deletions(-) ... diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 93ecc57..bd4012a 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -256,14 +256,12 @@ int kvm_arch_pre_run(CPUState *env, struct kvm_run *run) return 0; } -int kvm_arch_post_run(CPUState *env, struct kvm_run *run) +void kvm_arch_post_run(CPUState *env, struct kvm_run *run) { - return 0; } -int kvm_arch_process_irqchip_events(CPUState *env) +void kvm_arch_process_irqchip_events(CPUState *env) { - return 0; } Oops. Do we already have a built-bot for KVM-enabled PPC (and s390) targets somewhere? Just before leaving for vacation I prepared a machine for each and gave stefan access to them. Looks like they're not officially running though - will try to look at this asap. They are in the process of being added to the buildbot by Daniel Gollub. However, the ppc box is unable to build qemu.git because it hits ENOMEM while compiling. I doubled swap size but that didn't fix the issue so I need to investigate more. At least s390 should be good to go soon and I will send an update when it is up and running. Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Autotest] [PATCH 1/7] KVM test: Move test utilities to client/tools
On Wed, Mar 09, 2011 at 06:21:04AM -0300, Lucas Meneghel Rodrigues wrote: The programs cd_hash, html_report, scan_results can be used by other users of autotest, so move them to the tools directory inside the client directory. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- client/tools/cd_hash.py | 48 ++ client/tools/html_report.py | 1727 ++ client/tools/scan_results.py | 97 +++ 3 files changed, 1872 insertions(+), 0 deletions(-) create mode 100644 client/tools/__init__.py create mode 100755 client/tools/cd_hash.py create mode 100755 client/tools/html_report.py create mode 100755 client/tools/scan_results.py diff --git a/client/tools/__init__.py b/client/tools/__init__.py new file mode 100644 index 000..e69de29 diff --git a/client/tools/cd_hash.py b/client/tools/cd_hash.py new file mode 100755 index 000..04f8cbe --- /dev/null +++ b/client/tools/cd_hash.py @@ -0,0 +1,48 @@ +#!/usr/bin/python + +Program that calculates several hashes for a given CD image. + +@copyright: Red Hat 2008-2009 + + +import os, sys, optparse, logging +import common +import kvm_utils is it allowed to execute tools singlely? and kvm_utils.py has been dropped. # client/tools/scanf_results.py (ok) # client/tools/cd_hash.py (failed) Traceback (most recent call last): File client/tools/cd_hash.py, line 9, in module import common ImportError: No module named common +from autotest_lib.client.common_lib import logging_manager +from autotest_lib.client.bin import utils + + +if __name__ == __main__: +parser = optparse.OptionParser(usage: %prog [options] [filenames]) +options, args = parser.parse_args() + +logging_manager.configure_logging(kvm_utils.KvmLoggingConfig()) + +if args: +filenames = args +else: +parser.print_help() +sys.exit(1) diff --git a/client/tools/html_report.py b/client/tools/html_report.py I've executed a serial tests, but no result.html produced. new file mode 100755 index 000..8b4b109 --- /dev/null +++ b/client/tools/html_report.py @@ -0,0 +1,1727 @@ +#!/usr/bin/python + +Script used to parse the test results and generate an HTML report. + +@copyright: (c)2005-2007 Matt Kruse (javascripttoolbox.com) +@copyright: Red Hat 2008-2009 +@author: Dror Russo (dru...@redhat.com) + -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/15] kvm: Align kvm_arch_handle_exit to kvm_cpu_exec changes
On 04.03.2011, at 11:20, Jan Kiszka wrote: Make the return code of kvm_arch_handle_exit directly usable for kvm_cpu_exec. This is straightforward for x86 and ppc, just s390 would require more work. Avoid this for now by pushing the return code translation logic into s390's kvm_arch_handle_exit. Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: Alexander Graf ag...@suse.de Looks good, haven't tested it though. Do you have a git tree for all this? Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: ppc: Fix breakage of kvm_arch_pre_run/process_irqchip_events
On 11.03.2011, at 07:26, Stefan Hajnoczi wrote: On Fri, Mar 11, 2011 at 5:55 AM, Alexander Graf ag...@suse.de wrote: On 17.02.2011, at 22:01, Jan Kiszka wrote: On 2011-02-07 12:19, Jan Kiszka wrote: We do not check them, and the only arch with non-empty implementations always returns 0 (this is also true for qemu-kvm). Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: Alexander Graf ag...@suse.de --- kvm.h |5 ++--- target-i386/kvm.c |8 ++-- target-ppc/kvm.c |6 ++ target-s390x/kvm.c |6 ++ 4 files changed, 8 insertions(+), 17 deletions(-) ... diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 93ecc57..bd4012a 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -256,14 +256,12 @@ int kvm_arch_pre_run(CPUState *env, struct kvm_run *run) return 0; } -int kvm_arch_post_run(CPUState *env, struct kvm_run *run) +void kvm_arch_post_run(CPUState *env, struct kvm_run *run) { -return 0; } -int kvm_arch_process_irqchip_events(CPUState *env) +void kvm_arch_process_irqchip_events(CPUState *env) { -return 0; } Oops. Do we already have a built-bot for KVM-enabled PPC (and s390) targets somewhere? Just before leaving for vacation I prepared a machine for each and gave stefan access to them. Looks like they're not officially running though - will try to look at this asap. They are in the process of being added to the buildbot by Daniel Gollub. However, the ppc box is unable to build qemu.git because it hits ENOMEM while compiling. I doubled swap size but that didn't fix the issue so I need to investigate more. At least s390 should be good Hrm, yeah. I hit that one too just before I left. Maybe I should just search for another RAM bar :). to go soon and I will send an update when it is up and running. Thanks! Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/15] kvm: Align kvm_arch_handle_exit to kvm_cpu_exec changes
On 2011-03-11 07:50, Alexander Graf wrote: On 04.03.2011, at 11:20, Jan Kiszka wrote: Make the return code of kvm_arch_handle_exit directly usable for kvm_cpu_exec. This is straightforward for x86 and ppc, just s390 would require more work. Avoid this for now by pushing the return code translation logic into s390's kvm_arch_handle_exit. Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: Alexander Graf ag...@suse.de Looks good, haven't tested it though. Do you have a git tree for all this? See git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream for the latest version. Jan signature.asc Description: OpenPGP digital signature
Re: [PATCH 12/15] kvm: Align kvm_arch_handle_exit to kvm_cpu_exec changes
On 11.03.2011, at 08:13, Jan Kiszka wrote: On 2011-03-11 07:50, Alexander Graf wrote: On 04.03.2011, at 11:20, Jan Kiszka wrote: Make the return code of kvm_arch_handle_exit directly usable for kvm_cpu_exec. This is straightforward for x86 and ppc, just s390 would require more work. Avoid this for now by pushing the return code translation logic into s390's kvm_arch_handle_exit. Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: Alexander Graf ag...@suse.de Looks good, haven't tested it though. Do you have a git tree for all this? See git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream With the following patch s390x-softmmu compiles and runs the bootloader code just fine, breaks in early Linux boot code though. I haven't quite figured out why yet. diff --git a/Makefile.target b/Makefile.target index 220589e..21106c6 100644 --- a/Makefile.target +++ b/Makefile.target @@ -209,7 +209,7 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS) obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o # Inter-VM PCI shared memory -obj-$(CONFIG_KVM) += ivshmem.o +obj-i386-$(CONFIG_KVM) += ivshmem.o # Hardware support obj-i386-y += vga.o diff --git a/exec.c b/exec.c index 0b7a7b2..10e6528 100644 --- a/exec.c +++ b/exec.c @@ -2963,7 +2963,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length) RAMBlock *block; ram_addr_t offset; int flags; -void *area, *vaddr; +void *area = NULL, *vaddr; QLIST_FOREACH(block, ram_list.blocks, next) { offset = addr - block-offset; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/15] kvm: Align kvm_arch_handle_exit to kvm_cpu_exec changes
On 2011-03-11 08:26, Alexander Graf wrote: On 11.03.2011, at 08:13, Jan Kiszka wrote: On 2011-03-11 07:50, Alexander Graf wrote: On 04.03.2011, at 11:20, Jan Kiszka wrote: Make the return code of kvm_arch_handle_exit directly usable for kvm_cpu_exec. This is straightforward for x86 and ppc, just s390 would require more work. Avoid this for now by pushing the return code translation logic into s390's kvm_arch_handle_exit. Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: Alexander Graf ag...@suse.de Looks good, haven't tested it though. Do you have a git tree for all this? See git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream With the following patch s390x-softmmu compiles and runs the bootloader code just fine, breaks in early Linux boot code though. I haven't quite figured out why yet. diff --git a/Makefile.target b/Makefile.target index 220589e..21106c6 100644 --- a/Makefile.target +++ b/Makefile.target @@ -209,7 +209,7 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS) obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o # Inter-VM PCI shared memory -obj-$(CONFIG_KVM) += ivshmem.o +obj-i386-$(CONFIG_KVM) += ivshmem.o Looks like s390 hasn't been built for a while - or what makes this workaround necessary? # Hardware support obj-i386-y += vga.o diff --git a/exec.c b/exec.c index 0b7a7b2..10e6528 100644 --- a/exec.c +++ b/exec.c @@ -2963,7 +2963,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length) RAMBlock *block; ram_addr_t offset; int flags; -void *area, *vaddr; +void *area = NULL, *vaddr; QLIST_FOREACH(block, ram_list.blocks, next) { offset = addr - block-offset; Yeah, we should abort() on mem_path != 0 for unsupported targets. Jan signature.asc Description: OpenPGP digital signature
Re: [PATCH 12/15] kvm: Align kvm_arch_handle_exit to kvm_cpu_exec changes
On 11.03.2011, at 08:13, Jan Kiszka wrote: On 2011-03-11 07:50, Alexander Graf wrote: On 04.03.2011, at 11:20, Jan Kiszka wrote: Make the return code of kvm_arch_handle_exit directly usable for kvm_cpu_exec. This is straightforward for x86 and ppc, just s390 would require more work. Avoid this for now by pushing the return code translation logic into s390's kvm_arch_handle_exit. Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: Alexander Graf ag...@suse.de Looks good, haven't tested it though. Do you have a git tree for all this? See git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream ppc64 book3s works just fine. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/15] kvm: Align kvm_arch_handle_exit to kvm_cpu_exec changes
On 11.03.2011, at 08:33, Jan Kiszka wrote: On 2011-03-11 08:26, Alexander Graf wrote: On 11.03.2011, at 08:13, Jan Kiszka wrote: On 2011-03-11 07:50, Alexander Graf wrote: On 04.03.2011, at 11:20, Jan Kiszka wrote: Make the return code of kvm_arch_handle_exit directly usable for kvm_cpu_exec. This is straightforward for x86 and ppc, just s390 would require more work. Avoid this for now by pushing the return code translation logic into s390's kvm_arch_handle_exit. Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: Alexander Graf ag...@suse.de Looks good, haven't tested it though. Do you have a git tree for all this? See git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream With the following patch s390x-softmmu compiles and runs the bootloader code just fine, breaks in early Linux boot code though. I haven't quite figured out why yet. diff --git a/Makefile.target b/Makefile.target index 220589e..21106c6 100644 --- a/Makefile.target +++ b/Makefile.target @@ -209,7 +209,7 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS) obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o # Inter-VM PCI shared memory -obj-$(CONFIG_KVM) += ivshmem.o +obj-i386-$(CONFIG_KVM) += ivshmem.o Looks like s390 hasn't been built for a while - or what makes this workaround necessary? It's been broken for quite a while, yes. I always fixed it locally in my trees, thinking I'll get around to submitting a _proper_ patch upstream some day. Well, some day is a very long time span :). # Hardware support obj-i386-y += vga.o diff --git a/exec.c b/exec.c index 0b7a7b2..10e6528 100644 --- a/exec.c +++ b/exec.c @@ -2963,7 +2963,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length) RAMBlock *block; ram_addr_t offset; int flags; -void *area, *vaddr; +void *area = NULL, *vaddr; QLIST_FOREACH(block, ram_list.blocks, next) { offset = addr - block-offset; Yeah, we should abort() on mem_path != 0 for unsupported targets. Yes, that would work too :). Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html