Re: [PATCH v2 2/6] reuse env stop and stopped states

2009-07-28 Thread Avi Kivity

On 07/28/2009 03:48 AM, Glauber Costa wrote:

On Mon, Jul 27, 2009 at 06:43:47PM +0300, Avi Kivity wrote:
   

On 07/22/2009 01:13 AM, Glauber Costa wrote:
 

qemu CPUState already provides stop and stopped states. And they
mean exactly that. There is no need for us to provide our own.


   

This patch (known as dd0e1c1a589 in qemu-kvm.git) breaks reboot.  My
test case is FC6 i386 -smp 2, running the reboot command in rc.local.
In about 15 minutes qemu hangs hard.  Please check what's gone wrong.
 

I found out that doing kill -38your_pid  makes it run again, so we're likely
hanging somewhere while holding qemu_mutex. The state of the process is D,
so we're holding qemu_mutex, and then calling something that can block.
   


Sounds like we call a vcpu ioctl from the iothread (or from a different 
vcpu thread).



It's hard for me to believe that this patch introduced it. At best, it might 
have
made it more likely. Also, I also verified that it sometimes takes a while until
it happen for the first time. Are you sure this is the first patch that makes 
it happen?
   


I haven't been able to reproduce it before this patch.  Maybe this patch 
doesn't introduce it, only exposes it.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/6] reuse env stop and stopped states

2009-07-28 Thread Gleb Natapov
On Tue, Jul 28, 2009 at 09:17:05AM +0300, Avi Kivity wrote:
 On 07/28/2009 03:48 AM, Glauber Costa wrote:
 On Mon, Jul 27, 2009 at 06:43:47PM +0300, Avi Kivity wrote:

 On 07/22/2009 01:13 AM, Glauber Costa wrote:
  
 qemu CPUState already provides stop and stopped states. And they
 mean exactly that. There is no need for us to provide our own.



 This patch (known as dd0e1c1a589 in qemu-kvm.git) breaks reboot.  My
 test case is FC6 i386 -smp 2, running the reboot command in rc.local.
 In about 15 minutes qemu hangs hard.  Please check what's gone wrong.
  
 I found out that doing kill -38your_pid  makes it run again, so we're 
 likely
 hanging somewhere while holding qemu_mutex. The state of the process is D,
 so we're holding qemu_mutex, and then calling something that can block.


 Sounds like we call a vcpu ioctl from the iothread (or from a different  
 vcpu thread).

 It's hard for me to believe that this patch introduced it. At best, it might 
 have
 made it more likely. Also, I also verified that it sometimes takes a while 
 until
 it happen for the first time. Are you sure this is the first patch that 
 makes it happen?


 I haven't been able to reproduce it before this patch.  Maybe this patch  
 doesn't introduce it, only exposes it.

What are backtraces of all threads when it happens?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/6] reuse env stop and stopped states

2009-07-28 Thread Avi Kivity

On 07/28/2009 09:24 AM, Gleb Natapov wrote:


What are backtraces of all threads when it happens?
   


I wasn't able to attach with gdb.  But I thought you reproduced it?

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/6] reuse env stop and stopped states

2009-07-28 Thread Gleb Natapov
On Tue, Jul 28, 2009 at 09:28:26AM +0300, Avi Kivity wrote:
 On 07/28/2009 09:24 AM, Gleb Natapov wrote:

 What are backtraces of all threads when it happens?


 I wasn't able to attach with gdb.  But I thought you reproduced it?

Glauber may be yes. Me not even tried :)

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/6] reuse env stop and stopped states

2009-07-28 Thread Avi Kivity

On 07/28/2009 09:29 AM, Gleb Natapov wrote:

On Tue, Jul 28, 2009 at 09:28:26AM +0300, Avi Kivity wrote:
   

On 07/28/2009 09:24 AM, Gleb Natapov wrote:
 

What are backtraces of all threads when it happens?

   

I wasn't able to attach with gdb.  But I thought you reproduced it?

 

Glauber may be yes. Me not even tried :)

   


Ah sorry.  I only read the first two letters of any name.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/6] reuse env stop and stopped states

2009-07-28 Thread Avi Kivity

On 07/28/2009 09:17 AM, Avi Kivity wrote:
I found out that doing kill -38your_pid  makes it run again, so 
we're likely
hanging somewhere while holding qemu_mutex. The state of the process 
is D,

so we're holding qemu_mutex, and then calling something that can block.


Sounds like we call a vcpu ioctl from the iothread (or from a 
different vcpu thread).


That's indeed the case.  We reload the local apic state from the 
iothread instead of the vcpu thread.  Please write a patch to fix this.


It's hard for me to believe that this patch introduced it. At best, 
it might have
made it more likely. Also, I also verified that it sometimes takes a 
while until
it happen for the first time. Are you sure this is the first patch 
that makes it happen?


I haven't been able to reproduce it before this patch.  Maybe this 
patch doesn't introduce it, only exposes it.




It does.  The root problem is that env-stopped is cleared during reset, 
so pause_all_threads() doesn't work:


uint32_t stop;   /* Stop request */ \
uint32_t stopped; /* Artificially stopped */\
...
/* from this point: preserved by CPU reset */   \

This kind of bug is incredibly hard to find - you now owe Gleb a solar 
mass worth of beer.  IMO we shouldn't be coding like this, please patch 
upstream to explicitly clear what needs clearing.


I'm now testing the simple fix (moving the variables after the memset 
point).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/6] reuse env stop and stopped states

2009-07-27 Thread Avi Kivity

On 07/22/2009 01:13 AM, Glauber Costa wrote:

qemu CPUState already provides stop and stopped states. And they
mean exactly that. There is no need for us to provide our own.

   


This patch (known as dd0e1c1a589 in qemu-kvm.git) breaks reboot.  My 
test case is FC6 i386 -smp 2, running the reboot command in rc.local.  
In about 15 minutes qemu hangs hard.  Please check what's gone wrong.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/6] reuse env stop and stopped states

2009-07-27 Thread Glauber Costa
On Mon, Jul 27, 2009 at 06:43:47PM +0300, Avi Kivity wrote:
 On 07/22/2009 01:13 AM, Glauber Costa wrote:
 qemu CPUState already provides stop and stopped states. And they
 mean exactly that. There is no need for us to provide our own.



 This patch (known as dd0e1c1a589 in qemu-kvm.git) breaks reboot.  My  
 test case is FC6 i386 -smp 2, running the reboot command in rc.local.   
 In about 15 minutes qemu hangs hard.  Please check what's gone wrong.
I found out that doing kill -38 your_pid makes it run again, so we're likely
hanging somewhere while holding qemu_mutex. The state of the process is D,
so we're holding qemu_mutex, and then calling something that can block.

It's hard for me to believe that this patch introduced it. At best, it might 
have
made it more likely. Also, I also verified that it sometimes takes a while until
it happen for the first time. Are you sure this is the first patch that makes 
it happen?


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/6] reuse env stop and stopped states

2009-07-21 Thread Glauber Costa
qemu CPUState already provides stop and stopped states. And they
mean exactly that. There is no need for us to provide our own.

Signed-off-by: Glauber Costa glom...@redhat.com
---
 cpu-defs.h |2 --
 qemu-kvm.c |   30 --
 vl.c   |2 +-
 3 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/cpu-defs.h b/cpu-defs.h
index 7570096..fce366f 100644
--- a/cpu-defs.h
+++ b/cpu-defs.h
@@ -142,8 +142,6 @@ struct qemu_work_item;
 struct KVMCPUState {
 pthread_t thread;
 int signalled;
-int stop;
-int stopped;
 int created;
 void *vcpu_ctx;
 struct qemu_work_item *queued_work_first, *queued_work_last;
diff --git a/qemu-kvm.c b/qemu-kvm.c
index 0f5f14f..8eeace4 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -91,7 +91,7 @@ static int kvm_debug(void *opaque, void *data,
 
 if (handle) {
kvm_debug_cpu_requested = env;
-   env-kvm_cpu_state.stopped = 1;
+   env-stopped = 1;
 }
 return handle;
 }
@@ -977,7 +977,7 @@ int handle_halt(kvm_vcpu_context_t vcpu)
 int handle_shutdown(kvm_context_t kvm, CPUState *env)
 {
 /* stop the current vcpu from going back to guest mode */
-env-kvm_cpu_state.stopped = 1;
+env-stopped = 1;
 
 qemu_system_reset_request();
 return 1;
@@ -1815,7 +1815,7 @@ int kvm_cpu_exec(CPUState *env)
 
 static int is_cpu_stopped(CPUState *env)
 {
-return !vm_running || env-kvm_cpu_state.stopped;
+return !vm_running || env-stopped;
 }
 
 static void flush_queued_work(CPUState *env)
@@ -1861,9 +1861,9 @@ static void kvm_main_loop_wait(CPUState *env, int timeout)
 cpu_single_env = env;
 flush_queued_work(env);
 
-if (env-kvm_cpu_state.stop) {
-   env-kvm_cpu_state.stop = 0;
-   env-kvm_cpu_state.stopped = 1;
+if (env-stop) {
+   env-stop = 0;
+   env-stopped = 1;
pthread_cond_signal(qemu_pause_cond);
 }
 
@@ -1875,7 +1875,7 @@ static int all_threads_paused(void)
 CPUState *penv = first_cpu;
 
 while (penv) {
-if (penv-kvm_cpu_state.stop)
+if (penv-stop)
 return 0;
 penv = (CPUState *)penv-next_cpu;
 }
@@ -1889,11 +1889,11 @@ static void pause_all_threads(void)
 
 while (penv) {
 if (penv != cpu_single_env) {
-penv-kvm_cpu_state.stop = 1;
+penv-stop = 1;
 pthread_kill(penv-kvm_cpu_state.thread, SIG_IPI);
 } else {
-penv-kvm_cpu_state.stop = 0;
-penv-kvm_cpu_state.stopped = 1;
+penv-stop = 0;
+penv-stopped = 1;
 cpu_exit(penv);
 }
 penv = (CPUState *)penv-next_cpu;
@@ -1910,8 +1910,8 @@ static void resume_all_threads(void)
 assert(!cpu_single_env);
 
 while (penv) {
-penv-kvm_cpu_state.stop = 0;
-penv-kvm_cpu_state.stopped = 0;
+penv-stop = 0;
+penv-stopped = 0;
 pthread_kill(penv-kvm_cpu_state.thread, SIG_IPI);
 penv = (CPUState *)penv-next_cpu;
 }
@@ -2676,12 +2676,6 @@ int kvm_log_stop(target_phys_addr_t phys_addr, 
target_phys_addr_t len)
 return 0;
 }
 
-void qemu_kvm_cpu_stop(CPUState *env)
-{
-if (kvm_enabled())
-env-kvm_cpu_state.stopped = 1;
-}
-
 int kvm_set_boot_cpu_id(uint32_t id)
 {
return kvm_set_boot_vcpu_id(kvm_context, id);
diff --git a/vl.c b/vl.c
index b3df596..6ef7690 100644
--- a/vl.c
+++ b/vl.c
@@ -3553,7 +3553,7 @@ void qemu_system_reset_request(void)
 reset_requested = 1;
 }
 if (cpu_single_env) {
-qemu_kvm_cpu_stop(cpu_single_env);
+cpu_single_env-stopped = 1;
 }
 qemu_notify_event();
 }
-- 
1.6.2.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html