Re: [Qemu-devel] [PATCH v1] cpus: track calls to resume/pause_all_vcpus()

2018-04-09 Thread David Hildenbrand
On 09.04.2018 15:12, Paolo Bonzini wrote:
> On 09/04/2018 15:07, David Hildenbrand wrote:
>> If we have parallel calls to resume/pause_all_vcpus() we can get
>> into trouble because the qemu mutex is temporarily dropped while
>> waiting for all threads to stop. This can happen e.g. for s390x, where
>> resume/pause_all_vcpus() can be triggered by a VCPU.
> 

I'm also using it resume/pause_all_vcpus() now in a prototype to
temporarily get all VCPUs out of KVM, that's how I noticed that this is
shaky :)

> Why does s390 need to do pause_all_vcpus()/resume_all_vcpus() instead of
> just asking the main thread to do it (similar to qemu_system_reset), is
> it because diag 308 must be synchronous?

Christian implemented it back than to (quoting from another mail)

"I did this to prevent a "still running CPU to restart an already
stopped one"."

The problem is that another VCPU could just be about to send a SIGP
START/RESTART to a VCPU. Without the pause_all_vcpus(), the SIGP could
be delayed and executed just after the "soft reset", therefore resulting
in more than 1 VCPU running.

> 
> One disadvantage of the current approach is that diag 308 does not obey
> -no-reboot.

Both calls are used for kdump+kexec. "kdump on s390 uses a load normal
reset to bring the system in a defined state by doing a subsystem
reset", so like a "soft reboot". I don't think that we want to apply
"-no-reboot" here.

> 
> Paolo
> 


-- 

Thanks,

David / dhildenb



Re: [Qemu-devel] [PATCH v1] cpus: track calls to resume/pause_all_vcpus()

2018-04-09 Thread Paolo Bonzini
On 09/04/2018 15:07, David Hildenbrand wrote:
> If we have parallel calls to resume/pause_all_vcpus() we can get
> into trouble because the qemu mutex is temporarily dropped while
> waiting for all threads to stop. This can happen e.g. for s390x, where
> resume/pause_all_vcpus() can be triggered by a VCPU.

Why does s390 need to do pause_all_vcpus()/resume_all_vcpus() instead of
just asking the main thread to do it (similar to qemu_system_reset), is
it because diag 308 must be synchronous?

One disadvantage of the current approach is that diag 308 does not obey
-no-reboot.

Paolo



[Qemu-devel] [PATCH v1] cpus: track calls to resume/pause_all_vcpus()

2018-04-09 Thread David Hildenbrand
If we have parallel calls to resume/pause_all_vcpus() we can get
into trouble because the qemu mutex is temporarily dropped while
waiting for all threads to stop. This can happen e.g. for s390x, where
resume/pause_all_vcpus() can be triggered by a VCPU.

Pause/Resume exactly once, when we leave/hit "0".

Signed-off-by: David Hildenbrand 
---
 cpus.c | 31 ---
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/cpus.c b/cpus.c
index 2e6701795b..7c7e0245c5 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1778,17 +1778,26 @@ static bool all_vcpus_paused(void)
 return true;
 }
 
+/* wait for the initial vm_start() call */
+static int vcpus_paused = 1;
+
 void pause_all_vcpus(void)
 {
 CPUState *cpu;
 
-qemu_clock_enable(QEMU_CLOCK_VIRTUAL, false);
-CPU_FOREACH(cpu) {
-if (qemu_cpu_is_self(cpu)) {
-qemu_cpu_stop(cpu, true);
-} else {
-cpu->stop = true;
-qemu_cpu_kick(cpu);
+assert(qemu_mutex_iothread_locked());
+assert(vcpus_paused >= 0);
+
+vcpus_paused++;
+if (vcpus_paused == 1) {
+qemu_clock_enable(QEMU_CLOCK_VIRTUAL, false);
+CPU_FOREACH(cpu) {
+if (qemu_cpu_is_self(cpu)) {
+qemu_cpu_stop(cpu, true);
+} else {
+cpu->stop = true;
+qemu_cpu_kick(cpu);
+}
 }
 }
 
@@ -1820,6 +1829,14 @@ void resume_all_vcpus(void)
 {
 CPUState *cpu;
 
+assert(vcpus_paused >= 0);
+assert(qemu_mutex_iothread_locked());
+
+vcpus_paused--;
+if (vcpus_paused > 0) {
+return;
+}
+
 qemu_clock_enable(QEMU_CLOCK_VIRTUAL, true);
 CPU_FOREACH(cpu) {
 cpu_resume(cpu);
-- 
2.14.3