Re: [Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime

2017-07-06 Thread Alex Bennée

Paolo Bonzini  writes:

> On 05/07/2017 18:14, Peter Maydell wrote:
>>>   - Guest resets board, writing to some hw address (e.g.
>>> arm_sysctl_write)
>>>   - This triggers qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET)
>>>   - We exit iowrite and drop the BQL
>>>   - vl.c schedules qemu_system_reset->qemu_devices_reset...arm_cpu_reset
>>>   - we start writing new values to CPU env while still in TCG code
>>>   - CHAOS!
>>>
>>> The general solution for this is to ensure these sort of tasks are done
>>> with safe work in the CPUs context when we know nothing else is running.
>>> It seems this is probably best done by modifying
>>> qemu_system_reset_request to queue work up on current_cpu and execute it
>>> as safe work - I don't think the vl.c thread should ever be messing
>>> about with calling cpu_reset directly.
>> My first thought is that qemu_system_reset() should absolutely
>> stop every CPU (or other runnable thing like a DMA agent) in the
>> system. The semantics are basically "like a power cycle", so
>> that should include a complete stop of the world. (Is this
>> what vm_stop() does? Dunno...)
>
> I agree, it should do vm_stop() as the first thing and, if applicable,
> vm_start() as the last thing, similar to e.g. savevm.

OK I did some more digging and basically the problem is cpu_stop_current
does the wrong thing. It can set cpu->stopped while still in the vCPU
thread which means when the vl.c thread does pause_all_vcpus() it thinks
the thread is paused when in fact it isn't leading to the chaos. I think
the fix is to tighten up our usage of these two functions. So my current
plan is:

* pause_all_vcpus() should never be called from vCPU/HW emulation

One case in kvm_apic has been fixed by Pranith. The other case in s390
should be converted to use async_safe_work. Once this is done we can
assert that pause_all_vcpus() is not in a vCPU thread and keep it for
qmp,hmp and gdb type operations.

* vm_stop() is probably being misused by vCPU threads

There are more uses than pause_all_vcpus here but they all seem to be
for error handling bail-out type things.

* cpu_stop_current() is probably superfluous now

It certainly shouldn't be called directly from the vCPU code
(rtas_power_off) and once we know pause_all_vcpus() can't be called
directly at least one call is gone. I think the current_cpu handling is
a relic of the days of single-threaded handling when it was a global.

Does that sound reasonable?

--
Alex Bennée



Re: [Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime

2017-07-05 Thread Alex Bennée

Peter Maydell  writes:

> On 5 July 2017 at 20:30, Alex Bennée  wrote:
>>
>> Peter Maydell  writes:
>>
>>> On 5 July 2017 at 17:01, Alex Bennée  wrote:
 An interesting bug was reported on #qemu today. It was bisected to
 8d04fb55 (drop global lock for TCG) and only occurred when QEMU was run
 with taskset -c 0. Originally the fingers where pointed at mttcg but it
 occurs in both single and multi-threaded modes.

 I think the problem is qemu_system_reset_request() is certainly racy
 when resetting a running CPU. AFAICT:

   - Guest resets board, writing to some hw address (e.g.
 arm_sysctl_write)
   - This triggers qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET)
   - We exit iowrite and drop the BQL
   - vl.c schedules qemu_system_reset->qemu_devices_reset...arm_cpu_reset
   - we start writing new values to CPU env while still in TCG code
   - CHAOS!

 The general solution for this is to ensure these sort of tasks are done
 with safe work in the CPUs context when we know nothing else is running.
 It seems this is probably best done by modifying
 qemu_system_reset_request to queue work up on current_cpu and execute it
 as safe work - I don't think the vl.c thread should ever be messing
 about with calling cpu_reset directly.
>>>
>>> My first thought is that qemu_system_reset() should absolutely
>>> stop every CPU (or other runnable thing like a DMA agent) in the
>>> system.
>>
>> Are all these reset calls system wide though?
>
> It's called 'system_reset' because it resets the entire system...
>
>> After all with PCSI you
>> can bring individual cores up and down. I appreciate the vexpress stuff
>> pre-dates those well defined semantics though.
>
> It's individual core reset that's a more ad-hoc afterthought,
> really.
>
>> vm_stop certainly tries to deal with things gracefully as well as send
>> qapi events, drain IO queues and the rest of it. My only concern is it
>> handles two cases - external vm_stops and those from the current CPU.
>>
>> I think it may be cleaner for CPU originated halts to use the
>> async_safe_run_on_cpu() mechanism.
>
> System reset already has an async component to it -- you call
> qemu_system_reset_request(), which just says "schedule a system
> reset as soon as convenient". qemu_system_reset() is the thing
> that runs later and actually does the job (from the io thread,
> not the CPU thread).
>
> Looking more closely at the vl.c code, it looks like it
> calls pause_all_vcpus() before calling qemu_system_reset():
> shouldn't that be pausing all the TCG CPUs?

Looking deeper it seems cpu_stop_current() is doing the wrong thing.
Because it sets cpu->stopped the pause_all_vcpus() in the vl.c thread
doesn't wait.

I suspect it should really be doing a cpu_loop_exit. I'll see if I can
work up a patch.

>
> thanks
> -- PMM


--
Alex Bennée



Re: [Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime

2017-07-05 Thread Alex Bennée

Peter Maydell  writes:

> On 5 July 2017 at 20:30, Alex Bennée  wrote:
>>
>> Peter Maydell  writes:
>>
>>> On 5 July 2017 at 17:01, Alex Bennée  wrote:
 An interesting bug was reported on #qemu today. It was bisected to
 8d04fb55 (drop global lock for TCG) and only occurred when QEMU was run
 with taskset -c 0. Originally the fingers where pointed at mttcg but it
 occurs in both single and multi-threaded modes.

 I think the problem is qemu_system_reset_request() is certainly racy
 when resetting a running CPU. AFAICT:

   - Guest resets board, writing to some hw address (e.g.
 arm_sysctl_write)
   - This triggers qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET)
   - We exit iowrite and drop the BQL
   - vl.c schedules qemu_system_reset->qemu_devices_reset...arm_cpu_reset
   - we start writing new values to CPU env while still in TCG code
   - CHAOS!

 The general solution for this is to ensure these sort of tasks are done
 with safe work in the CPUs context when we know nothing else is running.
 It seems this is probably best done by modifying
 qemu_system_reset_request to queue work up on current_cpu and execute it
 as safe work - I don't think the vl.c thread should ever be messing
 about with calling cpu_reset directly.
>>>
>>> My first thought is that qemu_system_reset() should absolutely
>>> stop every CPU (or other runnable thing like a DMA agent) in the
>>> system.
>>
>> Are all these reset calls system wide though?
>
> It's called 'system_reset' because it resets the entire system...
>
>> After all with PCSI you
>> can bring individual cores up and down. I appreciate the vexpress stuff
>> pre-dates those well defined semantics though.
>
> It's individual core reset that's a more ad-hoc afterthought,
> really.
>
>> vm_stop certainly tries to deal with things gracefully as well as send
>> qapi events, drain IO queues and the rest of it. My only concern is it
>> handles two cases - external vm_stops and those from the current CPU.
>>
>> I think it may be cleaner for CPU originated halts to use the
>> async_safe_run_on_cpu() mechanism.
>
> System reset already has an async component to it -- you call
> qemu_system_reset_request(), which just says "schedule a system
> reset as soon as convenient". qemu_system_reset() is the thing
> that runs later and actually does the job (from the io thread,
> not the CPU thread).
>
> Looking more closely at the vl.c code, it looks like it
> calls pause_all_vcpus() before calling qemu_system_reset():
> shouldn't that be pausing all the TCG CPUs?

Hmm it should - but it doesn't seem to have in this backtrace:

#0  0x5593fdd3 in arm_cpu_reset (s=0x569abb90) at 
/home/alex/lsrc/qemu/qemu.git/target/arm/cpu.c:119
#1  0x55bcc74a in cpu_reset (cpu=0x569abb90) at qom/cpu.c:268
#2  0x5589d82a in do_cpu_reset (opaque=0x569abb90) at 
/home/alex/lsrc/qemu/qemu.git/hw/arm/boot.c:570
#3  0x55a257e4 in qemu_devices_reset () at hw/core/reset.c:69
#4  0x559697a8 in qemu_system_reset (reason=SHUTDOWN_CAUSE_GUEST_RESET) 
at vl.c:1713
#5  0x55969c0d in main_loop_should_exit () at vl.c:1885
#6  0x55969cda in main_loop () at vl.c:1922
#7  0x55971aca in main (argc=16, argv=0x7fffd918, 
envp=0x7fffd9a0) at vl.c:4749

Thread 4 (Thread 0x7fff731ff700 (LWP 10098)):
#0  0x7fffdf4f5a15 in do_futex_wait (private=0, abstime=0x7fff731fc670, 
expected=0, futex_word=0x7fff64cbb5b8) at 
../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  0x7fffdf4f5a15 in do_futex_wait (sem=sem@entry=0x7fff64cbb5b8, 
abstime=abstime@entry=0x7fff731fc670) at sem_waitcommon.c:111
#2  0x7fffdf4f5adf in __new_sem_wait_slow (sem=0x7fff64cbb5b8, 
abstime=0x7fff731fc670) at sem_waitcommon.c:181
#3  0x7fffdf4f5b92 in sem_timedwait (sem=, 
abstime=) at sem_timedwait.c:36
#4  0x55d27488 in qemu_sem_timedwait (sem=0x7fff64cbb5b8, ms=1) at 
util/qemu-thread-posix.c:271
#5  0x55d20aad in worker_thread (opaque=0x7fff64cbb550) at 
util/thread-pool.c:92
#6  0x7fffdf4ed6ba in start_thread (arg=0x7fff731ff700) at 
pthread_create.c:333
#7  0x7fffdf2233dd in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 3 (Thread 0x7fff7ebff700 (LWP 10097)):
#0  0x7fffdf4f630a in __lll_unlock_wake () at 
../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:371
#1  0x7fffdf4f14ff in __GI___pthread_mutex_unlock (decr=1, 
mutex=0x5641ae20 ) at pthread_mutex_unlock.c:55
#2  0x7fffdf4f14ff in __GI___pthread_mutex_unlock (mutex=0x5641ae20 
) at pthread_mutex_unlock.c:314
#3  0x55d27091 in qemu_mutex_unlock (mutex=0x5641ae20 
) at util/qemu-thread-posix.c:88
#4  0x557aa911 in qemu_mutex_unlock_iothread () at 
/home/alex/lsrc/qemu/qemu.git/cpus.c:1589
#5  0x557d791a in 

Re: [Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime

2017-07-05 Thread Peter Maydell
On 5 July 2017 at 20:30, Alex Bennée  wrote:
>
> Peter Maydell  writes:
>
>> On 5 July 2017 at 17:01, Alex Bennée  wrote:
>>> An interesting bug was reported on #qemu today. It was bisected to
>>> 8d04fb55 (drop global lock for TCG) and only occurred when QEMU was run
>>> with taskset -c 0. Originally the fingers where pointed at mttcg but it
>>> occurs in both single and multi-threaded modes.
>>>
>>> I think the problem is qemu_system_reset_request() is certainly racy
>>> when resetting a running CPU. AFAICT:
>>>
>>>   - Guest resets board, writing to some hw address (e.g.
>>> arm_sysctl_write)
>>>   - This triggers qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET)
>>>   - We exit iowrite and drop the BQL
>>>   - vl.c schedules qemu_system_reset->qemu_devices_reset...arm_cpu_reset
>>>   - we start writing new values to CPU env while still in TCG code
>>>   - CHAOS!
>>>
>>> The general solution for this is to ensure these sort of tasks are done
>>> with safe work in the CPUs context when we know nothing else is running.
>>> It seems this is probably best done by modifying
>>> qemu_system_reset_request to queue work up on current_cpu and execute it
>>> as safe work - I don't think the vl.c thread should ever be messing
>>> about with calling cpu_reset directly.
>>
>> My first thought is that qemu_system_reset() should absolutely
>> stop every CPU (or other runnable thing like a DMA agent) in the
>> system.
>
> Are all these reset calls system wide though?

It's called 'system_reset' because it resets the entire system...

> After all with PCSI you
> can bring individual cores up and down. I appreciate the vexpress stuff
> pre-dates those well defined semantics though.

It's individual core reset that's a more ad-hoc afterthought,
really.

> vm_stop certainly tries to deal with things gracefully as well as send
> qapi events, drain IO queues and the rest of it. My only concern is it
> handles two cases - external vm_stops and those from the current CPU.
>
> I think it may be cleaner for CPU originated halts to use the
> async_safe_run_on_cpu() mechanism.

System reset already has an async component to it -- you call
qemu_system_reset_request(), which just says "schedule a system
reset as soon as convenient". qemu_system_reset() is the thing
that runs later and actually does the job (from the io thread,
not the CPU thread).

Looking more closely at the vl.c code, it looks like it
calls pause_all_vcpus() before calling qemu_system_reset():
shouldn't that be pausing all the TCG CPUs?

thanks
-- PMM



Re: [Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime

2017-07-05 Thread Alex Bennée

Paolo Bonzini  writes:

> On 05/07/2017 18:14, Peter Maydell wrote:
>>>   - Guest resets board, writing to some hw address (e.g.
>>> arm_sysctl_write)
>>>   - This triggers qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET)
>>>   - We exit iowrite and drop the BQL
>>>   - vl.c schedules qemu_system_reset->qemu_devices_reset...arm_cpu_reset
>>>   - we start writing new values to CPU env while still in TCG code
>>>   - CHAOS!
>>>
>>> The general solution for this is to ensure these sort of tasks are done
>>> with safe work in the CPUs context when we know nothing else is running.
>>> It seems this is probably best done by modifying
>>> qemu_system_reset_request to queue work up on current_cpu and execute it
>>> as safe work - I don't think the vl.c thread should ever be messing
>>> about with calling cpu_reset directly.
>> My first thought is that qemu_system_reset() should absolutely
>> stop every CPU (or other runnable thing like a DMA agent) in the
>> system. The semantics are basically "like a power cycle", so
>> that should include a complete stop of the world. (Is this
>> what vm_stop() does? Dunno...)
>
> I agree, it should do vm_stop() as the first thing and, if applicable,
> vm_start() as the last thing, similar to e.g. savevm.

Why not use our async_safe_run_on_cpu mechanism for it? Certainly I
wouldn't expect the vCPU hitting it's own reset button to need to be
graceful about it.

>
> In fact, the above bug probably has existed forever in KVM.
>
> Paolo


--
Alex Bennée



Re: [Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime

2017-07-05 Thread Alex Bennée

Peter Maydell  writes:

> On 5 July 2017 at 17:01, Alex Bennée  wrote:
>> An interesting bug was reported on #qemu today. It was bisected to
>> 8d04fb55 (drop global lock for TCG) and only occurred when QEMU was run
>> with taskset -c 0. Originally the fingers where pointed at mttcg but it
>> occurs in both single and multi-threaded modes.
>>
>> I think the problem is qemu_system_reset_request() is certainly racy
>> when resetting a running CPU. AFAICT:
>>
>>   - Guest resets board, writing to some hw address (e.g.
>> arm_sysctl_write)
>>   - This triggers qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET)
>>   - We exit iowrite and drop the BQL
>>   - vl.c schedules qemu_system_reset->qemu_devices_reset...arm_cpu_reset
>>   - we start writing new values to CPU env while still in TCG code
>>   - CHAOS!
>>
>> The general solution for this is to ensure these sort of tasks are done
>> with safe work in the CPUs context when we know nothing else is running.
>> It seems this is probably best done by modifying
>> qemu_system_reset_request to queue work up on current_cpu and execute it
>> as safe work - I don't think the vl.c thread should ever be messing
>> about with calling cpu_reset directly.
>
> My first thought is that qemu_system_reset() should absolutely
> stop every CPU (or other runnable thing like a DMA agent) in the
> system.

Are all these reset calls system wide though? After all with PCSI you
can bring individual cores up and down. I appreciate the vexpress stuff
pre-dates those well defined semantics though.

> The semantics are basically "like a power cycle", so
> that should include a complete stop of the world. (Is this
> what vm_stop() does? Dunno...)

vm_stop certainly tries to deal with things gracefully as well as send
qapi events, drain IO queues and the rest of it. My only concern is it
handles two cases - external vm_stops and those from the current CPU.

I think it may be cleaner for CPU originated halts to use the
async_safe_run_on_cpu() mechanism. It has clear semantics with respect
to the behaviour of other CPUs. If you queue work with
async_safe_run_on_cpu and do a cpu_loop_exit you can guarantee all vCPUs
have stopped and the work has been serviced before the originating vCPU
executes its next instruction.

>
> thanks
> -- PMM


--
Alex Bennée



Re: [Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime

2017-07-05 Thread G 3


On Jul 5, 2017, at 12:42 PM, qemu-devel-requ...@nongnu.org wrote:


Hi,

An interesting bug was reported on #qemu today. It was bisected to
8d04fb55 (drop global lock for TCG) and only occurred when QEMU was  
run
with taskset -c 0. Originally the fingers where pointed at mttcg  
but it

occurs in both single and multi-threaded modes.

I think the problem is qemu_system_reset_request() is certainly racy
when resetting a running CPU. AFAICT:

  - Guest resets board, writing to some hw address (e.g.
arm_sysctl_write)
  - This triggers qemu_system_reset_request 
(SHUTDOWN_CAUSE_GUEST_RESET)

  - We exit iowrite and drop the BQL
  - vl.c schedules qemu_system_reset- 
>qemu_devices_reset...arm_cpu_reset

  - we start writing new values to CPU env while still in TCG code
  - CHAOS!

The general solution for this is to ensure these sort of tasks are  
done
with safe work in the CPUs context when we know nothing else is  
running.

It seems this is probably best done by modifying
qemu_system_reset_request to queue work up on current_cpu and  
execute it

as safe work - I don't think the vl.c thread should ever be messing
about with calling cpu_reset directly.


Maybe vl.c should be changed so it registers a request to reset the  
emulator instead.


So instead of cpu_reset()

we do

request_cpu_reset()



Looking at the calls most of these are made by device code but I  
see KVM
also does it. I just wanted to check this was a reasonable approach  
and

wouldn't upset anything else.

Any thoughts?


I think the problem with the QEMU monitor command "stop" (which  
causes the emulator to crash) is related to this issue as well.





Re: [Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime

2017-07-05 Thread Paolo Bonzini


On 05/07/2017 18:14, Peter Maydell wrote:
>>   - Guest resets board, writing to some hw address (e.g.
>> arm_sysctl_write)
>>   - This triggers qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET)
>>   - We exit iowrite and drop the BQL
>>   - vl.c schedules qemu_system_reset->qemu_devices_reset...arm_cpu_reset
>>   - we start writing new values to CPU env while still in TCG code
>>   - CHAOS!
>>
>> The general solution for this is to ensure these sort of tasks are done
>> with safe work in the CPUs context when we know nothing else is running.
>> It seems this is probably best done by modifying
>> qemu_system_reset_request to queue work up on current_cpu and execute it
>> as safe work - I don't think the vl.c thread should ever be messing
>> about with calling cpu_reset directly.
> My first thought is that qemu_system_reset() should absolutely
> stop every CPU (or other runnable thing like a DMA agent) in the
> system. The semantics are basically "like a power cycle", so
> that should include a complete stop of the world. (Is this
> what vm_stop() does? Dunno...)

I agree, it should do vm_stop() as the first thing and, if applicable,
vm_start() as the last thing, similar to e.g. savevm.

In fact, the above bug probably has existed forever in KVM.

Paolo



Re: [Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime

2017-07-05 Thread Peter Maydell
On 5 July 2017 at 17:01, Alex Bennée  wrote:
> An interesting bug was reported on #qemu today. It was bisected to
> 8d04fb55 (drop global lock for TCG) and only occurred when QEMU was run
> with taskset -c 0. Originally the fingers where pointed at mttcg but it
> occurs in both single and multi-threaded modes.
>
> I think the problem is qemu_system_reset_request() is certainly racy
> when resetting a running CPU. AFAICT:
>
>   - Guest resets board, writing to some hw address (e.g.
> arm_sysctl_write)
>   - This triggers qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET)
>   - We exit iowrite and drop the BQL
>   - vl.c schedules qemu_system_reset->qemu_devices_reset...arm_cpu_reset
>   - we start writing new values to CPU env while still in TCG code
>   - CHAOS!
>
> The general solution for this is to ensure these sort of tasks are done
> with safe work in the CPUs context when we know nothing else is running.
> It seems this is probably best done by modifying
> qemu_system_reset_request to queue work up on current_cpu and execute it
> as safe work - I don't think the vl.c thread should ever be messing
> about with calling cpu_reset directly.

My first thought is that qemu_system_reset() should absolutely
stop every CPU (or other runnable thing like a DMA agent) in the
system. The semantics are basically "like a power cycle", so
that should include a complete stop of the world. (Is this
what vm_stop() does? Dunno...)

thanks
-- PMM