Re: [Qemu-devel] [PULL 00/18] Record/replay core for 2.5-rc1
> From: Paolo Bonzini [mailto:pbonz...@redhat.com] > On 05/11/2015 15:00, Peter Maydell wrote: > > On 5 November 2015 at 12:13, Paolo Bonziniwrote: > >> The following changes since commit > >> 6c5f30cad290c745f910481d0e890b3f4fad1f00: > >> > >> Merge remote-tracking branch > >> 'remotes/juanquintela/tags/migration/20151104' into staging > (2015-11-05 10:10:57 +) > >> > >> are available in the git repository at: > >> > >> > >> git://github.com/bonzini/qemu.git tags/for-upstream-replay > >> > >> for you to fetch changes up to 88dcb341cd6163a47346de83d4206591da6a9959: > >> > >> replay: recording of the user input (2015-11-05 12:40:48 +0100) > >> > >> > >> So here it is, let's see what happens. > > > > Hi. This got through most of the merge tests, but running linux-user > > tests failed: Thanks for the report! > > /home/petmay01/linaro/qemu-for-merges/build/all-linux-static/i386-linux-user/qemu-i386 > > -L ./gnemul/qemu-i386 i386/ls -l dummyfile > > qemu-i386: /home/petmay01/linaro/qemu-for-merges/translate-all.c:146: > > tb_unlock: Assertion `have_tb_lock' failed. > > > > Seems to assert the first time we call tb_unlock(). Here's a backtrace: > > Program received signal SIGABRT, Aborted. > > 0x601fa5d9 in raise () > > (gdb) bt > > #0 0x601fa5d9 in raise () > > #1 0x6019a608 in abort () > > #2 0x60194584 in __assert_fail_base () > > #3 0x601945de in __assert_fail () > > #4 0x60003d5b in tb_unlock () at > > /home/petmay01/linaro/qemu-for-merges/translate-all.c:146 > > #5 0x600073fa in tb_find_slow (cpu=0x62605710, pc=4135504272, > > cs_base=0, flags=4194483) > > at /home/petmay01/linaro/qemu-for-merges/cpu-exec.c:279 > > #6 0x60007581 in tb_find_fast (cpu=0x62605710) at > > /home/petmay01/linaro/qemu-for-merges/cpu-exec.c:316 > > #7 0x60007868 in cpu_x86_exec (cpu=0x62605710) at > > /home/petmay01/linaro/qemu-for-merges/cpu-exec.c:423 > > #8 0x600372b6 in cpu_loop (env=0x6260d990) at > > /home/petmay01/linaro/qemu-for-merges/linux-user/main.c:281 > > #9 0x60039370 in main (argc=6, argv=0x7fffe438, > > envp=0x7fffe470) > > at /home/petmay01/linaro/qemu-for-merges/linux-user/main.c:4666 > > > > thanks > > -- PMM > > > > Pavel, can you look at it? I'll take care of squashing the change. Here is the patch: diff --git a/replay/replay-user.c b/replay/replay-user.c index eeaa41d..7d4f9fb 100755 --- a/replay/replay-user.c +++ b/replay/replay-user.c @@ -18,7 +18,7 @@ bool replay_exception(void) bool replay_has_exception(void) { -return true; +return false; } bool replay_interrupt(void) @@ -28,7 +28,7 @@ bool replay_interrupt(void) bool replay_has_interrupt(void) { -return true; +return false; } Pavel Dovgalyuk
Re: [Qemu-devel] [RFH PATCH 0/4] record/replay fixups and doubts
Hi, Paolo! Will you pull these patches into 2.5? Pavel Dovgalyuk > -Original Message- > From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > Sent: Tuesday, October 06, 2015 11:01 PM > To: qemu-devel@nongnu.org > Cc: pavel.dovga...@ispras.ru > Subject: [RFH PATCH 0/4] record/replay fixups and doubts > > These are some comments I have about the record/replay code. I can > integrate these in your patches myself, but I need an ack/tested-by and > in some case more answers... Please take a look. > > Paolo Bonzini (4): > replay: generalize ptimer event to bottom halves > more replay fixes > why is runstate_is_running needed? > events doubts > > Makefile.objs | 2 ++ > Makefile.target | 1 - > cpu-exec.c | 2 +- > cpus.c | 2 +- > exec.c | 2 +- > hw/bt/hci.c | 4 ++-- > hw/core/ptimer.c| 8 ++-- > include/qapi/qmp/qerror.h | 2 +- > {replay => include/sysemu}/replay.h | 4 ++-- > qapi/common.json| 6 +- > qemu-timer.c| 23 +-- > replay/Makefile.objs| 11 +-- > replay/replay-events.c | 24 +++- > replay/replay-input.c | 2 +- > replay/replay-internal.c| 4 ++-- > replay/replay-internal.h| 2 +- > replay/replay-time.c| 2 +- > replay/replay.c | 2 +- > stubs/Makefile.objs | 1 + > {replay => stubs}/replay-user.c | 6 +- > stubs/replay.c | 10 +- > ui/input.c | 2 +- > vl.c| 6 +++--- > 23 files changed, 59 insertions(+), 69 deletions(-) > rename {replay => include/sysemu}/replay.h (97%) > mode change 100755 => 100644 replay/Makefile.objs > mode change 100755 => 100644 replay/replay-events.c > mode change 100755 => 100644 replay/replay-input.c > mode change 100755 => 100644 replay/replay-internal.c > mode change 100755 => 100644 replay/replay-internal.h > mode change 100755 => 100644 replay/replay-time.c > mode change 100755 => 100644 replay/replay.c > rename {replay => stubs}/replay-user.c (90%) > > -- > 2.5.0
Re: [Qemu-devel] [RFH PATCH 0/4] record/replay fixups and doubts
There is one more fix. Sometimes replay cannot continue after stopping/restarting of the virtual machine. This happens because warp on stopped machine and on running machine behaves differently. Timers deadline calculation depends on enabled flag of the virtual timer. The following patch fixes the problem - it disables warp when machine is stopped. index 5130806..7a337d9 100644 --- a/cpus.c +++ b/cpus.c @@ -411,7 +411,7 @@ void qemu_clock_warp(QEMUClockType type) } /* warp clock deterministically in record/replay mode */ -if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP)) { +if (!runstate_is_running() || !replay_checkpoint(CHECKPOINT_CLOCK_WARP)) { return; } Pavel Dovgalyuk > -Original Message- > From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > Sent: Tuesday, October 06, 2015 11:01 PM > To: qemu-devel@nongnu.org > Cc: pavel.dovga...@ispras.ru > Subject: [RFH PATCH 0/4] record/replay fixups and doubts > > These are some comments I have about the record/replay code. I can > integrate these in your patches myself, but I need an ack/tested-by and > in some case more answers... Please take a look. > > Paolo Bonzini (4): > replay: generalize ptimer event to bottom halves > more replay fixes > why is runstate_is_running needed? > events doubts > > Makefile.objs | 2 ++ > Makefile.target | 1 - > cpu-exec.c | 2 +- > cpus.c | 2 +- > exec.c | 2 +- > hw/bt/hci.c | 4 ++-- > hw/core/ptimer.c| 8 ++-- > include/qapi/qmp/qerror.h | 2 +- > {replay => include/sysemu}/replay.h | 4 ++-- > qapi/common.json| 6 +- > qemu-timer.c| 23 +-- > replay/Makefile.objs| 11 +-- > replay/replay-events.c | 24 +++- > replay/replay-input.c | 2 +- > replay/replay-internal.c| 4 ++-- > replay/replay-internal.h| 2 +- > replay/replay-time.c| 2 +- > replay/replay.c | 2 +- > stubs/Makefile.objs | 1 + > {replay => stubs}/replay-user.c | 6 +- > stubs/replay.c | 10 +- > ui/input.c | 2 +- > vl.c| 6 +++--- > 23 files changed, 59 insertions(+), 69 deletions(-) > rename {replay => include/sysemu}/replay.h (97%) > mode change 100755 => 100644 replay/Makefile.objs > mode change 100755 => 100644 replay/replay-events.c > mode change 100755 => 100644 replay/replay-input.c > mode change 100755 => 100644 replay/replay-internal.c > mode change 100755 => 100644 replay/replay-internal.h > mode change 100755 => 100644 replay/replay-time.c > mode change 100755 => 100644 replay/replay.c > rename {replay => stubs}/replay-user.c (90%) > > -- > 2.5.0
Re: [Qemu-devel] [PATCH 4/4] events doubts
> From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > > It is not clear what separates REPLAY_ASYNC_EVENT_BH from other async > events. It seems to be an ordering issue, but then why do input events > not have to be looked up in the queue? It would be much simpler if they > are all handled the same way. There are two kinds of events: - read from the log and injected immediately (user input, network input) - read from the log and wait for corresponding event in the queue (BH) We cannot inject BH event immediately because we do not have any information about callback and to preserve consistency - BH cannot be processed before it is scheduled by qemu core. Pavel Dovgalyuk > --- > replay/replay-events.c | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/replay/replay-events.c b/replay/replay-events.c > index 402f644..d6c61f6 100644 > --- a/replay/replay-events.c > +++ b/replay/replay-events.c > @@ -203,13 +203,15 @@ static Event *replay_read_event(int checkpoint) > return NULL; > } > > -/* Events that has not to be in the queue */ > +/* Read event-specific data */ > switch (read_event_kind) { > case REPLAY_ASYNC_EVENT_BH: > if (read_id == -1) { > read_id = replay_get_qword(); > } > break; > + > +/* Events that do not have to be in the queue - ### WHY? */ > case REPLAY_ASYNC_EVENT_INPUT: > event = g_malloc0(sizeof(Event)); > event->event_kind = read_event_kind; > @@ -220,6 +222,7 @@ static Event *replay_read_event(int checkpoint) > event->event_kind = read_event_kind; > event->opaque = 0; > return event; > + > default: > error_report("Unknown ID %d of replay event", read_event_kind); > exit(1); > @@ -239,8 +242,6 @@ static Event *replay_read_event(int checkpoint) > return NULL; > } > > -/* Read event-specific data */ > - > return event; > } > > -- > 2.5.0
Re: [Qemu-devel] [PATCH 3/4] why is runstate_is_running needed?
> From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > Sent: Tuesday, October 06, 2015 11:01 PM > To: qemu-devel@nongnu.org > Cc: pavel.dovga...@ispras.ru > Subject: [PATCH 3/4] why is runstate_is_running needed? > > It doesn't seem correct to call it for all checkpoints, but why > is it right for timerlist_run_timers? Because replaying shouldn't proceed when machine is stopped. These checks could be also useful for creating snapshots in record mode, but I don't remember exact reasons of adding them. I'll check your changes for the current version. > --- > qemu-timer.c | 9 +++-- > stubs/replay.c | 5 - > 2 files changed, 3 insertions(+), 11 deletions(-) > > diff --git a/qemu-timer.c b/qemu-timer.c > index 3c6e4c3..f16e422 100644 > --- a/qemu-timer.c > +++ b/qemu-timer.c > @@ -488,20 +488,17 @@ bool timerlist_run_timers(QEMUTimerList *timer_list) > break; > default: > case QEMU_CLOCK_VIRTUAL: > -if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running()) > -|| !replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) { > +if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) { > goto out; > } > break; > case QEMU_CLOCK_HOST: > -if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running()) > -|| !replay_checkpoint(CHECKPOINT_CLOCK_HOST)) { > +if (!replay_checkpoint(CHECKPOINT_CLOCK_HOST)) { > goto out; > } > break; > case QEMU_CLOCK_VIRTUAL_RT: > -if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running()) > -|| !replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL_RT)) { > +if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL_RT)) { > goto out; > } > break; > diff --git a/stubs/replay.c b/stubs/replay.c > index 71fa7d5..42d01b5 100755 > --- a/stubs/replay.c > +++ b/stubs/replay.c > @@ -22,11 +22,6 @@ bool replay_checkpoint(ReplayCheckpoint checkpoint) > return true; > } > > -int runstate_is_running(void) > -{ > -abort(); > -} > - > bool replay_events_enabled(void) > { > return false; > -- > 2.5.0 > Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH 1/4] replay: generalize ptimer event to bottom halves
This one is ok. Pavel Dovgalyuk > -Original Message- > From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > Sent: Tuesday, October 06, 2015 11:01 PM > To: qemu-devel@nongnu.org > Cc: pavel.dovga...@ispras.ru > Subject: [PATCH 1/4] replay: generalize ptimer event to bottom halves > > Make the code a bit more type safe and follow the same scheme as > replay_input_event and replay_input_sync_event. > > Signed-off-by: Paolo Bonzini> --- > hw/core/ptimer.c | 6 +- > replay/replay-events.c | 15 ++- > replay/replay-internal.h | 2 +- > replay/replay.h | 4 ++-- > 4 files changed, 14 insertions(+), 13 deletions(-) > > diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c > index c56078d..86d544f 100644 > --- a/hw/core/ptimer.c > +++ b/hw/core/ptimer.c > @@ -28,11 +28,7 @@ struct ptimer_state > static void ptimer_trigger(ptimer_state *s) > { > if (s->bh) { > -if (replay_mode != REPLAY_MODE_NONE) { > -replay_add_ptimer_event(s->bh, replay_get_current_step()); > -} else { > -qemu_bh_schedule(s->bh); > -} > +replay_bh_schedule_event(s->bh); > } > } > > diff --git a/replay/replay-events.c b/replay/replay-events.c > index 23f3b12..06dd4ca 100755 > --- a/replay/replay-events.c > +++ b/replay/replay-events.c > @@ -37,7 +37,7 @@ static bool events_enabled; > static void replay_run_event(Event *event) > { > switch (event->event_kind) { > -case REPLAY_ASYNC_EVENT_PTIMER: > +case REPLAY_ASYNC_EVENT_BH: > aio_bh_call(event->opaque); > break; > case REPLAY_ASYNC_EVENT_INPUT: > @@ -129,9 +129,14 @@ static void replay_add_event(ReplayAsyncEventKind > event_kind, > replay_mutex_unlock(); > } > > -void replay_add_ptimer_event(void *bh, uint64_t id) > +void replay_bh_schedule_event(QEMUBH *bh) > { > -replay_add_event(REPLAY_ASYNC_EVENT_PTIMER, bh, NULL, id); > +if (replay_mode != REPLAY_MODE_NONE) { > +uint64_t id = replay_get_current_step(); > +replay_add_event(REPLAY_ASYNC_EVENT_BH, bh, NULL, id); > +} else { > +qemu_bh_schedule(bh); > +} > } > > void replay_add_input_event(struct InputEvent *event) > @@ -154,7 +159,7 @@ static void replay_save_event(Event *event, int > checkpoint) > > /* save event-specific data */ > switch (event->event_kind) { > -case REPLAY_ASYNC_EVENT_PTIMER: > +case REPLAY_ASYNC_EVENT_BH: > replay_put_qword(event->id); > break; > case REPLAY_ASYNC_EVENT_INPUT: > @@ -200,7 +205,7 @@ static Event *replay_read_event(int checkpoint) > > /* Events that has not to be in the queue */ > switch (read_event_kind) { > -case REPLAY_ASYNC_EVENT_PTIMER: > +case REPLAY_ASYNC_EVENT_BH: > if (read_id == -1) { > read_id = replay_get_qword(); > } > diff --git a/replay/replay-internal.h b/replay/replay-internal.h > index 04d2e1b..77e0d29 100755 > --- a/replay/replay-internal.h > +++ b/replay/replay-internal.h > @@ -41,7 +41,7 @@ enum ReplayEvents { > /* Asynchronous events IDs */ > > enum ReplayAsyncEventKind { > -REPLAY_ASYNC_EVENT_PTIMER, > +REPLAY_ASYNC_EVENT_BH, > REPLAY_ASYNC_EVENT_INPUT, > REPLAY_ASYNC_EVENT_INPUT_SYNC, > REPLAY_ASYNC_COUNT > diff --git a/replay/replay.h b/replay/replay.h > index cbb4e11..abb4688 100755 > --- a/replay/replay.h > +++ b/replay/replay.h > @@ -110,8 +110,8 @@ bool replay_checkpoint(ReplayCheckpoint checkpoint); > void replay_disable_events(void); > /*! Returns true when saving events is enabled */ > bool replay_events_enabled(void); > -/*! Adds ptimer event to the queue */ > -void replay_add_ptimer_event(void *bh, uint64_t id); > +/*! Adds bottom half event to the queue */ > +void replay_bh_schedule_event(QEMUBH *bh); > /*! Adds input event to the queue */ > void replay_input_event(QemuConsole *src, InputEvent *evt); > /*! Adds input sync event to the queue */ > -- > 2.5.0 >
Re: [Qemu-devel] [PATCH 2/4] more replay fixes
This one is ok too. Pavel Dovgalyuk > -Original Message- > From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > Sent: Tuesday, October 06, 2015 11:01 PM > To: qemu-devel@nongnu.org > Cc: pavel.dovga...@ispras.ru > Subject: [PATCH 2/4] more replay fixes > > 1) Compile files once > > 2) Move include file from replay/replay.h to include/sysemu/replay.h. > > 3) Fix Error usage > > 4) cleanup timerlistgroup_deadline_ns a bit and allow clock jump > notifiers to run > > 5) move replay-user.c to stubs/ > --- > Makefile.objs | 2 ++ > Makefile.target | 1 - > cpu-exec.c | 2 +- > cpus.c | 2 +- > exec.c | 2 +- > hw/bt/hci.c | 4 ++-- > hw/core/ptimer.c| 2 +- > include/qapi/qmp/qerror.h | 2 +- > {replay => include/sysemu}/replay.h | 0 > qapi/common.json| 6 +- > qemu-timer.c| 14 ++ > replay/Makefile.objs| 11 +-- > replay/replay-events.c | 2 +- > replay/replay-input.c | 2 +- > replay/replay-internal.c| 4 ++-- > replay/replay-internal.h| 0 > replay/replay-time.c| 2 +- > replay/replay.c | 2 +- > stubs/Makefile.objs | 1 + > {replay => stubs}/replay-user.c | 6 +- > stubs/replay.c | 9 +++-- > ui/input.c | 2 +- > vl.c| 6 +++--- > 23 files changed, 40 insertions(+), 44 deletions(-) > rename {replay => include/sysemu}/replay.h (100%) > mode change 100755 => 100644 replay/Makefile.objs > mode change 100755 => 100644 replay/replay-events.c > mode change 100755 => 100644 replay/replay-input.c > mode change 100755 => 100644 replay/replay-internal.c > mode change 100755 => 100644 replay/replay-internal.h > mode change 100755 => 100644 replay/replay-time.c > mode change 100755 => 100644 replay/replay.c > rename {replay => stubs}/replay-user.c (90%) > > diff --git a/Makefile.objs b/Makefile.objs > index bc43e5c..ba4b45e 100644 > --- a/Makefile.objs > +++ b/Makefile.objs > @@ -58,6 +58,8 @@ common-obj-y += audio/ > common-obj-y += hw/ > common-obj-y += accel.o > > +common-obj-y += replay/ > + > common-obj-y += ui/ > common-obj-y += bt-host.o bt-vhci.o > bt-host.o-cflags := $(BLUEZ_CFLAGS) > diff --git a/Makefile.target b/Makefile.target > index ca8f351..962d004 100644 > --- a/Makefile.target > +++ b/Makefile.target > @@ -88,7 +88,6 @@ obj-y = exec.o translate-all.o cpu-exec.o > obj-y += translate-common.o > obj-y += cpu-exec-common.o > obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o > -obj-y += replay/ > obj-$(CONFIG_TCG_INTERPRETER) += tci.o > obj-y += tcg/tcg-common.o > obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o > diff --git a/cpu-exec.c b/cpu-exec.c > index 2b83e18..0850f8c 100644 > --- a/cpu-exec.c > +++ b/cpu-exec.c > @@ -30,7 +30,7 @@ > #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY) > #include "hw/i386/apic.h" > #endif > -#include "replay/replay.h" > +#include "sysemu/replay.h" > > /* -icount align implementation. */ > > diff --git a/cpus.c b/cpus.c > index 5130806..7e846e3 100644 > --- a/cpus.c > +++ b/cpus.c > @@ -42,7 +42,7 @@ > #include "qemu/seqlock.h" > #include "qapi-event.h" > #include "hw/nmi.h" > -#include "replay/replay.h" > +#include "sysemu/replay.h" > > #ifndef _WIN32 > #include "qemu/compatfd.h" > diff --git a/exec.c b/exec.c > index dba9258..38f968a 100644 > --- a/exec.c > +++ b/exec.c > @@ -50,7 +50,7 @@ > #include "qemu/rcu_queue.h" > #include "qemu/main-loop.h" > #include "translate-all.h" > -#include "replay/replay.h" > +#include "sysemu/replay.h" > > #include "exec/memory-internal.h" > #include "exec/ram_addr.h" > diff --git a/hw/bt/hci.c b/hw/bt/hci.c > index 93dd1dc..2151d01 100644 > --- a/hw/bt/hci.c > +++ b/hw/bt/hci.c > @@ -24,7 +24,7 @@ > #include "sysemu/bt.h" > #include "hw/bt.h" > #include "qapi/qmp/qerror.h" > -#include "replay/replay.h" > +#include "sysemu/replay.h" > > struct bt_hci_s { > uint8_t *(*evt_packet)(void *opaque); > @@ -2193,7 +2193,7 @@ struct HCIInfo *bt_new_hci(struct bt_scatternet_s *net) > > s->device.handle_destroy = bt_hci_destroy; > > -error_set(>replay_blocker, ERROR_CLASS_REPLAY_NOT_SUPPORTED, "bt > hci"); > +error_setg(>replay_blocker, QERR_REPLAY_NOT_SUPPORTED, "-bt hci"); > replay_add_blocker(s->replay_blocker); > > return >info; > diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c > index 86d544f..edf077c 100644 > --- a/hw/core/ptimer.c > +++ b/hw/core/ptimer.c > @@ -9,7 +9,7 @@ > #include "qemu/timer.h" > #include "hw/ptimer.h" > #include "qemu/host-utils.h" > -#include "replay/replay.h" > +#include "sysemu/replay.h" > > struct ptimer_state > { >
Re: [Qemu-devel] [PATCH 3/4] why is runstate_is_running needed?
I checked this patch. Let's leave it without runstate_is_running() call. If it will be needed later, we'll find it out. Pavel Dovgalyuk > -Original Message- > From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > Sent: Tuesday, October 06, 2015 11:01 PM > To: qemu-devel@nongnu.org > Cc: pavel.dovga...@ispras.ru > Subject: [PATCH 3/4] why is runstate_is_running needed? > > It doesn't seem correct to call it for all checkpoints, but why > is it right for timerlist_run_timers? > --- > qemu-timer.c | 9 +++-- > stubs/replay.c | 5 - > 2 files changed, 3 insertions(+), 11 deletions(-) > > diff --git a/qemu-timer.c b/qemu-timer.c > index 3c6e4c3..f16e422 100644 > --- a/qemu-timer.c > +++ b/qemu-timer.c > @@ -488,20 +488,17 @@ bool timerlist_run_timers(QEMUTimerList *timer_list) > break; > default: > case QEMU_CLOCK_VIRTUAL: > -if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running()) > -|| !replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) { > +if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) { > goto out; > } > break; > case QEMU_CLOCK_HOST: > -if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running()) > -|| !replay_checkpoint(CHECKPOINT_CLOCK_HOST)) { > +if (!replay_checkpoint(CHECKPOINT_CLOCK_HOST)) { > goto out; > } > break; > case QEMU_CLOCK_VIRTUAL_RT: > -if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running()) > -|| !replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL_RT)) { > +if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL_RT)) { > goto out; > } > break; > diff --git a/stubs/replay.c b/stubs/replay.c > index 71fa7d5..42d01b5 100755 > --- a/stubs/replay.c > +++ b/stubs/replay.c > @@ -22,11 +22,6 @@ bool replay_checkpoint(ReplayCheckpoint checkpoint) > return true; > } > > -int runstate_is_running(void) > -{ > -abort(); > -} > - > bool replay_events_enabled(void) > { > return false; > -- > 2.5.0 >
Re: [Qemu-devel] [PATCH 4/4] events doubts
> From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > On 07/10/2015 10:21, Pavel Dovgaluk wrote: > > There are two kinds of events: > > - read from the log and injected immediately (user input, network input) > > - read from the log and wait for corresponding event in the queue (BH) > > > > We cannot inject BH event immediately because we do not have any information > > about callback > > Actually we do (indirectly, through aio_bh_call). But that may not be > the central issue, because... > > > and to preserve consistency - BH cannot be processed before > > it is scheduled by qemu core. > > ... you are processing them differently anyway between record mode > (where the BH is scheduled by the core) and replay (where the BH is > called directly). In record it also called through replay. It is scheduled into the replay events queue and called at checkpoint, where queue is flushed. > In fact, I don't understand what introduces the difference between > record and replay that requires special handling of ptimers' bottom > halves. In both cases, the ptimer triggers at the desired time (based > on checkpoints) and then the bottom half is called as soon as possible. > Why is a separate async event necessary? We want to preserve order of all events that affect virtual machine behavior, not only instructions execution. These events include processing of interrupts, exceptions, and bottom halves. That is why bottom halves are bind to checkpoints and recorded into the log. > Because we only care about bottom halves from ptimers, their order > should be the same for both record and replay. > > If bottom halves async events could be removed, that would simplify a > lot the code, and it would make it a lot easier to understand for me. I added ptimer handling because replay didn't work when I removed BH queuing. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH 4/4] events doubts
> From: Paolo Bonzini [mailto:pbonz...@redhat.com] > On 07/10/2015 11:50, Pavel Dovgaluk wrote: > >> From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > >> Bonzini > >> On 07/10/2015 10:21, Pavel Dovgaluk wrote: > >>> There are two kinds of events: > >>> - read from the log and injected immediately (user input, network input) > >>> - read from the log and wait for corresponding event in the queue (BH) > >>> > >>> We cannot inject BH event immediately because we do not have any > >>> information > >>> about callback > >> > >> Actually we do (indirectly, through aio_bh_call). But that may not be > >> the central issue, because... > >> > >>> and to preserve consistency - BH cannot be processed before > >>> it is scheduled by qemu core. > >> > >> ... you are processing them differently anyway between record mode > >> (where the BH is scheduled by the core) and replay (where the BH is > >> called directly). > > > > In record it also called through replay. It is scheduled into the replay > > events queue and called at checkpoint, where queue is flushed. > > > >> In fact, I don't understand what introduces the difference between > >> record and replay that requires special handling of ptimers' bottom > >> halves. In both cases, the ptimer triggers at the desired time (based > >> on checkpoints) and then the bottom half is called as soon as possible. > >> Why is a separate async event necessary? > > > > We want to preserve order of all events that affect virtual machine > > behavior, > > not only instructions execution. These events include processing of > > interrupts, exceptions, and bottom halves. > > That is why bottom halves are bind to checkpoints and recorded into the log. > > > >> Because we only care about bottom halves from ptimers, their order > >> should be the same for both record and replay. > >> > >> If bottom halves async events could be removed, that would simplify a > >> lot the code, and it would make it a lot easier to understand for me. > > > > I added ptimer handling because replay didn't work when I removed BH > > queuing. > > Ok, got it. I still want to understand exactly the need for the init > and reset checkpoints, and the placement of qemu_clock_warp calls, but > apart from that the patches are good to go for 2.5. Thanks for your > persistence! Init checkpoint is needed to separate initialization events (mostly coming from block devices) from execution ones. Reset checkpoint is used for synchronization of machine reset call. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH 4/4] events doubts
> From: Paolo Bonzini [mailto:pbonz...@redhat.com] > On 07/10/2015 12:42, Pavel Dovgaluk wrote: > >> > Ok, got it. I still want to understand exactly the need for the init > >> > and reset checkpoints, and the placement of qemu_clock_warp calls, but > >> > apart from that the patches are good to go for 2.5. Thanks for your > >> > persistence! > > Init checkpoint is needed to separate initialization events (mostly coming > > from block > > devices) from execution ones. > > Reset checkpoint is used for synchronization of machine reset call. > > Why do they need to be separate on startup? Does initialization hang? > My reasoning was that QEMU_CLOCK_VIRTUAL is zero anyway at both init and > reset. I'm not sure about current (only core functions) version, but full version requires this because of loading virtual machine state and block devices initialization. Events from these actions interfere with each other and with machine execution. Pavel Dovgalyuk
Re: [Qemu-devel] [PULL 50/52] typedef: add typedef for QemuOpts
Thank you! What about other patches from rr series? Pavel Dovgalyuk > -Original Message- > From: Paolo Bonzini [mailto:pbonz...@redhat.com] > Sent: Friday, September 25, 2015 7:52 PM > To: qemu-devel@nongnu.org > Cc: Pavel Dovgalyuk; Pavel Dovgalyuk > Subject: [PULL 50/52] typedef: add typedef for QemuOpts > > From: Pavel Dovgalyuk> > This patch moves typedefs for QemuOpts and related types > to qemu/typedefs.h file. > > Reviewed-by: Paolo Bonzini > > Signed-off-by: Pavel Dovgalyuk > Message-Id: <20150917162501.8676.85435.st...@pasha-isp.def.inno> > Signed-off-by: Paolo Bonzini > --- > include/qemu/option.h | 5 + > include/qemu/typedefs.h | 3 +++ > 2 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/include/qemu/option.h b/include/qemu/option.h > index 57e51c9..71f5f27 100644 > --- a/include/qemu/option.h > +++ b/include/qemu/option.h > @@ -30,6 +30,7 @@ > #include "qemu/queue.h" > #include "qapi/error.h" > #include "qapi/qmp/qdict.h" > +#include "qemu/typedefs.h" > > const char *get_opt_name(char *buf, int buf_size, const char *p, char delim); > const char *get_opt_value(char *buf, int buf_size, const char *p); > @@ -44,10 +45,6 @@ void parse_option_size(const char *name, const char *value, > bool has_help_option(const char *param); > bool is_valid_option_list(const char *param); > > -typedef struct QemuOpt QemuOpt; > -typedef struct QemuOpts QemuOpts; > -typedef struct QemuOptsList QemuOptsList; > - > enum QemuOptType { > QEMU_OPT_STRING = 0, /* no parsing (use string as-is) > */ > QEMU_OPT_BOOL,/* on/off > */ > diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h > index ce82c64..3a835ff 100644 > --- a/include/qemu/typedefs.h > +++ b/include/qemu/typedefs.h > @@ -67,6 +67,9 @@ typedef struct Property Property; > typedef struct QEMUBH QEMUBH; > typedef struct QemuConsole QemuConsole; > typedef struct QEMUFile QEMUFile; > +typedef struct QemuOpt QemuOpt; > +typedef struct QemuOpts QemuOpts; > +typedef struct QemuOptsList QemuOptsList; > typedef struct QEMUSGList QEMUSGList; > typedef struct QEMUSizedBuffer QEMUSizedBuffer; > typedef struct QEMUTimerListGroup QEMUTimerListGroup; > -- > 2.5.0 >
Re: [Qemu-devel] [PATCH v18 13/21] icount: improve counting for record/replay
> From: Paolo Bonzini [mailto:pbonz...@redhat.com] > On 17/09/2015 18:24, Pavel Dovgalyuk wrote: > > #endif > > > > +/* CPU thread can infinitely wait for event after > > + missing the warp */ > > +qemu_clock_warp(QEMU_CLOCK_VIRTUAL); > > qemu_clock_run_all_timers(); > > It is still not clear to me why the call in timerlist_rearm is not > sufficient. Can you explain this (again probably)? Sometimes tcg thread halts in qemu_tcg_wait_io_event function, waiting for any external event. Virtual clock does not run, because warp is not called. warp call in main_loop_wait proceeds virtual clock and allows tcg thread to run further. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v18 11/21] replay: recording and replaying clock ticks
> From: Paolo Bonzini [mailto:pbonz...@redhat.com] > On 17/09/2015 18:24, Pavel Dovgalyuk wrote: > > +if ((now < last || now > (last + get_max_clock_jump())) > > +&& replay_mode == REPLAY_MODE_NONE) { > > notifier_list_notify(>reset_notifiers, ); > > This seems wrong. You will have different timers in the record and the > replay if you do not invoke e.g. rtc_notify_clock_reset. That will "work" - record and replay will be equivalent. But they will differ from normal mode. It'll better remove it from the patch. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v18 13/21] icount: improve counting for record/replay
> From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > On 23/09/2015 09:22, Pavel Dovgaluk wrote: > > Sometimes tcg thread halts in qemu_tcg_wait_io_event function, > > waiting for any external event. Virtual clock does not run, because > > warp is not called. warp call in main_loop_wait proceeds virtual > > clock and allows tcg thread to run further. > > Ok, this makes sense! > > Would this work too as a replacement for this patch? No, it doesn't help. It seems that tcg is waiting within qemu_cond_wait function without leaving it. > > diff --git a/cpus.c b/cpus.c > index fbbd17f..9480acc 100644 > --- a/cpus.c > +++ b/cpus.c > @@ -926,6 +926,7 @@ static void qemu_tcg_wait_io_event(CPUState *cpu) > } > > while (iothread_requesting_mutex) { > +qemu_clock_warp(QEMU_CLOCK_VIRTUAL); > qemu_cond_wait(_io_proceeded_cond, _global_mutex); > } Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v18 00/21] Deterministic replay core
Hi! Paolo, have you reviewed these patches? Pavel Dovgalyuk > -Original Message- > From: Pavel Dovgalyuk [mailto:pavel.dovga...@ispras.ru] > Sent: Thursday, September 17, 2015 7:23 PM > To: qemu-devel@nongnu.org > Cc: edgar.igles...@xilinx.com; peter.mayd...@linaro.org; > igor.rubi...@gmail.com; > ebl...@redhat.com; mark.bur...@greensocs.com; r...@ispras.ru; > batuz...@ispras.ru; > maria.klimushenk...@ispras.ru; pavel.dovga...@ispras.ru; pbonz...@redhat.com; > hi...@cert.org; > alex.ben...@linaro.org; fred.kon...@greensocs.com > Subject: [PATCH v18 00/21] Deterministic replay core > > This set of patches is related to the reverse execution and deterministic > replay of qemu execution. This implementation of deterministic replay can > be used for deterministic debugging of guest code through gdb remote > interface. > > Core set of patches does not include support for reverse debugging commands > of gdb, block devices' operations, USB replay support. > > These patches include only core function of the replay, > excluding the support for replaying serial, audio, network, and USB devices' > operations. Reverse debugging and monitor commands were also excluded to > be submitted later as separate patches. > > Execution recording writes non-deterministic events log, which can be later > used for replaying the execution anywhere and for unlimited number of times. > It also supports checkpointing for faster rewinding during reverse debugging. > Execution replaying reads the log and replays all non-deterministic events > including external input, hardware clocks, and interrupts. > > Full version of deterministic replay has the following features: > * Deterministically replays whole system execution and all contents of the > memory, >state of the hadrware devices, clocks, and screen of the VM. > * Writes execution log into the file for latter replaying for multiple times >on different machines. > * Supports i386, x86_64, ARM, PowerPC, and MIPS hardware platforms. > * Performs deterministic replay of all operations with keyboard and mouse >input devices. > * Supports auto-checkpointing for convenient reverse debugging. > > Usage of the record/replay core: > * First, record the execution, by adding the following string to the command > line: >'-icount shift=7,rr=record,rrfile=replay.bin -net none'. >Block devices' images are not actually changed in the recording mode, >because all of the changes are written to the temporary overlay file. > * Then you can replay it for the multiple times by using another command >line option: '-icount shift=7,rr=replay,rrfile=replay.bin -net none' > * '-net none' option should also be specified if network replay patches >are not applied. > * Do not add any disk images to VM, because they are not supported by >the core patches. > > Papers with description of deterministic replay implementation: > http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html > http://dl.acm.org/citation.cfm?id=2786805.2803179 > > Public repository with current version of the patches: > https://github.com/Dovgalyuk/qemu/tree/rr-17 > > Modifications of qemu include: > * wrappers for clock and time functions to save their return values in the > log > * saving different asynchronous events (e.g. system shutdown) into the log > * synchronization of the threads from thread pool > * recording/replaying user input (mouse and keyboard) > * adding internal events for cpu and io synchronization > > v18 changes: > * Patches were updated to match upstream version > * Added missed replay-user.c file > > v17 changes: > * Removed useless stub functions (as suggested by Paolo Bonzini) > * Refined checkpoint-related code (as suggested by Paolo Bonzini) > * Improved icount processing (as suggested by Paolo Bonzini) > * Added checkpoint for suspend event (as suggested by Paolo Bonzini) > * Fixed linux-user configurations build > * Minor fixes > > v16 changes: > * Several warnings were fixed > > v15 changes: > * Tested record/replay with MIPS and PowerPC guests > * Published the patches on github > * Fixed replay mutex operation in icount mode > * Fixed timers processing in record/replay mode > > v14 changes: > * Minor fixes > > v13 changes: > * Introduced "ptimer trigger" event (as suggested by Paolo Bonzini) > > v12 changes: > * Removed block patches from the core patches set. > > v11 changes: > * Fixed instructions event processing. > * Added some mutex protection calls for replay. > * Fixed replaying read operations for qcow2. > * Fixed rtc reads on initializations stage. > * Eliminated some warnings in replay module. > * Fixed misprints in documentation for replay (as suggested by Eric Blake) > > v10 changes: > * Fixed queue processing for bottom halves (as suggested by Paolo Bonzini) > * Rewritten several replay functions (as suggested by Paolo Bonzini) > * Some minor fixes. > > v9 changes: > *
Re: [Qemu-devel] [PATCH v7 04/11] target-mips: improve exception handling
> From: Leon Alrae [mailto:leon.al...@imgtec.com] > On 28/08/2015 10:08, Pavel Dovgaluk wrote: > >> From: Aurelien Jarno [mailto:aurel...@aurel32.net] > >> On 2015-08-13 14:12, Leon Alrae wrote: > >>> On 10/07/2015 10:57, Pavel Dovgalyuk wrote: > >>>> @@ -2364,14 +2363,12 @@ static void gen_st_cond (DisasContext *ctx, > >>>> uint32_t opc, int rt, > >>>> #if defined(TARGET_MIPS64) > >>>> case OPC_SCD: > >>>> case R6_OPC_SCD: > >>>> -save_cpu_state(ctx, 1); > >>>> op_st_scd(t1, t0, rt, ctx); > >>>> opn = "scd"; > >>>> break; > >>>> #endif > >>>> case OPC_SC: > >>>> case R6_OPC_SC: > >>>> -save_cpu_state(ctx, 1); > >>>> op_st_sc(t1, t0, rt, ctx); > >>>> opn = "sc"; > >>>> break; > >>> > >>> Wouldn't we be better off assuming that conditional stores in linux-user > >>> always take an exception (we generate fake EXCP_SC exception) and avoid > >>> retranslation? After applying these changes I observed significant impact > >>> on > >>> performance in linux-user multithreaded apps, for instance c11-atomic-exec > >>> test before the change took just 2 seconds to finish, whereas now more > >>> than 30... > >> > >> This really show the impact of retranslation and why we should avoid > >> it when not necessary. Coming back to the issue here, the fact that we > >> go through retranslation is actually due to the fact that > >> helper_raise_exception has been changed to go through retranslation. > >> > >> Given the code path between user-mode and softmmu is quite different, > >> we definitely need a different code path wrt exception and retranslation > >> for the two cases. That said if we want deterministic code execution > >> (the original purpose of this patch), I don't see how we can do without > >> forcing retranslation. Pavel, do you have an idea for that? > > > > There is only one case when we can execute without retranslation - > > when the instruction is the last instruction in translation block. > > Then we can setup PC and flags before this last instruction. > > If the exception happens, we can just break the execution. > > The drawback of this method is breaking translation blocks into > > the smaller parts. > > c11-atomic-exec.4 test execution time in linux-user: > > * no changes: > real0m3.039s > user0m2.976s > sys 0m1.908s > > * tb_lock + patch: > real1m1.167s > user0m57.240s > sys 0m36.678s > > * tb_lock + patch + SC-without-retranslation: > real0m3.016s > user0m2.988s > sys 0m1.848s > > I had to add tb_lock() to cpu_restore_state() in the first place, otherwise > all of my multithreaded user mode tests crash QEMU with this patch. > > SC-without-retranslation (the diff below) seems to improve the situation, > and if I understand correctly we retain deterministic code execution. > Therefore if there are no objections I'll apply this patch + SC correction > to mips-next. diff below implements exactly what I meant. Pavel Dovgalyuk > > diff --git a/target-mips/translate.c b/target-mips/translate.c > index 99b99c5..006cb96 100644 > --- a/target-mips/translate.c > +++ b/target-mips/translate.c > @@ -2060,7 +2060,7 @@ static inline void op_st_##insn(TCGv arg1, TCGv arg2, > int rt, > DisasContext *ctx) > tcg_gen_movi_tl(t0, rt | ((almask << 3) & 0x20)); > \ > tcg_gen_st_tl(t0, cpu_env, offsetof(CPUMIPSState, llreg)); > \ > tcg_gen_st_tl(arg1, cpu_env, offsetof(CPUMIPSState, llnewval)); > \ > -gen_helper_0e0i(raise_exception, EXCP_SC); > \ > +generate_exception_end(ctx, EXCP_SC); > \ > gen_set_label(l2); > \ > tcg_gen_movi_tl(t0, 0); > \ > gen_store_gpr(t0, rt); > \
Re: [Qemu-devel] [PATCH v17 00/21] Deterministic replay core
Paolo, Are these patches good enough? Pavel Dovgalyuk > -Original Message- > From: Pavel Dovgalyuk [mailto:pavel.dovga...@ispras.ru] > Sent: Monday, September 07, 2015 11:40 AM > To: qemu-devel@nongnu.org > Cc: edgar.igles...@xilinx.com; peter.mayd...@linaro.org; > igor.rubi...@gmail.com; > ebl...@redhat.com; mark.bur...@greensocs.com; r...@ispras.ru; > batuz...@ispras.ru; > maria.klimushenk...@ispras.ru; pavel.dovga...@ispras.ru; pbonz...@redhat.com; > hi...@cert.org; > alex.ben...@linaro.org; fred.kon...@greensocs.com > Subject: [PATCH v17 00/21] Deterministic replay core > > This set of patches is related to the reverse execution and deterministic > replay of qemu execution. This implementation of deterministic replay can > be used for deterministic debugging of guest code through gdb remote > interface. > > Core set of patches does not include support for reverse debugging commands > of gdb, block devices' operations, USB replay support. > > These patches include only core function of the replay, > excluding the support for replaying serial, audio, network, and USB devices' > operations. Reverse debugging and monitor commands were also excluded to > be submitted later as separate patches. > > Execution recording writes non-deterministic events log, which can be later > used for replaying the execution anywhere and for unlimited number of times. > It also supports checkpointing for faster rewinding during reverse debugging. > Execution replaying reads the log and replays all non-deterministic events > including external input, hardware clocks, and interrupts. > > Full version of deterministic replay has the following features: > * Deterministically replays whole system execution and all contents of the > memory, >state of the hadrware devices, clocks, and screen of the VM. > * Writes execution log into the file for latter replaying for multiple times >on different machines. > * Supports i386, x86_64, ARM, PowerPC, and MIPS hardware platforms. > * Performs deterministic replay of all operations with keyboard and mouse >input devices. > * Supports auto-checkpointing for convenient reverse debugging. > > Usage of the record/replay core: > * First, record the execution, by adding the following string to the command > line: >'-icount shift=7,rr=record,rrfile=replay.bin -net none'. >Block devices' images are not actually changed in the recording mode, >because all of the changes are written to the temporary overlay file. > * Then you can replay it for the multiple times by using another command >line option: '-icount shift=7,rr=replay,rrfile=replay.bin -net none' > * '-net none' option should also be specified if network replay patches >are not applied. > * Do not add any disk images to VM, because they are not supported by >the core patches. > > Papers with description of deterministic replay implementation: > http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html > http://dl.acm.org/citation.cfm?id=2786805.2803179 > > Public repository with current version of the patches: > https://github.com/Dovgalyuk/qemu/tree/rr-17 > > Modifications of qemu include: > * wrappers for clock and time functions to save their return values in the > log > * saving different asynchronous events (e.g. system shutdown) into the log > * synchronization of the threads from thread pool > * recording/replaying user input (mouse and keyboard) > * adding internal events for cpu and io synchronization > > v17 changes: > * Removed useless stub functions (as suggested by Paolo Bonzini) > * Refined checkpoint-related code (as suggested by Paolo Bonzini) > * Improved icount processing (as suggested by Paolo Bonzini) > * Added checkpoint for suspend event (as suggested by Paolo Bonzini) > * Fixed linux-user configurations build > * Minor fixes > > v16 changes: > * Several warnings were fixed > > v15 changes: > * Tested record/replay with MIPS and PowerPC guests > * Published the patches on github > * Fixed replay mutex operation in icount mode > * Fixed timers processing in record/replay mode > > v14 changes: > * Minor fixes > > v13 changes: > * Introduced "ptimer trigger" event (as suggested by Paolo Bonzini) > > v12 changes: > * Removed block patches from the core patches set. > > v11 changes: > * Fixed instructions event processing. > * Added some mutex protection calls for replay. > * Fixed replaying read operations for qcow2. > * Fixed rtc reads on initializations stage. > * Eliminated some warnings in replay module. > * Fixed misprints in documentation for replay (as suggested by Eric Blake) > > v10 changes: > * Fixed queue processing for bottom halves (as suggested by Paolo Bonzini) > * Rewritten several replay functions (as suggested by Paolo Bonzini) > * Some minor fixes. > > v9 changes: > * Replaced fwrite/fread with putc/getc (as suggested by Paolo Bonzini) > * Stopping virtual machine in case of
Re: [Qemu-devel] [PATCH v7 04/11] target-mips: improve exception handling
From: Aurelien Jarno [mailto:aurel...@aurel32.net] On 2015-08-13 14:12, Leon Alrae wrote: On 10/07/2015 10:57, Pavel Dovgalyuk wrote: @@ -2364,14 +2363,12 @@ static void gen_st_cond (DisasContext *ctx, uint32_t opc, int rt, #if defined(TARGET_MIPS64) case OPC_SCD: case R6_OPC_SCD: -save_cpu_state(ctx, 1); op_st_scd(t1, t0, rt, ctx); opn = scd; break; #endif case OPC_SC: case R6_OPC_SC: -save_cpu_state(ctx, 1); op_st_sc(t1, t0, rt, ctx); opn = sc; break; Wouldn't we be better off assuming that conditional stores in linux-user always take an exception (we generate fake EXCP_SC exception) and avoid retranslation? After applying these changes I observed significant impact on performance in linux-user multithreaded apps, for instance c11-atomic-exec test before the change took just 2 seconds to finish, whereas now more than 30... This really show the impact of retranslation and why we should avoid it when not necessary. Coming back to the issue here, the fact that we go through retranslation is actually due to the fact that helper_raise_exception has been changed to go through retranslation. Given the code path between user-mode and softmmu is quite different, we definitely need a different code path wrt exception and retranslation for the two cases. That said if we want deterministic code execution (the original purpose of this patch), I don't see how we can do without forcing retranslation. Pavel, do you have an idea for that? There is only one case when we can execute without retranslation - when the instruction is the last instruction in translation block. Then we can setup PC and flags before this last instruction. If the exception happens, we can just break the execution. The drawback of this method is breaking translation blocks into the smaller parts. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v16 00/21] Deterministic replay core
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini unfortunately I do have some more review comments; that can happen when going back to the code after a few months, and it's also a good thing because it means that the code _is_ actually getting cleaner. Thanks for review. However, I am fairly sure that v17 is going to be the good one and will be in 2.5 if I get it by mid September when I'll be back from vacation. In particular: * patch 3 seems to be unnecessary (for now at least) Done. * replay_next_event_is is modified in patch 8 (cpu: replay instructions sequence) after it's introduced in patch 6 (replay: introduce icount event). Please fold the change in patch 6. Done. * replay_add_event is not used; please remove it, rename replay_add_event_internal to replay_add_event, and add an assertion that the rr mode is the right one (e.g. RECORD only?) Done. * a couple of comments say grteater instead of greater Done. * the replay_save_clock and replay_read_clock stubs should abort Done. * please inline replay_input_event into qemu_input_event_send and replay_input_sync_event into qemu_input_event_sync, so that the corresponding *_impl functions can be static; this also means moving the prototypes of replay_add_input_event and replay_add_input_sync_event to replay/replay.h (I acknowledge this might undo a request from a previous review of mine; I don't remember) I can't. *_impl functions are called to process input events, when they are waiting in the events queue. * most stubs are unnecessary (replay_get_current_step, replay_checkpoint, qemu_system_shutdown_request, qemu_input_event_send_impl, qemu_input_event_sync_impl) replay_checkpiont is required by qemu-timer.c * please squash this in replay: checkpoints Ok. And a few questions. The first three are the if the answer is yes, please do this kind to questions, the others can have more open/subjective answers: * does it make sense to call replay_check_error from replay_finish_event, and remove the call from replay_read_next_clock? No, read errors are checked just after file read operations. replay_finish_event usually is not coupled with any reads. * should qemu_clock_use_for_deadline always return false in replay mode? The clocks are all deterministic, so it doesn't make sense to take them into account in the poll() deadline. It could be host clock or virtual_rt. In both cases qemu_soonest_timeout will write some clock reads into the events log. In play we should read these clocks to avoid inconsistency. * now that qemu_clock_warp has to be called in main_loop_wait, should the icount_warp_timer have a dummy callback? icount_warp_rt is then only called from qemu_clock_warp. If so, this (adding the call to qemu_clock_warp in main_loop_wait, making the icount_warp_timer dummy, removing the now-unnecessary argument of icount_warp_rt) should be a separate patch before replay: checkpoints It seems that icount_warp_rt may be called without timer. I'll prepare a patch. * can you explain why both CHECKPOINT_INIT and CHECKPOINT_RESET are needed? What events are typically found in each of them? 'Init' and 'reset' checkpoints is used to split initialization and resetting functions. Some of them interact with log file (e.g., write clock values). These saved events may be used by other function than should be in normal execution. Checkpoints prevent reading more clock values than needed by polling functions, called when initializing the hardware. * would it make sense to test replay_mode != REPLAY_MODE_NONE !runstate_is_running() in replay_checkpoint, for all checkpoints, like if (replay_mode != REPLAY_MODE_NONE !runstate_is_running()) { return false; } It could be useful in case when machine is stopped just before checkpoint. I'll fix this. * do we need an event for suspend? Maybe, but the difference probably won't exhibit itself on a diskless system. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v6 09/10] target-i386: exception handling for other helper functions
From: Richard Henderson [mailto:rth7...@gmail.com] On Behalf Of Richard Henderson On 07/07/2015 02:31 PM, Pavel Dovgalyuk wrote: diff --git a/target-i386/misc_helper.c b/target-i386/misc_helper.c index 52c5d65..c8e7ee9 100644 --- a/target-i386/misc_helper.c +++ b/target-i386/misc_helper.c @@ -220,7 +220,7 @@ void helper_rdtsc(CPUX86State *env) uint64_t val; if ((env-cr[4] CR4_TSD_MASK) ((env-hflags HF_CPL_MASK) != 0)) { -raise_exception(env, EXCP0D_GPF); +raise_exception_ra(env, EXCP0D_GPF, GETPC()); } cpu_svm_check_intercept_param(env, SVM_EXIT_RDTSC, 0); @@ -238,13 +238,13 @@ void helper_rdtscp(CPUX86State *env) void helper_rdpmc(CPUX86State *env) { if ((env-cr[4] CR4_PCE_MASK) ((env-hflags HF_CPL_MASK) != 0)) { -raise_exception(env, EXCP0D_GPF); +raise_exception_ra(env, EXCP0D_GPF, GETPC()); } cpu_svm_check_intercept_param(env, SVM_EXIT_RDPMC, 0); /* currently unimplemented */ qemu_log_mask(LOG_UNIMP, x86: unimplemented rdpmc\n); -raise_exception_err(env, EXCP06_ILLOP, 0); +raise_exception_err_ra(env, EXCP06_ILLOP, 0, GETPC()); } #if defined(CONFIG_USER_ONLY) @@ -589,7 +589,7 @@ void helper_hlt(CPUX86State *env, int next_eip_addend) void helper_monitor(CPUX86State *env, target_ulong ptr) { if ((uint32_t)env-regs[R_ECX] != 0) { -raise_exception(env, EXCP0D_GPF); +raise_exception_ra(env, EXCP0D_GPF, GETPC()); } /* XXX: store address? */ cpu_svm_check_intercept_param(env, SVM_EXIT_MONITOR, 0); Likewise. Like what? @@ -601,7 +601,7 @@ void helper_mwait(CPUX86State *env, int next_eip_addend) X86CPU *cpu; if ((uint32_t)env-regs[R_ECX] != 0) { -raise_exception(env, EXCP0D_GPF); +raise_exception_ra(env, EXCP0D_GPF, GETPC()); } cpu_svm_check_intercept_param(env, SVM_EXIT_MWAIT, 0); env-eip += next_eip_addend; Similar to lcall/ljmp, this can be switched to not use an addend. Not exactly. cpu_svm_check_intercept_param could call helper_vmexit, which uses env-eip. diff --git a/target-i386/svm_helper.c b/target-i386/svm_helper.c index f1fabf5..cc8c2ec 100644 --- a/target-i386/svm_helper.c +++ b/target-i386/svm_helper.c @@ -354,7 +354,7 @@ void helper_vmrun(CPUX86State *env, int aflag, int next_eip_addend) void helper_vmmcall(CPUX86State *env) { cpu_svm_check_intercept_param(env, SVM_EXIT_VMMCALL, 0); -raise_exception(env, EXCP06_ILLOP); +raise_exception_ra(env, EXCP06_ILLOP, GETPC()); } void helper_vmload(CPUX86State *env, int aflag) @@ -457,7 +457,7 @@ void helper_skinit(CPUX86State *env) { cpu_svm_check_intercept_param(env, SVM_EXIT_SKINIT, 0); /* XXX: not implemented */ -raise_exception(env, EXCP06_ILLOP); +raise_exception_ra(env, EXCP06_ILLOP, GETPC()); } Either these are missing a change to translate.c, or they're pointless changes. Probably the later. Then it is better to remove these changes from the patch? Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v5 08/11] target-i386: exception handling for seg_helper functions
From: Richard Henderson [mailto:rth7...@gmail.com] On Behalf Of Richard Henderson On 07/06/2015 09:26 AM, Pavel Dovgalyuk wrote: This patch fixes exception handling for seg_helper functions. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru No, you don't want to discriminately change every call. That was my original point about not needing to change seg_helper.c or smm_helper.c. Further, any such changes would go along with the changes in translate.c to remove the state saving there. I would only change those that are normal memory operations, like fp loads etc. The segmentation changes are rare. The task state helpers require state saving anyway, so requiring a TCG search is a pessimization. I can refine the patch, but the most of the changes should remain. E.g., lcall helpers can cause an exception or not. TB ends in both cases. But icount and PC values in these two situations should be different. And lcall helpers use most of the seg functions I changed in the patch. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v5 08/11] target-i386: exception handling for seg_helper functions
From: Richard Henderson [mailto:rth7...@gmail.com] On Behalf Of Richard Henderson On 07/06/2015 09:26 AM, Pavel Dovgalyuk wrote: This patch fixes exception handling for seg_helper functions. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru No, you don't want to discriminately change every call. That was my original point about not needing to change seg_helper.c or smm_helper.c. Further, any such changes would go along with the changes in translate.c to remove the state saving there. I would only change those that are normal memory operations, like fp loads etc. The segmentation changes are rare. The task state helpers require state saving anyway, so requiring a TCG search is a pessimization. Then we'll have to stop the translation after every instruction that uses these function. Is this a limited subset? Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v15 00/21] Deterministic replay core
Paolo, Are there any chances for upstreaming these patches? Pavel Dovgalyuk -Original Message- From: Pavel Dovgalyuk [mailto:pavel.dovga...@ispras.ru] Sent: Wednesday, July 01, 2015 2:52 PM To: qemu-devel@nongnu.org Cc: peter.mayd...@linaro.org; peter.crosthwa...@xilinx.com; igor.rubi...@gmail.com; ebl...@redhat.com; mark.bur...@greensocs.com; r...@ispras.ru; batuz...@ispras.ru; maria.klimushenk...@ispras.ru; pavel.dovga...@ispras.ru; pbonz...@redhat.com; hi...@cert.org; alex.ben...@linaro.org; fred.kon...@greensocs.com Subject: [PATCH v15 00/21] Deterministic replay core This set of patches is related to the reverse execution and deterministic replay of qemu execution. This implementation of deterministic replay can be used for deterministic debugging of guest code through gdb remote interface. Core set of patches does not include support for reverse debugging commands of gdb, block devices' operations, USB replay support. These patches include only core function of the replay, excluding the support for replaying serial, audio, network, and USB devices' operations. Reverse debugging and monitor commands were also excluded to be submitted later as separate patches. Execution recording writes non-deterministic events log, which can be later used for replaying the execution anywhere and for unlimited number of times. It also supports checkpointing for faster rewinding during reverse debugging. Execution replaying reads the log and replays all non-deterministic events including external input, hardware clocks, and interrupts. Full version of deterministic replay has the following features: * Deterministically replays whole system execution and all contents of the memory, state of the hadrware devices, clocks, and screen of the VM. * Writes execution log into the file for latter replaying for multiple times on different machines. * Supports i386, x86_64, ARM, PowerPC, and MIPS hardware platforms. * Performs deterministic replay of all operations with keyboard and mouse input devices. * Supports auto-checkpointing for convenient reverse debugging. Usage of the record/replay core: * First, record the execution, by adding the following string to the command line: '-icount shift=7,rr=record,rrfile=replay.bin -net none'. Block devices' images are not actually changed in the recording mode, because all of the changes are written to the temporary overlay file. * Then you can replay it for the multiple times by using another command line option: '-icount shift=7,rr=replay,rrfile=replay.bin -net none' * '-net none' option should also be specified if network replay patches are not applied. * Do not add any disk images to VM, because they are not supported by the core patches. Paper with short description of deterministic replay implementation: http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html Public repository with the current version of the patches: https://github.com/Dovgalyuk/qemu/tree/rr-15 Modifications of qemu include: * wrappers for clock and time functions to save their return values in the log * saving different asynchronous events (e.g. system shutdown) into the log * synchronization of the threads from thread pool * recording/replaying user input (mouse and keyboard) * adding internal events for cpu and io synchronization v15 changes: * Tested record/replay with MIPS and PowerPC guests * Published the patches on github * Fixed replay mutex operation in icount mode * Fixed timers processing in record/replay mode v14 changes: * Minor fixes v13 changes: * Introduced ptimer trigger event (as suggested by Paolo Bonzini) v12 changes: * Removed block patches from the core patches set. v11 changes: * Fixed instructions event processing. * Added some mutex protection calls for replay. * Fixed replaying read operations for qcow2. * Fixed rtc reads on initializations stage. * Eliminated some warnings in replay module. * Fixed misprints in documentation for replay (as suggested by Eric Blake) v10 changes: * Fixed queue processing for bottom halves (as suggested by Paolo Bonzini) * Rewritten several replay functions (as suggested by Paolo Bonzini) * Some minor fixes. v9 changes: * Replaced fwrite/fread with putc/getc (as suggested by Paolo Bonzini) * Stopping virtual machine in case of replay file end (as suggested by Paolo Bonzini) * Removed one of the replay mutexes (as suggested by Paolo Bonzini) * Fixed RCU queue for bottom halves (as suggested by Paolo Bonzini) * Updated command line options' names (as suggested by Paolo Bonzini) * Added design document for record/replay (as suggested by Paolo Bonzini) * Simplified checkpoints for the timers * Added cloning InputEvent objects for replay (as suggested by Paolo Bonzini) * Added replay blockers instead of checking the command line (as
Re: [Qemu-devel] [PATCH v15 00/21] Deterministic replay core
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 06/07/2015 13:54, Pavel Dovgaluk wrote: Paolo, Are there any chances for upstreaming these patches? I'm sorry. It looks like no one really feels competent enough. What about committing them at the beginning of 2.5? I'll do another round of locking review before then. Ok, let's try 2.5. These patches are better to include before record/replay, because they fix PC and icount recovering after exceptions: https://lists.gnu.org/archive/html/qemu-devel/2015-07/msg01212.html Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v5 05/11] target-i386: exception handling for FPU instructions
From: Richard Henderson [mailto:rth7...@gmail.com] On Behalf Of Richard Henderson On 07/06/2015 09:26 AM, Pavel Dovgalyuk wrote: @@ -1117,33 +1131,33 @@ void helper_fxsave(CPUX86State *env, target_ulong ptr, int data64) for (i = 0; i 8; i++) { fptag |= (env-fptags[i] i); } -cpu_stw_data(env, ptr, env-fpuc); -cpu_stw_data(env, ptr + 2, fpus); -cpu_stw_data(env, ptr + 4, fptag ^ 0xff); +cpu_stw_data_ra(env, ptr, env-fpuc, GETPC()); +cpu_stw_data_ra(env, ptr + 2, fpus, GETPC()); +cpu_stw_data_ra(env, ptr + 4, fptag ^ 0xff, GETPC()); helper_fxsave and helper_fxrstor ought to have do_* versions just like you did for fstenv/fldenv. (I'm working on a patch set that adds xsave/xrstor support and I'll need to re-use these functions.) Why do_* functions should be in this series? These changes will look orphaned. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v4 4/5] target-i386: fix memory operations in helpers
From: Richard Henderson [mailto:rth7...@gmail.com] On Behalf Of Richard Henderson On 06/29/2015 08:23 AM, Pavel Dovgalyuk wrote: This patch passes TB return address into softmmu functions that are invoked from target helpers. This allows correct PC and icount recovering while handling MMU faults. Signed-off-by: Pavel Dovgalyukpavel.dovga...@ispras.ru --- target-i386/cc_helper.c |2 target-i386/cpu.h |5 target-i386/excp_helper.c | 21 + target-i386/fpu_helper.c | 146 + target-i386/helper.c |4 target-i386/int_helper.c | 32 +- target-i386/mem_helper.c | 39 +- target-i386/misc_helper.c | 12 - target-i386/ops_sse.h |2 target-i386/seg_helper.c | 712 +++-- target-i386/svm_helper.c |4 target-i386/translate.c | 25 -- 12 files changed, 503 insertions(+), 501 deletions(-) This patch is too big. It really needs to be split into several patches. (1) Introduce raise_exception_ra that accepts the GETPC argument, so that (a) you can stage the changes in and (b) most of the seg_helper changes from do_interrupt et al aren't needed. (2) Stage in fixes for each of the (groups of) helpers callable from translate.c. E.g. fld, fst in one group, division in another. Ok, I'll split this patch. And while this patch set fixes icount, do you have any evidence that we ever got incorrect PC values handling mmu faults? Indeed. I described it here: http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg02960.html One of the applications used maskmov_xmm, which caused MMU fault. Then execution of TB restarted instead of continuing from maskmov. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v3 2/3] target-mips: improve exceptions handling
From: Aurelien Jarno [mailto:aurel...@aurel32.net] On 2015-06-18 16:28, Pavel Dovgalyuk wrote: This patch improves exception handling in MIPS. Instructions generate several types of exceptions. When exception is generated, it breaks the execution of the current translation block. Implementation of the exceptions handling does not correctly restore icount for the instruction which caused the exception. In most cases icount will be decreased by the value equal to the size of TB. This patch passes pointer to the translation block internals to the exception handler. It allows correct restoring of the icount value. v3 changes: This patch stops translation when instruction which always generates exception is translated. This improves the performance of the patched version compared to original one. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- target-mips/cpu.h| 28 +++ target-mips/helper.h |1 target-mips/msa_helper.c |5 - target-mips/op_helper.c | 183 ++ target-mips/translate.c | 379 ++ 5 files changed, 302 insertions(+), 294 deletions(-) diff --git a/target-mips/cpu.h b/target-mips/cpu.h index f9d2b4c..70ba39a 100644 --- a/target-mips/cpu.h +++ b/target-mips/cpu.h @@ -1015,4 +1015,32 @@ static inline void cpu_mips_store_cause(CPUMIPSState *env, target_ulong val) } #endif +static inline void QEMU_NORETURN do_raise_exception_err(CPUMIPSState *env, +uint32_t exception, +int error_code, +uintptr_t pc) +{ +CPUState *cs = CPU(mips_env_get_cpu(env)); + +if (exception EXCP_SC) { +qemu_log(%s: %d %d\n, __func__, exception, error_code); +} +cs-exception_index = exception; +env-error_code = error_code; + +if (pc) { +/* now we have a real cpu fault */ +cpu_restore_state(cs, pc); +} + +cpu_loop_exit(cs); What about creating a cpu_loop_exit_restore(cs, pc) (maybe with a better name?) in the common code, if we now have to repeat this pattern for every target? Ok. diff --git a/target-mips/translate.c b/target-mips/translate.c index fd063a2..f87d5ac 100644 --- a/target-mips/translate.c +++ b/target-mips/translate.c @@ -1677,15 +1677,21 @@ generate_exception_err (DisasContext *ctx, int excp, int err) gen_helper_raise_exception_err(cpu_env, texcp, terr); tcg_temp_free_i32(terr); tcg_temp_free_i32(texcp); +ctx-bstate = BS_STOP; } static inline void generate_exception (DisasContext *ctx, int excp) { -save_cpu_state(ctx, 1); gen_helper_0e0i(raise_exception, excp); } +static inline void +generate_exception_end(DisasContext *ctx, int excp) +{ +generate_exception_err(ctx, excp, 0); +} + This sets error_code to 0, which is different than leaving it unchanged. This might be ok, but have you checked there is no side effect? Previous version called do_raise_exception, which passes 0 as error_code. /* Addresses computation */ static inline void gen_op_addr_add (DisasContext *ctx, TCGv ret, TCGv arg0, TCGv arg1) { @@ -1731,7 +1737,7 @@ static inline void check_cp1_enabled(DisasContext *ctx) static inline void check_cop1x(DisasContext *ctx) { if (unlikely(!(ctx-hflags MIPS_HFLAG_COP1X))) -generate_exception(ctx, EXCP_RI); +generate_exception_end(ctx, EXCP_RI); I don't think it is correct. Before triggering such an exception, we were saving the CPU state, and not going through retranslation. With this change, we don't save the CPU state, but we don't go through retranslation either. The rule is to either go through retranslation, or to save the CPU state before a possible exception. generate_exception_end saves CPU state and stops the translation through calling of generate_exception_err function. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v4 0/5] Fix exceptions handling for MIPS and i386
Forgot to fix the subject. These patches also fix exceptions handling for PowerPC. Pavel Dovgalyuk -Original Message- From: Pavel Dovgalyuk [mailto:pavel.dovga...@ispras.ru] Sent: Monday, June 29, 2015 10:23 AM To: qemu-devel@nongnu.org Cc: rth7...@gmail.com; ag...@suse.de; pavel.dovga...@ispras.ru; pbonz...@redhat.com; leon.al...@imgtec.com; aurel...@aurel32.net Subject: [PATCH v4 0/5] Fix exceptions handling for MIPS and i386 QEMU targets ISAs contain instruction that can break the execution flow with exceptions. When exception breaks the execution of the translation block it may corrupt PC and icount values. This set of patches fixes exception handling for MIPS and i386 targets. Incorrect execution for i386 is causes by exceptions raised by MMU functions. MMU helper functions are called from generated code and other helper functions. In both cases they try to get function's return address for restoring virtual CPU state. When MMU helper is called from some other helper function (like helper_maskmov_xmm) through cpu_st* function, the return address will point to that helper. That is why CPU state cannot be restored in the case of MMU fault. This bug can occur when maskmov instruction is located in the middle of the translation block. Execution sequence for this example: TB start: PC1: instr1 instr2 PC2: maskmov page fault page fault processing PC1: instr1 instr2 maskmov At the start of TB execution guest PC points to instr1. When page fault occurs QEMU tries to restore guest PC (which should be equal to PC2). It reads host PC from the call stack and checks whether it points to TB or not. Bug in ldst helpers implementation provides incorrect host PC, which is not located within the TB. That's why QEMU cannot recover guest PC and it remains the same (PC1). After page fault processing QEMU restarts TB and executes instr1 and instr2 for the second time, because guest PC was not recovered. Bugs in MIPS helper functions do not break the execution in regular TCG mode, because PC value is updated before calling the functions that can raise an exception. But icount value cannot be updated this way. Therefore exceptions make execution in icount mode non-determinisic. In icount mode every translation block looks as follows: if icount n then exit icount -= n instr1 instr2 ... instrn exit When one of these instructions initiates an exception, icount should be restored and adjusted number of instructions should be subtracted from icount instead of initial n. tlb_fill function passes retaddr to raise_exception, which allows restoring current instructions in TB and correct icount calculation. When exception triggered with other function (e.g. by embedding call to exception raising helper into TB), then PC is not passed as retaddr and correct icount is not recovered. In such cases icount will be decreased by the value equal to the size of TB. This behavior leads to incorrect values of virtual clock and non-deterministic execution of the code. These patches passes pointer to the translation block code to the exception handler. It allows correct restoring of PC and icount values. v4 changes: * Fixed exceptions handling for PowerPC * Fixed passing of mmu_idx into helpers (as suggested by Aurelien Jarno) * Added cpu_loop_exit_restore function (as suggested by Aurelien Jarno) * Fixed exceptions handling in compare helper functions for MIPS (as suggested by Aurelien Jarno) * Removed several CPU state saving calls for MIPS (as suggested by Aurelien Jarno) v3 changes: * Modified exception handling for syscall (as suggested by Aurelien Jarno) * Removed redundant calls to save_cpu_state (as suggested by Aurelien Jarno) * Removed helper_call* functions from softmmu (as suggested by Paolo Bonzini) v2 changes: * Added softmmu functions to pass TB return value into memory operations handlers * Fixed memory operations handling for MIPS * Disabled updates of the PC that are overridden with cpu_restore_state * Fixed memory operations and exceptions invoked by i386 helpers --- Pavel Dovgalyuk (5): softmmu: add helper function to pass through retaddr cpu-exec: introduce loop exit with restore function target-mips: improve exceptions handling target-i386: fix memory operations in helpers target-ppc: exceptions handling in icount mode cpu-exec.c |9 include/exec/cpu_ldst_template.h | 59 +++ include/exec/exec-all.h |1 softmmu_template.h |6 target-i386/cc_helper.c |2 target-i386/cpu.h|5 target-i386/excp_helper.c| 21 + target-i386/fpu_helper.c | 146 target-i386/helper.c |4 target-i386/int_helper.c | 32 +- target-i386/mem_helper.c | 39 +-
Re: [Qemu-devel] [PATCH v3 1/3] softmmu: add helper function to pass through retaddr
From: Aurelien Jarno [mailto:aurel...@aurel32.net] On 2015-06-18 16:28, Pavel Dovgalyuk wrote: This patch introduces several helpers to pass return address which points to the TB. Correct return address allows correct restoring of the guest PC and icount. These functions should be used when helpers embedded into TB invoke memory operations. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- include/exec/cpu_ldst_template.h | 48 -- softmmu_template.h |6 - tcg/tcg.h| 23 ++ 3 files changed, 63 insertions(+), 14 deletions(-) diff --git a/include/exec/cpu_ldst_template.h b/include/exec/cpu_ldst_template.h index 95ab750..35df753 100644 --- a/include/exec/cpu_ldst_template.h +++ b/include/exec/cpu_ldst_template.h @@ -54,15 +54,21 @@ #ifdef SOFTMMU_CODE_ACCESS #define ADDR_READ addr_code #define MMUSUFFIX _cmmu +#define URETSUFFIX SUFFIX +#define SRETSUFFIX SUFFIX #else #define ADDR_READ addr_read #define MMUSUFFIX _mmu +#define URETSUFFIX USUFFIX +#define SRETSUFFIX glue(s, SUFFIX) #endif /* generic load/store macros */ static inline RES_TYPE -glue(glue(cpu_ld, USUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr) +glue(glue(glue(cpu_ld, USUFFIX), MEMSUFFIX), _ra)(CPUArchState *env, + target_ulong ptr, + uintptr_t retaddr) { int page_index; RES_TYPE res; @@ -74,7 +80,8 @@ glue(glue(cpu_ld, USUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr) mmu_idx = CPU_MMU_INDEX; if (unlikely(env-tlb_table[mmu_idx][page_index].ADDR_READ != (addr (TARGET_PAGE_MASK | (DATA_SIZE - 1) { -res = glue(glue(helper_ld, SUFFIX), MMUSUFFIX)(env, addr, mmu_idx); +res = glue(glue(helper_ret_ld, URETSUFFIX), MMUSUFFIX)(env, addr, +mmu_idx, retaddr); helper_ret_ld doesn't take a int for the mmu_idx, but rather a TCGMemOpIdx (we should add more type checking). They differ by the fact that TCGMemOpIdx also encodes the size of the operation (plus the alignement, but that's out of scope here). This currently doesn't make a difference, but it's better to pass the correct value in case we change the code later. Ok. } else { uintptr_t hostaddr = addr + env-tlb_table[mmu_idx][page_index].addend; res = glue(glue(ld, USUFFIX), _p)((uint8_t *)hostaddr); @@ -82,9 +89,17 @@ glue(glue(cpu_ld, USUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr) return res; } +static inline RES_TYPE +glue(glue(cpu_ld, USUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr) +{ +return glue(glue(glue(cpu_ld, USUFFIX), MEMSUFFIX), _ra)(env, ptr, 0); +} + #if DATA_SIZE = 2 static inline int -glue(glue(cpu_lds, SUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr) +glue(glue(glue(cpu_lds, SUFFIX), MEMSUFFIX), _ra)(CPUArchState *env, + target_ulong ptr, + uintptr_t retaddr) { int res, page_index; target_ulong addr; @@ -95,14 +110,20 @@ glue(glue(cpu_lds, SUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr) mmu_idx = CPU_MMU_INDEX; if (unlikely(env-tlb_table[mmu_idx][page_index].ADDR_READ != (addr (TARGET_PAGE_MASK | (DATA_SIZE - 1) { -res = (DATA_STYPE)glue(glue(helper_ld, SUFFIX), - MMUSUFFIX)(env, addr, mmu_idx); +res = (DATA_STYPE)glue(glue(helper_ret_ld, SRETSUFFIX), + MMUSUFFIX)(env, addr, mmu_idx, retaddr); } else { uintptr_t hostaddr = addr + env-tlb_table[mmu_idx][page_index].addend; res = glue(glue(lds, SUFFIX), _p)((uint8_t *)hostaddr); } return res; } + +static inline int +glue(glue(cpu_lds, SUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr) +{ +return glue(glue(glue(cpu_lds, SUFFIX), MEMSUFFIX), _ra)(env, ptr, 0); +} #endif #ifndef SOFTMMU_CODE_ACCESS @@ -110,8 +131,9 @@ glue(glue(cpu_lds, SUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr) /* generic store macro */ static inline void -glue(glue(cpu_st, SUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr, - RES_TYPE v) +glue(glue(glue(cpu_st, SUFFIX), MEMSUFFIX), _ra)(CPUArchState *env, + target_ulong ptr, + RES_TYPE v, uintptr_t retaddr) { int page_index; target_ulong addr; @@ -122,13 +144,21 @@ glue(glue(cpu_st, SUFFIX), MEMSUFFIX)(CPUArchState *env,
Re: [Qemu-devel] [RFC v2 07/34] exec-all: Move cpu_can_do_io to qom/cpu.h
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 31/05/2015 08:11, Peter Crosthwaite wrote: +/* If not executing code then assume we are ok. */ +if (cpu-current_tb == NULL) { +return true; +} +return cpu-can_do_io != 0; For what it's worth, I think the if here is dead. Pavel? cpu_can_do_io can be called from cpus.c and translate-all.c In both cases these calls could be made outside the generated code. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC v2 07/34] exec-all: Move cpu_can_do_io to qom/cpu.h
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 24/06/2015 13:40, Pavel Dovgaluk wrote: +/* If not executing code then assume we are ok. */ +if (cpu-current_tb == NULL) { +return true; +} +return cpu-can_do_io != 0; For what it's worth, I think the if here is dead. Pavel? cpu_can_do_io can be called from cpus.c and translate-all.c In both cases these calls could be made outside the generated code. Yes, but doesn't your commit 626cf8f (icount: set can_do_io outside TB execution, 2014-12-08) cause cpu-can_do_io == 0 to imply cpu-current_tb != NULL? I see. You are right, as far I can understand the control flow. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v2 0/3] Fix exceptions handling for MIPS and i386
From: Aurelien Jarno [mailto:aurel...@aurel32.net] On 2015-06-18 12:02, Paolo Bonzini wrote: TCG can then use them to fill in an array stored inside the TranslationBlock, together with the host PC. Since the gen_opc_pc, gen_opc_instr_start, gen_opc_icount arrays are inside tcg_ctx, it may be a good idea to store the checkpoint information compressed in a byte array (e.g. as a series of ULEB128 values---the host and target PCs can even be stored as deltas from the last value). Either as deltas to the last value or as delta from the start of the TB. What I am worried about is the size of the checkpoint information, even if we do some compression, we might have one per guest instruction. I have implemented a naive version of that without compression, storing the checkpoint data at the end of the generated code, and it's about 30% of the size of the TB for MIPS. It's probably smaller on architectures storing only the PC. Also it's size is quite variable. That's why it's probably not a good idea to store it directly in the TranslationBlock. I don't like storing it directly in the generated code either, especially given this part is supposed to be executable. As a first step, gen_intermediate_code_pc and tcg_gen_code_search_pc can then be merged into a single target-independent function that uncompresses the byte array up to the required host PC into tcg_ctx. Later you can optimize them to remove the tcg_ctx arrays altogether. So the patches could be something like this: 1) SPARC: put the jump target information directly in gen_opc_* without using gen_opc_jump_pc (not trivial) 2) a few targets: instead of gen_opc_* arrays, use a new generic member of tcg_ctx (similar to how csbase is used generically), e.g. tcg_ctx.gen_opc_target1[] and tcg_ctx.gen_opc_target2[]. 3) all targets: always fill in tcg_ctx.gen_*, even if search_pc is false 4) TCG: add support for a checkpoint operation, make it fill in tcg_ctx.gen_* 5) all targets: change explicit filling of tcg_ctx.gen_* to use the checkpoint operation 6) TCG/translate-all: convert gen_intermediate_code_pc as outlined above That's sounds like a plan when I have more time ;-) Doesn't this approach still require my fixes to work correctly? Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v2 0/3] Fix exceptions handling for MIPS and i386
From: Aurelien Jarno [mailto:aurel...@aurel32.net] On 2015-06-17 15:41, Pavel Dovgalyuk wrote: In icount mode every translation block looks as follows: if icount n then exit icount -= n instr1 instr2 ... instrn exit When one of these instructions initiates an exception, icount should be restored and adjusted number of instructions should be subtracted from icount instead of initial n. tlb_fill function passes retaddr to raise_exception, which allows restoring current instructions in TB and correct icount calculation. When exception triggered with other function (e.g. by embedding call to exception raising helper into TB), then PC is not passed as retaddr and correct icount is not recovered. In such cases icount will be decreased by the value equal to the size of TB. Looking at how icount work, I see it's basically a variable in the CPU state (icount_decr.u16.low), which is already accessed from the TB. Couldn't we adjust it using additional code before generating an exception, when in icount mode. For example for MIPS, we can add some code before generate_exception which use the value from s-gen_opc_icount[j] to adjust the variable icount_decr.u16.low. It is possible, but it will incur additional overhead, because we will have to update icount every time the exception might be generated. We'll have to update icount value before and after every helper call, that can cause an exception: icount -= n ... instr_k icount += n - k helper icount -= n - k ... And this overhead will slowdown the code even if no exception occur. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v2 0/3] Fix exceptions handling for MIPS and i386
From: Peter Maydell [mailto:peter.mayd...@linaro.org] On 18 June 2015 at 08:12, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote: From: Aurelien Jarno [mailto:aurel...@aurel32.net] Looking at how icount work, I see it's basically a variable in the CPU state (icount_decr.u16.low), which is already accessed from the TB. Couldn't we adjust it using additional code before generating an exception, when in icount mode. For example for MIPS, we can add some code before generate_exception which use the value from s-gen_opc_icount[j] to adjust the variable icount_decr.u16.low. It is possible, but it will incur additional overhead, because we will have to update icount every time the exception might be generated. We'll have to update icount value before and after every helper call, that can cause an exception: icount -= n ... instr_k icount += n - k helper icount -= n - k ... And this overhead will slowdown the code even if no exception occur. Right, this is a tradeoff: in some cases it's faster to assume no exception and handle state resync by doing a retranslate. In some cases it's faster to assume there will be an exception and do a manual sync. Guest load/store is obviously in the first category. Guest doing an instruction which always takes an exception (like syscall insns) is in the second category. For other cases there's a choice. We need to support both approaches; obviously you can argue for any particular case whether it should be approach 1 or approach 2. Syscall and non-implemented instructions are in third category - they always take an exception. In this case the translation should be stopped without any additional actions. By the way, I implemented this 'third category' approach for mips and measured the performance. It does not show any performance degradation when compared to original unfixed version. All other exception-generating helpers and instructions use approach 1. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v2 0/3] Fix exceptions handling for MIPS and i386
From: Aurelien Jarno [mailto:aurel...@aurel32.net] On 2015-06-18 10:12, Pavel Dovgaluk wrote: From: Aurelien Jarno [mailto:aurel...@aurel32.net] On 2015-06-17 15:41, Pavel Dovgalyuk wrote: In icount mode every translation block looks as follows: if icount n then exit icount -= n instr1 instr2 ... instrn exit When one of these instructions initiates an exception, icount should be restored and adjusted number of instructions should be subtracted from icount instead of initial n. tlb_fill function passes retaddr to raise_exception, which allows restoring current instructions in TB and correct icount calculation. When exception triggered with other function (e.g. by embedding call to exception raising helper into TB), then PC is not passed as retaddr and correct icount is not recovered. In such cases icount will be decreased by the value equal to the size of TB. Looking at how icount work, I see it's basically a variable in the CPU state (icount_decr.u16.low), which is already accessed from the TB. Couldn't we adjust it using additional code before generating an exception, when in icount mode. For example for MIPS, we can add some code before generate_exception which use the value from s-gen_opc_icount[j] to adjust the variable icount_decr.u16.low. It is possible, but it will incur additional overhead, because we will have to update icount every time the exception might be generated. We'll have to update icount value before and after every helper call, that can cause an exception: icount -= n ... instr_k icount += n - k helper icount -= n - k ... And this overhead will slowdown the code even if no exception occur. That's where I might disagree. Retranslation seems a very good idea on the paper, but in practice it doesn't seems to always bring the performance improvement it should. In addition it seems to be highly dependent on the target. Just to give some numbers, on MIPS (as your patch originally concerns this architecture), 40% of code generation is actually due to retranslation. The problem is that over the time we have improved a lot the code generation (liveness analysis, better register allocation, constant propagation, ...) and thus we have increased the code generation time. While it clearly has some benefits when this code is actually executed, it's not the case when the code is simply retranslated. In short we spend more time to find the CPU state corresponding to an exception than before. ... All of that to say that I am worried for the performances to see more paths through the retranslation code, especially on MIPS as it seems to be costly. That said I haven't really look in details at other targets, nor hosts. I fixed syscalls, exceptions that occur without any conditions, and removed redundant calls to save_cpu_state. Then I measured the performance without enabling icount. And Linux boots even faster than with original version. I'll submit this version for review soon. Now to come back about your patches, we might want to simply fix icount first, even if it has some performance impact, and deal with the retranslation issue separately, as it concerns more than just icount. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v2 1/3] softmmu: add helper function to pass through retaddr
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 17/06/2015 14:42, Pavel Dovgalyuk wrote: This patch introduces several helpers to pass return address which points to the TB. Correct return address allows correct restoring of the guest PC and icount. These functions should be used when helpers embedded into TB invoke memory operations. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- include/exec/cpu_ldst_template.h | 42 +++--- include/exec/exec-all.h | 27 softmmu_template.h | 18 3 files changed, 79 insertions(+), 8 deletions(-) diff --git a/include/exec/cpu_ldst_template.h b/include/exec/cpu_ldst_template.h index 95ab750..1847816 100644 --- a/include/exec/cpu_ldst_template.h +++ b/include/exec/cpu_ldst_template.h @@ -62,7 +62,9 @@ /* generic load/store macros */ static inline RES_TYPE -glue(glue(cpu_ld, USUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr) +glue(glue(glue(cpu_ld, USUFFIX), MEMSUFFIX), _ra)(CPUArchState *env, + target_ulong ptr, + uintptr_t retaddr) Would it make sense to call these helper_cpu_ld##USUFFIX##MEMSUFFIX? diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index 856e698..b3aefde 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -350,6 +350,33 @@ struct MemoryRegion *iotlb_to_region(CPUState *cpu, void tlb_fill(CPUState *cpu, target_ulong addr, int is_write, int mmu_idx, uintptr_t retaddr); +uint8_t helper_call_ldb_cmmu(CPUArchState *env, target_ulong addr, + int mmu_idx, uintptr_t retaddr); Here we already have helper_ret_ldb_cmmu, so the new function is only needed if DATA_SIZE != 1. +uint16_t helper_call_ldw_cmmu(CPUArchState *env, target_ulong addr, + int mmu_idx, uintptr_t retaddr); What about helper_ret_ldw_cmmu for consistency with the DATA_SIZE == 1 case? tcg.h breaks these definitions: /* Temporary aliases until backends are converted. */ #ifdef TARGET_WORDS_BIGENDIAN # define helper_ret_ldsw_mmu helper_be_ldsw_mmu # define helper_ret_lduw_mmu helper_be_lduw_mmu # define helper_ret_ldsl_mmu helper_be_ldsl_mmu # define helper_ret_ldul_mmu helper_be_ldul_mmu # define helper_ret_ldq_mmu helper_be_ldq_mmu # define helper_ret_stw_mmu helper_be_stw_mmu # define helper_ret_stl_mmu helper_be_stl_mmu # define helper_ret_stq_mmu helper_be_stq_mmu #else Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v2 1/3] softmmu: add helper function to pass through retaddr
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 18/06/2015 11:24, Pavel Dovgaluk wrote: +uint16_t helper_call_ldw_cmmu(CPUArchState *env, target_ulong addr, + int mmu_idx, uintptr_t retaddr); What about helper_ret_ldw_cmmu for consistency with the DATA_SIZE == 1 case? tcg.h breaks these definitions: /* Temporary aliases until backends are converted. */ #ifdef TARGET_WORDS_BIGENDIAN # define helper_ret_ldsw_mmu helper_be_ldsw_mmu # define helper_ret_lduw_mmu helper_be_lduw_mmu # define helper_ret_ldsl_mmu helper_be_ldsl_mmu # define helper_ret_ldul_mmu helper_be_ldul_mmu # define helper_ret_ldq_mmu helper_be_ldq_mmu # define helper_ret_stw_mmu helper_be_stw_mmu # define helper_ret_stl_mmu helper_be_stl_mmu # define helper_ret_stq_mmu helper_be_stq_mmu #else Isn't this exactly the same as your helper_call_ldw_cmmu? Yes, but I can't compile it yet. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v2 0/3] Fix exceptions handling for MIPS and i386
From: Aurelien Jarno [mailto:aurel...@aurel32.net] On 2015-06-17 15:41, Pavel Dovgalyuk wrote: This set of patches fixes exception handling for MIPS and i386 targets. These targets contain instructions that break correct execution in icount/TCG modes (MIPS) and in regular TCG mode (i386). Just to be clear, this is not something specific to MIPS and i386. Every target which call cpu_loop_exit without doing a retranslation is affected by the icount bug. True. But I haven't checked other platforms yet. Incorrect execution for i386 is causes by exceptions raised by MMU functions. MMU helper functions are called from generated code and other helper functions. In both cases they try to get function's return address for restoring virtual CPU state. When MMU helper is called from some other helper function (like helper_maskmov_xmm) through cpu_st* function, the return address will point to that helper. That is why CPU state cannot be restored in the case of MMU fault. This bug can occur when maskmov instruction is located in the middle of the translation block. Execution sequence for this example: TB start: PC1: instr1 instr2 PC2: maskmov page fault page fault processing PC1: instr1 instr2 maskmov At the start of TB execution guest PC points to instr1. When page fault occurs QEMU tries to restore guest PC (which should be equal to PC2). It reads host PC from the call stack and checks whether it points to TB or not. Bug in ldst helpers implementation provides incorrect host PC, which is not located within the TB. That's why QEMU cannot recover guest PC and it remains the same (PC1). After page fault processing QEMU restarts TB and executes instr1 and instr2 for the second time, because guest PC was not recovered. Also just to be clear, the way you propose is just one way to fix the issue. The other way (which is used for all similar instructions) is to call just before the helper: gen_update_cc_op(s); gen_jmp_im(pc_start - s-cs_base); This would work only for regular execution, not for icount one. Bugs in MIPS helper functions do not break the execution in regular TCG mode, because PC value is updated before calling the functions that can raise an exception. But icount value cannot be updated this way. Therefore exceptions make execution in icount mode non-determinisic. In icount mode every translation block looks as follows: if icount n then exit icount -= n instr1 instr2 ... instrn exit When one of these instructions initiates an exception, icount should be restored and adjusted number of instructions should be subtracted from icount instead of initial n. tlb_fill function passes retaddr to raise_exception, which allows restoring current instructions in TB and correct icount calculation. When exception triggered with other function (e.g. by embedding call to exception raising helper into TB), then PC is not passed as retaddr and correct icount is not recovered. In such cases icount will be decreased by the value equal to the size of TB. This behavior leads to incorrect values of virtual clock and non-deterministic execution of the code. These patches passes pointer to the translation block code to the exception handler. It allows correct restoring of PC and icount values. While I think it's the correct way for load/stores or FPU exceptions, I am not convinced we should do that in all cases. Retranslation has a cost, and when the exception is likely to occur, it's better to save the CPU state instead of going for a retranslation. Actually we had to rollback such a change on SH4, as it has some performances issues. See commit 1012740098d0307ce5d957ebbe9a7f020da7f574. Ok, I'll double check the patch to make translation to stop when exception is likely to occur after current instruction. One way to fix that would be to reduce the cost of retranslation, for example by using a kind of exception table that we generate at the same time of the TB. What exactly do you mean? Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v2 1/3] softmmu: add helper function to pass through retaddr
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 17/06/2015 14:42, Pavel Dovgalyuk wrote: This patch introduces several helpers to pass return address which points to the TB. Correct return address allows correct restoring of the guest PC and icount. These functions should be used when helpers embedded into TB invoke memory operations. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- include/exec/cpu_ldst_template.h | 42 +++--- include/exec/exec-all.h | 27 softmmu_template.h | 18 3 files changed, 79 insertions(+), 8 deletions(-) diff --git a/include/exec/cpu_ldst_template.h b/include/exec/cpu_ldst_template.h index 95ab750..1847816 100644 --- a/include/exec/cpu_ldst_template.h +++ b/include/exec/cpu_ldst_template.h @@ -62,7 +62,9 @@ /* generic load/store macros */ static inline RES_TYPE -glue(glue(cpu_ld, USUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr) +glue(glue(glue(cpu_ld, USUFFIX), MEMSUFFIX), _ra)(CPUArchState *env, + target_ulong ptr, + uintptr_t retaddr) Would it make sense to call these helper_cpu_ld##USUFFIX##MEMSUFFIX? I don't want to use 'helper' prefix, because helper functions are usually called directly from TB. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH] MIPS: exceptions handling in icount mode
From: Aurelien Jarno [mailto:aurel...@aurel32.net] On 2015-06-15 07:53, Pavel Dovgaluk wrote: From: Aurelien Jarno [mailto:aurel...@aurel32.net] On 2015-06-10 11:33, Pavel Dovgalyuk wrote: This patch fixes exception handling in MIPS. MIPS instructions generate several types of exceptions. When exception is generated, it breaks the execution of the current translation block. Implementation of the exceptions handling in MIPS does not correctly restore icount for the instruction which caused the exception. In most cases icount will be decreased by the value equal to the size of TB. I don't think it is correct. There is no real point of always doing retranslation for an exception triggered from the helpers, especially when the CPU state has been saved before anyway? As you know, icount is processed as follows: TB: if icount n then exit icount -= n instr1 instr2 ... instrn exit When one of the instructions initiates an exception, then icount should be restored and adjusted number of instructions should be subtracted instead of initial n. E.g., tlb_fill function passes retaddr to raise_exception, which allows restoring current instructions in TB and correct icount calculation. When exception triggered with other function (e.g. by embedding call to helper_raise_exception_err into TB), then PC is not passed as retaddr and correct icount is not recovered. This behavior leads to incorrect values of timers and non-deterministic execution of the code. Ok, this therefore doesn't looks something MIPS specific, but rather a flaw in the icount design. Instead of fixing blindly one target, we should try to fix it globally, or if not possible at least agree on a way to fix that for all target and provide the infrastructure for that (for example provide load/store functions which accept a return address). Paolo any opinion on that? Also retranslation has a cost (actually on MIPS we spend more time in *retranslation* than in translation due to the way the MMU works), we should avoid it if we already have to save the CPU state for another reason. At least your patch should remove the code saving the CPU state when possible if the helper uses retranslation instead. Ok, I'll remove CPU saving where it is not actually used because of the exception. This patch passes pointer to the translation block internals to the exception handler. It allows correct restoring of the icount value. Your patch doesn't do that for all the helpers, for example all the memory access helpers. It probably improves the situation but therefore doesn't fix it. Right, I missed these helpers. I'll add them in the next version. Except we currently don't provide a way in the softmmu code to provide a return address... Softmmu passes return address into tlb_fill function, which passes it to raise_exception. I also fixed HELPER_LD_ATOMIC and HELPER_ST_ATOMIC to recover PC value. Do you mean something different? Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH] MIPS: exceptions handling in icount mode
From: Aurelien Jarno [mailto:aurel...@aurel32.net] On 2015-06-15 07:53, Pavel Dovgaluk wrote: From: Aurelien Jarno [mailto:aurel...@aurel32.net] On 2015-06-10 11:33, Pavel Dovgalyuk wrote: This patch fixes exception handling in MIPS. MIPS instructions generate several types of exceptions. When exception is generated, it breaks the execution of the current translation block. Implementation of the exceptions handling in MIPS does not correctly restore icount for the instruction which caused the exception. In most cases icount will be decreased by the value equal to the size of TB. I don't think it is correct. There is no real point of always doing retranslation for an exception triggered from the helpers, especially when the CPU state has been saved before anyway? As you know, icount is processed as follows: TB: if icount n then exit icount -= n instr1 instr2 ... instrn exit When one of the instructions initiates an exception, then icount should be restored and adjusted number of instructions should be subtracted instead of initial n. E.g., tlb_fill function passes retaddr to raise_exception, which allows restoring current instructions in TB and correct icount calculation. When exception triggered with other function (e.g. by embedding call to helper_raise_exception_err into TB), then PC is not passed as retaddr and correct icount is not recovered. This behavior leads to incorrect values of timers and non-deterministic execution of the code. Ok, this therefore doesn't looks something MIPS specific, but rather a flaw in the icount design. Instead of fixing blindly one target, we should try to fix it globally, or if not possible at least agree on a way to fix that for all target and provide the infrastructure for that (for example provide load/store functions which accept a return address). Paolo any opinion on that? Recovering from is a tricky mechanism. It can break the correct execution if used inaccurately even when icount is disabled. I already posted a patch for maskmov instruction in i386: http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg02960.html Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH] MIPS: exceptions handling in icount mode
From: Aurelien Jarno [mailto:aurel...@aurel32.net] On 2015-06-10 11:33, Pavel Dovgalyuk wrote: This patch fixes exception handling in MIPS. MIPS instructions generate several types of exceptions. When exception is generated, it breaks the execution of the current translation block. Implementation of the exceptions handling in MIPS does not correctly restore icount for the instruction which caused the exception. In most cases icount will be decreased by the value equal to the size of TB. I don't think it is correct. There is no real point of always doing retranslation for an exception triggered from the helpers, especially when the CPU state has been saved before anyway? As you know, icount is processed as follows: TB: if icount n then exit icount -= n instr1 instr2 ... instrn exit When one of the instructions initiates an exception, then icount should be restored and adjusted number of instructions should be subtracted instead of initial n. E.g., tlb_fill function passes retaddr to raise_exception, which allows restoring current instructions in TB and correct icount calculation. When exception triggered with other function (e.g. by embedding call to helper_raise_exception_err into TB), then PC is not passed as retaddr and correct icount is not recovered. This behavior leads to incorrect values of timers and non-deterministic execution of the code. This patch passes pointer to the translation block internals to the exception handler. It allows correct restoring of the icount value. Your patch doesn't do that for all the helpers, for example all the memory access helpers. It probably improves the situation but therefore doesn't fix it. Right, I missed these helpers. I'll add them in the next version. From my point of view, it looks like the problem is actually elsewhere in the common icount code. Do we know if it works correctly on other emulated architectures? This problem can be solved only if someone passes PC to cpu_restore_state function. It cannot be done without altering exceptions processing of the targets. Other architectures have bugs in icount processing in case of the exceptions. I will fix them and submit as separate patches. Also do you have a quick example to reproduce the issue? I have no quick example, because I've got this situation in the middle of booting Linux process. It is hard to catch, because in most cases this bug does not affect the guest behavior. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- target-mips/cpu.h| 28 + target-mips/msa_helper.c |5 +++- target-mips/op_helper.c | 52 +++--- target-mips/translate.c |2 ++ 4 files changed, 45 insertions(+), 42 deletions(-) [ snip ] diff --git a/target-mips/translate.c b/target-mips/translate.c index fd063a2..9c2ff7c 100644 --- a/target-mips/translate.c +++ b/target-mips/translate.c @@ -1675,6 +1675,7 @@ generate_exception_err (DisasContext *ctx, int excp, int err) TCGv_i32 terr = tcg_const_i32(err); save_cpu_state(ctx, 1); gen_helper_raise_exception_err(cpu_env, texcp, terr); +ctx-bstate = BS_STOP; tcg_temp_free_i32(terr); tcg_temp_free_i32(texcp); } @@ -1684,6 +1685,7 @@ generate_exception (DisasContext *ctx, int excp) { save_cpu_state(ctx, 1); gen_helper_0e0i(raise_exception, excp); +ctx-bstate = BS_STOP; } Why do we need to stop the translation here? The exception might be conditional (for example for ADDU or SUBU). Right, it would be better to add another helper, which recovers the PC. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH v14 21/21] replay: ptimer
Broken description. This one is correct: replay: recording of the user input This records user input (keyboard and mouse events) in record mode and replays these input events in replay mode. Pavel Dovgalyuk -Original Message- From: Pavel Dovgalyuk [mailto:pavel.dovga...@ispras.ru] Sent: Monday, May 25, 2015 3:09 PM To: qemu-devel@nongnu.org Cc: peter.mayd...@linaro.org; peter.crosthwa...@xilinx.com; ebl...@redhat.com; mark.bur...@greensocs.com; r...@ispras.ru; batuz...@ispras.ru; maria.klimushenk...@ispras.ru; pavel.dovga...@ispras.ru; pbonz...@redhat.com; alex.ben...@linaro.org; fred.kon...@greensocs.com Subject: [PATCH v14 21/21] replay: ptimer This patch adds deterministic replay for hardware periodic countdown timers. ptimer uses bottom halves layer to execute such an asynchronous callback. We put this callback into the replay queue instead of bottom halves one. When checkpoint is met by main loop thread, the replay queue is processed and callback is executed. Binding callback moment to one of the checkpoints makes it deterministic. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- include/ui/input.h |2 + replay/Makefile.objs |1 replay/replay-events.c | 31 + replay/replay-input.c| 159 ++ replay/replay-internal.h | 13 replay/replay.h |4 + ui/input.c | 27 +--- 7 files changed, 229 insertions(+), 8 deletions(-) create mode 100755 replay/replay-input.c diff --git a/include/ui/input.h b/include/ui/input.h index 5d5ac00..d06a12d 100644 --- a/include/ui/input.h +++ b/include/ui/input.h @@ -33,7 +33,9 @@ void qemu_input_handler_bind(QemuInputHandlerState *s, const char *device_id, int head, Error **errp); void qemu_input_event_send(QemuConsole *src, InputEvent *evt); +void qemu_input_event_send_impl(QemuConsole *src, InputEvent *evt); void qemu_input_event_sync(void); +void qemu_input_event_sync_impl(void); InputEvent *qemu_input_event_new_key(KeyValue *key, bool down); void qemu_input_event_send_key(QemuConsole *src, KeyValue *key, bool down); diff --git a/replay/Makefile.objs b/replay/Makefile.objs index 257c320..3936296 100755 --- a/replay/Makefile.objs +++ b/replay/Makefile.objs @@ -2,3 +2,4 @@ obj-y += replay.o obj-y += replay-internal.o obj-y += replay-events.o obj-y += replay-time.o +obj-y += replay-input.o diff --git a/replay/replay-events.c b/replay/replay-events.c index 1e14cce..6cf13f2 100755 --- a/replay/replay-events.c +++ b/replay/replay-events.c @@ -14,6 +14,7 @@ #include replay.h #include replay-internal.h #include block/aio.h +#include ui/input.h typedef struct Event { ReplayAsyncEventKind event_kind; @@ -39,6 +40,13 @@ static void replay_run_event(Event *event) case REPLAY_ASYNC_EVENT_PTIMER: aio_bh_call(event-opaque); break; +case REPLAY_ASYNC_EVENT_INPUT: +qemu_input_event_send_impl(NULL, (InputEvent *)event-opaque); +qapi_free_InputEvent((InputEvent *)event-opaque); +break; +case REPLAY_ASYNC_EVENT_INPUT_SYNC: +qemu_input_event_sync_impl(); +break; default: error_report(Replay: invalid async event ID (%d) in the queue, event-event_kind); @@ -132,6 +140,16 @@ void replay_add_ptimer_event(void *bh, uint64_t id) replay_add_event_internal(REPLAY_ASYNC_EVENT_PTIMER, bh, NULL, id); } +void replay_add_input_event(struct InputEvent *event) +{ +replay_add_event_internal(REPLAY_ASYNC_EVENT_INPUT, event, NULL, 0); +} + +void replay_add_input_sync_event(void) +{ +replay_add_event_internal(REPLAY_ASYNC_EVENT_INPUT_SYNC, NULL, NULL, 0); +} + static void replay_save_event(Event *event, int checkpoint) { if (replay_mode != REPLAY_MODE_PLAY) { @@ -145,6 +163,9 @@ static void replay_save_event(Event *event, int checkpoint) case REPLAY_ASYNC_EVENT_PTIMER: replay_put_qword(event-id); break; +case REPLAY_ASYNC_EVENT_INPUT: +replay_save_input_event(event-opaque); +break; } } } @@ -185,6 +206,16 @@ static Event *replay_read_event(int checkpoint) read_id = replay_get_qword(); } break; +case REPLAY_ASYNC_EVENT_INPUT: +event = g_malloc0(sizeof(Event)); +event-event_kind = read_event_kind; +event-opaque = replay_read_input_event(); +return event; +case REPLAY_ASYNC_EVENT_INPUT_SYNC: +event = g_malloc0(sizeof(Event)); +event-event_kind = read_event_kind; +event-opaque = 0; +return event; default: error_report(Unknown ID %d of replay event, read_event_kind); exit(1); diff --git a/replay/replay-input.c b/replay/replay-input.c new file
Re: [Qemu-devel] [RFC PATCH v13 00/21] Deterministic replay core
Ping? -Original Message- From: Pavel Dovgalyuk [mailto:pavel.dovga...@ispras.ru] Sent: Wednesday, May 06, 2015 5:03 PM To: qemu-devel@nongnu.org Cc: peter.mayd...@linaro.org; peter.crosthwa...@xilinx.com; ebl...@redhat.com; mark.bur...@greensocs.com; r...@ispras.ru; batuz...@ispras.ru; maria.klimushenk...@ispras.ru; pavel.dovga...@ispras.ru; pbonz...@redhat.com; alex.ben...@linaro.org; fred.kon...@greensocs.com Subject: [RFC PATCH v13 00/21] Deterministic replay core This set of patches is related to the reverse execution and deterministic replay of qemu execution. This implementation of deterministic replay can be used for deterministic debugging of guest code through gdb remote interface. Core set of patches does not include support for reverse debugging commands of gdb, block devices' operations, USB replay support. These patches include only core function of the replay, excluding the support for replaying serial, audio, network, and USB devices' operations. Reverse debugging and monitor commands were also excluded to be submitted later as separate patches. Execution recording writes non-deterministic events log, which can be later used for replaying the execution anywhere and for unlimited number of times. It also supports checkpointing for faster rewinding during reverse debugging. Execution replaying reads the log and replays all non-deterministic events including external input, hardware clocks, and interrupts. Full version of deterministic replay has the following features: * Deterministically replays whole system execution and all contents of the memory, state of the hadrware devices, clocks, and screen of the VM. * Writes execution log into the file for latter replaying for multiple times on different machines. * Supports i386, x86_64, and ARM hardware platforms. * Performs deterministic replay of all operations with keyboard and mouse input devices. * Supports auto-checkpointing for convenient reverse debugging. Usage of the record/replay core: * First, record the execution, by adding the following string to the command line: '-icount shift=7,rr=record,rrfile=replay.bin -net none'. Block devices' images are not actually changed in the recording mode, because all of the changes are written to the temporary overlay file. * Then you can replay it for the multiple times by using another command line option: '-icount shift=7,rr=replay,rrfile=replay.bin -net none' * '-net none' option should also be specified if network replay patches are not applied. * Do not add any disk images to VM, because they are not supported by the core patches. Paper with short description of deterministic replay implementation: http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html Modifications of qemu include: * wrappers for clock and time functions to save their return values in the log * saving different asynchronous events (e.g. system shutdown) into the log * synchronization of the threads from thread pool * recording/replaying user input (mouse and keyboard) * adding internal events for cpu and io synchronization v13 changes: * Introduced ptimer trigger event (as suggested by Paolo Bonzini) Can anyone review the remaining patches? Pavel Dovgalyuk
Re: [Qemu-devel] when does a target frontend need to use gen_io_start()/gen_io_end() ?
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 12/05/2015 17:32, Peter Maydell wrote: In order for -icount to work, it's important for the target translate.c code to correctly bracket any generated code which can do I/O with gen_io_start()/gen_io_end() calls. But does anybody know exactly what the criteria are here for this? It would be nice if we could document this in a comment in gen_icount.h -- I'm happy to write one up if somebody will just tell me what the right answer is :-) It's any instruction that can cause an icount read, typically through QEMU_CLOCK_VIRTUAL or cpu_get_ticks(). Doesn't this mean that ARM has incorrect implementation of icount? MMIO is common for this platform, but none of memory accesses are surrounded with gen_io_start()/gen_io_end(). Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v13 15/21] bottom halves: introduce bh call function
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 06/05/2015 16:04, Pavel Dovgalyuk wrote: This patch introduces aio_bh_call function. It is used to execute bottom halves as callbacks without adding them to the queue. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- async.c |8 +++- include/block/aio.h |5 + 2 files changed, 12 insertions(+), 1 deletions(-) diff --git a/async.c b/async.c index 46d9e63..734c76c 100644 --- a/async.c +++ b/async.c @@ -59,6 +59,12 @@ QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque) return bh; } +void aio_bh_call(void *opaque) +{ +QEMUBH *bh = (QEMUBH *)opaque; This can have a QEMUBH * argument. I can fix it when committing. +bh-cb(bh-opaque); +} + /* Multiple occurrences of aio_bh_poll cannot be called concurrently */ int aio_bh_poll(AioContext *ctx) { @@ -82,7 +88,7 @@ int aio_bh_poll(AioContext *ctx) if (!bh-idle) ret = 1; bh-idle = 0; -bh-cb(bh-opaque); +aio_bh_call(bh); } } diff --git a/include/block/aio.h b/include/block/aio.h index d2bb423..b498704 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -157,6 +157,11 @@ QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque); void aio_notify(AioContext *ctx); /** + * aio_bh_call: Executes callback function of the specified BH. + */ +void aio_bh_call(void *opaque); + +/** * aio_bh_poll: Poll bottom halves for an AioContext. * * These are internal functions used by the QEMU main loop. Works for me! I am not sure why patch 16 works though. :) I'll ponder on it, in the meanwhile some extra explanation from you would be welcome... As you know, there is a replay queue for different asynchronous events like user input or thread pool callback. ptimer uses bottom halves layer to execute such an asynchronous callback. We put this callback into the replay queue instead of bottom halves one. When checkpoint is met by main loop thread, the replay queue is processed and callback is executed. Binding callback moment to one of the checkpoints makes it deterministic. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v12 16/21]
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 05/05/2015 12:22, Pavel Dovgaluk wrote: This patch is the reduced version of prior bottom halves patch. dma-helpers.c is also related to block devices, so it's better not to change it now. Ok. Perhaps you can add a replay event for ptimers instead of touching bottom halves? It seems reasonable. Then I'll have to use BH as a simple callback by implementing run bh function. This BH will be added to replay queue instead of adding it to BH queue. Pavel Dovgalyuk -Original Message- From: Pavel Dovgalyuk [mailto:pavel.dovga...@ispras.ru] Sent: Tuesday, May 05, 2015 1:19 PM To: qemu-devel@nongnu.org Cc: peter.mayd...@linaro.org; peter.crosthwa...@xilinx.com; ebl...@redhat.com; mark.bur...@greensocs.com; r...@ispras.ru; batuz...@ispras.ru; maria.klimushenk...@ispras.ru; pavel.dovga...@ispras.ru; pbonz...@redhat.com; alex.ben...@linaro.org; fred.kon...@greensocs.com Subject: [RFC PATCH v12 16/21] Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- async.c | 24 +++- dma-helpers.c|4 +++- hw/timer/arm_timer.c |2 +- include/block/aio.h | 18 ++ include/qemu/main-loop.h |1 + main-loop.c |5 + replay/replay-events.c | 16 replay/replay-internal.h |1 + replay/replay.h |2 ++ stubs/replay.c |4 10 files changed, 74 insertions(+), 3 deletions(-) diff --git a/async.c b/async.c index bd975c9..d092aa2 100644 --- a/async.c +++ b/async.c @@ -27,6 +27,7 @@ #include block/thread-pool.h #include qemu/main-loop.h #include qemu/atomic.h +#include replay/replay.h /***/ /* bottom halves (can be seen as timers which expire ASAP) */ @@ -39,6 +40,8 @@ struct QEMUBH { bool scheduled; bool idle; bool deleted; +bool replay; +uint64_t id; }; QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque) @@ -56,6 +59,21 @@ QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque) return bh; } +QEMUBH *aio_bh_new_replay(AioContext *ctx, QEMUBHFunc *cb, void *opaque, + uint64_t id) +{ +QEMUBH *bh = aio_bh_new(ctx, cb, opaque); +bh-replay = true; +bh-id = id; +return bh; +} + +void aio_bh_call(void *opaque) +{ +QEMUBH *bh = (QEMUBH *)opaque; +bh-cb(bh-opaque); +} + /* Multiple occurrences of aio_bh_poll cannot be called concurrently */ int aio_bh_poll(AioContext *ctx) { @@ -78,7 +96,11 @@ int aio_bh_poll(AioContext *ctx) if (!bh-idle) ret = 1; bh-idle = 0; -bh-cb(bh-opaque); +if (!bh-replay) { +aio_bh_call(bh); +} else { +replay_add_bh_event(bh, bh-id); +} } } diff --git a/dma-helpers.c b/dma-helpers.c index 6918572..357d7e9 100644 --- a/dma-helpers.c +++ b/dma-helpers.c @@ -13,6 +13,7 @@ #include qemu/range.h #include qemu/thread.h #include qemu/main-loop.h +#include replay/replay.h /* #define DEBUG_IOMMU */ @@ -96,7 +97,8 @@ static void continue_after_map_failure(void *opaque) { DMAAIOCB *dbs = (DMAAIOCB *)opaque; -dbs-bh = qemu_bh_new(reschedule_dma, dbs); +dbs-bh = qemu_bh_new_replay(reschedule_dma, dbs, + replay_get_current_step()); qemu_bh_schedule(dbs-bh); } diff --git a/hw/timer/arm_timer.c b/hw/timer/arm_timer.c index 1452910..97784a0 100644 --- a/hw/timer/arm_timer.c +++ b/hw/timer/arm_timer.c @@ -168,7 +168,7 @@ static arm_timer_state *arm_timer_init(uint32_t freq) s-freq = freq; s-control = TIMER_CTRL_IE; -bh = qemu_bh_new(arm_timer_tick, s); +bh = qemu_bh_new_replay(arm_timer_tick, s, 0); s-timer = ptimer_init(bh); vmstate_register(NULL, -1, vmstate_arm_timer, s); return s; diff --git a/include/block/aio.h b/include/block/aio.h index 82cdf78..ed76b43 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -35,6 +35,8 @@ struct BlockAIOCB { const AIOCBInfo *aiocb_info; BlockDriverState *bs; BlockCompletionFunc *cb; +bool replay; +uint64_t replay_step; void *opaque; int refcnt; }; @@ -144,6 +146,17 @@ void aio_context_release(AioContext *ctx); QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque); /** + * aio_bh_new_replay: Allocate a new bottom half structure for replay. + * + * This function calls aio_bh_new function and also fills replay parameters + * of the BH structure. BH created with this function in record/replay mode + * are executed through the replay queue
Re: [Qemu-devel] [RFC PATCH v11 00/23] Deterministic replay core
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 27/04/2015 09:32, Pavel Dovgalyuk wrote: This set of patches is related to the reverse execution and deterministic replay of qemu execution. This implementation of deterministic replay can be used for deterministic debugging of guest code through gdb remote interface. These patches include only core function of the replay, excluding the support for replaying serial, audio, network, and USB devices' operations. Reverse debugging and monitor commands were also excluded to be submitted later as separate patches. Execution recording writes non-deterministic events log, which can be later used for replaying the execution anywhere and for unlimited number of times. It also supports checkpointing for faster rewinding during reverse debugging. Execution replaying reads the log and replays all non-deterministic events including external input, hardware clocks, and interrupts. Deterministic replay has the following features: * Deterministically replays whole system execution and all contents of the memory, state of the hadrware devices, clocks, and screen of the VM. * Writes execution log into the file for latter replaying for multiple times on different machines. * Supports i386, x86_64, and ARM hardware platforms. * Performs deterministic replay of all operations with keyboard and mouse input devices. * Supports auto-checkpointing for convenient reverse debugging. Usage of the record/replay: * First, record the execution, by adding the following string to the command line: '-icount shift=7,rr=record,rrfile=replay.bin -net none'. Block devices' images are not actually changed in the recording mode, because all of the changes are written to the temporary overlay file. * Then you can replay it for the multiple times by using another command line option: '-icount shift=7,rr=replay,rrfile=replay.bin -net none' * '-net none' option should also be specified if network replay patches are not applied. Paper with short description of deterministic replay implementation: http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html Modifications of qemu include: * wrappers for clock and time functions to save their return values in the log * saving different asynchronous events (e.g. system shutdown) into the log * synchronization of the bottom halves execution * synchronization of the threads from thread pool * recording/replaying user input (mouse and keyboard) * adding internal events for cpu and io synchronization v11 changes: * Fixed instructions event processing. * Added some mutex protection calls for replay. * Fixed replaying read operations for qcow2. * Fixed rtc reads on initializations stage. * Eliminated some warnings in replay module. * Fixed misprints in documentation for replay (as suggested by Eric Blake) This has the same problem as before, namely that the block changes are too intrusive and, likely, no one is going to review them. I strongly suggest dropping them and only supporting synchronous I/O devices for now. Ok, I'll remove them in the next iteration. Instead, I would like to see patches for the other sources of non-determinism, especially character devices. I'll release these patches after applying the core, because they need reworking for the newest versions of QEMU and replay. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v12 16/21]
This patch is the reduced version of prior bottom halves patch. Pavel Dovgalyuk -Original Message- From: Pavel Dovgalyuk [mailto:pavel.dovga...@ispras.ru] Sent: Tuesday, May 05, 2015 1:19 PM To: qemu-devel@nongnu.org Cc: peter.mayd...@linaro.org; peter.crosthwa...@xilinx.com; ebl...@redhat.com; mark.bur...@greensocs.com; r...@ispras.ru; batuz...@ispras.ru; maria.klimushenk...@ispras.ru; pavel.dovga...@ispras.ru; pbonz...@redhat.com; alex.ben...@linaro.org; fred.kon...@greensocs.com Subject: [RFC PATCH v12 16/21] Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- async.c | 24 +++- dma-helpers.c|4 +++- hw/timer/arm_timer.c |2 +- include/block/aio.h | 18 ++ include/qemu/main-loop.h |1 + main-loop.c |5 + replay/replay-events.c | 16 replay/replay-internal.h |1 + replay/replay.h |2 ++ stubs/replay.c |4 10 files changed, 74 insertions(+), 3 deletions(-) diff --git a/async.c b/async.c index bd975c9..d092aa2 100644 --- a/async.c +++ b/async.c @@ -27,6 +27,7 @@ #include block/thread-pool.h #include qemu/main-loop.h #include qemu/atomic.h +#include replay/replay.h /***/ /* bottom halves (can be seen as timers which expire ASAP) */ @@ -39,6 +40,8 @@ struct QEMUBH { bool scheduled; bool idle; bool deleted; +bool replay; +uint64_t id; }; QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque) @@ -56,6 +59,21 @@ QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque) return bh; } +QEMUBH *aio_bh_new_replay(AioContext *ctx, QEMUBHFunc *cb, void *opaque, + uint64_t id) +{ +QEMUBH *bh = aio_bh_new(ctx, cb, opaque); +bh-replay = true; +bh-id = id; +return bh; +} + +void aio_bh_call(void *opaque) +{ +QEMUBH *bh = (QEMUBH *)opaque; +bh-cb(bh-opaque); +} + /* Multiple occurrences of aio_bh_poll cannot be called concurrently */ int aio_bh_poll(AioContext *ctx) { @@ -78,7 +96,11 @@ int aio_bh_poll(AioContext *ctx) if (!bh-idle) ret = 1; bh-idle = 0; -bh-cb(bh-opaque); +if (!bh-replay) { +aio_bh_call(bh); +} else { +replay_add_bh_event(bh, bh-id); +} } } diff --git a/dma-helpers.c b/dma-helpers.c index 6918572..357d7e9 100644 --- a/dma-helpers.c +++ b/dma-helpers.c @@ -13,6 +13,7 @@ #include qemu/range.h #include qemu/thread.h #include qemu/main-loop.h +#include replay/replay.h /* #define DEBUG_IOMMU */ @@ -96,7 +97,8 @@ static void continue_after_map_failure(void *opaque) { DMAAIOCB *dbs = (DMAAIOCB *)opaque; -dbs-bh = qemu_bh_new(reschedule_dma, dbs); +dbs-bh = qemu_bh_new_replay(reschedule_dma, dbs, + replay_get_current_step()); qemu_bh_schedule(dbs-bh); } diff --git a/hw/timer/arm_timer.c b/hw/timer/arm_timer.c index 1452910..97784a0 100644 --- a/hw/timer/arm_timer.c +++ b/hw/timer/arm_timer.c @@ -168,7 +168,7 @@ static arm_timer_state *arm_timer_init(uint32_t freq) s-freq = freq; s-control = TIMER_CTRL_IE; -bh = qemu_bh_new(arm_timer_tick, s); +bh = qemu_bh_new_replay(arm_timer_tick, s, 0); s-timer = ptimer_init(bh); vmstate_register(NULL, -1, vmstate_arm_timer, s); return s; diff --git a/include/block/aio.h b/include/block/aio.h index 82cdf78..ed76b43 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -35,6 +35,8 @@ struct BlockAIOCB { const AIOCBInfo *aiocb_info; BlockDriverState *bs; BlockCompletionFunc *cb; +bool replay; +uint64_t replay_step; void *opaque; int refcnt; }; @@ -144,6 +146,17 @@ void aio_context_release(AioContext *ctx); QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque); /** + * aio_bh_new_replay: Allocate a new bottom half structure for replay. + * + * This function calls aio_bh_new function and also fills replay parameters + * of the BH structure. BH created with this function in record/replay mode + * are executed through the replay queue only at checkpoints and instructions + * executions. + */ +QEMUBH *aio_bh_new_replay(AioContext *ctx, QEMUBHFunc *cb, void *opaque, + uint64_t id); + +/** * aio_notify: Force processing of pending events. * * Similar to signaling a condition variable, aio_notify forces @@ -159,6 +172,11 @@ QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque); void aio_notify(AioContext *ctx); /** + * aio_bh_call: Executes callback function of the specified BH. + */ +void aio_bh_call(void *opaque); + +/** * aio_bh_poll: Poll
Re: [Qemu-devel] [RFC PATCH v10 18/24] replay: replay aio requests
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 27/02/2015 14:11, Pavel Dovgalyuk wrote: This patch adds identifier to aio requests. ID is used for creating bottom halves and identifying them while replaying. The patch also introduces several functions that make possible replaying of the aio requests. Out of curiosity, why did you use this approach instead of using a RR-specific block device backend (as you did for network, I think)? The backend could just store the data that is read in the RR file (or in a separate file that can be easily mmap-ed), and use a timer to trigger it at the right QEMU_CLOCK_VIRTUAL tick. I'm sure you considered something like this. Did you still get non-determinism? We considered this approach. But it requires too much data to be written. E.g. when loading an OS guest VM reads gigabytes of data. Writing all this data to file incurs significant slowdown. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v9 00/23] Deterministic replay core
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 18/02/2015 12:55, Pavel Dovgalyuk wrote: This set of patches is related to the reverse execution and deterministic replay of qemu execution. This implementation of deterministic replay can be used for deterministic debugging of guest code through gdb remote interface. These patches include only core function of the replay, excluding the support for replaying serial, audio, network, and USB devices' operations. Reverse debugging and monitor commands were also excluded to be submitted later as separate patches. Execution recording writes non-deterministic events log, which can be later used for replaying the execution anywhere and for unlimited number of times. It also supports checkpointing for faster rewinding during reverse debugging. Execution replaying reads the log and replays all non-deterministic events including external input, hardware clocks, and interrupts. Deterministic replay has the following features: * Deterministically replays whole system execution and all contents of the memory, state of the hadrware devices, clocks, and screen of the VM. * Writes execution log into the file for latter replaying for multiple times on different machines. * Supports i386, x86_64, and ARM hardware platforms. * Performs deterministic replay of all operations with keyboard and mouse input devices. * Supports auto-checkpointing for convenient reverse debugging. Usage of the record/replay: * First, record the execution, by adding the following string to the command line: '-icount shift=7,rr=record,rrfile=replay.bin -net none'. Block devices' images are not actually changed in the recording mode, because all of the changes are written to the temporary overlay file. * Then you can replay it for the multiple times by using another command line option: '-icount shift=7,rr=replay,rrfile=replay.bin -net none' * '-net none' option should also be specified if network replay patches are not applied. Paper with short description of deterministic replay implementation: http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html Modifications of qemu include: * wrappers for clock and time functions to save their return values in the log * saving different asynchronous events (e.g. system shutdown) into the log * synchronization of the bottom halves execution * synchronization of the threads from thread pool * recording/replaying user input (mouse and keyboard) * adding internal events for cpu and io synchronization v9 changes: * Replaced fwrite/fread with putc/getc (as suggested by Paolo Bonzini) * Stopping virtual machine in case of replay file end (as suggested by Paolo Bonzini) * Removed one of the replay mutexes (as suggested by Paolo Bonzini) * Fixed RCU queue for bottom halves (as suggested by Paolo Bonzini) * Updated command line options' names (as suggested by Paolo Bonzini) * Added design document for record/replay (as suggested by Paolo Bonzini) * Simplified checkpoints for the timers * Added cloning InputEvent objects for replay (as suggested by Paolo Bonzini) * Added replay blockers instead of checking the command line (as suggested by Paolo Bonzini) * Some functions renaming and extracting. I haven't yet reviewed patch 23 completely, and I have to think (a lot :)) more about block devices. In the meanwhile I understand the replay code much better so I had some suggestions. Do you have more comments? I'm ready to submit a new version. In general, the handling of replay_has_unread_data / replay_data_kind is a bit messy. It would be nice if you could call replay_fetch_data_kind() only when replay_has_unread_data == 0. Or, even, remove replay_has_unread_data altogether: just call replay_fetch_data_kind() when you'd set it to zero. That would simplify a lot the code for readers. Fixed. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v9 22/23] replay: command line options
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 18/02/2015 12:57, Pavel Dovgalyuk wrote: @@ -2751,6 +2757,7 @@ int main(int argc, char **argv, char **envp) { int i; int snapshot, linux_boot; +const char *icount_option = NULL; const char *initrd_filename; const char *kernel_filename, *kernel_cmdline; const char *boot_order; @@ -3770,6 +3777,8 @@ int main(int argc, char **argv, char **envp) } } +replay_configure(icount_opts); Is it possible to call this together with configure_icount (or even from configure_icount)? I think that it is the best place for replay_configure. It sets replay_mode which may be checked while configuring the virtual machine. That's why this function is called before any initialization actions. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 08/21] cpu: replay instructions sequence
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 16/02/2015 14:37, Pavel Dovgaluk wrote: From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 16/02/2015 14:27, Pavel Dovgaluk wrote: From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 16/02/2015 13:26, Pavel Dovgaluk wrote: I think in this case there are no events at all - just reading timers values that were made while recording. We have to replay these reads by waking iothread. I think the right place for this is in replay_read_next_clock then. It doesn't fit. Log file is not read until all instructions are executed. And the next read from the file should be performed by iothread which should be notified and waked up. I still don't understand. If you're getting EXCP_INTERRUPT it means: 1) that cpu_signal was called No, it isn't. That is the branch when icount is expired. And when it is expired in replay mode we have to wake up iothread, because nobody will care about this. Then it's done here in qemu_tcg_cpu_thread_fn: Do you mean that I should put iothread notification right here? It already notifies the iothread. Or that this code duplicates my patch? If the second one then I guess that it doesn't help and I need to make additional checks about it. Yes. You can modify your patch to do: int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL); if (deadline == 0) { qemu_clock_notify(QEMU_CLOCK_VIRTUAL); } instead of qemu_notify_event(), and remove these lines from qemu_tcg_cpu_thread_fn. I tried this one. But there is one problem. Expiring of the virtual timers is not the only reason of icount expiration in replay mode. It may be caused by host timers deadline or poll timeout in record mode. In this case qemu_clock_notify(QEMU_CLOCK_VIRTUAL) will not be called in replay mode and we'll waste time for iothread sleeping. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 08/21] cpu: replay instructions sequence
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 17/02/2015 09:43, Pavel Dovgaluk wrote: int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL); if (deadline == 0) { qemu_clock_notify(QEMU_CLOCK_VIRTUAL); } instead of qemu_notify_event(), and remove these lines from qemu_tcg_cpu_thread_fn. I tried this one. But there is one problem. Expiring of the virtual timers is not the only reason of icount expiration in replay mode. It may be caused by host timers deadline or poll timeout in record mode. In this case qemu_clock_notify(QEMU_CLOCK_VIRTUAL) will not be called in replay mode and we'll waste time for iothread sleeping. Sure, but unconditional qemu_notify_event() is also wrong. So, to sum up: - it's okay to move code from qemu_tcg_cpu_thread_fn to cpu-exec.c - it's okay to add more qemu_clock_notify calls than just qemu_clock_notify(QEMU_CLOCK_VIRTUAL), each with its own condition - it's better if all these, after being moved to cpu-exec.c, are also extracted in a separate function - it's not okay to do an unconditional qemu_notify_event() in cpu-exec.c, even if it's under if (replay_mode != NONE). How can I wake up iothread if there are no pending timers? deadline will (almost) never become zero in my case, because there is another kind of event in the log (not the timer one). Should I fetch the event and call qemu_notify_event() from replay module? Thanks for your understanding! :) Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 08/21] cpu: replay instructions sequence
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 16/02/2015 14:27, Pavel Dovgaluk wrote: From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 16/02/2015 13:26, Pavel Dovgaluk wrote: I think in this case there are no events at all - just reading timers values that were made while recording. We have to replay these reads by waking iothread. I think the right place for this is in replay_read_next_clock then. It doesn't fit. Log file is not read until all instructions are executed. And the next read from the file should be performed by iothread which should be notified and waked up. I still don't understand. If you're getting EXCP_INTERRUPT it means: 1) that cpu_signal was called No, it isn't. That is the branch when icount is expired. And when it is expired in replay mode we have to wake up iothread, because nobody will care about this. Then it's done here in qemu_tcg_cpu_thread_fn: Do you mean that I should put iothread notification right here? Or that this code duplicates my patch? If the second one then I guess that it doesn't help and I need to make additional checks about it. if (use_icount) { int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL); if (deadline == 0) { qemu_clock_notify(QEMU_CLOCK_VIRTUAL); } } If you need to move the 4 lines inside the if elsewhere, that I guess it's okay. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 08/21] cpu: replay instructions sequence
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 02/02/2015 13:42, Pavel Dovgaluk wrote: From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 02/02/2015 13:28, Pavel Dovgaluk wrote: cpu-exception_index = EXCP_INTERRUPT; next_tb = 0; +qemu_notify_event(); Why is this needed? It is needed to wake up iothread in replay mode. Otherwise it waits for additional time and replay becomes too slow. What event (something from a timerlist?) is ready, that has not been notified to the iothread yet? qemu_notify_event() should never be needed in common case. It's probably missing somewhere else. I think in this case there are no events at all - just reading timers values that were made while recording. We have to replay these reads by waking iothread. I think the right place for this is in replay_read_next_clock then. It doesn't fit. Log file is not read until all instructions are executed. And the next read from the file should be performed by iothread which should be notified and waked up. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 08/21] cpu: replay instructions sequence
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 16/02/2015 13:26, Pavel Dovgaluk wrote: I think in this case there are no events at all - just reading timers values that were made while recording. We have to replay these reads by waking iothread. I think the right place for this is in replay_read_next_clock then. It doesn't fit. Log file is not read until all instructions are executed. And the next read from the file should be performed by iothread which should be notified and waked up. I still don't understand. If you're getting EXCP_INTERRUPT it means: 1) that cpu_signal was called No, it isn't. That is the branch when icount is expired. And when it is expired in replay mode we have to wake up iothread, because nobody will care about this. 2) in turn this means that qemu_cpu_kick was called 3) in turn this means that the iothread is trying to run via qemu_mutex_lock_iothread (the iothread_requesting_mutex stuff). So why do you need an explicit qemu_notify_event? Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 21/21] replay: recording of the user input
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 22/01/2015 09:53, Pavel Dovgalyuk wrote: +if (replay_mode != REPLAY_MODE_PLAY) { +evt = qemu_input_event_new_key(key, down); +if (QTAILQ_EMPTY(kbd_queue)) { +qemu_input_event_send(src, evt); +qemu_input_event_sync(); +if (replay_mode != REPLAY_MODE_RECORD) { +qapi_free_InputEvent(evt); +} This is wrong. You have different lifetimes for different modes. Please make a copy of the event in the implementation of record mode. What is the correct way for cloning the QAPI type? I should invent the cloning visitor or just create a switch for correct cloning of the InputEvent union? Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 20/21] replay: command line options
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 09/02/2015 13:15, Pavel Dovgaluk wrote: From: Paolo Bonzini [mailto:pbonz...@redhat.com] break; case QEMU_OPTION_audio_help: AUD_help (); @@ -3244,6 +3254,7 @@ int main(int argc, char **argv, char **envp) if (!opts) { exit(1); } +not_compatible_replay_param++; Why not compatible? Replay for audio adapter will be added in latter patches. Trying to record/replay machine with audio using current set of patches will break the replay. For this case you can try adding a mechanism similar to migration blockers (replay blockers). Do you mean adding flag to vmsd similar to unmigratable? Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 16/21] replay: bottom halves
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 22/01/2015 09:53, Pavel Dovgalyuk wrote: This patch introduces bottom half event for replay queue. It saves the events into the queue and process them at the checkpoints and instructions execution. Which bottom halves must _not_ go through aio/qemu_bh_new_replay? The block layer and migration internal ones. They should not be replayed because are used for simulator purposes and not for VM execution. +QEMUBH *aio_bh_new_replay(AioContext *ctx, QEMUBHFunc *cb, void *opaque, + uint64_t id) id is superfluous, it can always be replay_get_current_step(). It seems to me that the device models can always use qemu_bh_new_replay. There are few if any uses of qemu_bh_new outside device models. Perhaps convert those to aio_bh_new, and make qemu_bh_new always do special replay treatment? I've created some kind of design document. Then I'll probably leave these changes until that document review. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 21/21] replay: recording of the user input
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 22/01/2015 09:53, Pavel Dovgalyuk wrote: +void replay_input_event(QemuConsole *src, InputEvent *evt) +{ +if (replay_mode == REPLAY_MODE_PLAY) { +/* Nothing */ +} else if (replay_mode == REPLAY_MODE_RECORD) { +replay_add_input_event(evt); Does replay_add_input_event ultimately call qemu_input_event_send_impl? No, it just adds event to the queue. +} else { +qemu_input_event_send_impl(src, evt); +} +} + Perhaps make this and replay_input_sync_event return a bool and in the caller do: if (replay_input_event(src, evt)) { qemu_input_event_send_impl(src, evt): } No, we can't. qemu_input_event_send_impl is called when the queue is saved to the log. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH] win64: perform correct setjmp calls
From: Stefan Weil [mailto:s...@weilnetz.de] Am 09.02.2015 um 09:07 schrieb Pavel Dovgaluk: From: Stefan Weil [mailto:s...@weilnetz.de] Am 09.02.2015 um 08:55 schrieb Pavel Dovgalyuk: On w64, setjmp is implemented by _setjmp which needs a second parameter. This parameter should be NULL to allow using longjump from generated code. This patch replaces all usages of setjmp.h with new header files which replaces setjmp with _setjmp function on win64 platform. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru Please have a look at include/sysemu/os-win32.h. I think that your patch is not needed because the current code already uses _setjmp. Right, but some of the files (e.g. include/qom/cpu.h) include setjmp.h directly. Then we have the following for compiling cpu-exec.c: cpu-exec.c: ... os-win32.h ... setjmp.h ... In this situation cpu-exec will call incorrect setjmp function. Pavel Dovgalyuk It won't call the wrong setjmp function, at least not in my tests. cpu-exec.c gets the setjmp declaration from os-win32.h. Without it, QEMU would be unusable because it would crash very soon during the emulation. Do you see problems caused by a wrong setjmp with latest QEMU? If yes: which build environment do you use (host, compiler, version of MinGW*)? Yes, I've got the problems. To verify the correctness of the call I disassembled cpu-exec.o and found that setjmp is called with second parameter not equal to zero. I'm using Cygwin environment and packages from here: http://win-builds.org/ My host OS is Windows 7x64 and the compiler is x86_64-w64-mingw32-gcc.exe (GCC) 4.8.3 Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH] win64: perform correct setjmp calls
From: Stefan Weil [mailto:s...@weilnetz.de] Am 09.02.2015 um 08:55 schrieb Pavel Dovgalyuk: On w64, setjmp is implemented by _setjmp which needs a second parameter. This parameter should be NULL to allow using longjump from generated code. This patch replaces all usages of setjmp.h with new header files which replaces setjmp with _setjmp function on win64 platform. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru Please have a look at include/sysemu/os-win32.h. I think that your patch is not needed because the current code already uses _setjmp. Right, but some of the files (e.g. include/qom/cpu.h) include setjmp.h directly. Then we have the following for compiling cpu-exec.c: cpu-exec.c: ... os-win32.h ... setjmp.h ... In this situation cpu-exec will call incorrect setjmp function. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 19/21] replay: initialization and deinitialization
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 22/01/2015 09:53, Pavel Dovgalyuk wrote: This patch introduces the functions for enabling the record/replay and for freeing the resources when simulator closes. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru }; /* Asynchronous events IDs */ diff --git a/replay/replay.c b/replay/replay.c index 7c4a801..fa738c2 100755 --- a/replay/replay.c +++ b/replay/replay.c @@ -15,9 +15,18 @@ #include qemu/timer.h #include sysemu/sysemu.h +/* Current version of the replay mechanism. + Increase it when file format changes. */ +#define REPLAY_VERSION 0xe02002 Any meaning? Perhaps make it 'QRR\0' or something identifiable? Version id has no special meaning, but has to be distinct from other RR versions. When format of the log file changes we have to update version id to prevent reading incompatible files. +replay_file = fopen(fname, fmode); +if (replay_file == NULL) { +fprintf(stderr, Replay: open %s: %s\n, fname, strerror(errno)); +exit(1); +} + +replay_filename = g_strdup(fname); + +replay_mode = mode; +replay_has_unread_data = 0; +replay_data_kind = -1; +replay_state.instructions_count = 0; +replay_state.current_step = 0; + +/* skip file header for RECORD and check it for PLAY */ +if (replay_mode == REPLAY_MODE_RECORD) { +fseek(replay_file, HEADER_SIZE, SEEK_SET); Why can't you write the header here? We don't write header here to detect broken log files at the replay stage. +replay_save_instructions(); + +/* finalize the file */ +if (replay_file) { +if (replay_mode == REPLAY_MODE_RECORD) { +uint64_t offset = 0; +/* write end event */ +replay_put_event(EVENT_END); + +/* write header */ +fseek(replay_file, 0, SEEK_SET); +replay_put_dword(REPLAY_VERSION); +replay_put_qword(offset); Why is this done here? You're writing a constant zero. Right, this offset is the field reserved for snapshots table. It will be introduced in other patches. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 20/21] replay: command line options
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 22/01/2015 09:53, Pavel Dovgalyuk wrote: This patch introduces command line options for enabling recording or replaying virtual machine behavior. -record option starts recording of the execution and saves it into the log, specified with fname parameter. -replay option is intended for replaying previously saved log. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru break; case QEMU_OPTION_snapshot: snapshot = 1; @@ -3105,6 +3114,7 @@ int main(int argc, char **argv, char **envp) #endif case QEMU_OPTION_bt: add_device_config(DEV_BT, optarg); +not_compatible_replay_param++; Could it be enough to add a migration blocker? Record/replay core does not use migration subsystem. That is why it should check the hardware by itself. break; case QEMU_OPTION_audio_help: AUD_help (); @@ -3244,6 +3254,7 @@ int main(int argc, char **argv, char **envp) if (!opts) { exit(1); } +not_compatible_replay_param++; Why not compatible? Replay for audio adapter will be added in latter patches. Trying to record/replay machine with audio using current set of patches will break the replay. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 11/21] replay: recording and replaying clock ticks
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 03/02/2015 11:51, Pavel Dovgaluk wrote: From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 22/01/2015 09:52, Pavel Dovgalyuk wrote: Clock ticks are considered as the sources of non-deterministic data for virtual machine. This patch implements saving the clock values when they are acquired (virtual, host clock, rdtsc, and some other timers). When replaying the execution corresponding values are read from log and transfered to the module, which wants to read the values. Such a design required the clock polling to be synchronized. Sometimes it is not true - e.g. when timeouts for timer lists are checked. In this case we use a cached value of the clock, passing it to the client code. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- cpus.c |3 +- include/qemu/timer.h | 10 + qemu-timer.c |7 ++-- replay/Makefile.objs |1 + replay/replay-internal.h | 13 +++ replay/replay-time.c | 84 ++ replay/replay.h | 25 ++ stubs/replay.c |9 + 8 files changed, 147 insertions(+), 5 deletions(-) create mode 100755 replay/replay-time.c diff --git a/cpus.c b/cpus.c index 8787277..01d89aa 100644 --- a/cpus.c +++ b/cpus.c @@ -353,7 +353,8 @@ static void icount_warp_rt(void *opaque) seqlock_write_lock(timers_state.vm_clock_seqlock); if (runstate_is_running()) { -int64_t clock = cpu_get_clock_locked(); +int64_t clock = REPLAY_CLOCK(REPLAY_CLOCK_VIRTUAL_RT, + cpu_get_clock_locked()); int64_t warp_delta; warp_delta = clock - vm_clock_warp_start; diff --git a/include/qemu/timer.h b/include/qemu/timer.h index 0666920..0c2472c 100644 --- a/include/qemu/timer.h +++ b/include/qemu/timer.h @@ -4,6 +4,7 @@ #include qemu/typedefs.h #include qemu-common.h #include qemu/notify.h +#include replay/replay.h /* timers */ @@ -760,6 +761,8 @@ int64_t cpu_icount_to_ns(int64_t icount); /***/ /* host CPU ticks (if available) */ +#define cpu_get_real_ticks cpu_get_real_ticks_impl + #if defined(_ARCH_PPC) static inline int64_t cpu_get_real_ticks(void) @@ -913,6 +916,13 @@ static inline int64_t cpu_get_real_ticks (void) } #endif +#undef cpu_get_real_ticks + +static inline int64_t cpu_get_real_ticks(void) cpu_get_real_ticks should never be used. Please instead wrap cpu_get_ticks() with REPLAY_CLOCK. I don't quite understand this comment. Do you mean that I should move REPLAY_CLOCK to the cpu_get_real_ticks usages instead of it's implementation? Only to the cpu_get_ticks usage. The others are okay. cpu_get_ticks cannot call cpu_get_real_ticks in icount mode. And other functions can. Then we should put REPLAY_CLOCK into those functions? +/*! Reads next clock value from the file. +If clock kind read from the file is different from the parameter, +the value is not used. +If the parameter is -1, the clock value is read to the cache anyway. */ In what case could the clock kind not match? It was used in full version which had to skip clock from the log while loading the VM state. So can it be removed for now? I think it can. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 12/21] replay: recording and replaying different timers
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 22/01/2015 09:52, Pavel Dovgalyuk wrote: This patch introduces functions for recording and replaying realtime sources, that do not use qemu-clock interface. These include return value of time() function in time_t and struct tm forms. Patch also adds warning to get_timedate function to prevent its usage in recording mode, because it may lead to non-determinism. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- hw/timer/mc146818rtc.c |3 + hw/timer/pl031.c |3 + include/qemu-common.h|1 replay/replay-internal.h |4 + replay/replay-time.c | 132 ++ replay/replay.h |8 +++ vl.c | 17 +- 7 files changed, 163 insertions(+), 5 deletions(-) diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c index f18d128..92295fb 100644 --- a/hw/timer/mc146818rtc.c +++ b/hw/timer/mc146818rtc.c @@ -28,6 +28,7 @@ #include qapi/visitor.h #include qapi-event.h #include qmp-commands.h +#include replay/replay.h #ifdef TARGET_I386 #include hw/i386/apic.h @@ -703,7 +704,7 @@ static void rtc_set_date_from_host(ISADevice *dev) RTCState *s = MC146818_RTC(dev); struct tm tm; -qemu_get_timedate(tm, 0); What about just making qemu_get_timedate use qemu_clock_get_ns(QEMU_CLOCK_HOST) instead of time()? This would just work using the infrastructure of the previous patch. I can get rid of these calls, but localtime() function used in qemu_get_timedate() will not work deterministically. Should we save its' result then? Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 11/21] replay: recording and replaying clock ticks
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 22/01/2015 09:52, Pavel Dovgalyuk wrote: Clock ticks are considered as the sources of non-deterministic data for virtual machine. This patch implements saving the clock values when they are acquired (virtual, host clock, rdtsc, and some other timers). When replaying the execution corresponding values are read from log and transfered to the module, which wants to read the values. Such a design required the clock polling to be synchronized. Sometimes it is not true - e.g. when timeouts for timer lists are checked. In this case we use a cached value of the clock, passing it to the client code. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- cpus.c |3 +- include/qemu/timer.h | 10 + qemu-timer.c |7 ++-- replay/Makefile.objs |1 + replay/replay-internal.h | 13 +++ replay/replay-time.c | 84 ++ replay/replay.h | 25 ++ stubs/replay.c |9 + 8 files changed, 147 insertions(+), 5 deletions(-) create mode 100755 replay/replay-time.c diff --git a/cpus.c b/cpus.c index 8787277..01d89aa 100644 --- a/cpus.c +++ b/cpus.c @@ -353,7 +353,8 @@ static void icount_warp_rt(void *opaque) seqlock_write_lock(timers_state.vm_clock_seqlock); if (runstate_is_running()) { -int64_t clock = cpu_get_clock_locked(); +int64_t clock = REPLAY_CLOCK(REPLAY_CLOCK_VIRTUAL_RT, + cpu_get_clock_locked()); int64_t warp_delta; warp_delta = clock - vm_clock_warp_start; diff --git a/include/qemu/timer.h b/include/qemu/timer.h index 0666920..0c2472c 100644 --- a/include/qemu/timer.h +++ b/include/qemu/timer.h @@ -4,6 +4,7 @@ #include qemu/typedefs.h #include qemu-common.h #include qemu/notify.h +#include replay/replay.h /* timers */ @@ -760,6 +761,8 @@ int64_t cpu_icount_to_ns(int64_t icount); /***/ /* host CPU ticks (if available) */ +#define cpu_get_real_ticks cpu_get_real_ticks_impl + #if defined(_ARCH_PPC) static inline int64_t cpu_get_real_ticks(void) @@ -913,6 +916,13 @@ static inline int64_t cpu_get_real_ticks (void) } #endif +#undef cpu_get_real_ticks + +static inline int64_t cpu_get_real_ticks(void) cpu_get_real_ticks should never be used. Please instead wrap cpu_get_ticks() with REPLAY_CLOCK. I don't quite understand this comment. Do you mean that I should move REPLAY_CLOCK to the cpu_get_real_ticks usages instead of it's implementation? +{ +return REPLAY_CLOCK(REPLAY_CLOCK_REAL_TICKS, cpu_get_real_ticks_impl()); +} + #ifdef CONFIG_PROFILER static inline int64_t profile_getclock(void) { diff --git a/qemu-timer.c b/qemu-timer.c index 98d9d1b..bc981a2 100644 --- a/qemu-timer.c +++ b/qemu-timer.c @@ -25,6 +25,7 @@ #include sysemu/sysemu.h #include monitor/monitor.h #include ui/console.h +#include replay/replay.h #include hw/hw.h @@ -566,15 +567,15 @@ int64_t qemu_clock_get_ns(QEMUClockType type) return cpu_get_clock(); } case QEMU_CLOCK_HOST: -now = get_clock_realtime(); +now = REPLAY_CLOCK(REPLAY_CLOCK_HOST, get_clock_realtime()); last = clock-last; clock-last = now; -if (now last) { +if (now last replay_mode == REPLAY_MODE_NONE) { notifier_list_notify(clock-reset_notifiers, now); } return now; case QEMU_CLOCK_VIRTUAL_RT: -return cpu_get_clock(); +return REPLAY_CLOCK(REPLAY_CLOCK_VIRTUAL_RT, cpu_get_clock()); } } diff --git a/replay/Makefile.objs b/replay/Makefile.objs index 56da09c..257c320 100755 --- a/replay/Makefile.objs +++ b/replay/Makefile.objs @@ -1,3 +1,4 @@ obj-y += replay.o obj-y += replay-internal.o obj-y += replay-events.o +obj-y += replay-time.o diff --git a/replay/replay-internal.h b/replay/replay-internal.h index 1666d6e..e906ec3 100755 --- a/replay/replay-internal.h +++ b/replay/replay-internal.h @@ -22,7 +22,10 @@ enum ReplayEvents { /* for emulated exceptions */ EVENT_EXCEPTION, /* for async events */ -EVENT_ASYNC +EVENT_ASYNC, +/* for clock read/writes */ +/* some of grteater codes are reserved for clocks */ +EVENT_CLOCK }; /* Asynchronous events IDs */ @@ -34,6 +37,8 @@ enum ReplayAsyncEventKind { typedef enum ReplayAsyncEventKind ReplayAsyncEventKind; typedef struct ReplayState { +/*! Cached clock values. */ +int64_t cached_clock[REPLAY_CLOCK_COUNT]; /*! Current step - number of processed instructions and timer events. */ uint64_t current_step; /*! Number of instructions to
Re: [Qemu-devel] [RFC PATCH v8 08/21] cpu: replay instructions sequence
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 02/02/2015 13:28, Pavel Dovgaluk wrote: cpu-exception_index = EXCP_INTERRUPT; next_tb = 0; +qemu_notify_event(); Why is this needed? It is needed to wake up iothread in replay mode. Otherwise it waits for additional time and replay becomes too slow. What event (something from a timerlist?) is ready, that has not been notified to the iothread yet? qemu_notify_event() should never be needed in common case. It's probably missing somewhere else. I think in this case there are no events at all - just reading timers values that were made while recording. We have to replay these reads by waking iothread. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 09/21] replay: interrupts and exceptions
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 22/01/2015 09:52, Pavel Dovgalyuk wrote: +if (replay_mode == REPLAY_MODE_RECORD) { +replay_save_instructions(); +replay_put_event(EVENT_EXCEPTION); +return true; Missing mutex lock/unlock. I think not. We just have to write number of already executed instructions. This number is not linked to exception event. They could be read in replay while processing some other event. @@ -1294,6 +1295,9 @@ bool x86_cpu_exec_interrupt(CPUState *cs, int interrupt_request) if (interrupt_request CPU_INTERRUPT_POLL) { cs-interrupt_request = ~CPU_INTERRUPT_POLL; apic_poll_irq(cpu-apic_state); +if (replay_mode != REPLAY_MODE_NONE) { +return true; +} } #endif Can you explain this? It probably needs a comment. Each function call should process one interrupt at once, because it corresponds to single event in the log. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 08/21] cpu: replay instructions sequence
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 22/01/2015 09:52, Pavel Dovgalyuk wrote: This patch adds calls to replay functions into the icount setup block. In record mode number of executed instructions is written to the log. In replay mode number of istructions to execute is taken from the replay log. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- cpu-exec.c |1 + cpus.c | 28 ++-- replay/replay.c | 24 replay/replay.h |4 4 files changed, 47 insertions(+), 10 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 49f01f5..99a0993 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -531,6 +531,7 @@ int cpu_exec(CPUArchState *env) } cpu-exception_index = EXCP_INTERRUPT; next_tb = 0; +qemu_notify_event(); Why is this needed? It is needed to wake up iothread in replay mode. Otherwise it waits for additional time and replay becomes too slow. cpu_loop_exit(cpu); } break; Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 00/21] Deterministic replay core
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 22/01/2015 09:51, Pavel Dovgalyuk wrote: These patches include only core function of the replay, excluding the support for replaying serial, audio, network, and USB devices' operations. Reverse debugging and monitor commands were also excluded to be submitted later as separate patches. BTW, I am interested in the serial part of the RR functionality, since it is probably simpler than block. Right, it's quite simple. But replaying block layer gives us a usable version of record/replay. That is why serial devices replay is not included in the very first series. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 04/21] replay: internal functions for replay log
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 22/01/2015 09:51, Pavel Dovgalyuk wrote: This patch adds functions to perform read and write operations with replay log. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- +void replay_check_error(void) Could this be static? (I haven't checked). No, because it is used from several replay files. +{ +if (replay_file) { +if (feof(replay_file)) { +fprintf(stderr, replay file is over\n); +exit(1); Perhaps qemu_system_vmstop_request_prepare + qemu_system_vmstop_request(RUN_STATE_PAUSED) instead of exit? Those two functions are thread-safe. There is no need to stop when replay file is over (because we cannot replay more). Should we send shutdown request instead? Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v8 00/21] Deterministic replay core
Ping? Pavel Dovgalyuk -Original Message- From: Pavel Dovgalyuk [mailto:pavel.dovga...@ispras.ru] Sent: Thursday, January 22, 2015 11:52 AM To: qemu-devel@nongnu.org Cc: peter.mayd...@linaro.org; peter.crosthwa...@xilinx.com; ebl...@redhat.com; mark.bur...@greensocs.com; r...@ispras.ru; batuz...@ispras.ru; maria.klimushenk...@ispras.ru; pavel.dovga...@ispras.ru; pbonz...@redhat.com; alex.ben...@linaro.org; afaer...@suse.de; fred.kon...@greensocs.com Subject: [RFC PATCH v8 00/21] Deterministic replay core This set of patches is related to the reverse execution and deterministic replay of qemu execution This implementation of deterministic replay can be used for deterministic debugging of guest code through gdb remote interface. These patches include only core function of the replay, excluding the support for replaying serial, audio, network, and USB devices' operations. Reverse debugging and monitor commands were also excluded to be submitted later as separate patches. Execution recording writes non-deterministic events log, which can be later used for replaying the execution anywhere and for unlimited number of times. It also supports checkpointing for faster rewinding during reverse debugging. Execution replaying reads the log and replays all non-deterministic events including external input, hardware clocks, and interrupts. Deterministic replay has the following features: * Deterministically replays whole system execution and all contents of the memory, state of the hadrware devices, clocks, and screen of the VM. * Writes execution log into the file for latter replaying for multiple times on different machines. * Supports i386, x86_64, and ARM hardware platforms. * Performs deterministic replay of all operations with keyboard and mouse input devices. * Supports auto-checkpointing for convenient reverse debugging. Usage of the record/replay: * First, record the execution, by adding the following string to the command line: '-record fname=replay.bin -icount 7 -net none'. Block devices' images are not actually changed in the recording mode, because all of the changes are written to the temporary overlay file. * Then you can replay it for the multiple times by using another command line option: '-replay fname=replay.bin -icount 7 -net none' * '-net none' option should also be specified if network replay patches are not applied. Paper with short description of deterministic replay implementation: http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html Modifications of qemu include: * wrappers for clock and time functions to save their return values in the log * saving different asynchronous events (e.g. system shutdown) into the log * synchronization of the bottom halves execution * synchronization of the threads from thread pool * recording/replaying user input (mouse and keyboard) * adding internal events for cpu and io synchronization v8 changes: * Simplified processing of the shutdown event (as suggested by Paolo Bonzini) * Replaced stack of bottom halves in AIO context with QSIMPLEQ (as suggested by Paolo Bonzini) * Moved replay_submode out of the series (as suggested by Paolo Bonzini) * Moved suffix option out of the series * Converted some of the defines into enums (as suggested by Paolo Bonzini) * Encapsulated save_tm/read_tm calls into the single function (as suggested by Paolo Bonzini) * Moved record/replay options to icount group (as suggested by Paolo Bonzini) * Updated mutex protection for the events queue (as suggested by Paolo Bonzini) * Added mutex to protect replay log file (as suggested by Paolo Bonzini) * Minor cleanups v7 changes: * Removed patches that were applied to upstream. v6 changes: * Fixed replay stub return value (as suggested by Eric Blake) * Fixed icount warping. * Virtual rt clock now uses cpu_get_clock() (as suggested by Paolo Bonzini) * Replated get_clock_realtime and get_clock calls with qemu clock requests (as suggested by Paolo Bonzini) * Modified can_do_io logic to allow requesting icount from cpu_exec function (as suggested by Paolo Bonzini) * Removed applied patches. v5 changes: * Minor changes. * Used fixed-width integer types for read/write functions (as suggested by Alex Bennee) * Moved savevm-related code out of the core. * Added new traced clock for deterministic virtual clock warping (as suggested by Paolo Bonzini) * Fixed exception_index reset for user mode (as suggested by Paolo Bonzini) * Adopted Paolo's icount patches * Fixed hardware interrupts replaying v4 changes: * Updated block drivers to support new bdrv_open interface. * Moved migration patches into separate series (as suggested by Paolo Bonzini) * Fixed a bug in replay_break operation. * Fixed rtl8139 migration for replay. * Fixed 'period' parameter processing for
Re: [Qemu-devel] [RFC PATCH v7 15/21] replay: checkpoints
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 13/01/2015 10:15, Pavel Dovgaluk wrote: The numbers have no meaning. They just have to be distinct in different places. This is easier to achieve if you give a name to each place. Sorry, missed one thing. run_all is used to distinguish timers processed in AIO by calling of timerlistgroup_run_timers function and in main loop by calling qemu_clock_run_all_timers. Should you instead distinguish which TimerListGroup is being run? Then main loop would checkpoint once for every TimerListGroup (which means twice). I checked that. We should distinguish only run_all_timers and all other calls. Distinguishing the TimerListGroups does not work - replay hangs. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 12/21] replay: recording and replaying clock ticks
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 19/01/2015 14:10, Pavel Dovgaluk wrote: Because 'A' is written only inside some of the replay_run_event callbacks. It depends on type of the event and it's processing function inside the QEMU core. There could be no 'A' at all. Why can't that code write the 'E' as well? Because such callbacks do not know that they are called from record/replay event. They may be called from record/replay code and from other parts of QEMU. And they may write save something low-level like timer request. I do not understand. Can you make an example with actual function names and data that is written? I haven't found any actual examples. Probably they occurred in previous version which wrote virtual timers into the log. Anyway I will add the mutex then. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 12/21] replay: recording and replaying clock ticks
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 19/01/2015 13:03, Pavel Dovgaluk wrote: It will work for protecting the events list (I've already did this). But that will not work for protecting the log file. replay_run_event can write some data to the log. And also some other function like replay_checkpoint can also write to the log simultaneously (if we will remove the global mutex). That's why we cannot simply replace the global mutex with some kind of local one. That's why you have to document very well the architecture and the locking policy. Ok. For example, why can't replay_run_event (or something that it calls) take the replay lock locally, when it writes to the log? replay_run_event can take the lock. Suppose that it writes data 'A'. replay_run_event itself corresponds to some event 'E'. We expect that the following sequence of the events should occur: 'E', 'A'. But if something will be written to the log between 'E' and 'A' then replay_run_event in replay mode will stuck, because it will not see its data 'A'. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 12/21] replay: recording and replaying clock ticks
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 19/01/2015 13:43, Pavel Dovgaluk wrote: For example, why can't replay_run_event (or something that it calls) take the replay lock locally, when it writes to the log? replay_run_event can take the lock. Suppose that it writes data 'A'. replay_run_event itself corresponds to some event 'E'. We expect that the following sequence of the events should occur: 'E', 'A'. But if something will be written to the log between 'E' and 'A' then replay_run_event in replay mode will stuck, because it will not see its data 'A'. It would be easier if you pointed me to actual code in the series. But this doesn't seem impossible to fix by atomically writing the 'E' and 'A' in the same critical section. Because 'A' is written only inside some of the replay_run_event callbacks. It depends on type of the event and it's processing function inside the QEMU core. There could be no 'A' at all. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 12/21] replay: recording and replaying clock ticks
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 19/01/2015 14:01, Pavel Dovgaluk wrote: It would be easier if you pointed me to actual code in the series. But this doesn't seem impossible to fix by atomically writing the 'E' and 'A' in the same critical section. Because 'A' is written only inside some of the replay_run_event callbacks. It depends on type of the event and it's processing function inside the QEMU core. There could be no 'A' at all. Why can't that code write the 'E' as well? Because such callbacks do not know that they are called from record/replay event. They may be called from record/replay code and from other parts of QEMU. And they may write save something low-level like timer request. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 12/21] replay: recording and replaying clock ticks
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 16/01/2015 09:03, Pavel Dovgaluk wrote: There is one problem with protecting log file inside the replay code. We probably should have the following sequence of calls: lock_replay_mutex replay_save_events replay_run_event unlock_replay_mutex But replay_run_event can also generate some log events and we will have deadlock here. That is why we rely on global mutex. I think replay_run_event should run with the lock _not_ taken, that is: qemu_mutex_lock(lock); while (!QTAILQ_EMPTY(events_list)) { Event *event = QTAILQ_FIRST(events_list); QTAILQ_REMOVE(events_list, event, events); qemu_mutex_unlock(lock); replay_run_event(event); g_free(event); qemu_mutex_lock(lock); } qemu_mutex_unlock(lock); It will work for protecting the events list (I've already did this). But that will not work for protecting the log file. replay_run_event can write some data to the log. And also some other function like replay_checkpoint can also write to the log simultaneously (if we will remove the global mutex). That's why we cannot simply replace the global mutex with some kind of local one. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 12/21] replay: recording and replaying clock ticks
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 13/01/2015 10:21, Pavel Dovgaluk wrote: +/*! Reads next clock event from the input. */ +int64_t replay_read_clock(unsigned int kind) +{ +if (kind = REPLAY_CLOCK_COUNT) { +fprintf(stderr, invalid clock ID %d for replay\n, kind); +exit(1); +} + +replay_exec_instructions(); + +if (replay_file) { +if (skip_async_events(EVENT_CLOCK + kind)) { +replay_read_next_clock(kind); +} +int64_t ret = replay_state.cached_clock[kind]; + +return ret; +} + +fprintf(stderr, REPLAY INTERNAL ERROR %d\n, __LINE__); +exit(1); +} Is this thread safe? It is, because order of main_loop and cpu_exec executions is protected by global mutex. Please document exactly which globals are protected by the rr QemuMutex, and which by the global mutex. But I think as many variables as possible should be protected by the rr QemuMutex, for two reasons: 1) people are working on threaded TCG. While SMP record/replay is a whole different story, even UP TCG will be multithreaded. 2) in the end it makes reviewing the code simpler. There is one problem with protecting log file inside the replay code. We probably should have the following sequence of calls: lock_replay_mutex replay_save_events replay_run_event unlock_replay_mutex But replay_run_event can also generate some log events and we will have deadlock here. That is why we rely on global mutex. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 21/21] replay: recording of the user input
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 12/01/2015 13:01, Pavel Dovgalyuk wrote: +void qemu_input_event_send(QemuConsole *src, InputEvent *evt) { -QemuInputHandlerState *s; - if (!runstate_is_running() !runstate_check(RUN_STATE_SUSPENDED)) { return; } +if (replay_mode == REPLAY_MODE_PLAY) { +/* Nothing */ +} else if (replay_mode == REPLAY_MODE_RECORD) { +replay_add_input_event(evt); +} else { +qemu_input_event_send_impl(src, evt); +} Similar to other cases, please wrap this into a single function, something like if (replay_handle_input_event(evt)) { return; } /* ... original contents of qemu_input_event_send ... */ I can wrap this, but cannot get rid of _impl function. In replay mode we have to make two things: - Deny real user imput (qemu_input_event_send should do nothing) - Read input from the log (qemu_input_event_send should process read data) Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 08/21] replay: interrupts and exceptions
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 12/01/2015 13:40, Pavel Dovgaluk wrote: Perhaps check the replay_interrupt() outside, in an with if (unlikely(interrupt_request))? You mean that I should wrap whole condition into unlikely? No, I wanted to have a single check of replay_interrupt() and/or replay_has_interrupt(). BTW, I think this is incorrect: +if ((replay_mode != REPLAY_MODE_PLAY +|| replay_has_interrupt()) + cc-cpu_exec_interrupt(cpu, interrupt_request)) { +replay_interrupt(); because cc-cpu_exec_interrupt() can exit with cpu_loop_exit(cpu). Haven't found any. Do you have an example? I guess it could be something like this: if (replay_mode == REPLAY_MODE_PLAY !replay_has_interrupt()) { /* do nothing */ } else if (interrupt_request CPU_INTERRUPT_HALT) { replay_interrupt(); ... cpu_loop_exit(cpu); } else if (interrupt_request CPU_INTERRUPT_INIT) { replay_interrupt(); ... cpu_loop_exit(cpu); } else { replay_interrupt(); if (cc-cpu_exec_interrupt(cpu, interrupt_request)) { next_tb = 0; } } Is it normal that processing of the reset request does not execute cpu_loop_exit(cpu)? #else -if (interrupt_request CPU_INTERRUPT_RESET) { +if ((interrupt_request CPU_INTERRUPT_RESET) + replay_interrupt()) { cpu_reset(cpu); } #endif Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 08/21] replay: interrupts and exceptions
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 14/01/2015 10:07, Pavel Dovgaluk wrote: From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 12/01/2015 13:40, Pavel Dovgaluk wrote: Perhaps check the replay_interrupt() outside, in an with if (unlikely(interrupt_request))? You mean that I should wrap whole condition into unlikely? No, I wanted to have a single check of replay_interrupt() and/or replay_has_interrupt(). BTW, I think this is incorrect: +if ((replay_mode != REPLAY_MODE_PLAY +|| replay_has_interrupt()) + cc-cpu_exec_interrupt(cpu, interrupt_request)) { +replay_interrupt(); because cc-cpu_exec_interrupt() can exit with cpu_loop_exit(cpu). Haven't found any. Do you have an example? Yes: cpu_svm_check_intercept_param - helper_svm_check_intercept_param - helper_vmexit - cpu_loop_exit Thanks. if (replay_mode == REPLAY_MODE_PLAY !replay_has_interrupt()) { /* do nothing */ } else if (interrupt_request CPU_INTERRUPT_HALT) { replay_interrupt(); ... cpu_loop_exit(cpu); } else if (interrupt_request CPU_INTERRUPT_INIT) { replay_interrupt(); ... cpu_loop_exit(cpu); } else { replay_interrupt(); if (cc-cpu_exec_interrupt(cpu, interrupt_request)) { next_tb = 0; } } Is it normal that processing of the reset request does not execute cpu_loop_exit(cpu)? I think it is okay. INIT executes cpu_loop_exit() on x86 because processors other than the boot processor are halted after they receive INIT. Then I cannot put everything in one if-else chain because it will change the behavior of the code. After processing RESET branch we can also process hardware interrupts (in unmodified code). Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 15/21] replay: checkpoints
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 12/01/2015 13:01, Pavel Dovgalyuk wrote: +default: +case QEMU_CLOCK_VIRTUAL: +if ((replay_mode != REPLAY_MODE_NONE !runstate_is_running()) +|| !replay_checkpoint(run_all ? 2 : 3)) { +return false; +} +break; Please document the meaning of the numbers by making an enum. The numbers have no meaning. They just have to be distinct in different places. Why do you have to distinguish run_all? It seems to be early versions artifact. I'll remove it. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 15/21] replay: checkpoints
From: Pavel Dovgaluk [mailto:pavel.dovga...@ispras.ru] From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 12/01/2015 13:01, Pavel Dovgalyuk wrote: +default: +case QEMU_CLOCK_VIRTUAL: +if ((replay_mode != REPLAY_MODE_NONE !runstate_is_running()) +|| !replay_checkpoint(run_all ? 2 : 3)) { +return false; +} +break; Please document the meaning of the numbers by making an enum. The numbers have no meaning. They just have to be distinct in different places. Why do you have to distinguish run_all? It seems to be early versions artifact. I'll remove it. Sorry, missed one thing. run_all is used to distinguish timers processed in AIO by calling of timerlistgroup_run_timers function and in main loop by calling qemu_clock_run_all_timers. We need to distinguish that to secure the sequence of the events. It makes sense when we use checkpointing while recording the execution. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 15/21] replay: checkpoints
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 13/01/2015 10:15, Pavel Dovgaluk wrote: The numbers have no meaning. They just have to be distinct in different places. This is easier to achieve if you give a name to each place. Sorry, missed one thing. run_all is used to distinguish timers processed in AIO by calling of timerlistgroup_run_timers function and in main loop by calling qemu_clock_run_all_timers. Should you instead distinguish which TimerListGroup is being run? Then main loop would checkpoint once for every TimerListGroup (which means twice). Then I'll have to introduce some kind of deterministic ID for them and new asynchronous event to be saved instead of common checkpoints. Because these calls can be invoked in any order. But I have no idea of how to make such a deterministic ID. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 12/21] replay: recording and replaying clock ticks
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 12/01/2015 13:00, Pavel Dovgalyuk wrote: +/*! Reads next clock event from the input. */ +int64_t replay_read_clock(unsigned int kind) +{ +if (kind = REPLAY_CLOCK_COUNT) { +fprintf(stderr, invalid clock ID %d for replay\n, kind); +exit(1); +} + +replay_exec_instructions(); + +if (replay_file) { +if (skip_async_events(EVENT_CLOCK + kind)) { +replay_read_next_clock(kind); +} +int64_t ret = replay_state.cached_clock[kind]; + +return ret; +} + +fprintf(stderr, REPLAY INTERNAL ERROR %d\n, __LINE__); +exit(1); +} Is this thread safe? It is, because order of main_loop and cpu_exec executions is protected by global mutex. Please introduce the record/replay QemuMutex in patch 4, and all along the series document which variables it protects. record/replay only uses mutex for events queue, because events may be added to the queue by the different threads. Pavel Dovgalyuk
Re: [Qemu-devel] [PATCH] i386: fix breakpoints handling in icount mode
From: Jan Kiszka [mailto:jan.kis...@siemens.com] On 2014-10-22 13:38, Pavel Dovgalyuk wrote: This patch fixes instructions counting when execution is stopped on breakpoint (e.g. set from gdb). Without a patch extra instruction is translated and icount is incremented by invalid value (which equals to number of executed instructions + 1). Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru --- target-i386/translate.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/target-i386/translate.c b/target-i386/translate.c index 1284173..193cf9f 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -8000,7 +8000,7 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu, if (bp-pc == pc_ptr !((bp-flags BP_CPU) (tb-flags HF_RF_MASK))) { gen_debug(dc, pc_ptr - dc-cs_base); -break; +goto done_generating; } } } @@ -8049,6 +8049,7 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu, break; } } +done_generating: if (tb-cflags CF_LAST_IO) gen_io_end(); gen_tb_end(tb, num_insns); Didn't looked into why, just bisected that this patch breaks at least certain guest-originated break- or watchpoints in TCG mode. Can be triggered by booting a Linux kernel with kgdb self-tests enabled. The result is some false reporting of a host-originated debug stop to gdb_set_stop_cpu while gdbserver_state is NULL - SEGV. It seems that kernel sets some hardware breakpoints and QEMU tries to process them with GDB stub. Modifying gdb_set_stop_cpu should help: diff --git a/gdbstub.c b/gdbstub.c index e4a1a79..e8ef546 100644 --- a/gdbstub.c +++ b/gdbstub.c @@ -1202,8 +1202,10 @@ static int gdb_handle_packet(GDBState *s, const char *line_buf) void gdb_set_stop_cpu(CPUState *cpu) { -gdbserver_state-c_cpu = cpu; -gdbserver_state-g_cpu = cpu; +if (gdbserver_state) { +gdbserver_state-c_cpu = cpu; +gdbserver_state-g_cpu = cpu; +} } #ifndef CONFIG_USER_ONLY Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 20/21] replay: command line options
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 12/01/2015 13:01, Pavel Dovgalyuk wrote: This patch introduces command line options for enabling recording or replaying virtual machine behavior. -record option starts recording of the execution and saves it into the log, specified with fname parameter. -replay option is intended for replaying previously saved log. What about just using -icount, since it requires that? You mean adding more attributes to -icount option? Or implicitly enabling -icount with -record/replay? Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 12/21] replay: recording and replaying clock ticks
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 12/01/2015 13:00, Pavel Dovgalyuk wrote: diff --git a/replay/replay.h b/replay/replay.h index 90a949b..1c18c0e 100755 --- a/replay/replay.h +++ b/replay/replay.h @@ -16,6 +16,16 @@ #include stdint.h #include qapi-types.h +/* replay clock kinds */ +/* rdtsc */ +#define REPLAY_CLOCK_REAL_TICKS 0 +/* host_clock */ +#define REPLAY_CLOCK_HOST 1 +/* virtual_rt_clock */ +#define REPLAY_CLOCK_VIRTUAL_RT 2 + +#define REPLAY_CLOCK_COUNT 3 + extern ReplayMode replay_mode; extern char *replay_image_suffix; @@ -47,6 +57,19 @@ bool replay_interrupt(void); Returns true, when interrupt request is pending */ bool replay_has_interrupt(void); +/* Processing clocks and other time sources */ + +/*! Save the specified clock */ +int64_t replay_save_clock(unsigned int kind, int64_t clock); +/*! Read the specified clock from the log or return cached data */ +int64_t replay_read_clock(unsigned int kind); +/*! Saves or reads the clock depending on the current replay mode. */ +#define REPLAY_CLOCK(clock, value) \ +(replay_mode == REPLAY_MODE_PLAY ? replay_read_clock((clock)) \ +: replay_mode == REPLAY_MODE_RECORD \ +? replay_save_clock((clock), (value)) \ +: (value)) + Inline functions please, not macros. Macro is required here, because I do not want the value to be computed in replay mode at all. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 02/21] replay: global variables and function stubs
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 12/01/2015 12:59, Pavel Dovgalyuk wrote: +## +# ReplaySubmode: +# +# Submode of the replay subsystem. +# +# @unknown: used for modes different from play. +# +# @normal: normal replay mode. +# +# Since: 2.3 +## +{ 'enum': 'ReplaySubmode', + 'data': [ 'unknown', 'normal' ] } What is a submode? Can you just check if replay_file is NULL? Submode is not very useful for these core patches. More submodes will be introduces in reverse debugging patches. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 08/21] replay: interrupts and exceptions
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 12/01/2015 13:00, Pavel Dovgalyuk wrote: +if (replay_exception()) { +cc-do_interrupt(cpu); +cpu-exception_index = -1; I cannot see replay_exception() in the series? It is in the patch number 8. @@ -419,21 +434,24 @@ int cpu_exec(CPUArchState *env) cpu-exception_index = EXCP_DEBUG; cpu_loop_exit(cpu); Why not for EXCP_DEBUG? As far as I understand, EXCP_DEBUG is used for external debugger (like gdb). Debugger is usually connected to VM during replay. That is why we do not need to replay EXCP_DEBUG - it may appear in different places, and it does not affect on behavior of the virtual machine. -if (interrupt_request CPU_INTERRUPT_HALT) { +if ((interrupt_request CPU_INTERRUPT_HALT) + replay_interrupt()) { cpu-interrupt_request = ~CPU_INTERRUPT_HALT; cpu-halted = 1; cpu-exception_index = EXCP_HLT; cpu_loop_exit(cpu); } #if defined(TARGET_I386) -if (interrupt_request CPU_INTERRUPT_INIT) { +if ((interrupt_request CPU_INTERRUPT_INIT) + replay_interrupt()) { cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0); do_cpu_init(x86_cpu); cpu-exception_index = EXCP_HALTED; cpu_loop_exit(cpu); } #else -if (interrupt_request CPU_INTERRUPT_RESET) { +if ((interrupt_request CPU_INTERRUPT_RESET) + replay_interrupt()) { cpu_reset(cpu); } #endif Perhaps check the replay_interrupt() outside, in an with if (unlikely(interrupt_request))? You mean that I should wrap whole condition into unlikely? @@ -441,7 +459,10 @@ int cpu_exec(CPUArchState *env) False when the interrupt isn't processed, True when it is, and we should restart on a new TB, and via longjmp via cpu_loop_exit. */ -if (cc-cpu_exec_interrupt(cpu, interrupt_request)) { +if ((replay_mode != REPLAY_MODE_PLAY +|| replay_has_interrupt()) + cc-cpu_exec_interrupt(cpu, interrupt_request)) { +replay_interrupt(); Please put this in a separate function like: if (replay_mode == REPLAY_MODE_PLAY !replay_has_interrupt()) { return false; } ret = cc-cpu_exec_interrupt(cpu, interrupt_request); if (ret) { replay_interrupt(); } return ret; Ok. Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v7 04/21] replay: internal functions for replay log
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 12/01/2015 13:00, Pavel Dovgalyuk wrote: + +volatile unsigned int replay_data_kind = -1; +volatile unsigned int replay_has_unread_data; + Why does this have to be volatile? I think 'volatile' is an artifact from the earlier version. Could be removed then. Pavel Dovgalyuk
Re: [Qemu-devel] [PULL 26/47] cpu-exec: reset exception_index correctly
From: Eduardo Habkost [mailto:ehabk...@redhat.com] On Mon, Dec 15, 2014 at 05:38:10PM +0100, Paolo Bonzini wrote: From: Pavel Dovgalyuk pavel.dovga...@ispras.ru Exception index is reset at every entry at every entry into cpu_exec() function. This may cause missing the exceptions while replaying them. This patch moves exception_index reset to the locations where they are processed. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru Signed-off-by: Paolo Bonzini pbonz...@redhat.com x86_64-linux-user is broken after applying this patch: [qemu/(e511b4d...)|BISECTING]$ ./install/bin/qemu-x86_64 /bin/true qemu: uncaught target signal 8 (Floating point exception) - core dumped Floating point exception (core dumped) I cannot reproduce this bug. QEMU runs and terminates correctly. Can you show me call stack for the exception? Pavel Dovgalyuk
Re: [Qemu-devel] [PULL 27/47] icount: set can_do_io outside TB execution
From: Paolo Bonzini [mailto:pbonz...@redhat.com] This patch sets can_do_io function to allow reading icount within cpu-exec, but outside TB execution. Will you apply other icount patches that enables/disables using icount in particular TBs? Pavel Dovgalyuk Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- cpu-exec.c | 3 +++ cpus.c | 3 +++ 2 files changed, 6 insertions(+) diff --git a/cpu-exec.c b/cpu-exec.c index 4df9856..cce80f0 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -168,7 +168,9 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, uint8_t *tb_ptr) } #endif /* DEBUG_DISAS */ +cpu-can_do_io = 0; next_tb = tcg_qemu_tb_exec(env, tb_ptr); +cpu-can_do_io = 1; trace_exec_tb_exit((void *) (next_tb ~TB_EXIT_MASK), next_tb TB_EXIT_MASK); @@ -543,6 +545,7 @@ int cpu_exec(CPUArchState *env) cpu = current_cpu; env = cpu-env_ptr; cc = CPU_GET_CLASS(cpu); +cpu-can_do_io = 1; #ifdef TARGET_I386 x86_cpu = X86_CPU(cpu); #endif diff --git a/cpus.c b/cpus.c index 91119bb..615d4ae 100644 --- a/cpus.c +++ b/cpus.c @@ -935,6 +935,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg) qemu_thread_get_self(cpu-thread); cpu-thread_id = qemu_get_thread_id(); cpu-exception_index = -1; +cpu-can_do_io = 1; current_cpu = cpu; r = kvm_init_vcpu(cpu); @@ -976,6 +977,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg) qemu_thread_get_self(cpu-thread); cpu-thread_id = qemu_get_thread_id(); cpu-exception_index = -1; +cpu-can_do_io = 1; sigemptyset(waitset); sigaddset(waitset, SIG_IPI); @@ -1019,6 +1021,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg) cpu-thread_id = qemu_get_thread_id(); cpu-created = true; cpu-exception_index = -1; +cpu-can_do_io = 1; } qemu_cond_signal(qemu_cpu_cond); -- 1.8.3.1
Re: [Qemu-devel] [RFC PATCH v6 08/32] icount: implement icount requesting
From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini On 10/12/2014 07:35, Pavel Dovgalyuk wrote: No, it worked well and I deleted _nocache version of that function. But I still need _raw one to get the instructions counter. Oh, great. This patch can also go in early. What's the next? Will you upstream some of the patches to simplify reviewing of the others? Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v5 07/31] icount: implement icount requesting
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 05/12/2014 06:34, Pavel Dovgaluk wrote: From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 04/12/2014 12:02, Pavel Dovgaluk wrote: Why do you need to do this if !cpu_can_do_io(cpu)? We save number of executed instruction when saving interrupt or exception event. It leads to the call of cpu_get_instructions_counter() from cpu_exec function (through several replay functions). It is correct (because no block is executing at that moment) but is different to prior usage of icount requests. Why is !cpu_can_do_io(cpu) if no block is executing? Because it returns cpu-can_do_io which is equal to zero at that moment. And why is can_do_io zero? :) Is the fix to move the place where can_do_io becomes nonzero? can_do_io is set by gen_io_start function. As I understand, it is used to protect determinism in icount mode, because it allows non-deterministic (port io, raising interrupt) operations only at the end of the translation blocks. When someone tries to use MMIO in the middle of TB, that TB is recompiled to place this instruction at the end of the block. Do you mean that we can set can_do_io before execution of the block and reset it at the beginning of the execution? Pavel Dovgalyuk
Re: [Qemu-devel] [RFC PATCH v5 07/31] icount: implement icount requesting
From: Paolo Bonzini [mailto:pbonz...@redhat.com] On 05/12/2014 11:55, Pavel Dovgaluk wrote: And why is can_do_io zero? :) Is the fix to move the place where can_do_io becomes nonzero? can_do_io is set by gen_io_start function. As I understand, it is used to protect determinism in icount mode, because it allows non-deterministic (port io, raising interrupt) operations only at the end of the translation blocks. When someone tries to use MMIO in the middle of TB, that TB is recompiled to place this instruction at the end of the block. Do you mean that we can set can_do_io before execution of the block and reset it at the beginning of the execution? Yes, we could try setting it after execution of the block and clearing it afterwards. Peter knows that part of icount better though (I know mostly the timer/warping parts). Ok, how about these changes? diff --git a/cpu-exec.c b/cpu-exec.c index f52f292..88675ca 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -168,7 +168,9 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, uint8_t *tb_ptr) } #endif /* DEBUG_DISAS */ +cpu-can_do_io = 0; next_tb = tcg_qemu_tb_exec(env, tb_ptr); +cpu-can_do_io = 1; trace_exec_tb_exit((void *) (next_tb ~TB_EXIT_MASK), next_tb TB_EXIT_MASK); @@ -548,6 +550,7 @@ int cpu_exec(CPUArchState *env) cpu = current_cpu; env = cpu-env_ptr; cc = CPU_GET_CLASS(cpu); +cpu-can_do_io = 1; #ifdef TARGET_I386 x86_cpu = X86_CPU(cpu); #endif diff --git a/cpus.c b/cpus.c index 0c33458..7a45a51 100644 --- a/cpus.c +++ b/cpus.c @@ -934,6 +934,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg) qemu_mutex_lock(qemu_global_mutex); qemu_thread_get_self(cpu-thread); cpu-thread_id = qemu_get_thread_id(); +cpu-can_do_io = 1; current_cpu = cpu; r = kvm_init_vcpu(cpu); @@ -974,6 +975,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg) qemu_mutex_lock_iothread(); qemu_thread_get_self(cpu-thread); cpu-thread_id = qemu_get_thread_id(); +cpu-can_do_io = 1; sigemptyset(waitset); sigaddset(waitset, SIG_IPI); @@ -1016,6 +1018,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg) CPU_FOREACH(cpu) { cpu-thread_id = qemu_get_thread_id(); cpu-created = true; +cpu-can_do_io = 1; } qemu_cond_signal(qemu_cpu_cond);