Re: [Qemu-devel] [PATCH v3 0/4] Introduce attributes for timers subsystem and remove QEMU_CLOCK_VIRTUAL_EXT clock type

2019-01-17 Thread Artem Pisarenko
> The source is the external timers, that depend on "outer world", e.g., slirp.
> These timers are not recorded/replayed, but may affect the icount warping
> (which should remain deterministic).

I thought we agreed earlier in this discussion that external timers
are deterministic by design, because they are subset of virtual clock
timers. Regarding slirp module, although it interfaces "outer world"
(except when 'restrict' option is on), it still uses IPv6 timer
deterministically. You may check it yourself. Otherwise it shouldn't
use timer of 'virtual clock' type for this purpose and this needs to
be fixed.
So I still don't understand. And I'm not asking you to explain. I'm
just hinting you that your fix may finally happen to be incorrect in
case if it based on wrong assertion I'm talking about.

> Please check the new version of the series.

Sorry, I'm too busy right now for deep analysis. I've just tried to
briefly understand issue you pointed and provide minimal feedback in
order to prevent possible wrong decisions. I'll return to this work
later.



Re: [Qemu-devel] [PATCH v3 0/4] Introduce attributes for timers subsystem and remove QEMU_CLOCK_VIRTUAL_EXT clock type

2019-01-16 Thread Artem Pisarenko
ср, 16 янв. 2019 г. в 12:15, Pavel Dovgalyuk :
>
> > From: Artem Pisarenko [mailto:artem.k.pisare...@gmail.com]
> > > It seems, that this approach is not always correct.
> > > Now timerlist_deadline_ns uses all virtual timers for deadline 
> > > calculation (including
> > external
> > > ones).
> > > qemu_start_warp_timer uses the deadline for setting warp timer (which 
> > > should be
> > deterministic).
> > > Therefore warp timer may become non-deterministic and replay may behave 
> > > differently compared
> > to
> > > recording phase.
> > >
> > > We have to rollback these or improve somehow to avoid non-determinism.
> >
> > I dont understand how this approach would even introduce
> > non-determinism. I'm not sure about aspects of timers subsystem you
> > mentioned, assuming that maybe we missed something when tried to
> > optimize. But this has nothing to do with determinism as long as we
> > treat all virtual clock timers as deterministic, regardless of
> > EXTERNAL attribute being set or not. They are intended to be used as
> > such by design, aren't? This attribute was introduced purely to avoid
> > extra events in log.
>
> Right, but deadline calculation and icount warp events (that are written into 
> the log too)
> depend on the active virtual timers. Therefore external ones may affect 
> icount warp
> operation sequence.
>
> Pavel Dovgalyuk
>

Sorry, but I still don't understand the source of non-determinism.
>From what you said I may understand that replay module actually has
some additional non-trivial (for me) dependencies on EXTERNAL
attribute other than one, introduced in 'timerlist_run_timers()' (i.e.
it expects deadline calculations somewhere else to match decisions
made in this function, or something like that). I would agree that
these patch series might introduce some incorrect behavior. But it
must be identical in all emulations running in same conditions,
because deadline, warp timer and their effects are all dependent only
on virtual timers, and, therefore, are deterministic too. Of course,
it needs to be fixed. Did you find solution ?



Re: [Qemu-devel] [PATCH v3 0/4] Introduce attributes for timers subsystem and remove QEMU_CLOCK_VIRTUAL_EXT clock type

2019-01-11 Thread Artem Pisarenko
> It seems, that this approach is not always correct.
> Now timerlist_deadline_ns uses all virtual timers for deadline calculation 
> (including external
> ones).
> qemu_start_warp_timer uses the deadline for setting warp timer (which should 
> be deterministic).
> Therefore warp timer may become non-deterministic and replay may behave 
> differently compared to
> recording phase.
>
> We have to rollback these or improve somehow to avoid non-determinism.

I dont understand how this approach would even introduce
non-determinism. I'm not sure about aspects of timers subsystem you
mentioned, assuming that maybe we missed something when tried to
optimize. But this has nothing to do with determinism as long as we
treat all virtual clock timers as deterministic, regardless of
EXTERNAL attribute being set or not. They are intended to be used as
such by design, aren't? This attribute was introduced purely to avoid
extra events in log.

Artem



Re: [Qemu-devel] [PATCH v8 20/20] replay: document development rules

2018-12-18 Thread Artem Pisarenko
> +Virtual devices can also use realtime clock for the events that do not
change
> +the guest state directly. When the clock ticking should depend on VM
execution
> +speed, use virtual ext clock. It is not deterministic, but its speed
depends
> +on the guest execution. This clock is used by the virtual devices (e.g.,
> +slirp routing device) that lie outside the replayed guest.

Virtual ext clock replaced with virtual clock timers having EXTERNAL
attribute.


Re: [Qemu-devel] [PATCH] netdev: fix socket backend implementation and docs

2018-11-13 Thread Artem Pisarenko
> Incidentally this method should not even be part of this header
> files.  qemu/sockets.h is for code that lives in util/qemu-sockets.c
>
> The parse_host_port declaration and impl should better live in
> net/util.{c,h}, so I'd recommend moving that as the first patch
> in a series.

Ok.

> This is really not a desirable thing todo. While you are changing
> one area of code, but you've got a number of independent bugs
> or improvements you are making. These should be done as a patch
> series, one distinct fix/change per patch. Refactoring should
> especially always be done separately from any functional changes
> to reviewers can clearly see no accidental behaviour changes are
> introduced by the refactoring.

I understand that and I done it intentionally for this specific patch.
These changes are really very and very related to each other (not only
because they mostly live in one source file).
Yes, few specific changes are easily may be represented by separate
patches, but it also means that it doesn't bring much convinience for
review. Converting this to patch series just introduce more complexity and
effort (mostly for me, but for reviewers too enough). Not also it requires
95% of effort already spent, but for single reviewer understanding of
validity of each separate patch will be mostly invalidated by following
patch. Furthermore, if some patch will require rework, then it also most
probably will require rework whole chain of next dependent patches, thus
invalidating their 'reviewed' status if any. I beleive this patch is worth
of exception from principles you mentioned. Whole net/socket.c requires
revision - separating changes wil not simplify it. Just consider it as one
large fix of something hardly broken and given new fresh look and view.

If you still insist on patch series, I'll do it, but later, much later
(didn't planned such large efforts on this).


пн, 12 нояб. 2018 г. в 21:31, Daniel P. Berrangé :

> On Wed, Nov 07, 2018 at 12:57:30PM +0600, Artem Pisarenko wrote:
> > Changes:
> > - user documentation and QAPI 'NetdevSocketOptions' comments updated
> > to match current implementation ('udp' type description added, 'fd'
> > option separated to exclusive type and described, 'localaddr'-related
> > description for 'mcast' type fixed, hostname parts in "[host]:port"
> > options updated to match optional/required syntax, various fixes and
> > improvements in options breakdown and wording);
> > - 'fd' type backend: requirement for socket handle being already
> > binded and connected made explicit and documented;
> > - 'fd' type backend: fix broken SOCK_DGRAM support;
> > - 'fd' type backend: removed multicast support and cleaned up broken
> > workaround for it (never called);
> > - fix possible broken multicasting in win32 platform;
> > - fix parsing of "[host]:port" options (added error handling for cases
> > where "host" part is documented as required but isn't provided);
> > - some error messages improved;
> > - other small fixes and refactoring in code.
> >
> > Signed-off-by: Artem Pisarenko 
> > ---
> >
> > Notes:
> > (Since these changes are closely related, I've combined them in one
> patch.)
>
> This is really not a desirable thing todo. While you are changing
> one area of code, but you've got a number of independent bugs
> or improvements you are making. These should be done as a patch
> series, one distinct fix/change per patch. Refactoring should
> especially always be done separately from any functional changes
> to reviewers can clearly see no accidental behaviour changes are
> introduced by the refactoring.
>
>
> > diff --git a/include/qemu/sockets.h b/include/qemu/sockets.h
> > index 8140fea..3fad004 100644
> > --- a/include/qemu/sockets.h
> > +++ b/include/qemu/sockets.h
> > @@ -46,7 +46,7 @@ void socket_listen_cleanup(int fd, Error **errp);
> >  int socket_dgram(SocketAddress *remote, SocketAddress *local, Error
> **errp);
> >
> >  /* Old, ipv4 only bits.  Don't use for new code. */
> > -int parse_host_port(struct sockaddr_in *saddr, const char *str,
> > +int parse_host_port(struct sockaddr_in *saddr, const char *str, bool
> h_addr_opt,
> >  Error **errp);
>
> Incidentally this method should not even be part of this header
> files.  qemu/sockets.h is for code that lives in util/qemu-sockets.c
>
> The parse_host_port declaration and impl should better live in
> net/util.{c,h}, so I'd recommend moving that as the first patch
> in a series.
>
>
> Regards,
> Daniel
> --
> |: https://berrange.com  -o-
> https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-
> https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-
> https://www.instagram.com/dberrange :|
>


[Qemu-devel] [PATCH] netdev: fix socket backend implementation and docs

2018-11-06 Thread Artem Pisarenko
Changes:
- user documentation and QAPI 'NetdevSocketOptions' comments updated
to match current implementation ('udp' type description added, 'fd'
option separated to exclusive type and described, 'localaddr'-related
description for 'mcast' type fixed, hostname parts in "[host]:port"
options updated to match optional/required syntax, various fixes and
improvements in options breakdown and wording);
- 'fd' type backend: requirement for socket handle being already
binded and connected made explicit and documented;
- 'fd' type backend: fix broken SOCK_DGRAM support;
- 'fd' type backend: removed multicast support and cleaned up broken
workaround for it (never called);
- fix possible broken multicasting in win32 platform;
- fix parsing of "[host]:port" options (added error handling for cases
where "host" part is documented as required but isn't provided);
- some error messages improved;
- other small fixes and refactoring in code.

Signed-off-by: Artem Pisarenko 
---

Notes:
(Since these changes are closely related, I've combined them in one patch.)
This patch addresses all issues I've pointed earlier 
(http://lists.nongnu.org/archive/html/qemu-devel/2018-05/msg06695.html), except 
internal protocol, and plus additional ones.

'fd' transport and its special 'af_inet multicast' case deserve special 
attention.
Firstly, it already wasn't working as expected at least in current qemu 
version, so it isn't supposed to break any compabitibility if someone cares.
The only question is a way of fixing it.
It depends on concept behind 'fd' transport, which is unknown to me. Seems 
like initially it was just optional parameter to any transport allowing to 
replace newly created socket descriptor with user supplied one, but at some 
time someone changed it to be separate transport and not accounted 'af_inet 
multicast' case (code analysis shows that condition "is_connected && mcast != 
NULL" in 'net_socket_fd_init_dgram' function never can be true and 
"s->dgram_dst" value is left unassigned).
I would prefer concept of separate transport when qemu doesn't depend on 
underlying domain, type and protocol of socket (almost). To make it 
universal/flexible, qemu shouldn't handle their specifics, such as 'af_inet 
multicast' one. So I fixed things accordingly. For example, if user needs 
multicasting, it should already know that it cannot share one socket between 
its app and qemu instances, so user should use 'mcast' transport of qemu which 
will create separate socket for qemu instance. And there are maybe other socket 
types (possibly not existing at present moment yet) with their own specifics.
Since this concept restricts usage of sockets which cannot work in 
connected state (such as af_inet multicast), I've prepared and ready to submit 
another version of patch which solves this restriction by checking socket type 
and handling it accordingly. Currently it supports only 'af_inet multicast' 
case by preserving existed hack (slightly modified): it extracts multicast 
address from socket and duplicates socket in proper unconnected state. It also 
requires synchronization with user who will unconnect original socket back. All 
of this is very hacky, but I'm open for discussion...

 include/qemu-common.h  |   3 +
 include/qemu/sockets.h |   2 +-
 net/net.c  |   8 ++-
 net/socket.c   | 148 ++---
 qapi/net.json  |  12 ++--
 qemu-options.hx|  85 +---
 6 files changed, 147 insertions(+), 111 deletions(-)

diff --git a/include/qemu-common.h b/include/qemu-common.h
index ed60ba2..d4e4121 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -67,6 +67,8 @@ int qemu_openpty_raw(int *aslave, char *pty_name);
 #define qemu_setsockopt(sockfd, level, optname, optval, optlen) \
 setsockopt(sockfd, level, optname, (const void *)optval, optlen)
 #define qemu_recv(sockfd, buf, len, flags) recv(sockfd, (void *)buf, len, 
flags)
+#define qemu_send(sockfd, buf, len, flags) \
+send(sockfd, (const void *)buf, len, flags)
 #define qemu_sendto(sockfd, buf, len, flags, destaddr, addrlen) \
 sendto(sockfd, (const void *)buf, len, flags, destaddr, addrlen)
 #else
@@ -75,6 +77,7 @@ int qemu_openpty_raw(int *aslave, char *pty_name);
 #define qemu_setsockopt(sockfd, level, optname, optval, optlen) \
 setsockopt(sockfd, level, optname, optval, optlen)
 #define qemu_recv(sockfd, buf, len, flags) recv(sockfd, buf, len, flags)
+#define qemu_send(sockfd, buf, len, flags) send(sockfd, buf, len, flags)
 #define qemu_sendto(sockfd, buf, len, flags, destaddr, addrlen) \
 sendto(sockfd, buf, len, flags, destaddr, addrlen)
 #endif
diff --git a/include/qemu/sockets.h b/include/qemu/sockets.h
index 8140fea..3fad004 100644
--- a/include/qemu/sockets.h
+++ b/include/qemu/sockets.h
@@ -46,7 +46,7 @@ void socket_listen_cleanup(i

[Qemu-devel] [PATCH v3 2/2] tests/test-char: add muxed chardev testing for open/close

2018-11-06 Thread Artem Pisarenko
Validate that frontend callbacks for CHR_EVENT_OPENED/CHR_EVENT_CLOSED
events are being issued when expected and in strictly pairing order.

Signed-off-by: Artem Pisarenko 
---
 tests/test-char.c | 80 +--
 1 file changed, 78 insertions(+), 2 deletions(-)

diff --git a/tests/test-char.c b/tests/test-char.c
index 831e37f..657cb4c 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -16,6 +16,9 @@ static bool quit;
 
 typedef struct FeHandler {
 int read_count;
+bool is_open;
+int openclose_count;
+bool openclose_mismatch;
 int last_event;
 char read_buf[128];
 } FeHandler;
@@ -49,10 +52,24 @@ static void fe_read(void *opaque, const uint8_t *buf, int 
size)
 static void fe_event(void *opaque, int event)
 {
 FeHandler *h = opaque;
+bool new_open_state;
 
 h->last_event = event;
-if (event != CHR_EVENT_BREAK) {
+switch (event) {
+case CHR_EVENT_BREAK:
+break;
+case CHR_EVENT_OPENED:
+case CHR_EVENT_CLOSED:
+h->openclose_count++;
+new_open_state = (event == CHR_EVENT_OPENED);
+if (h->is_open == new_open_state) {
+h->openclose_mismatch = true;
+}
+h->is_open = new_open_state;
+/* no break */
+default:
 quit = true;
+break;
 }
 }
 
@@ -161,7 +178,7 @@ static void char_mux_test(void)
 QemuOpts *opts;
 Chardev *chr, *base;
 char *data;
-FeHandler h1 = { 0, }, h2 = { 0, };
+FeHandler h1 = { 0, false, 0, false, }, h2 = { 0, false, 0, false, };
 CharBackend chr_be1, chr_be2;
 
 opts = qemu_opts_create(qemu_find_opts("chardev"), "mux-label",
@@ -233,6 +250,65 @@ static void char_mux_test(void)
 g_assert_cmpint(h1.last_event, ==, CHR_EVENT_BREAK);
 g_assert_cmpint(h2.last_event, ==, CHR_EVENT_MUX_OUT);
 
+/* open/close state and corresponding events */
+g_assert_true(qemu_chr_fe_backend_open(_be1));
+g_assert_true(qemu_chr_fe_backend_open(_be2));
+g_assert_true(h1.is_open);
+g_assert_false(h1.openclose_mismatch);
+g_assert_true(h2.is_open);
+g_assert_false(h2.openclose_mismatch);
+
+h1.openclose_count = h2.openclose_count = 0;
+
+qemu_chr_fe_set_handlers(_be1, NULL, NULL, NULL, NULL,
+ NULL, NULL, false);
+qemu_chr_fe_set_handlers(_be2, NULL, NULL, NULL, NULL,
+ NULL, NULL, false);
+g_assert_cmpint(h1.openclose_count, ==, 0);
+g_assert_cmpint(h2.openclose_count, ==, 0);
+
+h1.is_open = h2.is_open = false;
+qemu_chr_fe_set_handlers(_be1,
+ NULL,
+ NULL,
+ fe_event,
+ NULL,
+ ,
+ NULL, false);
+qemu_chr_fe_set_handlers(_be2,
+ NULL,
+ NULL,
+ fe_event,
+ NULL,
+ ,
+ NULL, false);
+g_assert_cmpint(h1.openclose_count, ==, 1);
+g_assert_false(h1.openclose_mismatch);
+g_assert_cmpint(h2.openclose_count, ==, 1);
+g_assert_false(h2.openclose_mismatch);
+
+qemu_chr_be_event(base, CHR_EVENT_CLOSED);
+qemu_chr_be_event(base, CHR_EVENT_OPENED);
+g_assert_cmpint(h1.openclose_count, ==, 3);
+g_assert_false(h1.openclose_mismatch);
+g_assert_cmpint(h2.openclose_count, ==, 3);
+g_assert_false(h2.openclose_mismatch);
+
+qemu_chr_fe_set_handlers(_be2,
+ fe_can_read,
+ fe_read,
+ fe_event,
+ NULL,
+ ,
+ NULL, false);
+qemu_chr_fe_set_handlers(_be1,
+ fe_can_read,
+ fe_read,
+ fe_event,
+ NULL,
+ ,
+ NULL, false);
+
 /* remove first handler */
 qemu_chr_fe_set_handlers(_be1, NULL, NULL, NULL, NULL,
  NULL, NULL, true);
-- 
2.7.4




Re: [Qemu-devel] [PATCH v2 2/2] tests/test-char: add muxed chardev testing for open/close

2018-11-06 Thread Artem Pisarenko
> this is unnecessary change, I can drop on commit

Oops, didn't noticed your message before sent v3.


[Qemu-devel] [PATCH v3 1/2] chardev: fix mess in OPENED/CLOSED events when muxed

2018-11-06 Thread Artem Pisarenko
When chardev is multiplexed (mux=on) there are a lot of cases where
CHR_EVENT_OPENED/CHR_EVENT_CLOSED events pairing (expected from
frontend side) is broken. There are either generation of multiple
repeated or extra CHR_EVENT_OPENED events, or CHR_EVENT_CLOSED just
isn't generated at all.
This is mostly because 'qemu_chr_fe_set_handlers()' function makes its
own (and often wrong) implicit decision on updated frontend state and
invokes 'fd_event' callback with 'CHR_EVENT_OPENED'. And even worse,
it doesn't do symmetric action in opposite direction, as someone may
expect (i.e. it doesn't invoke previously set 'fd_event' with
'CHR_EVENT_CLOSED'). Muxed chardev uses trick by calling this function
again to replace callback handlers with its own ones, but it doesn't
account for such side effect.
Fix that using extended version of this function with added argument
for disabling side effect and keep original function for compatibility
with lots of frontends already using this interface and being
"tolerant" to its side effects.
One more source of event duplication is just line of code in
char-mux.c, which does far more than comment above says (obvious fix).

Signed-off-by: Artem Pisarenko 
---

Notes:
v3:
 - extended and improved commit message with 'why' and 'how' explanation

 chardev/char-fe.c | 33 -
 chardev/char-mux.c| 16 
 include/chardev/char-fe.h | 18 +-
 3 files changed, 49 insertions(+), 18 deletions(-)

diff --git a/chardev/char-fe.c b/chardev/char-fe.c
index a8931f7..b7bcbd5 100644
--- a/chardev/char-fe.c
+++ b/chardev/char-fe.c
@@ -246,14 +246,15 @@ void qemu_chr_fe_deinit(CharBackend *b, bool del)
 }
 }
 
-void qemu_chr_fe_set_handlers(CharBackend *b,
-  IOCanReadHandler *fd_can_read,
-  IOReadHandler *fd_read,
-  IOEventHandler *fd_event,
-  BackendChangeHandler *be_change,
-  void *opaque,
-  GMainContext *context,
-  bool set_open)
+void qemu_chr_fe_set_handlers_full(CharBackend *b,
+   IOCanReadHandler *fd_can_read,
+   IOReadHandler *fd_read,
+   IOEventHandler *fd_event,
+   BackendChangeHandler *be_change,
+   void *opaque,
+   GMainContext *context,
+   bool set_open,
+   bool sync_state)
 {
 Chardev *s;
 int fe_open;
@@ -285,7 +286,7 @@ void qemu_chr_fe_set_handlers(CharBackend *b,
 qemu_chr_fe_take_focus(b);
 /* We're connecting to an already opened device, so let's make sure we
also get the open event */
-if (s->be_open) {
+if (sync_state && s->be_open) {
 qemu_chr_be_event(s, CHR_EVENT_OPENED);
 }
 }
@@ -295,6 +296,20 @@ void qemu_chr_fe_set_handlers(CharBackend *b,
 }
 }
 
+void qemu_chr_fe_set_handlers(CharBackend *b,
+  IOCanReadHandler *fd_can_read,
+  IOReadHandler *fd_read,
+  IOEventHandler *fd_event,
+  BackendChangeHandler *be_change,
+  void *opaque,
+  GMainContext *context,
+  bool set_open)
+{
+qemu_chr_fe_set_handlers_full(b, fd_can_read, fd_read, fd_event, be_change,
+  opaque, context, set_open,
+  true);
+}
+
 void qemu_chr_fe_take_focus(CharBackend *b)
 {
 if (!b->chr) {
diff --git a/chardev/char-mux.c b/chardev/char-mux.c
index 6055e76..1199d32 100644
--- a/chardev/char-mux.c
+++ b/chardev/char-mux.c
@@ -283,13 +283,13 @@ void mux_chr_set_handlers(Chardev *chr, GMainContext 
*context)
 MuxChardev *d = MUX_CHARDEV(chr);
 
 /* Fix up the real driver with mux routines */
-qemu_chr_fe_set_handlers(>chr,
- mux_chr_can_read,
- mux_chr_read,
- mux_chr_event,
- NULL,
- chr,
- context, true);
+qemu_chr_fe_set_handlers_full(>chr,
+  mux_chr_can_read,
+  mux_chr_read,
+  mux_chr_event,
+  NULL,
+  chr,
+  context, true, false);
 }
 
 void mux_set_focus(Chardev *chr, int focus)
@@ -367,7 +367,7 @@ static int open_muxes(Chardev *chr)
  * mark mux as OPENED so any new FEs will immedia

[Qemu-devel] [PATCH v3 0/2] chardev: fix mess in OPENED/CLOSED events when muxed

2018-11-06 Thread Artem Pisarenko
This issue actually more complex. Idea of generating events from inside 
function called '*_set_handlers' isn't good, at least its implicit nature, and 
especially a fact, that function decides about open state (see 'fe_open' 
variable), but generates event only in one direction. Combined with 
'mux_chr_set_handlers()' hack this makes things even worse.
Better solution is to change fe interface and rewrite all frontends code (a lot 
of stuff in hw/char/* and somewhere else).
Although first patch doesn't fix any bug (known to me), its main effect is 
optimization of emulation performance by avoiding extra activity.
Added testing demonstrates issue and prevents potential bugs in future.

v3 changes:
 - extended commit message explaining fix (as supposed by Marc-André Lureau)

v2 changes:
 - fix failed unit test
 - 'mux_chr_set_handlers()' hack rewritten (as supposed by Marc-André Lureau)
 - added testing of issue to unit test (new patch)

Artem Pisarenko (2):
  chardev: fix mess in OPENED/CLOSED events when muxed
  tests/test-char: add muxed chardev testing for open/close

 chardev/char-fe.c | 33 +--
 chardev/char-mux.c| 16 +-
 include/chardev/char-fe.h | 18 ++-
 tests/test-char.c | 80 +--
 4 files changed, 127 insertions(+), 20 deletions(-)

-- 
2.7.4




Re: [Qemu-devel] [PULL 03/48] qemu-timer: introduce timer attributes

2018-11-05 Thread Artem Pisarenko
> hw/core/ptimer.o: In function `timer_new_tl':
> /home/eblake/qemu/include/qemu/timer.h:536: undefined reference to
> `timer_init_tl'
> collect2: error: ld returned 1 exit status
> make: *** [/home/eblake/qemu/rules.mak:124: tests/ptimer-test] Error 1
> make: *** Waiting for unfinished jobs

I wasn't able to reproduce that on
commit 89a603a0c80ae3d6a8711571550b2ae9a01ea909 (is it that commit you
point to ?).
Neither 'make check' fails, nor "include/qemu/timer.h:536" points to
meaningful lines of code, nor full project text search shows anything like
'timer_new_tl' or 'timer_init_tl'. Same for merge
comit b312532fd03413d0e6ae6767ec793a3e30f487b8.
Looks like 'make' just failed to rebuild dependencies correctly and needs
clean/distclean.


[Qemu-devel] [PATCH v2 2/2] tests/test-char: add muxed chardev testing for open/close

2018-11-05 Thread Artem Pisarenko
Validate that frontend callbacks for CHR_EVENT_OPENED/CHR_EVENT_CLOSED
events are being issued when expected and in strictly pairing order.

Signed-off-by: Artem Pisarenko 
---
 tests/test-char.c | 80 +--
 1 file changed, 78 insertions(+), 2 deletions(-)

diff --git a/tests/test-char.c b/tests/test-char.c
index 831e37f..657cb4c 100644
--- a/tests/test-char.c
+++ b/tests/test-char.c
@@ -16,6 +16,9 @@ static bool quit;
 
 typedef struct FeHandler {
 int read_count;
+bool is_open;
+int openclose_count;
+bool openclose_mismatch;
 int last_event;
 char read_buf[128];
 } FeHandler;
@@ -49,10 +52,24 @@ static void fe_read(void *opaque, const uint8_t *buf, int 
size)
 static void fe_event(void *opaque, int event)
 {
 FeHandler *h = opaque;
+bool new_open_state;
 
 h->last_event = event;
-if (event != CHR_EVENT_BREAK) {
+switch (event) {
+case CHR_EVENT_BREAK:
+break;
+case CHR_EVENT_OPENED:
+case CHR_EVENT_CLOSED:
+h->openclose_count++;
+new_open_state = (event == CHR_EVENT_OPENED);
+if (h->is_open == new_open_state) {
+h->openclose_mismatch = true;
+}
+h->is_open = new_open_state;
+/* no break */
+default:
 quit = true;
+break;
 }
 }
 
@@ -161,7 +178,7 @@ static void char_mux_test(void)
 QemuOpts *opts;
 Chardev *chr, *base;
 char *data;
-FeHandler h1 = { 0, }, h2 = { 0, };
+FeHandler h1 = { 0, false, 0, false, }, h2 = { 0, false, 0, false, };
 CharBackend chr_be1, chr_be2;
 
 opts = qemu_opts_create(qemu_find_opts("chardev"), "mux-label",
@@ -233,6 +250,65 @@ static void char_mux_test(void)
 g_assert_cmpint(h1.last_event, ==, CHR_EVENT_BREAK);
 g_assert_cmpint(h2.last_event, ==, CHR_EVENT_MUX_OUT);
 
+/* open/close state and corresponding events */
+g_assert_true(qemu_chr_fe_backend_open(_be1));
+g_assert_true(qemu_chr_fe_backend_open(_be2));
+g_assert_true(h1.is_open);
+g_assert_false(h1.openclose_mismatch);
+g_assert_true(h2.is_open);
+g_assert_false(h2.openclose_mismatch);
+
+h1.openclose_count = h2.openclose_count = 0;
+
+qemu_chr_fe_set_handlers(_be1, NULL, NULL, NULL, NULL,
+ NULL, NULL, false);
+qemu_chr_fe_set_handlers(_be2, NULL, NULL, NULL, NULL,
+ NULL, NULL, false);
+g_assert_cmpint(h1.openclose_count, ==, 0);
+g_assert_cmpint(h2.openclose_count, ==, 0);
+
+h1.is_open = h2.is_open = false;
+qemu_chr_fe_set_handlers(_be1,
+ NULL,
+ NULL,
+ fe_event,
+ NULL,
+ ,
+ NULL, false);
+qemu_chr_fe_set_handlers(_be2,
+ NULL,
+ NULL,
+ fe_event,
+ NULL,
+ ,
+ NULL, false);
+g_assert_cmpint(h1.openclose_count, ==, 1);
+g_assert_false(h1.openclose_mismatch);
+g_assert_cmpint(h2.openclose_count, ==, 1);
+g_assert_false(h2.openclose_mismatch);
+
+qemu_chr_be_event(base, CHR_EVENT_CLOSED);
+qemu_chr_be_event(base, CHR_EVENT_OPENED);
+g_assert_cmpint(h1.openclose_count, ==, 3);
+g_assert_false(h1.openclose_mismatch);
+g_assert_cmpint(h2.openclose_count, ==, 3);
+g_assert_false(h2.openclose_mismatch);
+
+qemu_chr_fe_set_handlers(_be2,
+ fe_can_read,
+ fe_read,
+ fe_event,
+ NULL,
+ ,
+ NULL, false);
+qemu_chr_fe_set_handlers(_be1,
+ fe_can_read,
+ fe_read,
+ fe_event,
+ NULL,
+ ,
+ NULL, false);
+
 /* remove first handler */
 qemu_chr_fe_set_handlers(_be1, NULL, NULL, NULL, NULL,
  NULL, NULL, true);
-- 
2.7.4




[Qemu-devel] [PATCH v2 1/2] chardev: fix mess in OPENED/CLOSED events when muxed

2018-11-05 Thread Artem Pisarenko
When chardev is multiplexed (mux=on) there are a lot of cases, when
CHR_EVENT_OPENED/CHR_EVENT_CLOSED events pairing (expected from
frontend side) is broken. There are either generation of multiple
repeated or extra CHR_EVENT_OPENED events, or CHR_EVENT_CLOSED just
isn't generated at all (when it does with mux=off).
Fix that.

Signed-off-by: Artem Pisarenko 
---
 chardev/char-fe.c | 33 -
 chardev/char-mux.c| 16 
 include/chardev/char-fe.h | 18 +-
 3 files changed, 49 insertions(+), 18 deletions(-)

diff --git a/chardev/char-fe.c b/chardev/char-fe.c
index a8931f7..b7bcbd5 100644
--- a/chardev/char-fe.c
+++ b/chardev/char-fe.c
@@ -246,14 +246,15 @@ void qemu_chr_fe_deinit(CharBackend *b, bool del)
 }
 }
 
-void qemu_chr_fe_set_handlers(CharBackend *b,
-  IOCanReadHandler *fd_can_read,
-  IOReadHandler *fd_read,
-  IOEventHandler *fd_event,
-  BackendChangeHandler *be_change,
-  void *opaque,
-  GMainContext *context,
-  bool set_open)
+void qemu_chr_fe_set_handlers_full(CharBackend *b,
+   IOCanReadHandler *fd_can_read,
+   IOReadHandler *fd_read,
+   IOEventHandler *fd_event,
+   BackendChangeHandler *be_change,
+   void *opaque,
+   GMainContext *context,
+   bool set_open,
+   bool sync_state)
 {
 Chardev *s;
 int fe_open;
@@ -285,7 +286,7 @@ void qemu_chr_fe_set_handlers(CharBackend *b,
 qemu_chr_fe_take_focus(b);
 /* We're connecting to an already opened device, so let's make sure we
also get the open event */
-if (s->be_open) {
+if (sync_state && s->be_open) {
 qemu_chr_be_event(s, CHR_EVENT_OPENED);
 }
 }
@@ -295,6 +296,20 @@ void qemu_chr_fe_set_handlers(CharBackend *b,
 }
 }
 
+void qemu_chr_fe_set_handlers(CharBackend *b,
+  IOCanReadHandler *fd_can_read,
+  IOReadHandler *fd_read,
+  IOEventHandler *fd_event,
+  BackendChangeHandler *be_change,
+  void *opaque,
+  GMainContext *context,
+  bool set_open)
+{
+qemu_chr_fe_set_handlers_full(b, fd_can_read, fd_read, fd_event, be_change,
+  opaque, context, set_open,
+  true);
+}
+
 void qemu_chr_fe_take_focus(CharBackend *b)
 {
 if (!b->chr) {
diff --git a/chardev/char-mux.c b/chardev/char-mux.c
index 6055e76..1199d32 100644
--- a/chardev/char-mux.c
+++ b/chardev/char-mux.c
@@ -283,13 +283,13 @@ void mux_chr_set_handlers(Chardev *chr, GMainContext 
*context)
 MuxChardev *d = MUX_CHARDEV(chr);
 
 /* Fix up the real driver with mux routines */
-qemu_chr_fe_set_handlers(>chr,
- mux_chr_can_read,
- mux_chr_read,
- mux_chr_event,
- NULL,
- chr,
- context, true);
+qemu_chr_fe_set_handlers_full(>chr,
+  mux_chr_can_read,
+  mux_chr_read,
+  mux_chr_event,
+  NULL,
+  chr,
+  context, true, false);
 }
 
 void mux_set_focus(Chardev *chr, int focus)
@@ -367,7 +367,7 @@ static int open_muxes(Chardev *chr)
  * mark mux as OPENED so any new FEs will immediately receive
  * OPENED event
  */
-qemu_chr_be_event(chr, CHR_EVENT_OPENED);
+chr->be_open = 1;
 
 return 0;
 }
diff --git a/include/chardev/char-fe.h b/include/chardev/char-fe.h
index 46c997d..c1b7fd9 100644
--- a/include/chardev/char-fe.h
+++ b/include/chardev/char-fe.h
@@ -67,7 +67,7 @@ bool qemu_chr_fe_backend_connected(CharBackend *be);
 bool qemu_chr_fe_backend_open(CharBackend *be);
 
 /**
- * qemu_chr_fe_set_handlers:
+ * qemu_chr_fe_set_handlers_full:
  * @b: a CharBackend
  * @fd_can_read: callback to get the amount of data the frontend may
  *   receive
@@ -79,12 +79,28 @@ bool qemu_chr_fe_backend_open(CharBackend *be);
  * @context: a main loop context or NULL for the default
  * @set_open: whether to call qemu_chr_fe_set_open() implicitely when
  * any of the handler is non-NULL
+ * @sync_state: whether to issue event callback with updated state
  *
  * Set the front end char handlers. The front end 

[Qemu-devel] [PATCH v2 0/2] chardev: fix mess in OPENED/CLOSED events when muxed

2018-11-05 Thread Artem Pisarenko
This issue actually more complex. Idea of generating events from inside 
function called '*_set_handlers' isn't good, at least its implicit nature, and 
especially a fact, that function decides about open state (see 'fe_open' 
variable), but generates event only in one direction. Combined with 
'mux_chr_set_handlers()' hack this makes things even worse.
Better solution is to change fe interface and rewrite all frontends code (a lot 
of stuff in hw/char/* and somewhere else).
Although first patch doesn't fix any bug (known to me), its main effect is 
optimization of emulation performance by avoiding extra activity.
Added testing demonstrates issue and prevents potential bugs in future.

v2 changes:
 - fix failed unit test
 - 'mux_chr_set_handlers()' hack rewritten (as supposed by Marc-André Lureau)
 - added testing of issue to unit test (new patch)

Artem Pisarenko (2):
  chardev: fix mess in OPENED/CLOSED events when muxed
  tests/test-char: add muxed chardev testing for open/close

 chardev/char-fe.c | 33 +--
 chardev/char-mux.c| 16 +-
 include/chardev/char-fe.h | 18 ++-
 tests/test-char.c | 80 +--
 4 files changed, 127 insertions(+), 20 deletions(-)

-- 
2.7.4




Re: [Qemu-devel] [PATCH] chardev: fix mess in OPENED/CLOSED events when muxed

2018-11-04 Thread Artem Pisarenko
Sorry, I forgot to check unit tests. Although, it's very strange that this
specific test failed while things work functionally...

> I am a bit reluctant to take patches that don't actually "fix" things.
>
> Could you add some tests to demonstrate the problems?

Ok

>> @@ -257,6 +257,7 @@ void qemu_chr_fe_set_handlers(CharBackend *b,
>>  {
>>  Chardev *s;
>>  int fe_open;
>> +static __thread bool mux_reentered;
>
> Not very elegant. Maybe mux_chr_set_handlers() could call a refactored
> internal chr_fe_set_handlers() with an extra arg "no_open_event" ?

Agree. It would make this hack more elegant. Although, it may become
irrelevant as soon as I'll find soultion for failed docker test...

>> @@ -272,21 +272,24 @@ static void char_mux_finalize(Object *obj)
>>  for (i = 0; i < d->mux_cnt; i++) {
>>  CharBackend *be = d->backends[i];
>>  if (be) {
>> +if (be->chr && be->chr->be_open) {
>> +qemu_chr_be_event(be->chr, CHR_EVENT_CLOSED);
>> +}
>
> It looks like this could be a seperate patch, with a seperate test.

Why this should be separate ? Do you mean that overall "opened/closed"
pairing fix should be separated to "opened" fix+test and "closed" fix+test ?

Since in next version it cannot be single patch anymore, how should I
proceed with it ? Should I publish it as patch series with same subject
marked as V2 or start new patch series?


[Qemu-devel] [PATCH] virtserialport/virtconsole: fix messy opening/closing port

2018-11-01 Thread Artem Pisarenko
This fixes wrong interfacing between virtio serial port and bus
models, and corresponding chardev backends, caused extra and incorrect
activity during guest boot process (when virtserialport device used).

Signed-off-by: Artem Pisarenko 
---

Notes:
Although this doesn't trigger any issue/bug (known to me), this may prevent 
them in future.
Also this patch optimizes emulation performance by avoiding extra activity 
and fixes a case, when serial port wasn't closed if backend closed while being 
disabled (even worse - serial port will open again!).

 hw/char/virtio-console.c  | 24 
 hw/char/virtio-serial-bus.c   |  8 +++-
 include/hw/virtio/virtio-serial.h |  8 
 3 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/hw/char/virtio-console.c b/hw/char/virtio-console.c
index 2cbe1d4..634e9bf 100644
--- a/hw/char/virtio-console.c
+++ b/hw/char/virtio-console.c
@@ -25,6 +25,7 @@
 typedef struct VirtConsole {
 VirtIOSerialPort parent_obj;
 
+bool backend_active;
 CharBackend chr;
 guint watch;
 } VirtConsole;
@@ -149,6 +150,11 @@ static void chr_event(void *opaque, int event)
 VirtIOSerialPort *port = VIRTIO_SERIAL_PORT(vcon);
 
 trace_virtio_console_chr_event(port->id, event);
+
+if (!vcon->backend_active) {
+return;
+}
+
 switch (event) {
 case CHR_EVENT_OPENED:
 virtio_serial_open(port);
@@ -187,17 +193,24 @@ static int chr_be_change(void *opaque)
 return 0;
 }
 
+static bool virtconsole_is_backend_enabled(VirtIOSerialPort *port)
+{
+VirtConsole *vcon = VIRTIO_CONSOLE(port);
+
+return vcon->backend_active;
+}
+
 static void virtconsole_enable_backend(VirtIOSerialPort *port, bool enable)
 {
 VirtConsole *vcon = VIRTIO_CONSOLE(port);
 
-if (!qemu_chr_fe_backend_connected(>chr)) {
-return;
-}
-
 if (enable) {
 VirtIOSerialPortClass *k = VIRTIO_SERIAL_PORT_GET_CLASS(port);
 
+if (!k->is_console && virtio_serial_is_opened(port)
+&& !qemu_chr_fe_backend_open(>chr)) {
+virtio_serial_close(port);
+}
 qemu_chr_fe_set_handlers(>chr, chr_can_read, chr_read,
  k->is_console ? NULL : chr_event,
  chr_be_change, vcon, NULL, false);
@@ -205,6 +218,7 @@ static void virtconsole_enable_backend(VirtIOSerialPort 
*port, bool enable)
 qemu_chr_fe_set_handlers(>chr, NULL, NULL, NULL,
  NULL, NULL, NULL, false);
 }
+vcon->backend_active = enable;
 }
 
 static void virtconsole_realize(DeviceState *dev, Error **errp)
@@ -220,6 +234,7 @@ static void virtconsole_realize(DeviceState *dev, Error 
**errp)
 }
 
 if (qemu_chr_fe_backend_connected(>chr)) {
+vcon->backend_active = true;
 /*
  * For consoles we don't block guest data transfer just
  * because nothing is connected - we'll just let it go
@@ -278,6 +293,7 @@ static void virtserialport_class_init(ObjectClass *klass, 
void *data)
 k->unrealize = virtconsole_unrealize;
 k->have_data = flush_buf;
 k->set_guest_connected = set_guest_connected;
+k->is_backend_enabled = virtconsole_is_backend_enabled;
 k->enable_backend = virtconsole_enable_backend;
 k->guest_writable = guest_writable;
 dc->props = virtserialport_properties;
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 04e3ebe..d23d99d 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -258,6 +258,11 @@ static size_t send_control_event(VirtIOSerial *vser, 
uint32_t port_id,
 }
 
 /* Functions for use inside qemu to open and read from/write to ports */
+bool virtio_serial_is_opened(VirtIOSerialPort *port)
+{
+return port->host_connected;
+}
+
 int virtio_serial_open(VirtIOSerialPort *port)
 {
 /* Don't allow opening an already-open port */
@@ -643,7 +648,8 @@ static void set_status(VirtIODevice *vdev, uint8_t status)
 
 QTAILQ_FOREACH(port, >ports, next) {
 VirtIOSerialPortClass *vsc = VIRTIO_SERIAL_PORT_GET_CLASS(port);
-if (vsc->enable_backend) {
+if (vsc->is_backend_enabled && vsc->enable_backend
+&& (vsc->is_backend_enabled(port) != vdev->vm_running)) {
 vsc->enable_backend(port, vdev->vm_running);
 }
 }
diff --git a/include/hw/virtio/virtio-serial.h 
b/include/hw/virtio/virtio-serial.h
index 12657a9..4b72562 100644
--- a/include/hw/virtio/virtio-serial.h
+++ b/include/hw/virtio/virtio-serial.h
@@ -58,6 +58,8 @@ typedef struct VirtIOSerialPortClass {
 /* Guest opened/closed device. */
 void (*set_guest_connected)(VirtIOSerialPort *port, int guest_connected);
 
+/* Is backend currently enabled for virtio serial port */
+bool (*is_backend_enabled)(VirtIOSerialPort *port);

[Qemu-devel] [PATCH] chardev: fix mess in OPENED/CLOSED events when muxed

2018-11-01 Thread Artem Pisarenko
When chardev is multiplexed (mux=on) there are a lot of cases, when
CHR_EVENT_OPENED/CHR_EVENT_CLOSED events pairing (expected from
frontend side) is broken. There are either generation of multiple
repeated or extra CHR_EVENT_OPENED events, or CHR_EVENT_CLOSED just
isn't generated at all (when it does with mux=off).
Fix that.

Signed-off-by: Artem Pisarenko 
---

Notes:
This issue actually more complex. Idea of generating events from inside 
function called '*_set_handlers' isn't good, at least its implicit nature, and 
especially a fact, that function decides about open state (see 'fe_open' 
variable), but generates event only in one direction. Combined with 
'mux_chr_set_handlers()' hack this makes things even worse.
Better solution is to change fe interface and rewrite all frontends code (a 
lot of stuff in hw/char/* and somewhere else).
Although this patch doesn't fix any issue/bug (known to me), it prevents 
them in future. Also it optimizes emulation performance by avoiding extra 
activity.
I did several trivial tests on x86_64 target and seems like nothing broken.

 chardev/char-fe.c  |  9 ++---
 chardev/char-mux.c | 13 -
 include/chardev/char-mux.h |  2 +-
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/chardev/char-fe.c b/chardev/char-fe.c
index a8931f7..31cf7f0 100644
--- a/chardev/char-fe.c
+++ b/chardev/char-fe.c
@@ -257,6 +257,7 @@ void qemu_chr_fe_set_handlers(CharBackend *b,
 {
 Chardev *s;
 int fe_open;
+static __thread bool mux_reentered;
 
 s = b->chr;
 if (!s) {
@@ -284,14 +285,16 @@ void qemu_chr_fe_set_handlers(CharBackend *b,
 if (fe_open) {
 qemu_chr_fe_take_focus(b);
 /* We're connecting to an already opened device, so let's make sure we
-   also get the open event */
-if (s->be_open) {
+   also get the open event (hack: except when chardev is muxed) */
+if (s->be_open && !mux_reentered) {
 qemu_chr_be_event(s, CHR_EVENT_OPENED);
 }
 }
 
 if (CHARDEV_IS_MUX(s)) {
-mux_chr_set_handlers(s, context);
+mux_reentered = true;
+mux_chr_set_handlers(s, fe_open, context);
+mux_reentered = false;
 }
 }
 
diff --git a/chardev/char-mux.c b/chardev/char-mux.c
index 6055e76..9244802 100644
--- a/chardev/char-mux.c
+++ b/chardev/char-mux.c
@@ -272,21 +272,24 @@ static void char_mux_finalize(Object *obj)
 for (i = 0; i < d->mux_cnt; i++) {
 CharBackend *be = d->backends[i];
 if (be) {
+if (be->chr && be->chr->be_open) {
+qemu_chr_be_event(be->chr, CHR_EVENT_CLOSED);
+}
 be->chr = NULL;
 }
 }
 qemu_chr_fe_deinit(>chr, false);
 }
 
-void mux_chr_set_handlers(Chardev *chr, GMainContext *context)
+void mux_chr_set_handlers(Chardev *chr, bool is_open, GMainContext *context)
 {
 MuxChardev *d = MUX_CHARDEV(chr);
 
 /* Fix up the real driver with mux routines */
 qemu_chr_fe_set_handlers(>chr,
- mux_chr_can_read,
- mux_chr_read,
- mux_chr_event,
+ is_open ? mux_chr_can_read : NULL,
+ is_open ? mux_chr_read : NULL,
+ is_open ? mux_chr_event : NULL,
  NULL,
  chr,
  context, true);
@@ -367,7 +370,7 @@ static int open_muxes(Chardev *chr)
  * mark mux as OPENED so any new FEs will immediately receive
  * OPENED event
  */
-qemu_chr_be_event(chr, CHR_EVENT_OPENED);
+chr->be_open = 1;
 
 return 0;
 }
diff --git a/include/chardev/char-mux.h b/include/chardev/char-mux.h
index 1e13187..4b4df6e 100644
--- a/include/chardev/char-mux.h
+++ b/include/chardev/char-mux.h
@@ -55,7 +55,7 @@ typedef struct MuxChardev {
 #define CHARDEV_IS_MUX(chr) \
 object_dynamic_cast(OBJECT(chr), TYPE_CHARDEV_MUX)
 
-void mux_chr_set_handlers(Chardev *chr, GMainContext *context);
+void mux_chr_set_handlers(Chardev *chr, bool is_open, GMainContext *context);
 void mux_set_focus(Chardev *chr, int focus);
 void mux_chr_send_all_event(Chardev *chr, int event);
 
-- 
2.7.4




[Qemu-devel] Features/HelperNetworking wiki page: setup description misses root ownership

2018-10-24 Thread Artem Pisarenko
'Setup' section of https://wiki.qemu.org/Features/HelperNetworking says
that setuid attribute needs to be turned on for 'qemu-bridge-helper'
binary, but it forgets to mention that owner of this file must be root
user. Otherwise, setuid bit makes no sense.
Looks like in most scenarios this binary already installed with root
privileges. But this isn't a case, when you build QEMU from sources and
install it in custom location, running 'make install' by non-privileged
user. Yes, maybe it's obvious for mature linux users, but it took me a lot
of time to figure out why QEMU fails with error: "failed to create tun
device: Operation not permitted" (googling this error shows mostly answers
like "you just forget setuid").
So, it would be useful to add
sudo chown root /usr/local/libexec/qemu-bridge-helper
*before*
sudo chmod u+s /usr/local/libexec/qemu-bridge-helper


Re: [Qemu-devel] [PULL 05/48] qemu-timer: optimize record/replay checkpointing for all clocks

2018-10-19 Thread Artem Pisarenko
> Signed-off-by: Artem Pisarenko 
> Message-Id: <
549dbf4ebfa4c82051d01a264c27f88929fc277b.1539764043.git.artem.k.pisare...@gmail.com
>

Actually this version has nothing common with my person. Neither original
idea, nor content. Except maybe of my involvement in its discussion, commit
message and just few phrases in code comments.


Re: [Qemu-devel] [PATCH v3 4/4] Optimize record/replay checkpointing for all clocks it applies to

2018-10-19 Thread Artem Pisarenko
> …
> This is wrong at least for QEMU_CLOCK_HOST.
> …
> Reading the host clock here is not protected by the checkpoint.
> Therefore it may incur the inconsistency when replaying the execution.

That's why I didn't like idea of this patch and asked for any possible side
effects beforehand. So, here is one.
I have no idea how to fix it properly and would like to recall this
particular patch from these series (remember, it doesn't match series
goal). Lets leave this patch/issue for separate resolution, although I'm
not very interested in it.


Re: [Qemu-devel] [PATCH v3 0/4] Fix and improve core RTC function and documentation

2018-10-18 Thread Artem Pisarenko
> As a start of future refactoring, would you mind moving all this code to
> hw/timer/rtc.c or rtc.c?  It was somewaht generic before, but now it's
> very tied to -rtc.

Yes, sure.


Re: [Qemu-devel] [PATCH v3] Optimize record/replay checkpointing for all clocks it applies to

2018-10-18 Thread Artem Pisarenko
>чт, 18 окт. 2018 г., 23:25 Paolo Bonzini:
>On 18/10/2018 19:10, Artem Pisarenko wrote:
>>
>>> No, you're right. The if should remain in the caller, or
>>> need_replay_checkpoint must be initialized with replay_mode.
>>
>> If initialize 'need_replay_checkpoint', then it should also account for
>> clock != QEMU_CLOCK_REALTIME.
>
> Or you just get a unlock/lock pair for QEMU_CLOCK_REALTIME (which should
> really never happen if e.g. you have no UI).

And still have duplication (just smaller in this case).


>> Why do you want to split up such tightly coupled code?
>
> Because it's *too* coupled and not very readable.

Tightly coupled code in my understanding is property having its roots in
design, which usually has wider context than piece of code in question. So
we canot avoid this by definition (limiting changes only to this code).
Trying to improve it is just like a playing with bubble wrap - pressing
each bubble causes another bubble to pop up.

In our particular case it's because of overall rr design, qemu architecture
and a way how rr integrated in timers. The best we can do in this case is
to localise/quarantine ugly aspects as much as possible, carefully and
plenteously comment and try don't touch them... never... I consider
'timerlist_run_timers()' is already totally infected and we're just late to
save anyone of its residents (by isolating others).


Re: [Qemu-devel] [PATCH v3] Optimize record/replay checkpointing for all clocks it applies to

2018-10-18 Thread Artem Pisarenko
> чт, 18 окт. 2018 г., 20:31 Paolo Bonzini:
>On 18/10/2018 15:23, Artem Pisarenko wrote:
>>> We can also move the switch statement to a separate function, it
>>> simplifies the code:
>>> ...
>>
>> When I prepared this patch my intuition said me to add note in advance:
>> "Paolo, please, don't try to move this to a separate function. I've
>> tried it already. It cannot be done correct, look nice and not decrease
>> performancy at the same time". But I've ignored it... :)
>> Change you did is correct and nice, but (compared to my version) it adds
>> extra unlock/lock pair for running each timer list when it isn't empty
>> and in non-rr mode (where we would ignore checkpoints and execution flow
>> would bypass whole "if (need_replay_checkpoint) {...}" block).
>> Maybe you're aware of it, but I don't think that such change worth it.
>
> No, you're right. The if should remain in the caller, or
> need_replay_checkpoint must be initialized with replay_mode.

If initialize 'need_replay_checkpoint', then it should also account for
clock != QEMU_CLOCK_REALTIME. And here we come to what if+switch block
actually (mostly) does in my version. Finally, you will get duplication of
this whole condition usage between source function and extracted function,
which isn't nice.
Why do you want to split up such tightly coupled code?


Re: [Qemu-devel] [PATCH v3] Optimize record/replay checkpointing for all clocks it applies to

2018-10-18 Thread Artem Pisarenko
> We can also move the switch statement to a separate function, it
> simplifies the code:
> ...

When I prepared this patch my intuition said me to add note in advance:
"Paolo, please, don't try to move this to a separate function. I've tried
it already. It cannot be done correct, look nice and not decrease
performancy at the same time". But I've ignored it... :)
Change you did is correct and nice, but (compared to my version) it adds
extra unlock/lock pair for running each timer list when it isn't empty and
in non-rr mode (where we would ignore checkpoints and execution flow would
bypass whole "if (need_replay_checkpoint) {...}" block).
Maybe you're aware of it, but I don't think that such change worth it.
-- 

С уважением,
  Артем Писаренко


[Qemu-devel] [PATCH v3 4/4] Optimize record/replay checkpointing for all clocks it applies to

2018-10-18 Thread Artem Pisarenko
Removes redundant checkpoints in replay log when there are no expired timers in 
timers list, associated with corresponding clock (i.e. no rr events associated 
with current clock value).
This also improves performance in rr mode.

Signed-off-by: Artem Pisarenko 
---

Oops and again oops. Now it finally should be fine...

v3:
- fixed compiler warning caused non-debug build to fail

 include/qemu/timer.h |  2 +-
 util/qemu-timer.c| 62 +---
 2 files changed, 31 insertions(+), 33 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index dc0fd14..bff8dac 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -65,7 +65,7 @@ typedef enum {
  * QEMU_TIMER_ATTR_EXTERNAL: drives external subsystem
  *
  * Timers with this attribute do not recorded in rr mode, therefore it could be
- * used for the subsystems that operate outside the guest core. Applicable only
+ * used for the subsystems that operate outside the guest core. Relevant only
  * with virtual clock type.
  */
 
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index e2746cf..47205fe 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -490,6 +490,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 QEMUTimerCB *cb;
 void *opaque;
 bool need_replay_checkpoint = false;
+ReplayCheckpoint replay_checkpoint_id = (ReplayCheckpoint)-1;
 
 if (!atomic_read(_list->active_timers)) {
 return false;
@@ -500,43 +501,40 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 goto out;
 }
 
-switch (timer_list->clock->type) {
-case QEMU_CLOCK_REALTIME:
-break;
-default:
-case QEMU_CLOCK_VIRTUAL:
-if (replay_mode != REPLAY_MODE_NONE) {
-/* Checkpoint for virtual clock is redundant in cases where
- * it's being triggered with only non-EXTERNAL timers, because
- * these timers don't change guest state directly.
- * Since it has conditional dependence on specific timers, it is
- * subject to race conditions and requires special handling.
- * See below.
- */
+if (replay_mode != REPLAY_MODE_NONE) {
+/* Postpone actual checkpointing to timer list processing
+ * to properly check if we actually need it.
+ */
+switch (timer_list->clock->type) {
+case QEMU_CLOCK_VIRTUAL:
 need_replay_checkpoint = true;
+replay_checkpoint_id = CHECKPOINT_CLOCK_VIRTUAL;
+break;
+case QEMU_CLOCK_HOST:
+need_replay_checkpoint = true;
+replay_checkpoint_id = CHECKPOINT_CLOCK_HOST;
+break;
+case QEMU_CLOCK_VIRTUAL_RT:
+need_replay_checkpoint = true;
+replay_checkpoint_id = CHECKPOINT_CLOCK_VIRTUAL_RT;
+break;
+default:
+break;
 }
-break;
-case QEMU_CLOCK_HOST:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_HOST)) {
-goto out;
-}
-break;
-case QEMU_CLOCK_VIRTUAL_RT:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL_RT)) {
-goto out;
-}
-break;
 }
 
 /*
- * Extract expired timers from active timers list and and process them.
+ * Extract expired timers from active timers list and and process them,
+ * taking into account checkpointing required in rr mode.
  *
- * In rr mode we need "filtered" checkpointing for virtual clock.
- * Checkpoint must be replayed before any non-EXTERNAL timer has been
- * processed and only one time (virtual clock value stays same). But these
- * timers may appear in the timers list while it being processed, so this
- * must be checked until we finally decide that "no timers left - we are
- * done".
+ * Checkpoint must be replayed before any timer has been processed
+ * and only one time. But new timers may appear in the timers list while
+ * it's being processed, so this must be checked until we finally decide
+ * that "no timers left - we are done" (to avoid skipping checkpoint due to
+ * possible races).
+ * Also checkpoint for virtual clock is redundant in cases where it's being
+ * triggered with only non-EXTERNAL timers, because these timers don't
+ * change guest state directly.
  */
 current_time = qemu_clock_get_ns(timer_list->clock->type);
 qemu_mutex_lock(_list->active_timers_lock);
@@ -552,7 +550,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 /* once we got here, checkpoint clock only once */
 need_replay_checkpoint = false;
 qemu_mutex_unlock(_list->active_timers_lock);
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
+if (!replay_checkpoint(replay_checkpoint_id)) {
 goto out;
 }
 qemu_mutex_lock(_list->active_timers_lock);
-- 
2.7.4




Re: [Qemu-devel] [PATCH v3 0/4] Introduce attributes for timers subsystem and remove QEMU_CLOCK_VIRTUAL_EXT clock type

2018-10-18 Thread Artem Pisarenko
Sorry, I forgot to fix long lines in commit messages. Do I need to submit
v4 ?

-- 

С уважением,
  Артем Писаренко


[Qemu-devel] [PATCH v3] Optimize record/replay checkpointing for all clocks it applies to

2018-10-18 Thread Artem Pisarenko
Removes redundant checkpoints in replay log when there are no expired timers in 
timers list, associated with corresponding clock (i.e. no rr events associated 
with current clock value).
This also improves performance in rr mode.

Signed-off-by: Artem Pisarenko 
---

Oops, forgot to commit this fix

v3:
- fixed compiler warning caused non-debug build to fail

 include/qemu/timer.h |  2 +-
 util/qemu-timer.c| 62 +---
 2 files changed, 31 insertions(+), 33 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index dc0fd14..bff8dac 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -65,7 +65,7 @@ typedef enum {
  * QEMU_TIMER_ATTR_EXTERNAL: drives external subsystem
  *
  * Timers with this attribute do not recorded in rr mode, therefore it could be
- * used for the subsystems that operate outside the guest core. Applicable only
+ * used for the subsystems that operate outside the guest core. Relevant only
  * with virtual clock type.
  */
 
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index e2746cf..47205fe 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -490,6 +490,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 QEMUTimerCB *cb;
 void *opaque;
 bool need_replay_checkpoint = false;
+ReplayCheckpoint replay_checkpoint_id = (ReplayCheckpoint)-1;
 
 if (!atomic_read(_list->active_timers)) {
 return false;
@@ -500,43 +501,40 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 goto out;
 }
 
-switch (timer_list->clock->type) {
-case QEMU_CLOCK_REALTIME:
-break;
-default:
-case QEMU_CLOCK_VIRTUAL:
-if (replay_mode != REPLAY_MODE_NONE) {
-/* Checkpoint for virtual clock is redundant in cases where
- * it's being triggered with only non-EXTERNAL timers, because
- * these timers don't change guest state directly.
- * Since it has conditional dependence on specific timers, it is
- * subject to race conditions and requires special handling.
- * See below.
- */
+if (replay_mode != REPLAY_MODE_NONE) {
+/* Postpone actual checkpointing to timer list processing
+ * to properly check if we actually need it.
+ */
+switch (timer_list->clock->type) {
+case QEMU_CLOCK_VIRTUAL:
 need_replay_checkpoint = true;
+replay_checkpoint_id = CHECKPOINT_CLOCK_VIRTUAL;
+break;
+case QEMU_CLOCK_HOST:
+need_replay_checkpoint = true;
+replay_checkpoint_id = CHECKPOINT_CLOCK_HOST;
+break;
+case QEMU_CLOCK_VIRTUAL_RT:
+need_replay_checkpoint = true;
+replay_checkpoint_id = CHECKPOINT_CLOCK_VIRTUAL_RT;
+break;
+default:
+break;
 }
-break;
-case QEMU_CLOCK_HOST:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_HOST)) {
-goto out;
-}
-break;
-case QEMU_CLOCK_VIRTUAL_RT:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL_RT)) {
-goto out;
-}
-break;
 }
 
 /*
- * Extract expired timers from active timers list and and process them.
+ * Extract expired timers from active timers list and and process them,
+ * taking into account checkpointing required in rr mode.
  *
- * In rr mode we need "filtered" checkpointing for virtual clock.
- * Checkpoint must be replayed before any non-EXTERNAL timer has been
- * processed and only one time (virtual clock value stays same). But these
- * timers may appear in the timers list while it being processed, so this
- * must be checked until we finally decide that "no timers left - we are
- * done".
+ * Checkpoint must be replayed before any timer has been processed
+ * and only one time. But new timers may appear in the timers list while
+ * it's being processed, so this must be checked until we finally decide
+ * that "no timers left - we are done" (to avoid skipping checkpoint due to
+ * possible races).
+ * Also checkpoint for virtual clock is redundant in cases where it's being
+ * triggered with only non-EXTERNAL timers, because these timers don't
+ * change guest state directly.
  */
 current_time = qemu_clock_get_ns(timer_list->clock->type);
 qemu_mutex_lock(_list->active_timers_lock);
@@ -552,7 +550,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 /* once we got here, checkpoint clock only once */
 need_replay_checkpoint = false;
 qemu_mutex_unlock(_list->active_timers_lock);
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
+if (!replay_checkpoint(replay_checkpoint_id)) {
 goto out;
 }
 qemu_mutex_lock(_list->active_timers_lock);
-- 
2.7.4




[Qemu-devel] [PATCH v3 1/4] Revert some patches from recent [PATCH v6] "Fixing record/replay and adding reverse debugging"

2018-10-18 Thread Artem Pisarenko
That patch series introduced new virtual clock type for use in external 
subsystems. It breaks desired behavior in non-record/replay usage scenarios.

This reverts commit 87f4fe7653baf55b5c2f2753fe6003f473c07342.
This reverts commit 775a412bf83f6bc0c5c02091ee06cf649b34c593.
This reverts commit 9888091404a702d7ec79d51b088d994b9fc121bd.

Signed-off-by: Artem Pisarenko 
---
 include/qemu/timer.h | 9 -
 slirp/ip6_icmp.c | 7 +++
 ui/input.c   | 8 
 util/qemu-timer.c| 2 --
 4 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index a005ed2..39ea907 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -42,14 +42,6 @@
  * In icount mode, this clock counts nanoseconds while the virtual
  * machine is running.  It is used to increase @QEMU_CLOCK_VIRTUAL
  * while the CPUs are sleeping and thus not executing instructions.
- *
- * @QEMU_CLOCK_VIRTUAL_EXT: virtual clock for external subsystems
- *
- * The virtual clock only runs during the emulation. It stops
- * when the virtual machine is stopped. The timers for this clock
- * do not recorded in rr mode, therefore this clock could be used
- * for the subsystems that operate outside the guest core.
- *
  */
 
 typedef enum {
@@ -57,7 +49,6 @@ typedef enum {
 QEMU_CLOCK_VIRTUAL = 1,
 QEMU_CLOCK_HOST = 2,
 QEMU_CLOCK_VIRTUAL_RT = 3,
-QEMU_CLOCK_VIRTUAL_EXT = 4,
 QEMU_CLOCK_MAX
 } QEMUClockType;
 
diff --git a/slirp/ip6_icmp.c b/slirp/ip6_icmp.c
index 3f41187..ee333d0 100644
--- a/slirp/ip6_icmp.c
+++ b/slirp/ip6_icmp.c
@@ -17,7 +17,7 @@ static void ra_timer_handler(void *opaque)
 {
 Slirp *slirp = opaque;
 timer_mod(slirp->ra_timer,
-  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_EXT) + NDP_Interval);
+  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
 ndp_send_ra(slirp);
 }
 
@@ -27,10 +27,9 @@ void icmp6_init(Slirp *slirp)
 return;
 }
 
-slirp->ra_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL_EXT,
-   ra_timer_handler, slirp);
+slirp->ra_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, ra_timer_handler, 
slirp);
 timer_mod(slirp->ra_timer,
-  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_EXT) + NDP_Interval);
+  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
 }
 
 void icmp6_cleanup(Slirp *slirp)
diff --git a/ui/input.c b/ui/input.c
index dd7f6d7..51b1019 100644
--- a/ui/input.c
+++ b/ui/input.c
@@ -271,7 +271,7 @@ static void qemu_input_queue_process(void *opaque)
 item = QTAILQ_FIRST(queue);
 switch (item->type) {
 case QEMU_INPUT_QUEUE_DELAY:
-timer_mod(item->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_EXT)
+timer_mod(item->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL)
   + item->delay_ms);
 return;
 case QEMU_INPUT_QUEUE_EVENT:
@@ -301,7 +301,7 @@ static void qemu_input_queue_delay(struct 
QemuInputEventQueueHead *queue,
 queue_count++;
 
 if (start_timer) {
-timer_mod(item->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_EXT)
+timer_mod(item->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL)
   + item->delay_ms);
 }
 }
@@ -448,8 +448,8 @@ void qemu_input_event_send_key_delay(uint32_t delay_ms)
 }
 
 if (!kbd_timer) {
-kbd_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL_EXT,
- qemu_input_queue_process, _queue);
+kbd_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, qemu_input_queue_process,
+ _queue);
 }
 if (queue_count < queue_limit) {
 qemu_input_queue_delay(_queue, kbd_timer,
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index eb60d8f..86bfe84 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -496,7 +496,6 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 
 switch (timer_list->clock->type) {
 case QEMU_CLOCK_REALTIME:
-case QEMU_CLOCK_VIRTUAL_EXT:
 break;
 default:
 case QEMU_CLOCK_VIRTUAL:
@@ -598,7 +597,6 @@ int64_t qemu_clock_get_ns(QEMUClockType type)
 return get_clock();
 default:
 case QEMU_CLOCK_VIRTUAL:
-case QEMU_CLOCK_VIRTUAL_EXT:
 if (use_icount) {
 return cpu_get_icount();
 } else {
-- 
2.7.4




[Qemu-devel] [PATCH v3 0/4] Introduce attributes for timers subsystem and remove QEMU_CLOCK_VIRTUAL_EXT clock type

2018-10-18 Thread Artem Pisarenko
Recent patches from series [PATCH v6] "Fixing record/replay and adding reverse 
debugging" introduced new clock type QEMU_CLOCK_VIRTUAL_EXT and replaced 
virtual timers in some external subsystems with it.
This resulted in small change to existing behavior, which I consider to be 
unacceptable.
Processing of virtual timers, belonging to new clock type, was kicked off to 
main loop, which made them asynchronous with vCPU thread and, in icount mode, 
with whole guest execution. This breaks expected determinism in 
non-record/replay icount mode of emulation where these "external subsystems" 
are isolated from host (i.e. they external only to guest core, not to emulation 
environment).

Example for slirp ("user" backend for network device):
User runs qemu in icount mode with rtc clock=vm without any external 
communication interfaces but with "-netdev user,restrict=on". It expects 
deterministic execution, because network services are emulated inside qemu and 
isolated from host. There are no reasons to get reply from DHCP server with 
different delay or something like that.

These series of patches revert those commits and reimplement their 
modifications in a better way.

Current implementation of timers/clock processing is confusing (at least for 
me) because of exceptions from design concept behind them, which already 
introduced by icount mode (which adds QEMU_CLOCK_VIRTUAL_RT). Adding 
QEMU_CLOCK_VIRTUAL_EXT just made things even more complicated. I consider these 
"alternative" virtual clocks to be some kind of hacks being convinient only to 
authors of relevant qemu features. Lets don't touch fundamental clock types and 
keep them orthogonal to special cases of timers handling.

As far as I understand, original intention of author was just to make 
optimization in replay log by avoiding storing extra events which don't change 
guest state directly. Indeed, for example, ipv6 icmp timer in slirp does things 
which external to guest core and ends with sending network packet to guest, but 
record/replay will anyway catch event, corresponding to packet reception in 
guest network frontend, and store it to replay log, so there are no need in 
making checkpoint for corresponding clock when that timer fires.
If so, then we just need to skip checkpoints for clock values, when only these 
specific timers are run. It is individual timers which are specific, not clock.
Adding some kind of attribute/flag/property to individual timer allows any 
special qemu feature (such as record/replay) to inspect it and handle as 
needed. This design achieves less dependencies, more transparency and has more 
intuitive and clear sense. For record/replay feature it's achieved with 
'EXTERNAL' attribute. The only drawback is that it required to add one extra 
mutex unlock/lock pair for virtual clock type in timerlist_run_timers() 
function (see patch 3), but it's being added only when qemu runs in 
record/replay mode.

Finally, this patch series optimizes checkpointing for all other clocks it 
applies to.


v3 changes:
 - fixed failed build in last patch
 - attributes has been refactored and restricted to be used only within 
qemu-timer (as suggested by Stefan Hajnoczi and Paolo Bonzini)

v2 changes:
 - timers/aio interface refactored and improved, addition to couroutines 
removed (as Paolo Bonzini suggested)
 - fixed wrong reimplementation in qemu-timer.c caused race conditions
 - added bonus patch which optimizes checkpointing for other clocks

P.S. I've tried to test record/replay with slirp, but in replay mode qemu 
stucks at guest linux boot after "Configuring network interfaces..." message, 
where DHCP communication takes place. It's broken in a same way both in master 
and master with reverted commits being fixed.

P.S.2. I wasn't able to test these patches on purely clean master branch 
because of bugs https://bugs.launchpad.net/qemu/+bug/1790460 and 
https://bugs.launchpad.net/qemu/+bug/1795369, which workarounded by several 
not-accepted (yet?) modifications.


Artem Pisarenko (4):
  Revert some patches from recent [PATCH v6] "Fixing record/replay and
adding reverse debugging"
  Introduce attributes to qemu timer subsystem
  Restores record/replay behavior related to special virtual clock
processing for timers used in external subsystems.
  Optimize record/replay checkpointing for all clocks it applies to

 include/block/aio.h   |  59 ++---
 include/qemu/timer.h  | 128 +++---
 slirp/ip6_icmp.c  |   9 ++--
 tests/ptimer-test-stubs.c |  13 +++--
 ui/input.c|   9 ++--
 util/qemu-timer.c |  95 +++---
 6 files changed, 201 insertions(+), 112 deletions(-)

-- 
2.7.4




[Qemu-devel] [PATCH v3 3/4] Restores record/replay behavior related to special virtual clock processing for timers used in external subsystems.

2018-10-18 Thread Artem Pisarenko
Adds EXTERNAL attribute definition to qemu timers subsystem and assigns it to 
virtual clock timers, used in slirp (ICMP IPv6) and ui (key queue).
Virtual clock processing in rr mode reimplemented using this attribute.

Fixes: 87f4fe7653baf55b5c2f2753fe6003f473c07342
Fixes: 775a412bf83f6bc0c5c02091ee06cf649b34c593
Fixes: 9888091404a702d7ec79d51b088d994b9fc121bd
Signed-off-by: Artem Pisarenko 
---
 include/qemu/timer.h | 11 ++-
 slirp/ip6_icmp.c |  4 +++-
 ui/input.c   |  5 +++--
 util/qemu-timer.c| 50 +++---
 4 files changed, 59 insertions(+), 11 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index d45aded..dc0fd14 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -2,6 +2,7 @@
 #define QEMU_TIMER_H
 
 #include "qemu-common.h"
+#include "qemu/bitops.h"
 #include "qemu/notify.h"
 #include "qemu/host-utils.h"
 
@@ -59,9 +60,17 @@ typedef enum {
  * Each attribute corresponds to one bit. Attributes modify the processing
  * of timers when they fire.
  *
- * No attributes defined currently.
+ * The following attributes are available:
+ *
+ * QEMU_TIMER_ATTR_EXTERNAL: drives external subsystem
+ *
+ * Timers with this attribute do not recorded in rr mode, therefore it could be
+ * used for the subsystems that operate outside the guest core. Applicable only
+ * with virtual clock type.
  */
 
+#define QEMU_TIMER_ATTR_EXTERNAL BIT(0)
+
 typedef struct QEMUTimerList QEMUTimerList;
 
 struct QEMUTimerListGroup {
diff --git a/slirp/ip6_icmp.c b/slirp/ip6_icmp.c
index ee333d0..cd1e0b9 100644
--- a/slirp/ip6_icmp.c
+++ b/slirp/ip6_icmp.c
@@ -27,7 +27,9 @@ void icmp6_init(Slirp *slirp)
 return;
 }
 
-slirp->ra_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, ra_timer_handler, 
slirp);
+slirp->ra_timer = timer_new_full(NULL, QEMU_CLOCK_VIRTUAL,
+ SCALE_MS, QEMU_TIMER_ATTR_EXTERNAL,
+ ra_timer_handler, slirp);
 timer_mod(slirp->ra_timer,
   qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
 }
diff --git a/ui/input.c b/ui/input.c
index 51b1019..7c9a410 100644
--- a/ui/input.c
+++ b/ui/input.c
@@ -448,8 +448,9 @@ void qemu_input_event_send_key_delay(uint32_t delay_ms)
 }
 
 if (!kbd_timer) {
-kbd_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, qemu_input_queue_process,
- _queue);
+kbd_timer = timer_new_full(NULL, QEMU_CLOCK_VIRTUAL,
+   SCALE_MS, QEMU_TIMER_ATTR_EXTERNAL,
+   qemu_input_queue_process, _queue);
 }
 if (queue_count < queue_limit) {
 qemu_input_queue_delay(_queue, kbd_timer,
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index 04527a3..e2746cf 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -489,6 +489,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 bool progress = false;
 QEMUTimerCB *cb;
 void *opaque;
+bool need_replay_checkpoint = false;
 
 if (!atomic_read(_list->active_timers)) {
 return false;
@@ -504,8 +505,15 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 break;
 default:
 case QEMU_CLOCK_VIRTUAL:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
-goto out;
+if (replay_mode != REPLAY_MODE_NONE) {
+/* Checkpoint for virtual clock is redundant in cases where
+ * it's being triggered with only non-EXTERNAL timers, because
+ * these timers don't change guest state directly.
+ * Since it has conditional dependence on specific timers, it is
+ * subject to race conditions and requires special handling.
+ * See below.
+ */
+need_replay_checkpoint = true;
 }
 break;
 case QEMU_CLOCK_HOST:
@@ -520,13 +528,38 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 break;
 }
 
+/*
+ * Extract expired timers from active timers list and and process them.
+ *
+ * In rr mode we need "filtered" checkpointing for virtual clock.
+ * Checkpoint must be replayed before any non-EXTERNAL timer has been
+ * processed and only one time (virtual clock value stays same). But these
+ * timers may appear in the timers list while it being processed, so this
+ * must be checked until we finally decide that "no timers left - we are
+ * done".
+ */
 current_time = qemu_clock_get_ns(timer_list->clock->type);
-for(;;) {
-qemu_mutex_lock(_list->active_timers_lock);
-ts = timer_list->active_timers;
+qemu_mutex_lock(_list->active_timers_lock);
+while ((ts = timer_list->active_timers)) {
 if (!timer_expired_ns(ts, current_time)) {
+/* No expired timers left.
+ 

[Qemu-devel] [PATCH v3 4/4] Optimize record/replay checkpointing for all clocks it applies to

2018-10-18 Thread Artem Pisarenko
Removes redundant checkpoints in replay log when there are no expired timers in 
timers list, associated with corresponding clock (i.e. no rr events associated 
with current clock value).
This also improves performance in rr mode.

Signed-off-by: Artem Pisarenko 
---

Notes:
v3:
- fixed compiler warning caused non-debug build to fail

 include/qemu/timer.h |  2 +-
 util/qemu-timer.c| 62 +---
 2 files changed, 31 insertions(+), 33 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index dc0fd14..bff8dac 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -65,7 +65,7 @@ typedef enum {
  * QEMU_TIMER_ATTR_EXTERNAL: drives external subsystem
  *
  * Timers with this attribute do not recorded in rr mode, therefore it could be
- * used for the subsystems that operate outside the guest core. Applicable only
+ * used for the subsystems that operate outside the guest core. Relevant only
  * with virtual clock type.
  */
 
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index e2746cf..216d107 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -490,6 +490,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 QEMUTimerCB *cb;
 void *opaque;
 bool need_replay_checkpoint = false;
+ReplayCheckpoint replay_checkpoint_id;
 
 if (!atomic_read(_list->active_timers)) {
 return false;
@@ -500,43 +501,40 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 goto out;
 }
 
-switch (timer_list->clock->type) {
-case QEMU_CLOCK_REALTIME:
-break;
-default:
-case QEMU_CLOCK_VIRTUAL:
-if (replay_mode != REPLAY_MODE_NONE) {
-/* Checkpoint for virtual clock is redundant in cases where
- * it's being triggered with only non-EXTERNAL timers, because
- * these timers don't change guest state directly.
- * Since it has conditional dependence on specific timers, it is
- * subject to race conditions and requires special handling.
- * See below.
- */
+if (replay_mode != REPLAY_MODE_NONE) {
+/* Postpone actual checkpointing to timer list processing
+ * to properly check if we actually need it.
+ */
+switch (timer_list->clock->type) {
+case QEMU_CLOCK_VIRTUAL:
 need_replay_checkpoint = true;
+replay_checkpoint_id = CHECKPOINT_CLOCK_VIRTUAL;
+break;
+case QEMU_CLOCK_HOST:
+need_replay_checkpoint = true;
+replay_checkpoint_id = CHECKPOINT_CLOCK_HOST;
+break;
+case QEMU_CLOCK_VIRTUAL_RT:
+need_replay_checkpoint = true;
+replay_checkpoint_id = CHECKPOINT_CLOCK_VIRTUAL_RT;
+break;
+default:
+break;
 }
-break;
-case QEMU_CLOCK_HOST:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_HOST)) {
-goto out;
-}
-break;
-case QEMU_CLOCK_VIRTUAL_RT:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL_RT)) {
-goto out;
-}
-break;
 }
 
 /*
- * Extract expired timers from active timers list and and process them.
+ * Extract expired timers from active timers list and and process them,
+ * taking into account checkpointing required in rr mode.
  *
- * In rr mode we need "filtered" checkpointing for virtual clock.
- * Checkpoint must be replayed before any non-EXTERNAL timer has been
- * processed and only one time (virtual clock value stays same). But these
- * timers may appear in the timers list while it being processed, so this
- * must be checked until we finally decide that "no timers left - we are
- * done".
+ * Checkpoint must be replayed before any timer has been processed
+ * and only one time. But new timers may appear in the timers list while
+ * it's being processed, so this must be checked until we finally decide
+ * that "no timers left - we are done" (to avoid skipping checkpoint due to
+ * possible races).
+ * Also checkpoint for virtual clock is redundant in cases where it's being
+ * triggered with only non-EXTERNAL timers, because these timers don't
+ * change guest state directly.
  */
 current_time = qemu_clock_get_ns(timer_list->clock->type);
 qemu_mutex_lock(_list->active_timers_lock);
@@ -552,7 +550,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 /* once we got here, checkpoint clock only once */
 need_replay_checkpoint = false;
 qemu_mutex_unlock(_list->active_timers_lock);
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
+if (!replay_checkpoint(replay_checkpoint_id)) {
 goto out;
 }
 qemu_mutex_lock(_list->active_timers_lock);
-- 
2.7.4




[Qemu-devel] [PATCH v3 2/4] Introduce attributes to qemu timer subsystem

2018-10-18 Thread Artem Pisarenko
Attributes are simple flags, associated with individual timers for their whole 
lifetime.
They intended to be used to mark individual timers for special handling by 
various qemu features which have integration into qemu-timer.
New/init functions family in timer interface updated and refactored (new 
'attribute' argument added, timer_list replaced with timer_list_group+type 
combinations, comments improved to avoid info duplication).
Also existing aio interface extended with attribute-enabled variants of 
functions, which create/initialize timers.

Signed-off-by: Artem Pisarenko 
---

Notes:
v3:
- attributes has been properly incapsulated to qemu-timer (as suggested by 
Stefan Hajnoczi)
- attributes definition and docs refactored to avoid extra enum and use 
simple macros with explicit bit positions (as suggested by Stefan Hajnoczi and 
Paolo Bonzini)
- fixed old "QEMU_TIMER_ATTR(id)" notation (in comments) left from initial 
patch version

v2:
- timer creation/initialize functions reworked and and their unnecessary 
variants removed (as Paolo Bonzini suggested)
- also their comments improved to avoid info duplication

 include/block/aio.h   |  59 ++---
 include/qemu/timer.h  | 110 +++---
 tests/ptimer-test-stubs.c |  13 --
 util/qemu-timer.c |  13 --
 4 files changed, 125 insertions(+), 70 deletions(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index f08630c..0ca25df 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -388,6 +388,32 @@ struct LinuxAioState *aio_setup_linux_aio(AioContext *ctx, 
Error **errp);
 struct LinuxAioState *aio_get_linux_aio(AioContext *ctx);
 
 /**
+ * aio_timer_new_with_attrs:
+ * @ctx: the aio context
+ * @type: the clock type
+ * @scale: the scale
+ * @attributes: 0, or one to multiple OR'ed QEMU_TIMER_ATTR_ values
+ *  to assign
+ * @cb: the callback to call on timer expiry
+ * @opaque: the opaque pointer to pass to the callback
+ *
+ * Allocate a new timer (with attributes) attached to the context @ctx.
+ * The function is responsible for memory allocation.
+ *
+ * The preferred interface is aio_timer_init or aio_timer_init_with_attrs.
+ * Use that unless you really need dynamic memory allocation.
+ *
+ * Returns: a pointer to the new timer
+ */
+static inline QEMUTimer *aio_timer_new_with_attrs(AioContext *ctx,
+  QEMUClockType type,
+  int scale, int attributes,
+  QEMUTimerCB *cb, void 
*opaque)
+{
+return timer_new_full(>tlg, type, scale, attributes, cb, opaque);
+}
+
+/**
  * aio_timer_new:
  * @ctx: the aio context
  * @type: the clock type
@@ -396,10 +422,7 @@ struct LinuxAioState *aio_get_linux_aio(AioContext *ctx);
  * @opaque: the opaque pointer to pass to the callback
  *
  * Allocate a new timer attached to the context @ctx.
- * The function is responsible for memory allocation.
- *
- * The preferred interface is aio_timer_init. Use that
- * unless you really need dynamic memory allocation.
+ * See aio_timer_new_with_attrs for details.
  *
  * Returns: a pointer to the new timer
  */
@@ -407,7 +430,29 @@ static inline QEMUTimer *aio_timer_new(AioContext *ctx, 
QEMUClockType type,
int scale,
QEMUTimerCB *cb, void *opaque)
 {
-return timer_new_tl(ctx->tlg.tl[type], scale, cb, opaque);
+return timer_new_full(>tlg, type, scale, 0, cb, opaque);
+}
+
+/**
+ * aio_timer_init_with_attrs:
+ * @ctx: the aio context
+ * @ts: the timer
+ * @type: the clock type
+ * @scale: the scale
+ * @attributes: 0, or one to multiple OR'ed QEMU_TIMER_ATTR_ values
+ *  to assign
+ * @cb: the callback to call on timer expiry
+ * @opaque: the opaque pointer to pass to the callback
+ *
+ * Initialise a new timer (with attributes) attached to the context @ctx.
+ * The caller is responsible for memory allocation.
+ */
+static inline void aio_timer_init_with_attrs(AioContext *ctx,
+ QEMUTimer *ts, QEMUClockType type,
+ int scale, int attributes,
+ QEMUTimerCB *cb, void *opaque)
+{
+timer_init_full(ts, >tlg, type, scale, attributes, cb, opaque);
 }
 
 /**
@@ -420,14 +465,14 @@ static inline QEMUTimer *aio_timer_new(AioContext *ctx, 
QEMUClockType type,
  * @opaque: the opaque pointer to pass to the callback
  *
  * Initialise a new timer attached to the context @ctx.
- * The caller is responsible for memory allocation.
+ * See aio_timer_init_with_attrs for details.
  */
 static inline void aio_timer_init(AioContext *ctx,
   QEMUTimer *ts, QEMUClockType type,
  

[Qemu-devel] [PATCH v3 4/4] vl, qapi: offset calculation in RTC_CHANGE event reverted

2018-10-18 Thread Artem Pisarenko
Return value of qemu_timedate_diff(), used for calculation offset in
QAPI 'RTC_CHANGE' event, restored to keep compatibility. Since it
wasn't documented that difference is relative to host clock
advancement, this change also adds important note to 'RTC_CHANGE'
event description to highlight established implementation specifics.

Signed-off-by: Artem Pisarenko 
---
 qapi/misc.json |  3 ++-
 vl.c   | 10 +-
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/qapi/misc.json b/qapi/misc.json
index ada9af5..d0f5381 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -3070,7 +3070,8 @@
 # Emitted when the guest changes the RTC time.
 #
 # @offset: offset between base RTC clock (as specified by -rtc base), and
-#  new RTC clock value
+#  new RTC clock value. Note that value will be different depending
+#  on clock chosen to drive RTC (specified by -rtc clock).
 #
 # Note: This event is rate-limited.
 #
diff --git a/vl.c b/vl.c
index 78a8a68..c350aba 100644
--- a/vl.c
+++ b/vl.c
@@ -788,10 +788,10 @@ void qemu_system_vmstop_request(RunState state)
 
 /***/
 /* RTC reference time/date access */
-static time_t qemu_ref_timedate(void)
+static time_t qemu_ref_timedate(QEMUClockType clock)
 {
-time_t value = qemu_clock_get_ms(rtc_clock) / 1000;
-switch (rtc_clock) {
+time_t value = qemu_clock_get_ms(clock) / 1000;
+switch (clock) {
 case QEMU_CLOCK_REALTIME:
 value -= rtc_realtime_clock_offset;
 /* no break */
@@ -811,7 +811,7 @@ static time_t qemu_ref_timedate(void)
 
 void qemu_get_timedate(struct tm *tm, int offset)
 {
-time_t ti = qemu_ref_timedate();
+time_t ti = qemu_ref_timedate(rtc_clock);
 
 ti += offset;
 
@@ -849,7 +849,7 @@ int qemu_timedate_diff(struct tm *tm)
 break;
 }
 
-return seconds - qemu_ref_timedate();
+return seconds - qemu_ref_timedate(QEMU_CLOCK_HOST);
 }
 
 static void configure_rtc_base_datetime(const char *startdate)
-- 
2.7.4




[Qemu-devel] [PATCH v3 3/4] Fixes RTC bug with base datetime shifts in clock=vm

2018-10-18 Thread Artem Pisarenko
This makes all current "-rtc" option parameters combinations produce
fixed/unambiguous RTC timedate reference for hardware emulation
frontends.
It restores determinism of guest execution when used with clock=vm and
specified base  value.

Buglink: https://bugs.launchpad.net/qemu/+bug/1797033
Signed-off-by: Artem Pisarenko 
---
 vl.c | 58 +++---
 1 file changed, 35 insertions(+), 23 deletions(-)

diff --git a/vl.c b/vl.c
index 10c4275..78a8a68 100644
--- a/vl.c
+++ b/vl.c
@@ -152,8 +152,10 @@ static enum {
 RTC_BASE_LOCALTIME,
 RTC_BASE_DATETIME,
 } rtc_base_type = RTC_BASE_UTC;
-static int rtc_host_datetime_offset = -1; /* valid only for host rtc_clock and
- rtc_base_type=RTC_BASE_DATETIME */
+static time_t rtc_ref_start_datetime;
+static int rtc_realtime_clock_offset; /* used only with QEMU_CLOCK_REALTIME */
+static int rtc_host_datetime_offset = -1; /* valid & used only with
+ RTC_BASE_DATETIME */
 QEMUClockType rtc_clock;
 int vga_interface_type = VGA_NONE;
 static DisplayOptions dpy;
@@ -785,32 +787,42 @@ void qemu_system_vmstop_request(RunState state)
 }
 
 /***/
-/* real time host monotonic timer */
-
-static time_t qemu_timedate(void)
-{
-return qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000;
-}
-
-/***/
 /* RTC reference time/date access */
+static time_t qemu_ref_timedate(void)
+{
+time_t value = qemu_clock_get_ms(rtc_clock) / 1000;
+switch (rtc_clock) {
+case QEMU_CLOCK_REALTIME:
+value -= rtc_realtime_clock_offset;
+/* no break */
+case QEMU_CLOCK_VIRTUAL:
+value += rtc_ref_start_datetime;
+break;
+case QEMU_CLOCK_HOST:
+if (rtc_base_type == RTC_BASE_DATETIME) {
+value -= rtc_host_datetime_offset;
+}
+break;
+default:
+assert(0);
+}
+return value;
+}
+
 void qemu_get_timedate(struct tm *tm, int offset)
 {
-time_t ti = qemu_timedate();
+time_t ti = qemu_ref_timedate();
 
 ti += offset;
 
 switch (rtc_base_type) {
+case RTC_BASE_DATETIME:
 case RTC_BASE_UTC:
 gmtime_r(, tm);
 break;
 case RTC_BASE_LOCALTIME:
 localtime_r(, tm);
 break;
-case RTC_BASE_DATETIME:
-ti -= rtc_host_datetime_offset;
-gmtime_r(, tm);
-break;
 }
 }
 
@@ -819,6 +831,7 @@ int qemu_timedate_diff(struct tm *tm)
 time_t seconds;
 
 switch (rtc_base_type) {
+case RTC_BASE_DATETIME:
 case RTC_BASE_UTC:
 seconds = mktimegm(tm);
 break;
@@ -829,9 +842,6 @@ int qemu_timedate_diff(struct tm *tm)
 seconds = mktime();
 break;
 }
-case RTC_BASE_DATETIME:
-seconds = mktimegm(tm) + rtc_host_datetime_offset;
-break;
 default:
 /* gcc complains: ‘seconds’ may be used uninitialized */
 g_assert_not_reached();
@@ -839,10 +849,10 @@ int qemu_timedate_diff(struct tm *tm)
 break;
 }
 
-return seconds - qemu_timedate();
+return seconds - qemu_ref_timedate();
 }
 
-static void configure_rtc_host_datetime_offset(const char *startdate)
+static void configure_rtc_base_datetime(const char *startdate)
 {
 time_t rtc_start_datetime;
 struct tm tm;
@@ -868,8 +878,8 @@ static void configure_rtc_host_datetime_offset(const char 
*startdate)
  "'2006-06-17T16:01:21' or '2006-06-17'\n");
 exit(1);
 }
-rtc_host_datetime_offset = (qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000)
-   - rtc_start_datetime;
+rtc_host_datetime_offset = rtc_ref_start_datetime - rtc_start_datetime;
+rtc_ref_start_datetime = rtc_start_datetime;
 }
 
 static void configure_rtc(QemuOpts *opts)
@@ -888,7 +898,7 @@ static void configure_rtc(QemuOpts *opts)
 replay_add_blocker(blocker);
 } else {
 rtc_base_type = RTC_BASE_DATETIME;
-configure_rtc_host_datetime_offset(value);
+configure_rtc_base_datetime(value);
 }
 }
 value = qemu_opt_get(opts, "clock");
@@ -3039,6 +3049,8 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 rtc_clock = QEMU_CLOCK_HOST;
+rtc_ref_start_datetime = qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000;
+rtc_realtime_clock_offset = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
 
 QLIST_INIT (_change_state_head);
 os_setup_early_signal_handling();
-- 
2.7.4




[Qemu-devel] [PATCH v3 2/4] vl: refactor -rtc option references

2018-10-18 Thread Artem Pisarenko
Improve code readability and prepare for fixing bug #1797033

Signed-off-by: Artem Pisarenko 
---

Notes:
v2: fixed compiler warning

 vl.c | 85 ++--
 1 file changed, 53 insertions(+), 32 deletions(-)

diff --git a/vl.c b/vl.c
index 4e25c78..10c4275 100644
--- a/vl.c
+++ b/vl.c
@@ -147,8 +147,13 @@ bool enable_cpu_pm = false;
 int nb_nics;
 NICInfo nd_table[MAX_NICS];
 int autostart;
-static int rtc_utc = 1;
-static int rtc_date_offset = -1; /* -1 means no change */
+static enum {
+RTC_BASE_UTC,
+RTC_BASE_LOCALTIME,
+RTC_BASE_DATETIME,
+} rtc_base_type = RTC_BASE_UTC;
+static int rtc_host_datetime_offset = -1; /* valid only for host rtc_clock and
+ rtc_base_type=RTC_BASE_DATETIME */
 QEMUClockType rtc_clock;
 int vga_interface_type = VGA_NONE;
 static DisplayOptions dpy;
@@ -782,26 +787,30 @@ void qemu_system_vmstop_request(RunState state)
 /***/
 /* real time host monotonic timer */
 
-static time_t qemu_time(void)
+static time_t qemu_timedate(void)
 {
 return qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000;
 }
 
 /***/
-/* host time/date access */
+/* RTC reference time/date access */
 void qemu_get_timedate(struct tm *tm, int offset)
 {
-time_t ti = qemu_time();
+time_t ti = qemu_timedate();
 
 ti += offset;
-if (rtc_date_offset == -1) {
-if (rtc_utc)
-gmtime_r(, tm);
-else
-localtime_r(, tm);
-} else {
-ti -= rtc_date_offset;
+
+switch (rtc_base_type) {
+case RTC_BASE_UTC:
 gmtime_r(, tm);
+break;
+case RTC_BASE_LOCALTIME:
+localtime_r(, tm);
+break;
+case RTC_BASE_DATETIME:
+ti -= rtc_host_datetime_offset;
+gmtime_r(, tm);
+break;
 }
 }
 
@@ -809,23 +818,33 @@ int qemu_timedate_diff(struct tm *tm)
 {
 time_t seconds;
 
-if (rtc_date_offset == -1)
-if (rtc_utc)
-seconds = mktimegm(tm);
-else {
-struct tm tmp = *tm;
-tmp.tm_isdst = -1; /* use timezone to figure it out */
-seconds = mktime();
-   }
-else
-seconds = mktimegm(tm) + rtc_date_offset;
+switch (rtc_base_type) {
+case RTC_BASE_UTC:
+seconds = mktimegm(tm);
+break;
+case RTC_BASE_LOCALTIME:
+{
+struct tm tmp = *tm;
+tmp.tm_isdst = -1; /* use timezone to figure it out */
+seconds = mktime();
+break;
+}
+case RTC_BASE_DATETIME:
+seconds = mktimegm(tm) + rtc_host_datetime_offset;
+break;
+default:
+/* gcc complains: ‘seconds’ may be used uninitialized */
+g_assert_not_reached();
+seconds = -1;
+break;
+}
 
-return seconds - qemu_time();
+return seconds - qemu_timedate();
 }
 
-static void configure_rtc_date_offset(const char *startdate)
+static void configure_rtc_host_datetime_offset(const char *startdate)
 {
-time_t rtc_start_date;
+time_t rtc_start_datetime;
 struct tm tm;
 
 if (sscanf(startdate, "%d-%d-%dT%d:%d:%d", _year, _mon,
@@ -841,15 +860,16 @@ static void configure_rtc_date_offset(const char 
*startdate)
 }
 tm.tm_year -= 1900;
 tm.tm_mon--;
-rtc_start_date = mktimegm();
-if (rtc_start_date == -1) {
+rtc_start_datetime = mktimegm();
+if (rtc_start_datetime == -1) {
 date_fail:
-error_report("invalid date format");
+error_report("invalid datetime format");
 error_printf("valid formats: "
  "'2006-06-17T16:01:21' or '2006-06-17'\n");
 exit(1);
 }
-rtc_date_offset = qemu_time() - rtc_start_date;
+rtc_host_datetime_offset = (qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000)
+   - rtc_start_datetime;
 }
 
 static void configure_rtc(QemuOpts *opts)
@@ -859,15 +879,16 @@ static void configure_rtc(QemuOpts *opts)
 value = qemu_opt_get(opts, "base");
 if (value) {
 if (!strcmp(value, "utc")) {
-rtc_utc = 1;
+rtc_base_type = RTC_BASE_UTC;
 } else if (!strcmp(value, "localtime")) {
 Error *blocker = NULL;
-rtc_utc = 0;
+rtc_base_type = RTC_BASE_LOCALTIME;
 error_setg(, QERR_REPLAY_NOT_SUPPORTED,
   "-rtc base=localtime");
 replay_add_blocker(blocker);
 } else {
-configure_rtc_date_offset(value);
+rtc_base_type = RTC_BASE_DATETIME;
+configure_rtc_host_datetime_offset(value);
 }
 }
 value = qemu_opt_get(opts, "clock");
-- 
2.7.4




[Qemu-devel] [PATCH v3 1/4] vl: improve/fix documentation related to RTC function

2018-10-18 Thread Artem Pisarenko
Documentation describing -rtc option updated to better match current
implementation and highlight some important specifics.

Signed-off-by: Artem Pisarenko 
---
 qemu-options.hx | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index f139459..4a9c065 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3458,25 +3458,29 @@ HXCOMM Silently ignored for compatibility
 DEF("clock", HAS_ARG, QEMU_OPTION_clock, "", QEMU_ARCH_ALL)
 
 DEF("rtc", HAS_ARG, QEMU_OPTION_rtc, \
-"-rtc [base=utc|localtime|date][,clock=host|rt|vm][,driftfix=none|slew]\n" 
\
+"-rtc 
[base=utc|localtime|][,clock=host|rt|vm][,driftfix=none|slew]\n" \
 "set the RTC base and clock, enable drift fix for clock 
ticks (x86 only)\n",
 QEMU_ARCH_ALL)
 
 STEXI
 
-@item -rtc [base=utc|localtime|@var{date}][,clock=host|vm][,driftfix=none|slew]
+@item -rtc 
[base=utc|localtime|@var{datetime}][,clock=host|rt|vm][,driftfix=none|slew]
 @findex -rtc
 Specify @option{base} as @code{utc} or @code{localtime} to let the RTC start 
at the current
 UTC or local time, respectively. @code{localtime} is required for correct date 
in
-MS-DOS or Windows. To start at a specific point in time, provide @var{date} in 
the
+MS-DOS or Windows. To start at a specific point in time, provide 
@var{datetime} in the
 format @code{2006-06-17T16:01:21} or @code{2006-06-17}. The default base is 
UTC.
 
 By default the RTC is driven by the host system time. This allows using of the
 RTC as accurate reference clock inside the guest, specifically if the host
 time is smoothly following an accurate external reference clock, e.g. via NTP.
 If you want to isolate the guest time from the host, you can set @option{clock}
-to @code{rt} instead.  To even prevent it from progressing during suspension,
-you can set it to @code{vm}.
+to @code{rt} instead, which provides host monotonic clock if host support it.
+To even prevent it from progressing during suspension, you can set it to
+@code{vm} (virtual clock). Virtual clock is the clock seen by guest,
+In icount mode of emulation its (long term) speed will be different from
+any host clock, when icount configured to non-auto value or virtual cpu 
sleeping
+is off, and no synchronization algorithm is active.
 
 Enable @option{driftfix} (i386 targets only) if you experience time drift 
problems,
 specifically with Windows' ACPI HAL. This option will try to figure out how
-- 
2.7.4




[Qemu-devel] [PATCH v3 0/4] Fix and improve core RTC function and documentation

2018-10-18 Thread Artem Pisarenko
Modifications are motivated by bug https://bugs.launchpad.net/qemu/+bug/1797033 
I've encountered recently.

Trying to fix it and analyzing its effect on all use cases (not covered in bug 
report) revealed much deeper problems.
This is my first patch to QEMU and I'm not sure whether the way I addressed 
them is proper.
They definitely require either modifications to current implementation 
(breaking compability), or adding clarifications to user documentation. I've 
tried to find compromise.
Changes I propose are driven mostly by my interpretation of user documentation 
and comments in code, rather than by existing implementation.

I've splitted patches in a way that at least some of them may be accepted. 
Following subsets are make sense on their own:
- 1
- 1,2,3

I limited refactoring only to vl.c, although it leads to reworking of all hw/* 
rtc models which have unreasonable duplications of actions performed by vl.c.
I'm sure such kind of rework will reveal more hidden bugs/features, but so far 
I've had enough (it's a long story how muсh effort and pain cost me to find 
that tricky bug above).
I suppose just to draw attention of relevant hw/* maintainers.


v3 changes:
 - commit messages rewritten/shortened/wrapped to match patch requirements
 - fixed minor typo
v2 changes:
 - fixed compiler warning caused non-debug build fail


Artem Pisarenko (4):
  vl: improve/fix documentation related to RTC function
  vl: refactor -rtc option references
  Fixes RTC bug with base datetime shifts in clock=vm
  vl,qapi: offset calculation in RTC_CHANGE event reverted

 qapi/misc.json  |   3 +-
 qemu-options.hx |  14 +---
 vl.c| 105 +---
 3 files changed, 80 insertions(+), 42 deletions(-)

-- 
2.7.4




Re: [Qemu-devel] [Qemu-block] [PATCH v2 2/4] Introduce attributes to qemu timer subsystem

2018-10-17 Thread Artem Pisarenko
>ср, 17 окт. 2018 г. в 20:43, Paolo Bonzini:
>On 17/10/2018 15:07, Artem Pisarenko wrote:
>> Yes, but without words ..."when they fire" in attributes comments.
>
> For now the only attribute applies when timers fire; that sentence was a
> way to clarify the intended scope and address Stefan's comment.  I can
> remove it too, but I'd like to understand if you have other ideas for
> attributes.

No, I haven't. I don't think that this partcular narrowing has any
relevance to Stefan's comment, so I just considered it to be unnecessary. I
don't insist.
-- 

С уважением,
  Артем Писаренко


Re: [Qemu-devel] [PATCH v2 4/4] vl, qapi: offset value calculation in RTC_CHANGE event reverted to match behavior before #1797033 bugfix and documented

2018-10-17 Thread Artem Pisarenko
I fixed them and ready to submit v3.
Anything else ?

ср, 17 окт. 2018 г. в 20:01, Eric Blake :

> On 10/17/18 7:23 AM, Artem Pisarenko wrote:
> > Return value of qemu_timedate_diff(), used for calculation offset in
> QAPI 'RTC_CHANGE' event restored to keep compatibility, although it isn't
> documented that difference is relative to host clock advancement.
> > Added important note to 'RTC_CHANGE' event description to highlight
> established implementation specifics.
> >
>
> Hmm, I just replied to v1 before seeing that you sent v2; my comments
> about long lines and a typo still apply.
>
> > Signed-off-by: Artem Pisarenko 
> > ---
> >   qapi/misc.json |  3 ++-
> >   vl.c   | 10 +-
> >   2 files changed, 7 insertions(+), 6 deletions(-)
> >
> > diff --git a/qapi/misc.json b/qapi/misc.json
> > index ada9af5..ed866f2 100644
> > --- a/qapi/misc.json
> > +++ b/qapi/misc.json
> > @@ -3070,7 +3070,8 @@
> >   # Emitted when the guest changes the RTC time.
> >   #
> >   # @offset: offset between base RTC clock (as specified by -rtc base),
> and
> > -#  new RTC clock value
> > +#  new RTC clock value. Note that value will be different
> depending
> > +#  on clock choosen to drive RTC (specified by -rtc clock).
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3266 <(919)%20301-3266>
> Virtualization:  qemu.org | libvirt.org
>
-- 

С уважением,
  Артем Писаренко


Re: [Qemu-devel] [PATCH] Revert "icount: remove obsolete warp call"

2018-10-17 Thread Artem Pisarenko
> I found the source of the bug. As QEMU becomes more multi-threaded and
non-> synchronized,
> checkpoints move from thread to thread.
> And the event queue that processed at checkpoints should belong to the
same thread
> in both record and replay executions.
>
> Current problem was with the checkpoint for virtual timers. They are
processed from different threads:
> from vCPU and from aio_dispatch function.

What bug you're talking about ? This patch thread started with intention to
fix generic bug with icount, which is being encountered in non-rr mode. As
far as I understood your words (and patch content) it will not fix that
generic bug.
Anyway, it's a good news!
-- 

С уважением,
  Артем Писаренко


Re: [Qemu-devel] [Qemu-block] [PATCH v2 2/4] Introduce attributes to qemu timer subsystem

2018-10-17 Thread Artem Pisarenko
Yes, but without words ..."when they fire" in attributes comments.

ср, 17 окт. 2018 г. в 17:24, Paolo Bonzini :

> On 17/10/2018 12:57, Artem Pisarenko wrote:
> >> Further down in this patch the notation is QEMU_TIMER_ATTR_, which I
> >> think is clearer because QEMU_TIMER_ATTR(id) looks like a (non-existent)
> >> macro.  Please use the QEMU_TIMER_ATTR_ notation consistently.
> >
> > Yes, I've just forgot to update comments after previous patch version,
> > where it actually was macro.
> >
> >> What is the purpose of this bit?  I guess it's just here as a
> >> placeholder because no real bits have been defined yet.  Hopefully the
> >> next patch removes it (/* This placeholder is removed in the next patch
> >> */ would be a nice way to document this for reviewers).
> >
> > It's just to prevent compilation errors, as required by
> > https://wiki.qemu.org/Contribute/SubmitAPatch#Split_up_long_patches
> >
> >> The enum isn't needed and makes debugging harder since the bit number is
> >> implicit in the enum ordering.  This alternative is clearer and more
> >> concise:
> >>
> >>   #define QEMU_TIMER_ATTR_foo BIT(n)
> >
> > Agree.
>
> Like this?
>
> ...
>
> --

С уважением,
  Артем Писаренко


[Qemu-devel] [PATCH v2 4/4] vl, qapi: offset value calculation in RTC_CHANGE event reverted to match behavior before #1797033 bugfix and documented

2018-10-17 Thread Artem Pisarenko
Return value of qemu_timedate_diff(), used for calculation offset in QAPI 
'RTC_CHANGE' event restored to keep compatibility, although it isn't documented 
that difference is relative to host clock advancement.
Added important note to 'RTC_CHANGE' event description to highlight established 
implementation specifics.

Signed-off-by: Artem Pisarenko 
---
 qapi/misc.json |  3 ++-
 vl.c   | 10 +-
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/qapi/misc.json b/qapi/misc.json
index ada9af5..ed866f2 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -3070,7 +3070,8 @@
 # Emitted when the guest changes the RTC time.
 #
 # @offset: offset between base RTC clock (as specified by -rtc base), and
-#  new RTC clock value
+#  new RTC clock value. Note that value will be different depending
+#  on clock choosen to drive RTC (specified by -rtc clock).
 #
 # Note: This event is rate-limited.
 #
diff --git a/vl.c b/vl.c
index 78a8a68..c350aba 100644
--- a/vl.c
+++ b/vl.c
@@ -788,10 +788,10 @@ void qemu_system_vmstop_request(RunState state)
 
 /***/
 /* RTC reference time/date access */
-static time_t qemu_ref_timedate(void)
+static time_t qemu_ref_timedate(QEMUClockType clock)
 {
-time_t value = qemu_clock_get_ms(rtc_clock) / 1000;
-switch (rtc_clock) {
+time_t value = qemu_clock_get_ms(clock) / 1000;
+switch (clock) {
 case QEMU_CLOCK_REALTIME:
 value -= rtc_realtime_clock_offset;
 /* no break */
@@ -811,7 +811,7 @@ static time_t qemu_ref_timedate(void)
 
 void qemu_get_timedate(struct tm *tm, int offset)
 {
-time_t ti = qemu_ref_timedate();
+time_t ti = qemu_ref_timedate(rtc_clock);
 
 ti += offset;
 
@@ -849,7 +849,7 @@ int qemu_timedate_diff(struct tm *tm)
 break;
 }
 
-return seconds - qemu_ref_timedate();
+return seconds - qemu_ref_timedate(QEMU_CLOCK_HOST);
 }
 
 static void configure_rtc_base_datetime(const char *startdate)
-- 
2.7.4




[Qemu-devel] [PATCH v2 2/4] vl: refactor -rtc option references

2018-10-17 Thread Artem Pisarenko
Improve code readability and prepare for fixing bug #1797033

Signed-off-by: Artem Pisarenko 
---

Notes:
v2: fixed compiler warning

 vl.c | 85 ++--
 1 file changed, 53 insertions(+), 32 deletions(-)

diff --git a/vl.c b/vl.c
index 4e25c78..10c4275 100644
--- a/vl.c
+++ b/vl.c
@@ -147,8 +147,13 @@ bool enable_cpu_pm = false;
 int nb_nics;
 NICInfo nd_table[MAX_NICS];
 int autostart;
-static int rtc_utc = 1;
-static int rtc_date_offset = -1; /* -1 means no change */
+static enum {
+RTC_BASE_UTC,
+RTC_BASE_LOCALTIME,
+RTC_BASE_DATETIME,
+} rtc_base_type = RTC_BASE_UTC;
+static int rtc_host_datetime_offset = -1; /* valid only for host rtc_clock and
+ rtc_base_type=RTC_BASE_DATETIME */
 QEMUClockType rtc_clock;
 int vga_interface_type = VGA_NONE;
 static DisplayOptions dpy;
@@ -782,26 +787,30 @@ void qemu_system_vmstop_request(RunState state)
 /***/
 /* real time host monotonic timer */
 
-static time_t qemu_time(void)
+static time_t qemu_timedate(void)
 {
 return qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000;
 }
 
 /***/
-/* host time/date access */
+/* RTC reference time/date access */
 void qemu_get_timedate(struct tm *tm, int offset)
 {
-time_t ti = qemu_time();
+time_t ti = qemu_timedate();
 
 ti += offset;
-if (rtc_date_offset == -1) {
-if (rtc_utc)
-gmtime_r(, tm);
-else
-localtime_r(, tm);
-} else {
-ti -= rtc_date_offset;
+
+switch (rtc_base_type) {
+case RTC_BASE_UTC:
 gmtime_r(, tm);
+break;
+case RTC_BASE_LOCALTIME:
+localtime_r(, tm);
+break;
+case RTC_BASE_DATETIME:
+ti -= rtc_host_datetime_offset;
+gmtime_r(, tm);
+break;
 }
 }
 
@@ -809,23 +818,33 @@ int qemu_timedate_diff(struct tm *tm)
 {
 time_t seconds;
 
-if (rtc_date_offset == -1)
-if (rtc_utc)
-seconds = mktimegm(tm);
-else {
-struct tm tmp = *tm;
-tmp.tm_isdst = -1; /* use timezone to figure it out */
-seconds = mktime();
-   }
-else
-seconds = mktimegm(tm) + rtc_date_offset;
+switch (rtc_base_type) {
+case RTC_BASE_UTC:
+seconds = mktimegm(tm);
+break;
+case RTC_BASE_LOCALTIME:
+{
+struct tm tmp = *tm;
+tmp.tm_isdst = -1; /* use timezone to figure it out */
+seconds = mktime();
+break;
+}
+case RTC_BASE_DATETIME:
+seconds = mktimegm(tm) + rtc_host_datetime_offset;
+break;
+default:
+/* gcc complains: ‘seconds’ may be used uninitialized */
+g_assert_not_reached();
+seconds = -1;
+break;
+}
 
-return seconds - qemu_time();
+return seconds - qemu_timedate();
 }
 
-static void configure_rtc_date_offset(const char *startdate)
+static void configure_rtc_host_datetime_offset(const char *startdate)
 {
-time_t rtc_start_date;
+time_t rtc_start_datetime;
 struct tm tm;
 
 if (sscanf(startdate, "%d-%d-%dT%d:%d:%d", _year, _mon,
@@ -841,15 +860,16 @@ static void configure_rtc_date_offset(const char 
*startdate)
 }
 tm.tm_year -= 1900;
 tm.tm_mon--;
-rtc_start_date = mktimegm();
-if (rtc_start_date == -1) {
+rtc_start_datetime = mktimegm();
+if (rtc_start_datetime == -1) {
 date_fail:
-error_report("invalid date format");
+error_report("invalid datetime format");
 error_printf("valid formats: "
  "'2006-06-17T16:01:21' or '2006-06-17'\n");
 exit(1);
 }
-rtc_date_offset = qemu_time() - rtc_start_date;
+rtc_host_datetime_offset = (qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000)
+   - rtc_start_datetime;
 }
 
 static void configure_rtc(QemuOpts *opts)
@@ -859,15 +879,16 @@ static void configure_rtc(QemuOpts *opts)
 value = qemu_opt_get(opts, "base");
 if (value) {
 if (!strcmp(value, "utc")) {
-rtc_utc = 1;
+rtc_base_type = RTC_BASE_UTC;
 } else if (!strcmp(value, "localtime")) {
 Error *blocker = NULL;
-rtc_utc = 0;
+rtc_base_type = RTC_BASE_LOCALTIME;
 error_setg(, QERR_REPLAY_NOT_SUPPORTED,
   "-rtc base=localtime");
 replay_add_blocker(blocker);
 } else {
-configure_rtc_date_offset(value);
+rtc_base_type = RTC_BASE_DATETIME;
+configure_rtc_host_datetime_offset(value);
 }
 }
 value = qemu_opt_get(opts, "clock");
-- 
2.7.4




[Qemu-devel] [PATCH v2 1/4] vl: improve/fix documentation related to RTC function

2018-10-17 Thread Artem Pisarenko
Documentation describing -rtc option updated to better match current 
implementation and highlight some important specifics.

Signed-off-by: Artem Pisarenko 
---
 qemu-options.hx | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index f139459..4a9c065 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3458,25 +3458,29 @@ HXCOMM Silently ignored for compatibility
 DEF("clock", HAS_ARG, QEMU_OPTION_clock, "", QEMU_ARCH_ALL)
 
 DEF("rtc", HAS_ARG, QEMU_OPTION_rtc, \
-"-rtc [base=utc|localtime|date][,clock=host|rt|vm][,driftfix=none|slew]\n" 
\
+"-rtc 
[base=utc|localtime|][,clock=host|rt|vm][,driftfix=none|slew]\n" \
 "set the RTC base and clock, enable drift fix for clock 
ticks (x86 only)\n",
 QEMU_ARCH_ALL)
 
 STEXI
 
-@item -rtc [base=utc|localtime|@var{date}][,clock=host|vm][,driftfix=none|slew]
+@item -rtc 
[base=utc|localtime|@var{datetime}][,clock=host|rt|vm][,driftfix=none|slew]
 @findex -rtc
 Specify @option{base} as @code{utc} or @code{localtime} to let the RTC start 
at the current
 UTC or local time, respectively. @code{localtime} is required for correct date 
in
-MS-DOS or Windows. To start at a specific point in time, provide @var{date} in 
the
+MS-DOS or Windows. To start at a specific point in time, provide 
@var{datetime} in the
 format @code{2006-06-17T16:01:21} or @code{2006-06-17}. The default base is 
UTC.
 
 By default the RTC is driven by the host system time. This allows using of the
 RTC as accurate reference clock inside the guest, specifically if the host
 time is smoothly following an accurate external reference clock, e.g. via NTP.
 If you want to isolate the guest time from the host, you can set @option{clock}
-to @code{rt} instead.  To even prevent it from progressing during suspension,
-you can set it to @code{vm}.
+to @code{rt} instead, which provides host monotonic clock if host support it.
+To even prevent it from progressing during suspension, you can set it to
+@code{vm} (virtual clock). Virtual clock is the clock seen by guest,
+In icount mode of emulation its (long term) speed will be different from
+any host clock, when icount configured to non-auto value or virtual cpu 
sleeping
+is off, and no synchronization algorithm is active.
 
 Enable @option{driftfix} (i386 targets only) if you experience time drift 
problems,
 specifically with Windows' ACPI HAL. This option will try to figure out how
-- 
2.7.4




[Qemu-devel] [PATCH v2 3/4] Fixes RTC bug with base datetime shifts in clock=vm

2018-10-17 Thread Artem Pisarenko
This makes all current "-rtc" option parameters combinations produce 
fixed/unambiguous RTC timedate reference for hardware emulation frontends.
It restores determinism of guest execution when used with clock=vm and 
specified base  value.

Buglink: https://bugs.launchpad.net/qemu/+bug/1797033
Signed-off-by: Artem Pisarenko 
---
 vl.c | 58 +++---
 1 file changed, 35 insertions(+), 23 deletions(-)

diff --git a/vl.c b/vl.c
index 10c4275..78a8a68 100644
--- a/vl.c
+++ b/vl.c
@@ -152,8 +152,10 @@ static enum {
 RTC_BASE_LOCALTIME,
 RTC_BASE_DATETIME,
 } rtc_base_type = RTC_BASE_UTC;
-static int rtc_host_datetime_offset = -1; /* valid only for host rtc_clock and
- rtc_base_type=RTC_BASE_DATETIME */
+static time_t rtc_ref_start_datetime;
+static int rtc_realtime_clock_offset; /* used only with QEMU_CLOCK_REALTIME */
+static int rtc_host_datetime_offset = -1; /* valid & used only with
+ RTC_BASE_DATETIME */
 QEMUClockType rtc_clock;
 int vga_interface_type = VGA_NONE;
 static DisplayOptions dpy;
@@ -785,32 +787,42 @@ void qemu_system_vmstop_request(RunState state)
 }
 
 /***/
-/* real time host monotonic timer */
-
-static time_t qemu_timedate(void)
-{
-return qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000;
-}
-
-/***/
 /* RTC reference time/date access */
+static time_t qemu_ref_timedate(void)
+{
+time_t value = qemu_clock_get_ms(rtc_clock) / 1000;
+switch (rtc_clock) {
+case QEMU_CLOCK_REALTIME:
+value -= rtc_realtime_clock_offset;
+/* no break */
+case QEMU_CLOCK_VIRTUAL:
+value += rtc_ref_start_datetime;
+break;
+case QEMU_CLOCK_HOST:
+if (rtc_base_type == RTC_BASE_DATETIME) {
+value -= rtc_host_datetime_offset;
+}
+break;
+default:
+assert(0);
+}
+return value;
+}
+
 void qemu_get_timedate(struct tm *tm, int offset)
 {
-time_t ti = qemu_timedate();
+time_t ti = qemu_ref_timedate();
 
 ti += offset;
 
 switch (rtc_base_type) {
+case RTC_BASE_DATETIME:
 case RTC_BASE_UTC:
 gmtime_r(, tm);
 break;
 case RTC_BASE_LOCALTIME:
 localtime_r(, tm);
 break;
-case RTC_BASE_DATETIME:
-ti -= rtc_host_datetime_offset;
-gmtime_r(, tm);
-break;
 }
 }
 
@@ -819,6 +831,7 @@ int qemu_timedate_diff(struct tm *tm)
 time_t seconds;
 
 switch (rtc_base_type) {
+case RTC_BASE_DATETIME:
 case RTC_BASE_UTC:
 seconds = mktimegm(tm);
 break;
@@ -829,9 +842,6 @@ int qemu_timedate_diff(struct tm *tm)
 seconds = mktime();
 break;
 }
-case RTC_BASE_DATETIME:
-seconds = mktimegm(tm) + rtc_host_datetime_offset;
-break;
 default:
 /* gcc complains: ‘seconds’ may be used uninitialized */
 g_assert_not_reached();
@@ -839,10 +849,10 @@ int qemu_timedate_diff(struct tm *tm)
 break;
 }
 
-return seconds - qemu_timedate();
+return seconds - qemu_ref_timedate();
 }
 
-static void configure_rtc_host_datetime_offset(const char *startdate)
+static void configure_rtc_base_datetime(const char *startdate)
 {
 time_t rtc_start_datetime;
 struct tm tm;
@@ -868,8 +878,8 @@ static void configure_rtc_host_datetime_offset(const char 
*startdate)
  "'2006-06-17T16:01:21' or '2006-06-17'\n");
 exit(1);
 }
-rtc_host_datetime_offset = (qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000)
-   - rtc_start_datetime;
+rtc_host_datetime_offset = rtc_ref_start_datetime - rtc_start_datetime;
+rtc_ref_start_datetime = rtc_start_datetime;
 }
 
 static void configure_rtc(QemuOpts *opts)
@@ -888,7 +898,7 @@ static void configure_rtc(QemuOpts *opts)
 replay_add_blocker(blocker);
 } else {
 rtc_base_type = RTC_BASE_DATETIME;
-configure_rtc_host_datetime_offset(value);
+configure_rtc_base_datetime(value);
 }
 }
 value = qemu_opt_get(opts, "clock");
@@ -3039,6 +3049,8 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 rtc_clock = QEMU_CLOCK_HOST;
+rtc_ref_start_datetime = qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000;
+rtc_realtime_clock_offset = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
 
 QLIST_INIT (_change_state_head);
 os_setup_early_signal_handling();
-- 
2.7.4




[Qemu-devel] [PATCH v2 0/4] Fix and improve core RTC function and documentation

2018-10-17 Thread Artem Pisarenko
Modifications are motivated by bug https://bugs.launchpad.net/qemu/+bug/1797033 
I've encountered recently.

Trying to fix it and analyzing its effect on all use cases (not covered in bug 
report) revealed much deeper problems.
This is my first patch to QEMU and I'm not sure whether the way I addressed 
them is proper.
They definitely require either modifications to current implementation 
(breaking compability), or adding clarifications to user documentation. I've 
tried to find compromise.
Changes I propose are driven mostly by my interpretation of user documentation 
and comments in code, rather than by existing implementation.

I've splitted patches in a way that at least some of them may be accepted. 
Following subsets are make sense on their own:
- 1
- 1,2,3

I limited refactoring only to vl.c, although it leads to reworking of all hw/* 
rtc models which have unreasonable duplications of actions performed by vl.c.
I'm sure such kind of rework will reveal more hidden bugs/features, but so far 
I've had enough (it's a long story how muсh effort and pain cost me to find 
that tricky bug above).
I suppose just to draw attention of relevant hw/* maintainers.


v2 changes:
 - fixed compiler warning caused non-debug build fail


Artem Pisarenko (4):
  vl: improve/fix documentation related to RTC function
  vl: refactor -rtc option references
  Fixes RTC bug with base datetime shifts in clock=vm
  vl,qapi: offset value calculation in RTC_CHANGE event reverted to
match behavior before #1797033 bugfix and documented

 qapi/misc.json  |   3 +-
 qemu-options.hx |  14 +---
 vl.c| 105 +---
 3 files changed, 80 insertions(+), 42 deletions(-)

-- 
2.7.4




Re: [Qemu-devel] [PATCH 0/4] Fix and improve core RTC function and documentation

2018-10-17 Thread Artem Pisarenko
I checked it with ./configure --enable-debug and didn't imagined that in
non-debug build (with -O2) compiler will complain:
   "vl.c:847:12: error: ‘seconds’ may be used uninitialized in this
function [-Werror=maybe-uninitialized]"

int qemu_timedate_diff(struct tm *tm)
{
time_t seconds;
switch (rtc_base_type) {
case RTC_BASE_DATETIME:
case RTC_BASE_UTC:
seconds = mktimegm(tm);
break;
case RTC_BASE_LOCALTIME:
{
struct tm tmp = *tm;
tmp.tm_isdst = -1; /* use timezone to figure it out */
seconds = mktime();
break;
}
}
>>>  return seconds - qemu_ref_timedate(QEMU_CLOCK_HOST);
}

Switch covers all defined enum values, so 'second' cannot be left
uninitialized. One way it could be is if someone assigns integer value to
'rtc_base_type' variable explicitly. Ok, let's look at it. This variable
initialized with valid value at definition and elsewhere in code it's being
assigned only valid enum values. Maybe it could be changed outside of
compile unit ? No, it's defined as static. Maybe GCC isn't just smart
enough to check it in whole compile unit scope context ? From my expirience
I've seen that it's able to make even more complex assumptions and smart
decisions. And with higher optimization levels GCC becomes just smarter.
But this time GCC behavior differs...
-- 

С уважением,
  Артем Писаренко


Re: [Qemu-devel] [Qemu-block] [PATCH v2 2/4] Introduce attributes to qemu timer subsystem

2018-10-17 Thread Artem Pisarenko
> Further down in this patch the notation is QEMU_TIMER_ATTR_, which I
> think is clearer because QEMU_TIMER_ATTR(id) looks like a (non-existent)
> macro.  Please use the QEMU_TIMER_ATTR_ notation consistently.

Yes, I've just forgot to update comments after previous patch version,
where it actually was macro.

> What is the purpose of this bit?  I guess it's just here as a
> placeholder because no real bits have been defined yet.  Hopefully the
> next patch removes it (/* This placeholder is removed in the next patch
> */ would be a nice way to document this for reviewers).

It's just to prevent compilation errors, as required by
https://wiki.qemu.org/Contribute/SubmitAPatch#Split_up_long_patches

> The enum isn't needed and makes debugging harder since the bit number is
> implicit in the enum ordering.  This alternative is clearer and more
> concise:
>
>   #define QEMU_TIMER_ATTR_foo BIT(n)

Agree.

>ср, 17 окт. 2018 г. в 15:15, Paolo Bonzini :
>On 17/10/2018 11:12, Stefan Hajnoczi wrote:
>>> Attributes are simple flags, associated with individual timers for
their whole lifetime.
>>> They intended to be used to mark individual timers for special handling
by various qemu features operating at qemu core level.
>> I'm worried that this sentence suggests various parts of QEMU will stash
>> state in ts->attributes.  That's messy and they shouldn't do this.  Make
>> the field private to qemu-timer.c.
>
> Yes, the contents of the fields are private.  Are you suggesting a
> different wording for the commit message or the "QEMU Timer attributes"
> doc comment, or something more than that?  Possibly removing
> timer_get_attributes altogether?

Actually, attributes aren't intended to be changed externally of timers
code.
The purpose of timer_get_attributes() is just informational, same like if
timer_get_clock(), timer_get_scale(), etc. would exist. But since none of
such accessors exists, adding just one for 'attributes' looks exceptional.
And it isn't being used even after whole patch series applied. As such, it
could be just removed.
Any suggestons to improve commit message and/or comments, if any ?
-- 

С уважением,
  Артем Писаренко


Re: [Qemu-devel] [PATCH] Revert "icount: remove obsolete warp call"

2018-10-17 Thread Artem Pisarenko
See my last comment in bug report. This kind of modification, even adapted
to changed function name, doesn't solve issue.
I thought long time that it does, but once I catched qemu with a hang. And
of course, I wasn't able to reproduce it. So it just better hides issue.
Take a look at alternative solution from QBox:
https://git.greensocs.com/qemu/qbox/commit/a8ed106032e375e715a531d6e93e4d9ec295dbdb
I didn't catched fail with it (yet).
-- 

С уважением,
  Артем Писаренко


[Qemu-devel] [PATCH v2 4/4] Optimize record/replay checkpointing for all clocks it applies to

2018-10-17 Thread Artem Pisarenko
Removes redundant checkpoints in replay log when there are no expired timers in 
timers list, associated with corresponding clock (i.e. no rr events associated 
with current clock value).
This also improves performance in rr mode.

Signed-off-by: Artem Pisarenko 
---
 include/qemu/timer.h |  2 +-
 util/qemu-timer.c| 62 +---
 2 files changed, 31 insertions(+), 33 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 8e3f236..bd627dc 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -70,7 +70,7 @@ typedef enum {
  * QEMU_TIMER_ATTR_EXTERNAL: drives external subsystem
  *
  * Timers with this attribute do not recorded in rr mode, therefore it could be
- * used for the subsystems that operate outside the guest core. Applicable only
+ * used for the subsystems that operate outside the guest core. Relevant only
  * with virtual clock type.
  */
 
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index e9a0f00..7315ca1 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -495,6 +495,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 QEMUTimerCB *cb;
 void *opaque;
 bool need_replay_checkpoint = false;
+ReplayCheckpoint replay_checkpoint_id;
 
 if (!atomic_read(_list->active_timers)) {
 return false;
@@ -505,43 +506,40 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 goto out;
 }
 
-switch (timer_list->clock->type) {
-case QEMU_CLOCK_REALTIME:
-break;
-default:
-case QEMU_CLOCK_VIRTUAL:
-if (replay_mode != REPLAY_MODE_NONE) {
-/* Checkpoint for virtual clock is redundant in cases where
- * it's being triggered with only non-EXTERNAL timers, because
- * these timers don't change guest state directly.
- * Since it has conditional dependence on specific timers, it is
- * subject to race conditions and requires special handling.
- * See below.
- */
+if (replay_mode != REPLAY_MODE_NONE) {
+/* Postpone actual checkpointing to timer list processing
+ * to properly check if we actually need it.
+ */
+switch (timer_list->clock->type) {
+case QEMU_CLOCK_VIRTUAL:
 need_replay_checkpoint = true;
+replay_checkpoint_id = CHECKPOINT_CLOCK_VIRTUAL;
+break;
+case QEMU_CLOCK_HOST:
+need_replay_checkpoint = true;
+replay_checkpoint_id = CHECKPOINT_CLOCK_HOST;
+break;
+case QEMU_CLOCK_VIRTUAL_RT:
+need_replay_checkpoint = true;
+replay_checkpoint_id = CHECKPOINT_CLOCK_VIRTUAL_RT;
+break;
+default:
+break;
 }
-break;
-case QEMU_CLOCK_HOST:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_HOST)) {
-goto out;
-}
-break;
-case QEMU_CLOCK_VIRTUAL_RT:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL_RT)) {
-goto out;
-}
-break;
 }
 
 /*
- * Extract expired timers from active timers list and and process them.
+ * Extract expired timers from active timers list and and process them,
+ * taking into account checkpointing required in rr mode.
  *
- * In rr mode we need "filtered" checkpointing for virtual clock.
- * Checkpoint must be replayed before any non-EXTERNAL timer has been
- * processed and only one time (virtual clock value stays same). But these
- * timers may appear in the timers list while it being processed, so this
- * must be checked until we finally decide that "no timers left - we are
- * done".
+ * Checkpoint must be replayed before any timer has been processed
+ * and only one time. But new timers may appear in the timers list while
+ * it's being processed, so this must be checked until we finally decide
+ * that "no timers left - we are done" (to avoid skipping checkpoint due to
+ * possible races).
+ * Also checkpoint for virtual clock is redundant in cases where it's being
+ * triggered with only non-EXTERNAL timers, because these timers don't
+ * change guest state directly.
  */
 current_time = qemu_clock_get_ns(timer_list->clock->type);
 qemu_mutex_lock(_list->active_timers_lock);
@@ -557,7 +555,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 /* once we got here, checkpoint clock only once */
 need_replay_checkpoint = false;
 qemu_mutex_unlock(_list->active_timers_lock);
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
+if (!replay_checkpoint(replay_checkpoint_id)) {
 goto out;
 }
 qemu_mutex_lock(_list->active_timers_lock);
-- 
2.7.4




[Qemu-devel] [PATCH v2 3/4] Restores record/replay behavior related to special virtual clock processing for timers used in external subsystems.

2018-10-17 Thread Artem Pisarenko
Adds EXTERNAL attribute definition to qemu timers subsystem and assigns it to 
virtual clock timers, used in slirp (ICMP IPv6) and ui (key queue).
Virtual clock processing in rr mode reimplemented using this attribute.

Fixes: 87f4fe7653baf55b5c2f2753fe6003f473c07342
Fixes: 775a412bf83f6bc0c5c02091ee06cf649b34c593
Fixes: 9888091404a702d7ec79d51b088d994b9fc121bd
Signed-off-by: Artem Pisarenko 
---

Notes:
v2: fixes race condition and reimplements synchronization between 
checkpointing and timers processing in qemu-timer.c

 include/qemu/timer.h | 12 +---
 slirp/ip6_icmp.c |  4 +++-
 ui/input.c   |  5 +++--
 util/qemu-timer.c| 50 +++---
 4 files changed, 58 insertions(+), 13 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index e225ad4..8e3f236 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -65,14 +65,20 @@ typedef enum {
  * instead each attribute in bit set accessed with QEMU_TIMER_ATTR_ macro,
  * where  is a unique part of attribute identifier.
  *
- * No attributes defined currently.
+ * The following attributes are available:
+ *
+ * QEMU_TIMER_ATTR_EXTERNAL: drives external subsystem
+ *
+ * Timers with this attribute do not recorded in rr mode, therefore it could be
+ * used for the subsystems that operate outside the guest core. Applicable only
+ * with virtual clock type.
  */
 
 typedef enum {
-QEMU_TIMER_ATTRBIT__NONE
+QEMU_TIMER_ATTRBIT_EXTERNAL,
 } QEMUTimerAttrBit;
 
-#define QEMU_TIMER_ATTR__NONE (1 << QEMU_TIMER_ATTRBIT__NONE)
+#define QEMU_TIMER_ATTR_EXTERNAL (1 << QEMU_TIMER_ATTRBIT_EXTERNAL)
 
 typedef struct QEMUTimerList QEMUTimerList;
 
diff --git a/slirp/ip6_icmp.c b/slirp/ip6_icmp.c
index ee333d0..cd1e0b9 100644
--- a/slirp/ip6_icmp.c
+++ b/slirp/ip6_icmp.c
@@ -27,7 +27,9 @@ void icmp6_init(Slirp *slirp)
 return;
 }
 
-slirp->ra_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, ra_timer_handler, 
slirp);
+slirp->ra_timer = timer_new_full(NULL, QEMU_CLOCK_VIRTUAL,
+ SCALE_MS, QEMU_TIMER_ATTR_EXTERNAL,
+ ra_timer_handler, slirp);
 timer_mod(slirp->ra_timer,
   qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
 }
diff --git a/ui/input.c b/ui/input.c
index 51b1019..7c9a410 100644
--- a/ui/input.c
+++ b/ui/input.c
@@ -448,8 +448,9 @@ void qemu_input_event_send_key_delay(uint32_t delay_ms)
 }
 
 if (!kbd_timer) {
-kbd_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, qemu_input_queue_process,
- _queue);
+kbd_timer = timer_new_full(NULL, QEMU_CLOCK_VIRTUAL,
+   SCALE_MS, QEMU_TIMER_ATTR_EXTERNAL,
+   qemu_input_queue_process, _queue);
 }
 if (queue_count < queue_limit) {
 qemu_input_queue_delay(_queue, kbd_timer,
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index 2046b68..e9a0f00 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -494,6 +494,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 bool progress = false;
 QEMUTimerCB *cb;
 void *opaque;
+bool need_replay_checkpoint = false;
 
 if (!atomic_read(_list->active_timers)) {
 return false;
@@ -509,8 +510,15 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 break;
 default:
 case QEMU_CLOCK_VIRTUAL:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
-goto out;
+if (replay_mode != REPLAY_MODE_NONE) {
+/* Checkpoint for virtual clock is redundant in cases where
+ * it's being triggered with only non-EXTERNAL timers, because
+ * these timers don't change guest state directly.
+ * Since it has conditional dependence on specific timers, it is
+ * subject to race conditions and requires special handling.
+ * See below.
+ */
+need_replay_checkpoint = true;
 }
 break;
 case QEMU_CLOCK_HOST:
@@ -525,13 +533,38 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 break;
 }
 
+/*
+ * Extract expired timers from active timers list and and process them.
+ *
+ * In rr mode we need "filtered" checkpointing for virtual clock.
+ * Checkpoint must be replayed before any non-EXTERNAL timer has been
+ * processed and only one time (virtual clock value stays same). But these
+ * timers may appear in the timers list while it being processed, so this
+ * must be checked until we finally decide that "no timers left - we are
+ * done".
+ */
 current_time = qemu_clock_get_ns(timer_list->clock->type);
-for(;;) {
-qemu_mutex_lock(_list->active_timers_lock);
-ts = timer_list->active_timers;
+qemu_mutex_lock(_list->active_timers_lock);
+while ((ts = t

[Qemu-devel] [PATCH v2 2/4] Introduce attributes to qemu timer subsystem

2018-10-17 Thread Artem Pisarenko
Attributes are simple flags, associated with individual timers for their whole 
lifetime.
They intended to be used to mark individual timers for special handling by 
various qemu features operating at qemu core level.
New/init functions family in timer interface updated and refactored (new 
'attribute' argument added, timer_list replaced with timer_list_group+type 
combinations, comments improved to avoid info duplication).
Also existing aio interface extended with attribute-enabled variants of 
functions, which create/initialize timers.

Signed-off-by: Artem Pisarenko 
---

Notes:
v2:
- timer creation/initialize functions reworked and and their unnecessary 
variants removed (as Paolo Bonzini suggested)
- also their comments improved to avoid info duplication

 include/block/aio.h   |  57 ++---
 include/qemu/timer.h  | 128 ++
 tests/ptimer-test-stubs.c |  13 +++--
 util/qemu-timer.c |  18 +--
 4 files changed, 146 insertions(+), 70 deletions(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index f08630c..fce9d48 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -388,6 +388,31 @@ struct LinuxAioState *aio_setup_linux_aio(AioContext *ctx, 
Error **errp);
 struct LinuxAioState *aio_get_linux_aio(AioContext *ctx);
 
 /**
+ * aio_timer_new_with_attrs:
+ * @ctx: the aio context
+ * @type: the clock type
+ * @scale: the scale
+ * @attributes: 0, or one to multiple OR'ed QEMU_TIMER_ATTR(id) values to 
assign
+ * @cb: the callback to call on timer expiry
+ * @opaque: the opaque pointer to pass to the callback
+ *
+ * Allocate a new timer (with attributes) attached to the context @ctx.
+ * The function is responsible for memory allocation.
+ *
+ * The preferred interface is aio_timer_init or aio_timer_init_with_attrs.
+ * Use that unless you really need dynamic memory allocation.
+ *
+ * Returns: a pointer to the new timer
+ */
+static inline QEMUTimer *aio_timer_new_with_attrs(AioContext *ctx,
+  QEMUClockType type,
+  int scale, int attributes,
+  QEMUTimerCB *cb, void 
*opaque)
+{
+return timer_new_full(>tlg, type, scale, attributes, cb, opaque);
+}
+
+/**
  * aio_timer_new:
  * @ctx: the aio context
  * @type: the clock type
@@ -396,10 +421,7 @@ struct LinuxAioState *aio_get_linux_aio(AioContext *ctx);
  * @opaque: the opaque pointer to pass to the callback
  *
  * Allocate a new timer attached to the context @ctx.
- * The function is responsible for memory allocation.
- *
- * The preferred interface is aio_timer_init. Use that
- * unless you really need dynamic memory allocation.
+ * See aio_timer_new_with_attrs for details.
  *
  * Returns: a pointer to the new timer
  */
@@ -407,7 +429,28 @@ static inline QEMUTimer *aio_timer_new(AioContext *ctx, 
QEMUClockType type,
int scale,
QEMUTimerCB *cb, void *opaque)
 {
-return timer_new_tl(ctx->tlg.tl[type], scale, cb, opaque);
+return timer_new_full(>tlg, type, scale, 0, cb, opaque);
+}
+
+/**
+ * aio_timer_init_with_attrs:
+ * @ctx: the aio context
+ * @ts: the timer
+ * @type: the clock type
+ * @scale: the scale
+ * @attributes: 0, or one to multiple OR'ed QEMU_TIMER_ATTR(id) values to 
assign
+ * @cb: the callback to call on timer expiry
+ * @opaque: the opaque pointer to pass to the callback
+ *
+ * Initialise a new timer (with attributes) attached to the context @ctx.
+ * The caller is responsible for memory allocation.
+ */
+static inline void aio_timer_init_with_attrs(AioContext *ctx,
+ QEMUTimer *ts, QEMUClockType type,
+ int scale, int attributes,
+ QEMUTimerCB *cb, void *opaque)
+{
+timer_init_full(ts, >tlg, type, scale, attributes, cb, opaque);
 }
 
 /**
@@ -420,14 +463,14 @@ static inline QEMUTimer *aio_timer_new(AioContext *ctx, 
QEMUClockType type,
  * @opaque: the opaque pointer to pass to the callback
  *
  * Initialise a new timer attached to the context @ctx.
- * The caller is responsible for memory allocation.
+ * See aio_timer_init_with_attrs for details.
  */
 static inline void aio_timer_init(AioContext *ctx,
   QEMUTimer *ts, QEMUClockType type,
   int scale,
   QEMUTimerCB *cb, void *opaque)
 {
-timer_init_tl(ts, ctx->tlg.tl[type], scale, cb, opaque);
+timer_init_full(ts, >tlg, type, scale, 0, cb, opaque);
 }
 
 /**
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 39ea907..e225ad4 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -52,6 +52,28 @@ typedef enum {
 QEMU_CLOCK_MAX
 } QEMUClockType;
 
+/**
+ * QEM

[Qemu-devel] [PATCH v2 1/4] Revert some patches from recent [PATCH v6] "Fixing record/replay and adding reverse debugging"

2018-10-17 Thread Artem Pisarenko
That patch series introduced new virtual clock type for use in external 
subsystems. It breaks desired behavior in non-record/replay usage scenarios.

This reverts commit 87f4fe7653baf55b5c2f2753fe6003f473c07342.
This reverts commit 775a412bf83f6bc0c5c02091ee06cf649b34c593.
This reverts commit 9888091404a702d7ec79d51b088d994b9fc121bd.

Signed-off-by: Artem Pisarenko 
---
 include/qemu/timer.h | 9 -
 slirp/ip6_icmp.c | 7 +++
 ui/input.c   | 8 
 util/qemu-timer.c| 2 --
 4 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index a005ed2..39ea907 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -42,14 +42,6 @@
  * In icount mode, this clock counts nanoseconds while the virtual
  * machine is running.  It is used to increase @QEMU_CLOCK_VIRTUAL
  * while the CPUs are sleeping and thus not executing instructions.
- *
- * @QEMU_CLOCK_VIRTUAL_EXT: virtual clock for external subsystems
- *
- * The virtual clock only runs during the emulation. It stops
- * when the virtual machine is stopped. The timers for this clock
- * do not recorded in rr mode, therefore this clock could be used
- * for the subsystems that operate outside the guest core.
- *
  */
 
 typedef enum {
@@ -57,7 +49,6 @@ typedef enum {
 QEMU_CLOCK_VIRTUAL = 1,
 QEMU_CLOCK_HOST = 2,
 QEMU_CLOCK_VIRTUAL_RT = 3,
-QEMU_CLOCK_VIRTUAL_EXT = 4,
 QEMU_CLOCK_MAX
 } QEMUClockType;
 
diff --git a/slirp/ip6_icmp.c b/slirp/ip6_icmp.c
index 3f41187..ee333d0 100644
--- a/slirp/ip6_icmp.c
+++ b/slirp/ip6_icmp.c
@@ -17,7 +17,7 @@ static void ra_timer_handler(void *opaque)
 {
 Slirp *slirp = opaque;
 timer_mod(slirp->ra_timer,
-  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_EXT) + NDP_Interval);
+  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
 ndp_send_ra(slirp);
 }
 
@@ -27,10 +27,9 @@ void icmp6_init(Slirp *slirp)
 return;
 }
 
-slirp->ra_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL_EXT,
-   ra_timer_handler, slirp);
+slirp->ra_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, ra_timer_handler, 
slirp);
 timer_mod(slirp->ra_timer,
-  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_EXT) + NDP_Interval);
+  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
 }
 
 void icmp6_cleanup(Slirp *slirp)
diff --git a/ui/input.c b/ui/input.c
index dd7f6d7..51b1019 100644
--- a/ui/input.c
+++ b/ui/input.c
@@ -271,7 +271,7 @@ static void qemu_input_queue_process(void *opaque)
 item = QTAILQ_FIRST(queue);
 switch (item->type) {
 case QEMU_INPUT_QUEUE_DELAY:
-timer_mod(item->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_EXT)
+timer_mod(item->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL)
   + item->delay_ms);
 return;
 case QEMU_INPUT_QUEUE_EVENT:
@@ -301,7 +301,7 @@ static void qemu_input_queue_delay(struct 
QemuInputEventQueueHead *queue,
 queue_count++;
 
 if (start_timer) {
-timer_mod(item->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_EXT)
+timer_mod(item->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL)
   + item->delay_ms);
 }
 }
@@ -448,8 +448,8 @@ void qemu_input_event_send_key_delay(uint32_t delay_ms)
 }
 
 if (!kbd_timer) {
-kbd_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL_EXT,
- qemu_input_queue_process, _queue);
+kbd_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, qemu_input_queue_process,
+ _queue);
 }
 if (queue_count < queue_limit) {
 qemu_input_queue_delay(_queue, kbd_timer,
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index eb60d8f..86bfe84 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -496,7 +496,6 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 
 switch (timer_list->clock->type) {
 case QEMU_CLOCK_REALTIME:
-case QEMU_CLOCK_VIRTUAL_EXT:
 break;
 default:
 case QEMU_CLOCK_VIRTUAL:
@@ -598,7 +597,6 @@ int64_t qemu_clock_get_ns(QEMUClockType type)
 return get_clock();
 default:
 case QEMU_CLOCK_VIRTUAL:
-case QEMU_CLOCK_VIRTUAL_EXT:
 if (use_icount) {
 return cpu_get_icount();
 } else {
-- 
2.7.4




[Qemu-devel] [PATCH v2 0/4] Introduce attributes for timers subsystem and remove QEMU_CLOCK_VIRTUAL_EXT clock type

2018-10-17 Thread Artem Pisarenko
Recent patches from series [PATCH v6] "Fixing record/replay and adding reverse 
debugging" introduced new clock type QEMU_CLOCK_VIRTUAL_EXT and replaced 
virtual timers in some external subsystems with it.
This resulted in small change to existing behavior, which I consider to be 
unacceptable.
Processing of virtual timers, belonging to new clock type, was kicked off to 
main loop, which made them asynchronous with vCPU thread and, in icount mode, 
with whole guest execution. This breaks expected determinism in 
non-record/replay icount mode of emulation where these "external subsystems" 
are isolated from host (i.e. they external only to guest core, not to emulation 
environment).

Example for slirp ("user" backend for network device):
User runs qemu in icount mode with rtc clock=vm without any external 
communication interfaces but with "-netdev user,restrict=on". It expects 
deterministic execution, because network services are emulated inside qemu and 
isolated from host. There are no reasons to get reply from DHCP server with 
different delay or something like that.

These series of patches revert those commits and reimplement their 
modifications in a better way.

Current implementation of timers/clock processing is confusing (at least for 
me) because of exceptions from design concept behind them, which already 
introduced by icount mode (which adds QEMU_CLOCK_VIRTUAL_RT). Adding 
QEMU_CLOCK_VIRTUAL_EXT just made things even more complicated. I consider these 
"alternative" virtual clocks to be some kind of hacks being convinient only to 
authors of relevant qemu features. Lets don't touch fundamental clock types and 
keep them orthogonal to special cases of timers handling.

As far as I understand, original intention of author was just to make 
optimization in replay log by avoiding storing extra events which don't change 
guest state directly. Indeed, for example, ipv6 icmp timer in slirp does things 
which external to guest core and ends with sending network packet to guest, but 
record/replay will anyway catch event, corresponding to packet reception in 
guest network frontend, and store it to replay log, so there are no need in 
making checkpoint for corresponding clock when that timer fires.
If so, then we just need to skip checkpoints for clock values, when only these 
specific timers are run. It is individual timers which are specific, not clock.
Adding some kind of attribute/flag/property to individual timer allows any 
special qemu feature (such as record/replay) to inspect it in any place of code 
and handle as needed. This design acheives less dependencies, more transparency 
and has more intuitive and clear sense. For record/replay feature it's acheived 
with 'EXTERNAL' attribute. The only drawback is that it required to add one 
extra mutex unlock/lock pair for virtual clock type in timerlist_run_timers() 
function (see patch 3), but it's being added only when qemu runs in 
record/replay mode.

Finally, this patch series optimizes checkpointing for all other clocks it 
applies to.


v2 changes:
 - timers/aio interface refactored and improved, addition to couroutines 
removed (as Paolo Bonzini suggested)
 - fixed wrong reimplementation in qemu-timer.c caused race conditions
 - added bonus patch which optimizes checkpointing for other clocks

P.S. I've tried to test record/replay with slirp, but in replay mode qemu 
stucks at guest linux boot after "Configuring network interfaces..." message, 
where DHCP communication takes place. It's broken in a same way both in master 
and master with reverted commits being fixed.

P.S.2. I wasn't able to test these patches on purely clean master branch 
because of bugs https://bugs.launchpad.net/qemu/+bug/1790460 and 
https://bugs.launchpad.net/qemu/+bug/1795369, which workarounded by several 
not-accepted (yet?) modifications.


Artem Pisarenko (4):
  Revert some patches from recent [PATCH v6] "Fixing record/replay and
adding reverse debugging"
  Introduce attributes to qemu timer subsystem
  Restores record/replay behavior related to special virtual clock
processing for timers used in external subsystems.
  Optimize record/replay checkpointing for all clocks it applies to

 include/block/aio.h   |  57 +++---
 include/qemu/timer.h  | 143 +-
 slirp/ip6_icmp.c  |   9 +--
 tests/ptimer-test-stubs.c |  13 +++--
 ui/input.c|   9 +--
 util/qemu-timer.c | 100 ++--
 6 files changed, 219 insertions(+), 112 deletions(-)

-- 
2.7.4




[Qemu-devel] [PATCH v2] Introduce attributes to qemu timer subsystem

2018-10-16 Thread Artem Pisarenko
Attributes are simple flags, associated with individual timers for their whole 
lifetime.
They intended to be used to mark individual timers for special handling by 
various qemu features operating at qemu core level.
New/init functions family in timer interface updated and their comments 
improved to avoid info duplication.
Also existing aio interface extended with attribute-enabled variants of 
functions, which create/initialize timers.

Signed-off-by: Artem Pisarenko 
---

Notes:
v2:
- timer creation/initialize functions reworked and and their unnecessary 
variants removed (as Paolo Bonzini suggested)
- also their comments improved to avoid info duplication

 include/block/aio.h   |  53 ++-
 include/qemu/timer.h  | 126 ++
 tests/ptimer-test-stubs.c |   7 +--
 util/qemu-timer.c |  12 +++--
 4 files changed, 136 insertions(+), 62 deletions(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index f08630c..07c291a 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -407,10 +407,38 @@ static inline QEMUTimer *aio_timer_new(AioContext *ctx, 
QEMUClockType type,
int scale,
QEMUTimerCB *cb, void *opaque)
 {
-return timer_new_tl(ctx->tlg.tl[type], scale, cb, opaque);
+return timer_new_full(type, ctx->tlg.tl[type],
+  scale, 0, cb, opaque);
 }
 
 /**
+ * aio_timer_new_with_attrs:
+ * @ctx: the aio context
+ * @type: the clock type
+ * @scale: the scale
+ * @attributes: 0, or one to multiple OR'ed QEMU_TIMER_ATTR(id) values to 
assign
+ * @cb: the callback to call on timer expiry
+ * @opaque: the opaque pointer to pass to the callback
+ *
+ * Allocate a new timer (with attributes) attached to the context @ctx.
+ * The function is responsible for memory allocation.
+ *
+ * The preferred interface is aio_timer_init. Use that
+ * unless you really need dynamic memory allocation.
+ *
+ * Returns: a pointer to the new timer
+ */
+static inline QEMUTimer *aio_timer_new_with_attrs(AioContext *ctx,
+  QEMUClockType type,
+  int scale, int attributes,
+  QEMUTimerCB *cb, void 
*opaque)
+{
+return timer_new_full(type, ctx->tlg.tl[type],
+  scale, attributes, cb, opaque);
+}
+
+
+/**
  * aio_timer_init:
  * @ctx: the aio context
  * @ts: the timer
@@ -427,7 +455,28 @@ static inline void aio_timer_init(AioContext *ctx,
   int scale,
   QEMUTimerCB *cb, void *opaque)
 {
-timer_init_tl(ts, ctx->tlg.tl[type], scale, cb, opaque);
+timer_init_full(ts, ctx->tlg.tl[type], scale, 0, cb, opaque);
+}
+
+/**
+ * aio_timer_init_with_attrs:
+ * @ctx: the aio context
+ * @ts: the timer
+ * @type: the clock type
+ * @scale: the scale
+ * @attributes: 0, or one to multiple OR'ed QEMU_TIMER_ATTR(id) values to 
assign
+ * @cb: the callback to call on timer expiry
+ * @opaque: the opaque pointer to pass to the callback
+ *
+ * Initialise a new timer (with attributes) attached to the context @ctx.
+ * The caller is responsible for memory allocation.
+ */
+static inline void aio_timer_init_with_attrs(AioContext *ctx,
+ QEMUTimer *ts, QEMUClockType type,
+ int scale, int attributes,
+ QEMUTimerCB *cb, void *opaque)
+{
+timer_init_full(ts, ctx->tlg.tl[type], scale, attributes, cb, opaque);
 }
 
 /**
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 39ea907..64a84ea 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -52,6 +52,28 @@ typedef enum {
 QEMU_CLOCK_MAX
 } QEMUClockType;
 
+/**
+ * QEMU Timer attributes:
+ *
+ * An individual timer may be assigned with one or multiple attributes when
+ * initialized.
+ * Attribute is a static flag, meaning that timer has corresponding property.
+ * Attributes are defined in QEMUTimerAttrBit enum and encoded to bit set,
+ * which used to initialize timer, stored to 'attributes' member and can be
+ * retrieved externally with timer_get_attributes() call.
+ * Values of QEMUTimerAttrBit aren't used directly,
+ * instead each attribute in bit set accessed with QEMU_TIMER_ATTR_ macro,
+ * where  is a unique part of attribute identifier.
+ *
+ * No attributes defined currently.
+ */
+
+typedef enum {
+QEMU_TIMER_ATTRBIT__NONE
+} QEMUTimerAttrBit;
+
+#define QEMU_TIMER_ATTR__NONE (1 << QEMU_TIMER_ATTRBIT__NONE)
+
 typedef struct QEMUTimerList QEMUTimerList;
 
 struct QEMUTimerListGroup {
@@ -67,6 +89,7 @@ struct QEMUTimer {
 QEMUTimerCB *cb;
 void *opaque;
 QEMUTimer *next;
+int attributes;
 int scale;
 };
 
@@ -418,22 +441,24 @@ int64_t ti

[Qemu-devel] [PATCH v2] Restores record/replay behavior related to special virtual clock processing for timers used in external subsystems.

2018-10-16 Thread Artem Pisarenko
Adds EXTERNAL attribute definition to qemu timers subsystem and assigns it to 
virtual clock timers, used in slirp (ICMP IPv6) and ui (key queue).
Virtual clock processing in rr mode reimplemented using this attribute.

Fixes: 87f4fe7653baf55b5c2f2753fe6003f473c07342
Fixes: 775a412bf83f6bc0c5c02091ee06cf649b34c593
Fixes: 9888091404a702d7ec79d51b088d994b9fc121bd
Signed-off-by: Artem Pisarenko 
---

Notes:
v2: fixes race condition and reimplements synchronization between 
checkpointing and timers processing in qemu-timer.c

qemu-timer.c:
It has one more difference from original behavior, which I'm not sure about.
If there are no timers will be processed for virtual clock (i.e. timer list 
is empty), then checkpointing will be skipped, although it looks fine for me. 
Is such scenario ever possible?

 include/qemu/timer.h | 10 --
 slirp/ip6_icmp.c |  4 +++-
 ui/input.c   |  5 +++--
 util/qemu-timer.c| 50 +++---
 4 files changed, 57 insertions(+), 12 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 031e3a1..53bfba5 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -65,11 +65,17 @@ typedef enum {
  * instead each attribute in bit set accessed with QEMU_TIMER_ATTR(id) macro,
  * where 'id' is a unique part of attribute identifier.
  *
- * No attributes defined currently.
+ * The following attributes are available:
+ *
+ * QEMU_TIMER_ATTR(EXTERNAL): drives external subsystem
+ *
+ * Timers with this attribute do not recorded in rr mode, therefore it could be
+ * used for the subsystems that operate outside the guest core. Applicable only
+ * with virtual clock type.
  */
 
 typedef enum {
-/* none */
+QEMU_TIMER_ATTRBIT_EXTERNAL,
 } QEMUTimerAttrBit;
 
 #define QEMU_TIMER_ATTR(id) (1 << QEMU_TIMER_ATTRBIT_ ## id)
diff --git a/slirp/ip6_icmp.c b/slirp/ip6_icmp.c
index ee333d0..7c08433 100644
--- a/slirp/ip6_icmp.c
+++ b/slirp/ip6_icmp.c
@@ -27,7 +27,9 @@ void icmp6_init(Slirp *slirp)
 return;
 }
 
-slirp->ra_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, ra_timer_handler, 
slirp);
+slirp->ra_timer = timer_new_a(QEMU_CLOCK_VIRTUAL, SCALE_MS,
+  QEMU_TIMER_ATTR(EXTERNAL),
+  ra_timer_handler, slirp);
 timer_mod(slirp->ra_timer,
   qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
 }
diff --git a/ui/input.c b/ui/input.c
index 51b1019..6279187 100644
--- a/ui/input.c
+++ b/ui/input.c
@@ -448,8 +448,9 @@ void qemu_input_event_send_key_delay(uint32_t delay_ms)
 }
 
 if (!kbd_timer) {
-kbd_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, qemu_input_queue_process,
- _queue);
+kbd_timer = timer_new_a(QEMU_CLOCK_VIRTUAL, SCALE_MS,
+QEMU_TIMER_ATTR(EXTERNAL),
+qemu_input_queue_process, _queue);
 }
 if (queue_count < queue_limit) {
 qemu_input_queue_delay(_queue, kbd_timer,
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index 29d8e39..7731dd9 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -490,6 +490,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 bool progress = false;
 QEMUTimerCB *cb;
 void *opaque;
+bool need_replay_checkpoint = false;
 
 if (!atomic_read(_list->active_timers)) {
 return false;
@@ -505,8 +506,15 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 break;
 default:
 case QEMU_CLOCK_VIRTUAL:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
-goto out;
+if (replay_mode != REPLAY_MODE_NONE) {
+/* Checkpoint for virtual clock is redundant in cases where
+ * it's being triggered with only non-EXTERNAL timers, because
+ * these timers don't change guest state directly.
+ * Since it has conditional dependence on specific timers, it is
+ * subject to race conditions and requires special handling.
+ * See below.
+ */
+need_replay_checkpoint = true;
 }
 break;
 case QEMU_CLOCK_HOST:
@@ -521,13 +529,38 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 break;
 }
 
+/*
+ * Extract expired timers from active timers list and and process them.
+ *
+ * In rr mode we need "filtered" checkpointing for virtual clock.
+ * Checkpoint must be replayed before any non-EXTERNAL timer has been
+ * processed and only one time (virtual clock value stays same). But these
+ * timers may appear in the timers list while it being processed, so this
+ * must be checked until we finally decide that "no timers left - we are
+ * done".
+ */
 current_time = qemu_clock_get_ns(timer_list->clock->type);
-for(;;) {
-qemu_mutex_lock(_list->active_tim

[Qemu-devel] [PATCH v2] Restores record/replay behavior related to special virtual clock processing for timers used in external subsystems.

2018-10-15 Thread Artem Pisarenko
Adds EXTERNAL attribute definition to qemu timers subsystem and assigns it to 
virtual clock timers, used in slirp (ICMP IPv6) and ui (key queue).
Virtual clock processing in rr mode reimplemented using this attribute.

Fixes: 87f4fe7653baf55b5c2f2753fe6003f473c07342
Fixes: 775a412bf83f6bc0c5c02091ee06cf649b34c593
Fixes: 9888091404a702d7ec79d51b088d994b9fc121bd
Signed-off-by: Artem Pisarenko 
---

Notes:
v2: fixes race condition and reimplements synchronization between 
checkpointing and timers processing in qemu-timer.c

qemu-timer.c:
It has one more difference from original behavior, which I'm not sure about.
If there are no timers will be processed for virtual clock (i.e. timer list 
is empty), then checkpointing will be skipped, although it looks fine for me. 
Is such scenario ever possible?

 include/qemu/timer.h | 10 --
 slirp/ip6_icmp.c |  4 ++-
 ui/input.c   |  5 +--
 util/qemu-timer.c| 88 
 4 files changed, 82 insertions(+), 25 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 031e3a1..53bfba5 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -65,11 +65,17 @@ typedef enum {
  * instead each attribute in bit set accessed with QEMU_TIMER_ATTR(id) macro,
  * where 'id' is a unique part of attribute identifier.
  *
- * No attributes defined currently.
+ * The following attributes are available:
+ *
+ * QEMU_TIMER_ATTR(EXTERNAL): drives external subsystem
+ *
+ * Timers with this attribute do not recorded in rr mode, therefore it could be
+ * used for the subsystems that operate outside the guest core. Applicable only
+ * with virtual clock type.
  */
 
 typedef enum {
-/* none */
+QEMU_TIMER_ATTRBIT_EXTERNAL,
 } QEMUTimerAttrBit;
 
 #define QEMU_TIMER_ATTR(id) (1 << QEMU_TIMER_ATTRBIT_ ## id)
diff --git a/slirp/ip6_icmp.c b/slirp/ip6_icmp.c
index ee333d0..7c08433 100644
--- a/slirp/ip6_icmp.c
+++ b/slirp/ip6_icmp.c
@@ -27,7 +27,9 @@ void icmp6_init(Slirp *slirp)
 return;
 }
 
-slirp->ra_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, ra_timer_handler, 
slirp);
+slirp->ra_timer = timer_new_a(QEMU_CLOCK_VIRTUAL, SCALE_MS,
+  QEMU_TIMER_ATTR(EXTERNAL),
+  ra_timer_handler, slirp);
 timer_mod(slirp->ra_timer,
   qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
 }
diff --git a/ui/input.c b/ui/input.c
index 51b1019..6279187 100644
--- a/ui/input.c
+++ b/ui/input.c
@@ -448,8 +448,9 @@ void qemu_input_event_send_key_delay(uint32_t delay_ms)
 }
 
 if (!kbd_timer) {
-kbd_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, qemu_input_queue_process,
- _queue);
+kbd_timer = timer_new_a(QEMU_CLOCK_VIRTUAL, SCALE_MS,
+QEMU_TIMER_ATTR(EXTERNAL),
+qemu_input_queue_process, _queue);
 }
 if (queue_count < queue_limit) {
 qemu_input_queue_delay(_queue, kbd_timer,
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index 29d8e39..9aa861d 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -490,6 +490,8 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 bool progress = false;
 QEMUTimerCB *cb;
 void *opaque;
+bool need_replay_checkpoint = false;
+bool nonexternal_expired_pending;
 
 if (!atomic_read(_list->active_timers)) {
 return false;
@@ -505,8 +507,15 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 break;
 default:
 case QEMU_CLOCK_VIRTUAL:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
-goto out;
+if (replay_mode != REPLAY_MODE_NONE) {
+/* Checkpoint for virtual clock is redundant in cases where
+ * it's being triggered with only non-EXTERNAL timers, because
+ * these timers don't change guest state directly.
+ * Since it has conditional dependence on specific timers, it is
+ * subject to race conditions and requires special handling.
+ * See below.
+ */
+need_replay_checkpoint = true;
 }
 break;
 case QEMU_CLOCK_HOST:
@@ -522,26 +531,65 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 }
 
 current_time = qemu_clock_get_ns(timer_list->clock->type);
-for(;;) {
-qemu_mutex_lock(_list->active_timers_lock);
-ts = timer_list->active_timers;
-if (!timer_expired_ns(ts, current_time)) {
+nonexternal_expired_pending = false;
+
+/*
+ * Inner loop performs extracting and processing expired timers.
+ *
+ * Outer loop is relevant only to rr mode and required for proper
+ * checkpointing.
+ * Checkpoint must be replayed before any non-EXTERNAL timer has been
+ * processed and only one time (virtual clock value stays same). But these
+ * timers ma

Re: [Qemu-devel] [PATCH 3/3] Restores record/replay behavior related to special virtual clock processing for timers used in external subsystems.

2018-10-15 Thread Artem Pisarenko
Opps, I introduced race condition in 'timerlist_run_timers' function. If
checking procedure decides, that no checkpoint needed, and another thread
adds expired non-EXTERNAL timer between end of checking procedure and end
of following "for (;;)" loop, then this timer will be processed but no
checkpoint will be preceeded. I didn't figured out how to workaround it
yet...

вс, 14 окт. 2018 г. в 20:57, Artem Pisarenko :

> Adds EXTERNAL attribute definition to qemu timers subsystem and assigns it
> to virtual clock timers, used in slirp (ICMP IPv6) and ui (key queue).
> Virtual clock processing in rr mode reimplemented using this attribute.
>
> Fixes: 87f4fe7653baf55b5c2f2753fe6003f473c07342
> Fixes: 775a412bf83f6bc0c5c02091ee06cf649b34c593
> Fixes: 9888091404a702d7ec79d51b088d994b9fc121bd
> Signed-off-by: Artem Pisarenko 
> ---
>  include/qemu/timer.h | 10 ++--
>  slirp/ip6_icmp.c |  4 +++-
>  ui/input.c   |  5 ++--
>  util/qemu-timer.c| 67
> 
>  4 files changed, 60 insertions(+), 26 deletions(-)
>
> diff --git a/include/qemu/timer.h b/include/qemu/timer.h
> index 031e3a1..53bfba5 100644
> --- a/include/qemu/timer.h
> +++ b/include/qemu/timer.h
> @@ -65,11 +65,17 @@ typedef enum {
>   * instead each attribute in bit set accessed with QEMU_TIMER_ATTR(id)
> macro,
>   * where 'id' is a unique part of attribute identifier.
>   *
> - * No attributes defined currently.
> + * The following attributes are available:
> + *
> + * QEMU_TIMER_ATTR(EXTERNAL): drives external subsystem
> + *
> + * Timers with this attribute do not recorded in rr mode, therefore it
> could be
> + * used for the subsystems that operate outside the guest core.
> Applicable only
> + * with virtual clock type.
>   */
>
>  typedef enum {
> -/* none */
> +QEMU_TIMER_ATTRBIT_EXTERNAL,
>  } QEMUTimerAttrBit;
>
>  #define QEMU_TIMER_ATTR(id) (1 << QEMU_TIMER_ATTRBIT_ ## id)
> diff --git a/slirp/ip6_icmp.c b/slirp/ip6_icmp.c
> index ee333d0..7c08433 100644
> --- a/slirp/ip6_icmp.c
> +++ b/slirp/ip6_icmp.c
> @@ -27,7 +27,9 @@ void icmp6_init(Slirp *slirp)
>  return;
>  }
>
> -slirp->ra_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, ra_timer_handler,
> slirp);
> +slirp->ra_timer = timer_new_a(QEMU_CLOCK_VIRTUAL, SCALE_MS,
> +  QEMU_TIMER_ATTR(EXTERNAL),
> +  ra_timer_handler, slirp);
>  timer_mod(slirp->ra_timer,
>qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
>  }
> diff --git a/ui/input.c b/ui/input.c
> index 51b1019..6279187 100644
> --- a/ui/input.c
> +++ b/ui/input.c
> @@ -448,8 +448,9 @@ void qemu_input_event_send_key_delay(uint32_t delay_ms)
>  }
>
>  if (!kbd_timer) {
> -kbd_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
> qemu_input_queue_process,
> - _queue);
> +kbd_timer = timer_new_a(QEMU_CLOCK_VIRTUAL, SCALE_MS,
> +QEMU_TIMER_ATTR(EXTERNAL),
> +qemu_input_queue_process, _queue);
>  }
>  if (queue_count < queue_limit) {
>  qemu_input_queue_delay(_queue, kbd_timer,
> diff --git a/util/qemu-timer.c b/util/qemu-timer.c
> index 29d8e39..8c6d1cb 100644
> --- a/util/qemu-timer.c
> +++ b/util/qemu-timer.c
> @@ -490,6 +490,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
>  bool progress = false;
>  QEMUTimerCB *cb;
>  void *opaque;
> +bool need_replay_checkpoint = false;
>
>  if (!atomic_read(_list->active_timers)) {
>  return false;
> @@ -500,28 +501,52 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
>  goto out;
>  }
>
> -switch (timer_list->clock->type) {
> -case QEMU_CLOCK_REALTIME:
> -break;
> -default:
> -case QEMU_CLOCK_VIRTUAL:
> -if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
> -goto out;
> -}
> -break;
> -case QEMU_CLOCK_HOST:
> -if (!replay_checkpoint(CHECKPOINT_CLOCK_HOST)) {
> -goto out;
> -}
> -break;
> -case QEMU_CLOCK_VIRTUAL_RT:
> -if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL_RT)) {
> -goto out;
> -}
> -break;
> -}
> -
>  current_time = qemu_clock_get_ns(timer_list->clock->type);
> +
> +if (replay_mode != REPLAY_MODE_NONE) {
> +switch (timer_list->clock->type) {
> +case QEMU_CLOCK_REALTIME:
> +break;
> +default:
> +case QEMU_CLOCK_VIRTUAL:
> + 

[Qemu-devel] [PATCH 3/3] Restores record/replay behavior related to special virtual clock processing for timers used in external subsystems.

2018-10-14 Thread Artem Pisarenko
Adds EXTERNAL attribute definition to qemu timers subsystem and assigns it to 
virtual clock timers, used in slirp (ICMP IPv6) and ui (key queue).
Virtual clock processing in rr mode reimplemented using this attribute.

Fixes: 87f4fe7653baf55b5c2f2753fe6003f473c07342
Fixes: 775a412bf83f6bc0c5c02091ee06cf649b34c593
Fixes: 9888091404a702d7ec79d51b088d994b9fc121bd
Signed-off-by: Artem Pisarenko 
---
 include/qemu/timer.h | 10 ++--
 slirp/ip6_icmp.c |  4 +++-
 ui/input.c   |  5 ++--
 util/qemu-timer.c| 67 
 4 files changed, 60 insertions(+), 26 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 031e3a1..53bfba5 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -65,11 +65,17 @@ typedef enum {
  * instead each attribute in bit set accessed with QEMU_TIMER_ATTR(id) macro,
  * where 'id' is a unique part of attribute identifier.
  *
- * No attributes defined currently.
+ * The following attributes are available:
+ *
+ * QEMU_TIMER_ATTR(EXTERNAL): drives external subsystem
+ *
+ * Timers with this attribute do not recorded in rr mode, therefore it could be
+ * used for the subsystems that operate outside the guest core. Applicable only
+ * with virtual clock type.
  */
 
 typedef enum {
-/* none */
+QEMU_TIMER_ATTRBIT_EXTERNAL,
 } QEMUTimerAttrBit;
 
 #define QEMU_TIMER_ATTR(id) (1 << QEMU_TIMER_ATTRBIT_ ## id)
diff --git a/slirp/ip6_icmp.c b/slirp/ip6_icmp.c
index ee333d0..7c08433 100644
--- a/slirp/ip6_icmp.c
+++ b/slirp/ip6_icmp.c
@@ -27,7 +27,9 @@ void icmp6_init(Slirp *slirp)
 return;
 }
 
-slirp->ra_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, ra_timer_handler, 
slirp);
+slirp->ra_timer = timer_new_a(QEMU_CLOCK_VIRTUAL, SCALE_MS,
+  QEMU_TIMER_ATTR(EXTERNAL),
+  ra_timer_handler, slirp);
 timer_mod(slirp->ra_timer,
   qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
 }
diff --git a/ui/input.c b/ui/input.c
index 51b1019..6279187 100644
--- a/ui/input.c
+++ b/ui/input.c
@@ -448,8 +448,9 @@ void qemu_input_event_send_key_delay(uint32_t delay_ms)
 }
 
 if (!kbd_timer) {
-kbd_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, qemu_input_queue_process,
- _queue);
+kbd_timer = timer_new_a(QEMU_CLOCK_VIRTUAL, SCALE_MS,
+QEMU_TIMER_ATTR(EXTERNAL),
+qemu_input_queue_process, _queue);
 }
 if (queue_count < queue_limit) {
 qemu_input_queue_delay(_queue, kbd_timer,
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index 29d8e39..8c6d1cb 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -490,6 +490,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 bool progress = false;
 QEMUTimerCB *cb;
 void *opaque;
+bool need_replay_checkpoint = false;
 
 if (!atomic_read(_list->active_timers)) {
 return false;
@@ -500,28 +501,52 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 goto out;
 }
 
-switch (timer_list->clock->type) {
-case QEMU_CLOCK_REALTIME:
-break;
-default:
-case QEMU_CLOCK_VIRTUAL:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
-goto out;
-}
-break;
-case QEMU_CLOCK_HOST:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_HOST)) {
-goto out;
-}
-break;
-case QEMU_CLOCK_VIRTUAL_RT:
-if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL_RT)) {
-goto out;
-}
-break;
-}
-
 current_time = qemu_clock_get_ns(timer_list->clock->type);
+
+if (replay_mode != REPLAY_MODE_NONE) {
+switch (timer_list->clock->type) {
+case QEMU_CLOCK_REALTIME:
+break;
+default:
+case QEMU_CLOCK_VIRTUAL:
+/* Check whether there are pending timers used for external
+ * subsystems, before doing replay checkpoint. If there are only
+ * such timers, then checkpoint will be redundant, because these
+ * timers don't change guest state directly.
+ * Procedure optimized to finish traversing timer list quickly,
+ * because it's a rare condition.
+ */
+qemu_mutex_lock(_list->active_timers_lock);
+ts = timer_list->active_timers;
+while (timer_expired_ns(ts, current_time)) {
+if (!(ts->attributes & QEMU_TIMER_ATTR(EXTERNAL))) {
+need_replay_checkpoint = true;
+break;
+}
+ts = ts->next;
+}
+qemu_mutex_unlock(_list->active_timers_lock);
+if (!need_replay_checkpoint) {
+break;
+}
+
+if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
+ 

[Qemu-devel] [PATCH 2/3] Introduce attributes to qemu timer subsystem

2018-10-14 Thread Artem Pisarenko
Attributes are simple flags, associated with individual timers for their whole 
lifetime.
They intended to be used to mark individual timers for special handling by 
various qemu features operating at qemu core level.
Existing timer, aio and coroutine interface extended with attribute-enabled 
variants of functions, which create/initialize timers.

Signed-off-by: Artem Pisarenko 
---

Notes:
Conversion and association between QEMUTimerAttrBit and accessor macro are 
dumb.
Maybe better alternatives exist (like QFlags in Qt framework) or existing 
qemu code may be reused, if any.
Attributes also may be better named as flags, but they looks like something 
volatile, whereas 'attribute' expresses constant nature better.

 include/block/aio.h |  50 +++-
 include/qemu/coroutine.h|   5 +-
 include/qemu/timer.h| 110 ++--
 tests/ptimer-test-stubs.c   |   7 +--
 util/qemu-coroutine-sleep.c |   6 ++-
 util/qemu-timer.c   |  12 +++--
 6 files changed, 165 insertions(+), 25 deletions(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index f08630c..a6be3fb 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -407,10 +407,35 @@ static inline QEMUTimer *aio_timer_new(AioContext *ctx, 
QEMUClockType type,
int scale,
QEMUTimerCB *cb, void *opaque)
 {
-return timer_new_tl(ctx->tlg.tl[type], scale, cb, opaque);
+return timer_new_a_tl(ctx->tlg.tl[type], scale, 0, cb, opaque);
 }
 
 /**
+ * aio_timer_new_a:
+ * @ctx: the aio context
+ * @type: the clock type
+ * @scale: the scale
+ * @attributes: 0, or one to multiple OR'ed QEMU_TIMER_ATTR(id) values to 
assign
+ * @cb: the callback to call on timer expiry
+ * @opaque: the opaque pointer to pass to the callback
+ *
+ * Allocate a new timer (with attributes) attached to the context @ctx.
+ * The function is responsible for memory allocation.
+ *
+ * The preferred interface is aio_timer_init. Use that
+ * unless you really need dynamic memory allocation.
+ *
+ * Returns: a pointer to the new timer
+ */
+static inline QEMUTimer *aio_timer_new_a(AioContext *ctx, QEMUClockType type,
+ int scale, int attributes,
+ QEMUTimerCB *cb, void *opaque)
+{
+return timer_new_a_tl(ctx->tlg.tl[type], scale, attributes, cb, opaque);
+}
+
+
+/**
  * aio_timer_init:
  * @ctx: the aio context
  * @ts: the timer
@@ -427,7 +452,28 @@ static inline void aio_timer_init(AioContext *ctx,
   int scale,
   QEMUTimerCB *cb, void *opaque)
 {
-timer_init_tl(ts, ctx->tlg.tl[type], scale, cb, opaque);
+timer_init_a_tl(ts, ctx->tlg.tl[type], scale, 0, cb, opaque);
+}
+
+/**
+ * aio_timer_init_a:
+ * @ctx: the aio context
+ * @ts: the timer
+ * @type: the clock type
+ * @scale: the scale
+ * @attributes: 0, or one to multiple OR'ed QEMU_TIMER_ATTR(id) values to 
assign
+ * @cb: the callback to call on timer expiry
+ * @opaque: the opaque pointer to pass to the callback
+ *
+ * Initialise a new timer (with attributes) attached to the context @ctx.
+ * The caller is responsible for memory allocation.
+ */
+static inline void aio_timer_init_a(AioContext *ctx,
+QEMUTimer *ts, QEMUClockType type,
+int scale, int attributes,
+QEMUTimerCB *cb, void *opaque)
+{
+timer_init_a_tl(ts, ctx->tlg.tl[type], scale, attributes, cb, opaque);
 }
 
 /**
diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 9801e7f..cffc2b2 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -276,7 +276,10 @@ void qemu_co_rwlock_unlock(CoRwlock *lock);
 /**
  * Yield the coroutine for a given duration
  */
-void coroutine_fn qemu_co_sleep_ns(QEMUClockType type, int64_t ns);
+#define qemu_co_sleep_ns(type, ns) \
+qemu_co_sleep_a_ns(type, 0, ns)
+void coroutine_fn qemu_co_sleep_a_ns(QEMUClockType type, int attributes,
+ int64_t ns);
 
 /**
  * Yield until a file descriptor becomes readable
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 39ea907..031e3a1 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -52,6 +52,28 @@ typedef enum {
 QEMU_CLOCK_MAX
 } QEMUClockType;
 
+/**
+ * QEMU Timer attributes:
+ *
+ * An individual timer may be assigned with one or multiple attributes when
+ * initialized.
+ * Attribute is a static flag, meaning that timer has corresponding property.
+ * Attributes are defined in QEMUTimerAttrBit enum and encoded to bit set,
+ * which used to initialize timer, stored to 'attributes' member and can be
+ * retrieved externally with timer_get_attributes() call.
+ * Values of QEMUTimerAttrBit aren't used directly,
+ * instead each attribute i

[Qemu-devel] [PATCH 1/3] Revert some patches from recent series [PATCH v6] "Fixing record/replay and adding reverse debugging", which introduced new virtual clock type for use in external subsystems.

2018-10-14 Thread Artem Pisarenko
This reverts commit 87f4fe7653baf55b5c2f2753fe6003f473c07342.
This reverts commit 775a412bf83f6bc0c5c02091ee06cf649b34c593.
This reverts commit 9888091404a702d7ec79d51b088d994b9fc121bd.

Signed-off-by: Artem Pisarenko 
---
 include/qemu/timer.h | 9 -
 slirp/ip6_icmp.c | 7 +++
 ui/input.c   | 8 
 util/qemu-timer.c| 2 --
 4 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index a005ed2..39ea907 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -42,14 +42,6 @@
  * In icount mode, this clock counts nanoseconds while the virtual
  * machine is running.  It is used to increase @QEMU_CLOCK_VIRTUAL
  * while the CPUs are sleeping and thus not executing instructions.
- *
- * @QEMU_CLOCK_VIRTUAL_EXT: virtual clock for external subsystems
- *
- * The virtual clock only runs during the emulation. It stops
- * when the virtual machine is stopped. The timers for this clock
- * do not recorded in rr mode, therefore this clock could be used
- * for the subsystems that operate outside the guest core.
- *
  */
 
 typedef enum {
@@ -57,7 +49,6 @@ typedef enum {
 QEMU_CLOCK_VIRTUAL = 1,
 QEMU_CLOCK_HOST = 2,
 QEMU_CLOCK_VIRTUAL_RT = 3,
-QEMU_CLOCK_VIRTUAL_EXT = 4,
 QEMU_CLOCK_MAX
 } QEMUClockType;
 
diff --git a/slirp/ip6_icmp.c b/slirp/ip6_icmp.c
index 3f41187..ee333d0 100644
--- a/slirp/ip6_icmp.c
+++ b/slirp/ip6_icmp.c
@@ -17,7 +17,7 @@ static void ra_timer_handler(void *opaque)
 {
 Slirp *slirp = opaque;
 timer_mod(slirp->ra_timer,
-  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_EXT) + NDP_Interval);
+  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
 ndp_send_ra(slirp);
 }
 
@@ -27,10 +27,9 @@ void icmp6_init(Slirp *slirp)
 return;
 }
 
-slirp->ra_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL_EXT,
-   ra_timer_handler, slirp);
+slirp->ra_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, ra_timer_handler, 
slirp);
 timer_mod(slirp->ra_timer,
-  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_EXT) + NDP_Interval);
+  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
 }
 
 void icmp6_cleanup(Slirp *slirp)
diff --git a/ui/input.c b/ui/input.c
index dd7f6d7..51b1019 100644
--- a/ui/input.c
+++ b/ui/input.c
@@ -271,7 +271,7 @@ static void qemu_input_queue_process(void *opaque)
 item = QTAILQ_FIRST(queue);
 switch (item->type) {
 case QEMU_INPUT_QUEUE_DELAY:
-timer_mod(item->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_EXT)
+timer_mod(item->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL)
   + item->delay_ms);
 return;
 case QEMU_INPUT_QUEUE_EVENT:
@@ -301,7 +301,7 @@ static void qemu_input_queue_delay(struct 
QemuInputEventQueueHead *queue,
 queue_count++;
 
 if (start_timer) {
-timer_mod(item->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_EXT)
+timer_mod(item->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL)
   + item->delay_ms);
 }
 }
@@ -448,8 +448,8 @@ void qemu_input_event_send_key_delay(uint32_t delay_ms)
 }
 
 if (!kbd_timer) {
-kbd_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL_EXT,
- qemu_input_queue_process, _queue);
+kbd_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, qemu_input_queue_process,
+ _queue);
 }
 if (queue_count < queue_limit) {
 qemu_input_queue_delay(_queue, kbd_timer,
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index eb60d8f..86bfe84 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -496,7 +496,6 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 
 switch (timer_list->clock->type) {
 case QEMU_CLOCK_REALTIME:
-case QEMU_CLOCK_VIRTUAL_EXT:
 break;
 default:
 case QEMU_CLOCK_VIRTUAL:
@@ -598,7 +597,6 @@ int64_t qemu_clock_get_ns(QEMUClockType type)
 return get_clock();
 default:
 case QEMU_CLOCK_VIRTUAL:
-case QEMU_CLOCK_VIRTUAL_EXT:
 if (use_icount) {
 return cpu_get_icount();
 } else {
-- 
2.7.4




[Qemu-devel] [PATCH 0/3] Introduce attributes for timers subsystem and remove QEMU_CLOCK_VIRTUAL_EXT clock type

2018-10-14 Thread Artem Pisarenko
Recent patches from series [PATCH v6] "Fixing record/replay and adding reverse 
debugging" introduced new clock type QEMU_CLOCK_VIRTUAL_EXT and replaced 
virtual timers in some external subsystems with it.
This resulted in small change to existing behavior, which I consider to be 
unacceptable.
Processing of virtual timers, belonging to new clock type, was kicked off to 
main loop, which made them asynchronous with vCPU thread and, in icount mode, 
with whole guest execution. This breaks expected determinism in 
non-record/replay icount mode of emulation where these "external subsystems" 
are isolated from host (i.e. they external only to guest core, not to emulation 
environment).

Example for slirp ("user" backend for network device):
User runs qemu in icount mode with rtc clock=vm without any external 
communication interfaces but with "-netdev user,restrict=on". It expects 
deterministic execution, because network services are emulated inside qemu and 
isolated from host. There are no reasons to get reply from DHCP server with 
different delay or something like that.

These series of patches revert those commits and reimplement their 
modifications in a better way.

Current implementation of timers/clock processing is confusing (at least for 
me) because of exceptions from design concept behind them, introduced by icount 
mode (which adds QEMU_CLOCK_VIRTUAL_RT). Adding QEMU_CLOCK_VIRTUAL_EXT just 
made things even more complicated. I consider these "alternative" virtual 
clocks to be some kind of hacks being convinient only to authors of relevant 
qemu features. Lets don't touch fundamental clock types and keep them 
orthogonal to special cases of timers handling.

As far as I understand, original intention of author was just to make 
optimization in replay log by avoiding storing extra events which don't change 
guest state directly. Indeed, for example, ipv6 icmp timer in slirp does things 
which external to guest core and ends with sending network packet to guest, but 
record/replay will anyway catch event, corresponding to packet reception in 
guest network frontend, and store it to replay log, so there are no need in 
making checkpoint for corresponding clock when that timer fires.
If so, then we just need to skip checkpoints for clock values, when only these 
specific timers are run. It is individual timers which are specific, not clock.
Adding some kind of attribute/flag/property to individual timer allows any 
special qemu feature (such as record/replay) to inspect it in any place of code 
and handle as needed. This design acheives less dependencies, more transparency 
and has more intuitive and clear sense. For record/replay feature it's acheived 
with 'EXTERNAL' attribute. The only drawback is that it required to add extra 
cycle of traversal through active timer list in timerlist_run_timers() function 
(see patch 3), but I've optimized it in a way that performance degradation 
expected to be very small and appear only in record/replay mode.

P.S. I've tried to test record/replay with slirp, but in replay mode qemu 
stucks at guest linux boot after "Configuring network interfaces..." message, 
where DHCP communication takes place. It's broken in a same way both in master 
and master with reverted commits being fixed.


Artem Pisarenko (3):
  Revert some patches from recent series [PATCH v6] "Fixing
record/replay and adding reverse debugging", which introduced new
virtual clock type for use in external subsystems. These changes
breaks desired behavior in non-record/replay usage scenarios.
  Introduce attributes to qemu timer subsystem
  Restores record/replay behavior related to special virtual clock
processing for timers used in external subsystems.

 include/block/aio.h |  50 +-
 include/qemu/coroutine.h|   5 +-
 include/qemu/timer.h| 125 
 slirp/ip6_icmp.c|   9 ++--
 tests/ptimer-test-stubs.c   |   7 +--
 ui/input.c  |   9 ++--
 util/qemu-coroutine-sleep.c |   6 ++-
 util/qemu-timer.c   |  81 +++-
 8 files changed, 227 insertions(+), 65 deletions(-)

-- 
2.7.4




[Qemu-devel] [Bug 1790460] Re: -icount, sleep=off mode is broken (target slows down or hangs)

2018-10-12 Thread Artem Pisarenko
Long-term testing showed that my trivial adaptation didn't fixed issue and I 
sticked to modification from QBox. It didn't failed yet.
And yes, issue still relevant to current master branch.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1790460

Title:
  -icount,sleep=off mode is broken (target slows down or hangs)

Status in QEMU:
  New

Bug description:
  QEMU running with options "-icount,sleep=off -rtc clock=vm" doesn't execute 
emulation at maximum possible speed.
  Target virtual clock may run faster or slower than realtime clock by N times, 
where N value depends on various unrelated conditions (i.e. random from the 
user point of view). The worst case is when target hangs (hopefully, early in 
booting stage).
  Example scenarios I've described here: 
http://lists.nongnu.org/archive/html/qemu-discuss/2018-08/msg00102.html

  QEMU process just sleeps most of the time (polling, waiting some
  condition, etc.). I've tried to debug issue and came to 99% conclusion
  that there are racing somewhere in qemu internals.

  The feature is broken since v2.6.0 release.
  Bad commit is 281b2201e4e18d5b9a26e1e8d81b62b5581a13be by Pavel Dovgalyuk, 
03/10/2016 05:56 PM:

icount: remove obsolete warp call

qemu_clock_warp call in qemu_tcg_wait_io_event function is not needed
anymore, because it is called in every iteration of main_loop_wait.

Reviewed-by: Paolo Bonzini 

Signed-off-by: Pavel Dovgalyuk 
Message-Id: <20160310115603.4812.67559.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini 

  I've reverted commit to all major releases and latest git master
  branch. Issue was fixed for all of them. My adaptation is trivial:
  just restoring removed function call before "qemu_cond_wait(...)"
  line.

  I'm sure following bugs are just particular cases of the issue:
  #1774677, #1653063 .

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1790460/+subscriptions



Re: [Qemu-devel] [PATCH v7 19/19] replay: document development rules

2018-10-11 Thread Artem Pisarenko
Great! I'm voting with all my fingers up for such rules. But I would
suggest even more generic rules which prevent breaking determinism in a
more wide sense. At least, where such breakage is trivial to avoid.

Currently I'm working on modification, which extends conditions where guest
execution is kept deterministic, above such narrow set like "only rtc
clock=vm, no serial devices, no network, no external communication
interfaces at all, etc...". Also I'm dealing with bugs (features?), which
prevents advertised determinism even in such restricted conditions. I found
that my work involves very similar efforts as Pavel's work. And this is
extra hard work. I'm feeling like fighting with hundreds of maintainers and
contributors, whose efforts are in opposite direction. They starting from
qemu underlying core, encouraging asyncronous  processing (aio, threads,
virtio ioeventfd, etc.), and ending with particular modules or hardware
models, which negligently uses any kind of non-blocking calls, callbacks,
and even inappropriate QEMUClockType. Nobody cares about synchronization
with vcpu thread, except Pavel, me and, possibly, 1-2 more persons in a
whole world. I can understand why. Main reason is that it might greatly
degradate performance of emulation, which might be avoided by introducing
very high complexity. So, my words shouldn't be treated as any kind of
criticism. I perfectly understand that it's a complex issue.

Key difference of record/replay is that it solves problem by hooking calls
of any source of asynchrony at low level and just replaying it. In other
words, it deals with end effects, whereas non-record/replay use case
doesn't allow such solution and have to eliminate source of undesired
asynchrony by design. As such, record/replay have strong immunity to
violation of 'generic determinism' rules and even to hidden and tricky bugs
in any module which affects guest state. And that's why development rules
Pavel imposes are so democratic (relative to generic ones, I would like to
exist).

Anyway, it's just generic idea for discussion. I know, it needs to be more
specific. But, if nobody will express interest, I see no reason to continue.

P.S. Trivial example of how qemu could extend conditions for deterministic
execution. Сhardev would perform writing to backend using blocking calls,
thus making possible deterministic execution in use cases, where guest has
only one serial port which outputs data to console and have no interaction
with user. At least it would provide user with option, selecting between
better performance and determinism.


ср, 10 окт. 2018 г. в 19:32, Pavel Dovgalyuk :

> This patch introduces docs/devel/replay.txt which describes the rules
> that should be followed to make virtual devices usable in record/replay
> mode.
>
> Signed-off-by: Pavel Dovgalyuk 
> ---
>  docs/devel/replay.txt |   45 +
>  1 file changed, 45 insertions(+)
>  create mode 100644 docs/devel/replay.txt
>
> diff --git a/docs/devel/replay.txt b/docs/devel/replay.txt
> new file mode 100644
> index 000..61dac1b
> --- /dev/null
> +++ b/docs/devel/replay.txt
> @@ -0,0 +1,45 @@
> +Record/replay mechanism, that could be enabled through icount mode,
> expects
> +the virtual devices to satisfy the following requirements.
> +
> +The main idea behind this document is that everything that affects
> +the guest state during execution in icount mode should be deterministic.
> +
> +Timers
> +==
> +
> +All virtual devices should use virtual clock for timers that change the
> guest
> +state. Virtual clock is deterministic, therefore such timers are
> deterministic
> +too.
> +
> +Virtual devices can also use realtime clock for the events that do not
> change
> +the guest state directly. When the clock ticking should depend on VM
> execution
> +speed, use virtual ext clock. It is not deterministic, but its speed
> depends
> +on the guest execution. This clock is used by the virtual devices (e.g.,
> +slirp routing device) that lie outside the replayed guest.
> +
> +Bottom halves
> +=
> +
> +Bottom half callbacks, that affect the guest state, should be invoked
> through
> +replay_bh_schedule_event or replay_bh_schedule_oneshot_event functions.
> +Their invocations are saved in record mode and synchronized with the
> existing
> +log in replay mode.
> +
> +Saving/restoring the VM state
> +=
> +
> +All fields in the device state structure (including virtual timers)
> +should be restored by loadvm to the same values they had before savevm.
> +
> +Avoid accessing other devices' state, because the order of
> saving/restoring
> +is not defined. It means that you should not call functions like
> +'update_irq' in post_load callback. Save everything explicitly to avoid
> +the dependencies that may make restoring the VM state non-deterministic.
> +
> +Stopping the VM
> +===
> +
> +Stopping the guest should not interfere with its state (with 

Re: [Qemu-devel] [PATCH v7 18/19] replay: init rtc after enabling the replay

2018-10-11 Thread Artem Pisarenko
I guess 'configure_rtc' function is not the only one which call should be
postponed. Since record/replay uses its 'hooks' at very low levels of qemu
core, any option processing is subject to potential indirect dependence on
effects of 'replay_configure' call. The most apparent example would be
'qemu_clock_get_*(QEMU_CLOCK_HOST)' calls, which are used widely.
Furthermore, right now I submitting a patch which adds one such call in
place before any option parsed yet, so it will also need to be moved to
follow this patch. Even if such calls aren't involved in current version,
no guarantees that they will not appear in future (like I already doing
it), because there are nothing which prevents developers from doing so.

One way, how this could be enforced, is to add something like
'REPLAY_MODE_INVALID' to 'ReplayMode' enum, replace initialization of
global 'replay_mode' variable with this value and add corresponding
checks/assertions to 'REPLAY_CLOCK' macro, etc.

But it would be more reliable to move 'icount_opts' processing out of
common processing loop (marked as 'second pass of option parsing') and
place it to new pass, located before current 'second' one (like it already
done for '-nouserconfig' option), and also move 'replace_configure' right
after it. Probably it will also require to tune 'icount_opts' processing
and record/replay initialization. Since record/replay feature is very
specific, I consider it's worth to make such exception for it.

ср, 10 окт. 2018 г. в 19:32, Pavel Dovgalyuk :

> This patch postpones the call of 'configure_rtc' function. This call
> uses host clock to configure the rtc, but host clock access should be
> recorded when using icount record/replay mode. Therefore now rtc
> is configured after switching record/replay mode on.
>
> Signed-off-by: Pavel Dovgalyuk 
> ---
>  vl.c |   11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/vl.c b/vl.c
> index a3e2e47..f910072 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2932,6 +2932,7 @@ int main(int argc, char **argv, char **envp)
>  DisplayState *ds;
>  QemuOpts *opts, *machine_opts;
>  QemuOpts *icount_opts = NULL, *accel_opts = NULL;
> +QemuOpts *rtc_opts = NULL;
>  QemuOptsList *olist;
>  int optind;
>  const char *optarg;
> @@ -3738,12 +3739,11 @@ int main(int argc, char **argv, char **envp)
>  warn_report("This option is ignored and will be removed
> soon");
>  break;
>  case QEMU_OPTION_rtc:
> -opts = qemu_opts_parse_noisily(qemu_find_opts("rtc"),
> optarg,
> -   false);
> -if (!opts) {
> +rtc_opts = qemu_opts_parse_noisily(qemu_find_opts("rtc"),
> +   optarg, false);
> +if (!rtc_opts) {
>  exit(1);
>  }
> -configure_rtc(opts);
>  break;
>  case QEMU_OPTION_tb_size:
>  #ifndef CONFIG_TCG
> @@ -3954,6 +3954,9 @@ int main(int argc, char **argv, char **envp)
>  loc_set_none();
>
>  replay_configure(icount_opts);
> +if (rtc_opts) {
> +configure_rtc(rtc_opts);
> +}
>
>  if (incoming && !preconfig_exit_requested) {
>  error_report("'preconfig' and 'incoming' options are "
>
> --

С уважением,
  Артем Писаренко


[Qemu-devel] [PATCH 4/4] vl, qapi: offset value calculation in RTC_CHANGE event reverted to match behavior before #1797033 bugfix and documented

2018-10-11 Thread Artem Pisarenko
Return value of qemu_timedate_diff(), used for calculation offset in QAPI 
'RTC_CHANGE' event restored to keep compatibility, although it isn't documented 
that difference is relative to host clock advancement.
Added important note to 'RTC_CHANGE' event description to highlight established 
implementation specifics.

Signed-off-by: Artem Pisarenko 
---
 qapi/misc.json |  3 ++-
 vl.c   | 10 +-
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/qapi/misc.json b/qapi/misc.json
index ada9af5..ed866f2 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -3070,7 +3070,8 @@
 # Emitted when the guest changes the RTC time.
 #
 # @offset: offset between base RTC clock (as specified by -rtc base), and
-#  new RTC clock value
+#  new RTC clock value. Note that value will be different depending
+#  on clock choosen to drive RTC (specified by -rtc clock).
 #
 # Note: This event is rate-limited.
 #
diff --git a/vl.c b/vl.c
index d60018f..72a4173 100644
--- a/vl.c
+++ b/vl.c
@@ -788,10 +788,10 @@ void qemu_system_vmstop_request(RunState state)
 
 /***/
 /* RTC reference time/date access */
-static time_t qemu_ref_timedate(void)
+static time_t qemu_ref_timedate(QEMUClockType clock)
 {
-time_t value = qemu_clock_get_ms(rtc_clock) / 1000;
-switch (rtc_clock) {
+time_t value = qemu_clock_get_ms(clock) / 1000;
+switch (clock) {
 case QEMU_CLOCK_REALTIME:
 value -= rtc_realtime_clock_offset;
 /* no break */
@@ -811,7 +811,7 @@ static time_t qemu_ref_timedate(void)
 
 void qemu_get_timedate(struct tm *tm, int offset)
 {
-time_t ti = qemu_ref_timedate();
+time_t ti = qemu_ref_timedate(rtc_clock);
 
 ti += offset;
 
@@ -844,7 +844,7 @@ int qemu_timedate_diff(struct tm *tm)
 }
 }
 
-return seconds - qemu_ref_timedate();
+return seconds - qemu_ref_timedate(QEMU_CLOCK_HOST);
 }
 
 static void configure_rtc_base_datetime(const char *startdate)
-- 
2.7.4




[Qemu-devel] [PATCH 3/4] Fixes RTC bug with base datetime shifts in clock=vm

2018-10-11 Thread Artem Pisarenko
This makes all current "-rtc" option parameters combinations produce 
fixed/unambiguous RTC timedate reference for hardware emulation frontends.
It restores determinism of guest execution when used with clock=vm and 
specified base  value.

Buglink: https://bugs.launchpad.net/qemu/+bug/1797033
Signed-off-by: Artem Pisarenko 
---
 vl.c | 54 +-
 1 file changed, 33 insertions(+), 21 deletions(-)

diff --git a/vl.c b/vl.c
index 3a1346f..d60018f 100644
--- a/vl.c
+++ b/vl.c
@@ -152,8 +152,10 @@ static enum {
 RTC_BASE_LOCALTIME,
 RTC_BASE_DATETIME,
 } rtc_base_type = RTC_BASE_UTC;
-static int rtc_host_datetime_offset = -1; /* valid only for host rtc_clock and
- rtc_base_type=RTC_BASE_DATETIME */
+static time_t rtc_ref_start_datetime;
+static int rtc_realtime_clock_offset; /* used only with QEMU_CLOCK_REALTIME */
+static int rtc_host_datetime_offset = -1; /* valid & used only with
+ RTC_BASE_DATETIME */
 QEMUClockType rtc_clock;
 int vga_interface_type = VGA_NONE;
 static DisplayOptions dpy;
@@ -785,32 +787,42 @@ void qemu_system_vmstop_request(RunState state)
 }
 
 /***/
-/* real time host monotonic timer */
-
-static time_t qemu_timedate(void)
+/* RTC reference time/date access */
+static time_t qemu_ref_timedate(void)
 {
-return qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000;
+time_t value = qemu_clock_get_ms(rtc_clock) / 1000;
+switch (rtc_clock) {
+case QEMU_CLOCK_REALTIME:
+value -= rtc_realtime_clock_offset;
+/* no break */
+case QEMU_CLOCK_VIRTUAL:
+value += rtc_ref_start_datetime;
+break;
+case QEMU_CLOCK_HOST:
+if (rtc_base_type == RTC_BASE_DATETIME) {
+value -= rtc_host_datetime_offset;
+}
+break;
+default:
+assert(0);
+}
+return value;
 }
 
-/***/
-/* RTC reference time/date access */
 void qemu_get_timedate(struct tm *tm, int offset)
 {
-time_t ti = qemu_timedate();
+time_t ti = qemu_ref_timedate();
 
 ti += offset;
 
 switch (rtc_base_type) {
+case RTC_BASE_DATETIME:
 case RTC_BASE_UTC:
 gmtime_r(, tm);
 break;
 case RTC_BASE_LOCALTIME:
 localtime_r(, tm);
 break;
-case RTC_BASE_DATETIME:
-ti -= rtc_host_datetime_offset;
-gmtime_r(, tm);
-break;
 }
 }
 
@@ -819,6 +831,7 @@ int qemu_timedate_diff(struct tm *tm)
 time_t seconds;
 
 switch (rtc_base_type) {
+case RTC_BASE_DATETIME:
 case RTC_BASE_UTC:
 seconds = mktimegm(tm);
 break;
@@ -829,15 +842,12 @@ int qemu_timedate_diff(struct tm *tm)
 seconds = mktime();
 break;
 }
-case RTC_BASE_DATETIME:
-seconds = mktimegm(tm) + rtc_host_datetime_offset;
-break;
 }
 
-return seconds - qemu_timedate();
+return seconds - qemu_ref_timedate();
 }
 
-static void configure_rtc_host_datetime_offset(const char *startdate)
+static void configure_rtc_base_datetime(const char *startdate)
 {
 time_t rtc_start_datetime;
 struct tm tm;
@@ -863,8 +873,8 @@ static void configure_rtc_host_datetime_offset(const char 
*startdate)
  "'2006-06-17T16:01:21' or '2006-06-17'\n");
 exit(1);
 }
-rtc_host_datetime_offset = (qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000)
-   - rtc_start_datetime;
+rtc_host_datetime_offset = rtc_ref_start_datetime - rtc_start_datetime;
+rtc_ref_start_datetime = rtc_start_datetime;
 }
 
 static void configure_rtc(QemuOpts *opts)
@@ -883,7 +893,7 @@ static void configure_rtc(QemuOpts *opts)
 replay_add_blocker(blocker);
 } else {
 rtc_base_type = RTC_BASE_DATETIME;
-configure_rtc_host_datetime_offset(value);
+configure_rtc_base_datetime(value);
 }
 }
 value = qemu_opt_get(opts, "clock");
@@ -3034,6 +3044,8 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 rtc_clock = QEMU_CLOCK_HOST;
+rtc_ref_start_datetime = qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000;
+rtc_realtime_clock_offset = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000;
 
 QLIST_INIT (_change_state_head);
 os_setup_early_signal_handling();
-- 
2.7.4




[Qemu-devel] [PATCH 2/4] vl: refactor -rtc option references

2018-10-11 Thread Artem Pisarenko
Improve code readability and prepare for fixing bug #1797033

Signed-off-by: Artem Pisarenko 
---
 vl.c | 82 +---
 1 file changed, 49 insertions(+), 33 deletions(-)

diff --git a/vl.c b/vl.c
index 4e25c78..3a1346f 100644
--- a/vl.c
+++ b/vl.c
@@ -147,8 +147,13 @@ bool enable_cpu_pm = false;
 int nb_nics;
 NICInfo nd_table[MAX_NICS];
 int autostart;
-static int rtc_utc = 1;
-static int rtc_date_offset = -1; /* -1 means no change */
+static enum {
+RTC_BASE_UTC,
+RTC_BASE_LOCALTIME,
+RTC_BASE_DATETIME,
+} rtc_base_type = RTC_BASE_UTC;
+static int rtc_host_datetime_offset = -1; /* valid only for host rtc_clock and
+ rtc_base_type=RTC_BASE_DATETIME */
 QEMUClockType rtc_clock;
 int vga_interface_type = VGA_NONE;
 static DisplayOptions dpy;
@@ -782,26 +787,30 @@ void qemu_system_vmstop_request(RunState state)
 /***/
 /* real time host monotonic timer */
 
-static time_t qemu_time(void)
+static time_t qemu_timedate(void)
 {
 return qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000;
 }
 
 /***/
-/* host time/date access */
+/* RTC reference time/date access */
 void qemu_get_timedate(struct tm *tm, int offset)
 {
-time_t ti = qemu_time();
+time_t ti = qemu_timedate();
 
 ti += offset;
-if (rtc_date_offset == -1) {
-if (rtc_utc)
-gmtime_r(, tm);
-else
-localtime_r(, tm);
-} else {
-ti -= rtc_date_offset;
+
+switch (rtc_base_type) {
+case RTC_BASE_UTC:
+gmtime_r(, tm);
+break;
+case RTC_BASE_LOCALTIME:
+localtime_r(, tm);
+break;
+case RTC_BASE_DATETIME:
+ti -= rtc_host_datetime_offset;
 gmtime_r(, tm);
+break;
 }
 }
 
@@ -809,23 +818,28 @@ int qemu_timedate_diff(struct tm *tm)
 {
 time_t seconds;
 
-if (rtc_date_offset == -1)
-if (rtc_utc)
-seconds = mktimegm(tm);
-else {
-struct tm tmp = *tm;
-tmp.tm_isdst = -1; /* use timezone to figure it out */
-seconds = mktime();
-   }
-else
-seconds = mktimegm(tm) + rtc_date_offset;
-
-return seconds - qemu_time();
+switch (rtc_base_type) {
+case RTC_BASE_UTC:
+seconds = mktimegm(tm);
+break;
+case RTC_BASE_LOCALTIME:
+{
+struct tm tmp = *tm;
+tmp.tm_isdst = -1; /* use timezone to figure it out */
+seconds = mktime();
+break;
+}
+case RTC_BASE_DATETIME:
+seconds = mktimegm(tm) + rtc_host_datetime_offset;
+break;
+}
+
+return seconds - qemu_timedate();
 }
 
-static void configure_rtc_date_offset(const char *startdate)
+static void configure_rtc_host_datetime_offset(const char *startdate)
 {
-time_t rtc_start_date;
+time_t rtc_start_datetime;
 struct tm tm;
 
 if (sscanf(startdate, "%d-%d-%dT%d:%d:%d", _year, _mon,
@@ -841,15 +855,16 @@ static void configure_rtc_date_offset(const char 
*startdate)
 }
 tm.tm_year -= 1900;
 tm.tm_mon--;
-rtc_start_date = mktimegm();
-if (rtc_start_date == -1) {
+rtc_start_datetime = mktimegm();
+if (rtc_start_datetime == -1) {
 date_fail:
-error_report("invalid date format");
+error_report("invalid datetime format");
 error_printf("valid formats: "
  "'2006-06-17T16:01:21' or '2006-06-17'\n");
 exit(1);
 }
-rtc_date_offset = qemu_time() - rtc_start_date;
+rtc_host_datetime_offset = (qemu_clock_get_ms(QEMU_CLOCK_HOST) / 1000)
+   - rtc_start_datetime;
 }
 
 static void configure_rtc(QemuOpts *opts)
@@ -859,15 +874,16 @@ static void configure_rtc(QemuOpts *opts)
 value = qemu_opt_get(opts, "base");
 if (value) {
 if (!strcmp(value, "utc")) {
-rtc_utc = 1;
+rtc_base_type = RTC_BASE_UTC;
 } else if (!strcmp(value, "localtime")) {
 Error *blocker = NULL;
-rtc_utc = 0;
+rtc_base_type = RTC_BASE_LOCALTIME;
 error_setg(, QERR_REPLAY_NOT_SUPPORTED,
   "-rtc base=localtime");
 replay_add_blocker(blocker);
 } else {
-configure_rtc_date_offset(value);
+rtc_base_type = RTC_BASE_DATETIME;
+configure_rtc_host_datetime_offset(value);
 }
 }
 value = qemu_opt_get(opts, "clock");
-- 
2.7.4




[Qemu-devel] [PATCH 1/4] vl: improve/fix documentation related to RTC function

2018-10-11 Thread Artem Pisarenko
Documentation describing -rtc option updated to better match current 
implementation and highlight some important specifics.

Signed-off-by: Artem Pisarenko 
---
 qemu-options.hx | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index f139459..4a9c065 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3458,25 +3458,29 @@ HXCOMM Silently ignored for compatibility
 DEF("clock", HAS_ARG, QEMU_OPTION_clock, "", QEMU_ARCH_ALL)
 
 DEF("rtc", HAS_ARG, QEMU_OPTION_rtc, \
-"-rtc [base=utc|localtime|date][,clock=host|rt|vm][,driftfix=none|slew]\n" 
\
+"-rtc 
[base=utc|localtime|][,clock=host|rt|vm][,driftfix=none|slew]\n" \
 "set the RTC base and clock, enable drift fix for clock 
ticks (x86 only)\n",
 QEMU_ARCH_ALL)
 
 STEXI
 
-@item -rtc [base=utc|localtime|@var{date}][,clock=host|vm][,driftfix=none|slew]
+@item -rtc 
[base=utc|localtime|@var{datetime}][,clock=host|rt|vm][,driftfix=none|slew]
 @findex -rtc
 Specify @option{base} as @code{utc} or @code{localtime} to let the RTC start 
at the current
 UTC or local time, respectively. @code{localtime} is required for correct date 
in
-MS-DOS or Windows. To start at a specific point in time, provide @var{date} in 
the
+MS-DOS or Windows. To start at a specific point in time, provide 
@var{datetime} in the
 format @code{2006-06-17T16:01:21} or @code{2006-06-17}. The default base is 
UTC.
 
 By default the RTC is driven by the host system time. This allows using of the
 RTC as accurate reference clock inside the guest, specifically if the host
 time is smoothly following an accurate external reference clock, e.g. via NTP.
 If you want to isolate the guest time from the host, you can set @option{clock}
-to @code{rt} instead.  To even prevent it from progressing during suspension,
-you can set it to @code{vm}.
+to @code{rt} instead, which provides host monotonic clock if host support it.
+To even prevent it from progressing during suspension, you can set it to
+@code{vm} (virtual clock). Virtual clock is the clock seen by guest,
+In icount mode of emulation its (long term) speed will be different from
+any host clock, when icount configured to non-auto value or virtual cpu 
sleeping
+is off, and no synchronization algorithm is active.
 
 Enable @option{driftfix} (i386 targets only) if you experience time drift 
problems,
 specifically with Windows' ACPI HAL. This option will try to figure out how
-- 
2.7.4




[Qemu-devel] [PATCH 0/4] Fix and improve core RTC function and documentation

2018-10-11 Thread Artem Pisarenko
Modifications are motivated by bug https://bugs.launchpad.net/qemu/+bug/1797033 
I've encountered recently.

Trying to fix it and analyzing its effect on all use cases (not covered in bug 
report) revealed much deeper problems.
This is my first patch to QEMU and I'm not sure whether the way I addressed 
them is proper.
They definitely require either modifications to current implementation 
(breaking compability), or adding clarifications to user documentation. I've 
tried to find compromise.
Changes I propose are driven mostly by my interpretation of user documentation 
and comments in code, rather than by existing implementation.

I've splitted patches in a way that at least some of them may be accepted. 
Following subsets are make sense on their own:
- 1
- 1,2,3

I limited refactoring only to vl.c, although it leads to reworking of all hw/* 
rtc models which have unreasonable duplications of actions performed by vl.c.
I'm sure such kind of rework will reveal more hidden bugs/features, but so far 
I've had enough (it's a long story how muсh effort and pain cost me to find 
that tricky bug above).
I suppose just to draw attention of relevant hw/* maintainers.

Artem Pisarenko (4):
  vl: improve/fix documentation related to RTC function
  vl: refactor -rtc option references
  Fixes RTC bug with base datetime shifts in clock=vm
  vl,qapi: offset value calculation in RTC_CHANGE event reverted to
match behavior before #1797033 bugfix and documented

 qapi/misc.json  |   3 +-
 qemu-options.hx |  14 +---
 vl.c| 102 
 3 files changed, 76 insertions(+), 43 deletions(-)

-- 
2.7.4




[Qemu-devel] [Bug 1797033] Re: Running with -rtc clock=vm, base= introduces arbitrary base shift at guest startup

2018-10-09 Thread Artem Pisarenko
Why I didn't posted patch to qemu-devel ?
I have no idea how to patch it correctly, because it isn't clear how these 
things are expected to work when referenced across all qemu code and different 
use cases. Should vl.c/qemu_get_timedate() just be fixed ? Does each caller 
expect same behavior from it? Or only selected hw/* RTC implementations (such 
as hw/timer/mc146818rtc.c) have to be modified to use alternative function ? Or 
maybe it isn't specific to clock=vm and should be considered bad behavior in 
any case (i.e. time shouldn't advance before guest execution being started)?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1797033

Title:
  Running with -rtc clock=vm,base= introduces arbitrary base
  shift at guest startup

Status in QEMU:
  New

Bug description:
  When specifying 'base' for RTC to start with, it has incorrect
  implementation in combination with clock=vm.

  I inspected source code. This is because it uses host clock
  (qemu_time() function return value) as reference with
  'rtc_date_offset' operations across several places in code before
  guest execution starts, which has no relevance with clock=vm.

  It works in vast majority of cases only thanks to combination of some
  luck and fast execution of qemu initialization phase relative to host
  real time (i.e. multiple calls of qemu_time() returns same value). But
  if qemu execution is being slow down by high CPU load on host or
  started just before value of second changes, it may accumulate at
  least 1 second (and in hard circumstances even 2+ seconds) of delay
  from specified base datetime before guest becomes ready to start.

  This behavior breaks determinism of guest execution in icount mode.
  (Even if guest doesn't cares about precise time, just one second shift
  may trigger such chain of changes which accumulates significant
  difference in guest state at the moment when guest OS finishes
  booting.)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1797033/+subscriptions



[Qemu-devel] [Bug 1797033] [NEW] Running with -rtc clock=vm, base= introduces arbitrary base shift at guest startup

2018-10-09 Thread Artem Pisarenko
Public bug reported:

When specifying 'base' for RTC to start with, it has incorrect
implementation in combination with clock=vm.

I inspected source code. This is because it uses host clock (qemu_time()
function return value) as reference with 'rtc_date_offset' operations
across several places in code before guest execution starts, which has
no relevance with clock=vm.

It works in vast majority of cases only thanks to combination of some
luck and fast execution of qemu initialization phase relative to host
real time (i.e. multiple calls of qemu_time() returns same value). But
if qemu execution is being slow down by high CPU load on host or started
just before value of second changes, it may accumulate at least 1 second
(and in hard circumstances even 2+ seconds) of delay from specified base
datetime before guest becomes ready to start.

This behavior breaks determinism of guest execution in icount mode.
(Even if guest doesn't cares about precise time, just one second shift
may trigger such chain of changes which accumulates significant
difference in guest state at the moment when guest OS finishes booting.)

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1797033

Title:
  Running with -rtc clock=vm,base= introduces arbitrary base
  shift at guest startup

Status in QEMU:
  New

Bug description:
  When specifying 'base' for RTC to start with, it has incorrect
  implementation in combination with clock=vm.

  I inspected source code. This is because it uses host clock
  (qemu_time() function return value) as reference with
  'rtc_date_offset' operations across several places in code before
  guest execution starts, which has no relevance with clock=vm.

  It works in vast majority of cases only thanks to combination of some
  luck and fast execution of qemu initialization phase relative to host
  real time (i.e. multiple calls of qemu_time() returns same value). But
  if qemu execution is being slow down by high CPU load on host or
  started just before value of second changes, it may accumulate at
  least 1 second (and in hard circumstances even 2+ seconds) of delay
  from specified base datetime before guest becomes ready to start.

  This behavior breaks determinism of guest execution in icount mode.
  (Even if guest doesn't cares about precise time, just one second shift
  may trigger such chain of changes which accumulates significant
  difference in guest state at the moment when guest OS finishes
  booting.)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1797033/+subscriptions



Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging

2018-10-09 Thread Artem Pisarenko
It wasn't so easy to apply this patch due to problems in compilation of
version you pointed to, and due to content distortions introduced by mail
archive, but I got it worked finally :)

Applying this patch finally made all my tests succeed... almost :)

Now qemu may hang in random moment of emulation, but not hard. Symptoms
looks like I've already reported here:
https://bugs.launchpad.net/qemu/+bug/1790460 . So, this isn't
record/replay-specific. Although, without rr= option I wasn't able cause
this issue to reveal itself, but it doesn't make much sense due to
instability of issue's nature and its hard reproducibility.

Commit I tested against (with patches
applied): 53a19a9a5f9811a911e9b69ef36afb0d66b5d85c .


вт, 9 окт. 2018 г. в 17:26, Pavel Dovgalyuk :

> Maybe this will help?
>
>
>
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg560780.html
>
>
>
> Pavel Dovgalyuk
>
>
>
> *From:* Artem Pisarenko [mailto:artem.k.pisare...@gmail.com]
> *Sent:* Tuesday, October 09, 2018 2:24 PM
> *To:* Pavel Dovgalyuk
>
>
> *Cc:* pavel.dovga...@ispras.ru; qemu-devel@nongnu.org
> *Subject:* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and
> adding reverse debugging
>
>
>
> (Since all previous patches are already merged to master, I'm running
> tests against latest (almost) version from master branch. Following results
> are based on master commit dafd95053611aa14dda40266857608d12ddce658 .)
>
>
>
> Applying this patch made Tests 1 and 2 succeed (at least I wasn't able to
> acheive failures with several attempts).
>
> Also I've tried few tests without sleep=off and/or rtc base options. All
> of them succeed too, except one case - removing sleep=off (regardless of
> -rtc option values or its presence at all) causes qemu to hang hard in
> recording mode at very startup. Process needs to be killed.
>
>
>
> Some info from debugger:
>
> qemu-system-x86_64 [13231] [cores: 2,4,5,7]
>
>   Thread #1 [qemu-system-x86] 13231 [core: 2] (Suspended :
> Container)
>
>   __lll_lock_wait() at lowlevellock.S:135
> 0x7f00b116626d
>
>   __GI___pthread_mutex_lock() at
> pthread_mutex_lock.c:80 0x7f00b115fdbd
>
>   qemu_mutex_lock_impl() at qemu-thread-posix.c:66
> 0x947ac4
>
>   replay_mutex_lock() at replay-internal.c:206
> 0x7f3dea
>
>   os_host_main_loop_wait() at main-loop.c:235
> 0x94335e
>
>   main_loop_wait() at main-loop.c:497 0x943429
>
>   main_loop() at vl.c:1,853 0x5be70f
>
>   main() at vl.c:4,575 0x5c56e0
>
>   Thread #2 [qemu-system-x86] 13282 [core: 4] (Suspended :
> Container)
>
>   Thread #3 [qemu-system-x86] 13283 [core: 5] (Suspended :
> Container)
>
>   Thread #4 [qemu-system-x86] 13284 [core: 7] (Suspended : Step)
>
>   cpu_get_icount_raw() at cpus.c:301 0x45a0a0
>
>   replay_get_current_step() at replay.c:67 0x7f2f14
>
>   replay_save_instructions() at replay-internal.c:225
> 0x7f3ea0
>
>   replay_save_clock() at replay-time.c:24 0x7f483d
>
>   icount_warp_rt() at cpus.c:512 0x45a745
>
>   qemu_account_warp_timer() at cpus.c:690
> 0x45ad55
>
>   qemu_tcg_rr_cpu_thread_fn() at cpus.c:1,498
> 0x45c554
>
>   qemu_thread_start() at qemu-thread-posix.c:504
> 0x9485cf
>
>   start_thread() at pthread_create.c:333
> 0x7f00b115d6ba
>
>   clone() at clone.S:109 0x7f00b0e9341d
>
> gdb (7.11.1)
>
>
>
> Threads #2,3 are just waiting in poll or similar. Nothing extraordinary.
>
>
>
> Thread #4 cycles inside do {} while() loop of cpu_get_icount_raw()
> function:
>
> do {
>
> start = seqlock_read_begin(_state.vm_clock_seqlock);
>
> icount = cpu_get_icount_raw_locked();
>
> } while (seqlock_read_retry(_state.vm_clock_seqlock, start));
>
>
>
> Value of timers_state.vm_clock_seqlock.sequence is always 3.
>
>
>
> вт, 9 окт. 2018 г. в 15:04, Pavel Dovgalyuk :
>
> Please try the following patch.
>
> There was a problem with rtc option in record/replay mode.
>
>
>
> diff --git a/vl.c b/vl.c
>
> index 40d5d0f..afe1c20 100644
>
> --- a/vl.c
>
> +++ b/vl.c
>
> @@ -2885,6 +2885,7 @@ int main(int argc, char **argv, char **envp)
>
>  DisplayState *ds;
>
>  QemuOpts *opts, *machine_opts;
>
>  QemuOpts *icount_opts = NULL, *a

Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging

2018-10-09 Thread Artem Pisarenko
(Since all previous patches are already merged to master, I'm running tests
against latest (almost) version from master branch. Following results are
based on master commit dafd95053611aa14dda40266857608d12ddce658 .)

Applying this patch made Tests 1 and 2 succeed (at least I wasn't able to
acheive failures with several attempts).
Also I've tried few tests without sleep=off and/or rtc base options. All of
them succeed too, except one case - removing sleep=off (regardless of -rtc
option values or its presence at all) causes qemu to hang hard in recording
mode at very startup. Process needs to be killed.

Some info from debugger:
qemu-system-x86_64 [13231] [cores: 2,4,5,7]
Thread #1 [qemu-system-x86] 13231 [core: 2] (Suspended : Container)
__lll_lock_wait() at lowlevellock.S:135 0x7f00b116626d
__GI___pthread_mutex_lock() at pthread_mutex_lock.c:80 0x7f00b115fdbd
qemu_mutex_lock_impl() at qemu-thread-posix.c:66 0x947ac4
replay_mutex_lock() at replay-internal.c:206 0x7f3dea
os_host_main_loop_wait() at main-loop.c:235 0x94335e
main_loop_wait() at main-loop.c:497 0x943429
main_loop() at vl.c:1,853 0x5be70f
main() at vl.c:4,575 0x5c56e0
Thread #2 [qemu-system-x86] 13282 [core: 4] (Suspended : Container)
Thread #3 [qemu-system-x86] 13283 [core: 5] (Suspended : Container)
Thread #4 [qemu-system-x86] 13284 [core: 7] (Suspended : Step)
cpu_get_icount_raw() at cpus.c:301 0x45a0a0
replay_get_current_step() at replay.c:67 0x7f2f14
replay_save_instructions() at replay-internal.c:225 0x7f3ea0
replay_save_clock() at replay-time.c:24 0x7f483d
icount_warp_rt() at cpus.c:512 0x45a745
qemu_account_warp_timer() at cpus.c:690 0x45ad55
qemu_tcg_rr_cpu_thread_fn() at cpus.c:1,498 0x45c554
qemu_thread_start() at qemu-thread-posix.c:504 0x9485cf
start_thread() at pthread_create.c:333 0x7f00b115d6ba
clone() at clone.S:109 0x7f00b0e9341d
gdb (7.11.1)

Threads #2,3 are just waiting in poll or similar. Nothing extraordinary.

Thread #4 cycles inside do {} while() loop of cpu_get_icount_raw() function:
do {
start = seqlock_read_begin(_state.vm_clock_seqlock);
icount = cpu_get_icount_raw_locked();
} while (seqlock_read_retry(_state.vm_clock_seqlock, start));

Value of timers_state.vm_clock_seqlock.sequence is always 3.

вт, 9 окт. 2018 г. в 15:04, Pavel Dovgalyuk :

> Please try the following patch.
>
> There was a problem with rtc option in record/replay mode.
>
>
>
> diff --git a/vl.c b/vl.c
>
> index 40d5d0f..afe1c20 100644
>
> --- a/vl.c
>
> +++ b/vl.c
>
> @@ -2885,6 +2885,7 @@ int main(int argc, char **argv, char **envp)
>
>  DisplayState *ds;
>
>  QemuOpts *opts, *machine_opts;
>
>  QemuOpts *icount_opts = NULL, *accel_opts = NULL;
>
> +QemuOpts *rtc_opts = NULL;
>
>  QemuOptsList *olist;
>
>  int optind;
>
>  const char *optarg;
>
> @@ -3691,12 +3692,11 @@ int main(int argc, char **argv, char **envp)
>
>  warn_report("This option is ignored and will be removed
> soon");
>
>  break;
>
>  case QEMU_OPTION_rtc:
>
> -opts = qemu_opts_parse_noisily(qemu_find_opts("rtc"),
> optarg,
>
> -   false);
>
> -if (!opts) {
>
> +rtc_opts = qemu_opts_parse_noisily(qemu_find_opts("rtc"),
>
> +   optarg, false);
>
> +if (!rtc_opts) {
>
>  exit(1);
>
>  }
>
> -configure_rtc(opts);
>
>  break;
>
>  case QEMU_OPTION_tb_size:
>
> #ifndef CONFIG_TCG
>
> @@ -3907,6 +3907,9 @@ int main(int argc, char **argv, char **envp)
>
>  loc_set_none();
>
>  replay_configure(icount_opts);
>
> +    if (rtc_opts) {
>
> +configure_rtc(rtc_opts);
>
> +}
>
>  if (incoming && !preconfig_exit_requested) {
>
>  error_report("'preconfig' and 'incoming' options are "
>
>
>
> Pavel Dovgalyuk
>
>
>
> *From:* Artem Pisarenko [mailto:artem.k.pisare...@gmail.com]
> *Sent:* Thursday, October 04, 2018 4:16 PM
> *To:* dovgaluk
> *Cc:* pavel.dovga...@ispras.ru; qemu-devel@nongnu.org
> *Subject:* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and
> adding reverse debugging
>
>
>
> No, it didn't changed test results, at least for
> https://github.com/ispras/qemu/tree/rr-180911 . Even step values it
> stucks on are same for most runs.
>
> Playing with master and my own branch gives different results for tests
> without sleep=off and -rtc base. It seems that patch you mentioned didn't
> changed them very much.
>
> The only thing can be said for sure, is that thi

Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging

2018-10-04 Thread Artem Pisarenko
No, it didn't changed test results, at least for
https://github.com/ispras/qemu/tree/rr-180911 . Even step values it stucks
on are same for most runs.
Playing with master and my own branch gives different results for tests
without sleep=off and -rtc base. It seems that patch you mentioned didn't
changed them very much.
The only thing can be said for sure, is that this patch does not fix issues
completely. But MAY fix them partially or in some other specific cases...

ср, 3 окт. 2018 г. в 12:47, dovgaluk :

> Can you try applying this patch?
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg563798.html
>
> I also encountered the problems with x86_64 replaying and found the
> misprint in
> the code which was fixed later, than sending the series to the mailing
> list.
>
> Pavel Dovgalyuk
>
>
> Artem Pisarenko писал 2018-10-02 10:02:
> > I've added "-monitor stdio" option to command line of Test 1 and
> > repeated entering command during execution:
> >
> >   QEMU 3.0.50 monitor - type 'help' for more information
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 311736195
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 318198367
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 324737211
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 329890795
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 607069789
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 607069789
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 607069789
> >   ...
> >
> > Some notes on value of step it stucks on:
> > - mostly it's same (even across different record-replay pairs);
> > - stressing host during replay may cause it to change even for same
> > record-replay pair (i.e. different replay executions for same file
> > recorded).
> >
> > This specific case seems to be stable to reproduce.
> >
> > вт, 2 окт. 2018 г. в 0:22, Artem Pisarenko
> > :
> >
> >> I've posted bug report with extended tests (incl. case without
> >> sleep=off). You may find guest image (kernel) in bug description.
> >> https://bugs.launchpad.net/qemu/+bug/1795369 [1]
> >>
> >> The most annoying thing is that some issues are almost not
> >> reproducible. There are definitely race conditions somewhere in qemu
> >> code. Running 'stress-ng' utility with CPU and I/O stressors in
> >> parallel with qemu execution greatly minimizes amount of attempts
> >> when I'm trying to trigger some of issues I encounter.
> >>
> >> I'll try 'info monitor' command tomorrow, but no guarantees that
> >> I'll be able to reproduce issue again.
> >>
> >> Speaking about '-nographic' and SDL... I've noted that UI greatly
> >> minimizes possibility of hanging (but not avoids it completely) when
> >> using icount in general, so this effect isn't rr-specific. I've
> >> already reported this bug too.
> >>
> >> пн, 1 окт. 2018 г., 20:14 dovgaluk :
> >>
> >>> Artem Pisarenko писал 2018-09-30 14:01:
> >>>> Feature still broken :(
> >>>
> >>> Thanks for testing.
> >>>
> >>>>
> >>>> Brief description of my tests.
> >>>>
> >>>> Guest image is Linux, which just powers off after kernel boots
> >>>> (instead of proceeding to user-space /init or /sbin/init).
> >>>> Base cmdline:
> >>>> qemu-system-x86_64 -nodefaults -machine pc,accel=tcg -m 2048
> >>> -cpu
> >>>> qemu64 -rtc clock=vm,base=2000-01-01T00:00:00 -kernel bzImage
> >>> -initrd
> >>>> rootfs -append 'nokaslr console=ttyS0 rdinit=/init_poweroff'
> >>>> -nographic -serial SERIAL_VALUE -icount
> >>>> 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin
> >>>
> >>> I've never tried it with sleep=off. Can you remove it and try
> >>> again?
> >>>
> >>> We also seen a problem with '-nographic'. When we remove this
> >>> option and
> >>> QEMU runs with SDL
> >>> window, everything is ok. There is some problem with main loop
> >>> which may
> >>> sleep when there
> >>> is no GUI to update, or something like

Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging

2018-10-02 Thread Artem Pisarenko
I've added "-monitor stdio" option to command line of Test 1 and repeated
entering command during execution:

  QEMU 3.0.50 monitor - type 'help' for more information
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 311736195
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 318198367
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 324737211
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 329890795
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 607069789
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 607069789
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 607069789
  ...

Some notes on value of step it stucks on:
- mostly it's same (even across different record-replay pairs);
- stressing host during replay may cause it to change even for same
record-replay pair (i.e. different replay executions for same file
recorded).

This specific case seems to be stable to reproduce.

вт, 2 окт. 2018 г. в 0:22, Artem Pisarenko :

> I've posted bug report with extended tests (incl. case without sleep=off).
> You may find guest image (kernel) in bug description.
> https://bugs.launchpad.net/qemu/+bug/1795369
>
> The most annoying thing is that some issues are almost not reproducible.
> There are definitely race conditions somewhere in qemu code. Running
> 'stress-ng' utility with CPU and I/O stressors in parallel with qemu
> execution greatly minimizes amount of attempts when I'm trying to trigger
> some of issues I encounter.
>
> I'll try 'info monitor' command tomorrow, but no guarantees that I'll be
> able to reproduce issue again.
>
> Speaking about '-nographic' and SDL... I've noted that UI greatly
> minimizes possibility of hanging (but not avoids it completely) when using
> icount in general, so this effect isn't rr-specific. I've already reported
> this bug too.
>
>
> пн, 1 окт. 2018 г., 20:14 dovgaluk :
>
>> Artem Pisarenko писал 2018-09-30 14:01:
>> > Feature still broken :(
>>
>> Thanks for testing.
>>
>> >
>> > Brief description of my tests.
>> >
>> > Guest image is Linux, which just powers off after kernel boots
>> > (instead of proceeding to user-space /init or /sbin/init).
>> > Base cmdline:
>> > qemu-system-x86_64 -nodefaults -machine pc,accel=tcg -m 2048 -cpu
>> > qemu64 -rtc clock=vm,base=2000-01-01T00:00:00 -kernel bzImage -initrd
>> > rootfs -append 'nokaslr console=ttyS0 rdinit=/init_poweroff'
>> > -nographic -serial SERIAL_VALUE -icount
>> > 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin
>>
>> I've never tried it with sleep=off. Can you remove it and try again?
>>
>> We also seen a problem with '-nographic'. When we remove this option and
>> QEMU runs with SDL
>> window, everything is ok. There is some problem with main loop which may
>> sleep when there
>> is no GUI to update, or something like that. We couldn't fix it yet.
>>
>> >
>> > Test 1. When SERIAL_VALUE=none
>> > Running with RR_VALUE=record completes successfully.
>> > Running with RR_VALUE=replay doesn't completes. qemu process just
>> > eating ~100% cpu and memory usage doesn't grow after some moment. I
>> > don't see what happens because of problem no.2 (see below).
>>
>> Try 'info replay' monitor command. Does instruction counter increases?
>>
>> >
>> > Test 2. When SERIAL_VALUE=stdio
>> > Running with RR_VALUE=record completes successfully.
>> >
>> > Running with RR_VALUE=replay caues exit with error:
>> >
>> > "qemu-system-x86_64: Missing character write event in the replay log"
>> >
>> > These problems are same with qemu 2.12 (both vanilla and with previous
>> > versions of these patches applied). Furthemore, I consider whole
>> > icount mode broken and determinism isn't achievable.
>> > The irony is that I actually don't need record/replay feature. I've
>> > tried to use it only as instrument to debug failing determinism in
>> > qemu code. But since replay/record feature itself relies on
>> > determinism, which is broken, it's no wonder why it fails also (I just
>> > hoped to bypass it).
>> >
>> > Contact me if you need more details. I just tired a lot trying to get
>> > all these things working... Hope is leaving me...
>>
>> Can you share the kernel in case the icount still broken?
>>
>>
>> Pavel Dovgalyuk
>>
>> --
>
> С уважением,
>   Артем Писаренко
>
-- 

С уважением,
  Артем Писаренко


Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging

2018-10-01 Thread Artem Pisarenko
I've posted bug report with extended tests (incl. case without sleep=off).
You may find guest image (kernel) in bug description.
https://bugs.launchpad.net/qemu/+bug/1795369

The most annoying thing is that some issues are almost not reproducible.
There are definitely race conditions somewhere in qemu code. Running
'stress-ng' utility with CPU and I/O stressors in parallel with qemu
execution greatly minimizes amount of attempts when I'm trying to trigger
some of issues I encounter.

I'll try 'info monitor' command tomorrow, but no guarantees that I'll be
able to reproduce issue again.

Speaking about '-nographic' and SDL... I've noted that UI greatly minimizes
possibility of hanging (but not avoids it completely) when using icount in
general, so this effect isn't rr-specific. I've already reported this bug
too.


пн, 1 окт. 2018 г., 20:14 dovgaluk :

> Artem Pisarenko писал 2018-09-30 14:01:
> > Feature still broken :(
>
> Thanks for testing.
>
> >
> > Brief description of my tests.
> >
> > Guest image is Linux, which just powers off after kernel boots
> > (instead of proceeding to user-space /init or /sbin/init).
> > Base cmdline:
> > qemu-system-x86_64 -nodefaults -machine pc,accel=tcg -m 2048 -cpu
> > qemu64 -rtc clock=vm,base=2000-01-01T00:00:00 -kernel bzImage -initrd
> > rootfs -append 'nokaslr console=ttyS0 rdinit=/init_poweroff'
> > -nographic -serial SERIAL_VALUE -icount
> > 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin
>
> I've never tried it with sleep=off. Can you remove it and try again?
>
> We also seen a problem with '-nographic'. When we remove this option and
> QEMU runs with SDL
> window, everything is ok. There is some problem with main loop which may
> sleep when there
> is no GUI to update, or something like that. We couldn't fix it yet.
>
> >
> > Test 1. When SERIAL_VALUE=none
> > Running with RR_VALUE=record completes successfully.
> > Running with RR_VALUE=replay doesn't completes. qemu process just
> > eating ~100% cpu and memory usage doesn't grow after some moment. I
> > don't see what happens because of problem no.2 (see below).
>
> Try 'info replay' monitor command. Does instruction counter increases?
>
> >
> > Test 2. When SERIAL_VALUE=stdio
> > Running with RR_VALUE=record completes successfully.
> >
> > Running with RR_VALUE=replay caues exit with error:
> >
> > "qemu-system-x86_64: Missing character write event in the replay log"
> >
> > These problems are same with qemu 2.12 (both vanilla and with previous
> > versions of these patches applied). Furthemore, I consider whole
> > icount mode broken and determinism isn't achievable.
> > The irony is that I actually don't need record/replay feature. I've
> > tried to use it only as instrument to debug failing determinism in
> > qemu code. But since replay/record feature itself relies on
> > determinism, which is broken, it's no wonder why it fails also (I just
> > hoped to bypass it).
> >
> > Contact me if you need more details. I just tired a lot trying to get
> > all these things working... Hope is leaving me...
>
> Can you share the kernel in case the icount still broken?
>
>
> Pavel Dovgalyuk
>
> --

С уважением,
  Артем Писаренко


[Qemu-devel] [Bug 1795369] Re: Record/replay (icount rr) causes emulation hang or exit with error about missing events in log

2018-10-01 Thread Artem Pisarenko
Applying recent patches from http://lists.nongnu.org/archive/html/qemu-
devel/2018-09/msg04038.html doesn't fix any of issues.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1795369

Title:
  Record/replay (icount rr) causes emulation hang or exit with error
  about missing events in log

Status in QEMU:
  New

Bug description:
  Test case description:

  Guest image is Linux, which just powers off after kernel boots (instead of 
proceeding to user-space /init or /sbin/init).
  Base cmdline:
qemu-system-x86_64 \
  -nodefaults -nographic -machine pc,accel=tcg -m 2048 -cpu qemu64 \
  -kernel bzImage -initrd rootfs -append 'nokaslr console=ttyS0 
rdinit=/init_poweroff' \
  -serial SERIAL_VALUE \
  -rtc clock=vm,base=2000-01-01T00:00:00 \
  -icount 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin

  Test 1.
  When SERIAL_VALUE=none
  Running with RR_VALUE=record completes successfully.
  Running with RR_VALUE=replay doesn't completes. qemu process just eating 
~100% cpu and memory usage doesn't grow after some moment. I don't see what 
happens because of problem no.2 (see below).

  Test 2.
  When SERIAL_VALUE=stdio
  Running with RR_VALUE=record completes successfully.
  Running with RR_VALUE=replay causes exit with error:
  "qemu-system-x86_64: Missing character write event in the replay log"

  Tests 3,4,5...
  SERIAL_VALUE=stdio. Playing with "-rtc" clock and base suboptions, "-icount" 
sleep suboptions produces non-repeatable results.
  In most cases running with RR_VALUE=record completes successfully (but may 
hang at very begining).
  Running with RR_VALUE=replay with combinations of removing "-rtc base=..." 
and "-icount sleep=..." goes better, but at different places of boot process it 
may either hang (as in test 1) or exit with error (as in test 2).
  When qemu "hangs", it may also happen differently: either it can be stopped 
by Ctrl-C, or have to be killed.

  
  Guest image uploaded here: 
https://drive.google.com/open?id=1SHG4HyBdcPutc5Au4pyhN8z9w52et51A

  QEMU built from master (commit 042938f46e1c477419d1931381fdadffaa49d45e) with:
  /configure --prefix= --target-list=x86_64-softmmu 
--enable-debug --disable-pie --enable-tcg --disable-tcg-interpreter 
--enable-virtfs --disable-docs --disable-guest-agent --disable-modules 
--disable-gnutls --disable-nettle --disable-gcrypt --disable-sdl 
--disable-curses --disable-vnc --disable-vnc-sasl --disable-vnc-jpeg 
--disable-vnc-png --disable-cocoa --disable-xen --disable-xen-pci-passthrough 
--disable-brlapi --disable-curl --disable-fdt --disable-bluez --disable-kvm 
--disable-hax --disable-hvf --disable-whpx --disable-rdma --disable-vde 
--disable-netmap --disable-cap-ng --disable-spice --disable-rbd 
--disable-libiscsi --disable-libnfs --disable-smartcard --disable-libusb 
--disable-live-block-migration --disable-usb-redir --disable-glusterfs 
--disable-tpm --disable-libssh2 --disable-numa --disable-libxml2 
--disable-opengl --disable-virglrenderer --disable-qom-cast-debug 
--disable-tools --disable-vxhs --disable-crypto-afalg --disable-capstone 
--disable-replication --disable-xfsctl --disable-seccomp --disable-pvrdma 
--disable-libpmem

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1795369/+subscriptions



[Qemu-devel] [Bug 1795369] [NEW] Record/replay (icount rr) causes emulation hang or exit with error about missing events in log

2018-10-01 Thread Artem Pisarenko
Public bug reported:

Test case description:

Guest image is Linux, which just powers off after kernel boots (instead of 
proceeding to user-space /init or /sbin/init).
Base cmdline:
  qemu-system-x86_64 \
-nodefaults -nographic -machine pc,accel=tcg -m 2048 -cpu qemu64 \
-kernel bzImage -initrd rootfs -append 'nokaslr console=ttyS0 
rdinit=/init_poweroff' \
-serial SERIAL_VALUE \
-rtc clock=vm,base=2000-01-01T00:00:00 \
-icount 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin

Test 1.
When SERIAL_VALUE=none
Running with RR_VALUE=record completes successfully.
Running with RR_VALUE=replay doesn't completes. qemu process just eating ~100% 
cpu and memory usage doesn't grow after some moment. I don't see what happens 
because of problem no.2 (see below).

Test 2.
When SERIAL_VALUE=stdio
Running with RR_VALUE=record completes successfully.
Running with RR_VALUE=replay causes exit with error:
"qemu-system-x86_64: Missing character write event in the replay log"

Tests 3,4,5...
SERIAL_VALUE=stdio. Playing with "-rtc" clock and base suboptions, "-icount" 
sleep suboptions produces non-repeatable results.
In most cases running with RR_VALUE=record completes successfully (but may hang 
at very begining).
Running with RR_VALUE=replay with combinations of removing "-rtc base=..." and 
"-icount sleep=..." goes better, but at different places of boot process it may 
either hang (as in test 1) or exit with error (as in test 2).
When qemu "hangs", it may also happen differently: either it can be stopped by 
Ctrl-C, or have to be killed.


Guest image uploaded here: 
https://drive.google.com/open?id=1SHG4HyBdcPutc5Au4pyhN8z9w52et51A

QEMU built from master (commit 042938f46e1c477419d1931381fdadffaa49d45e) with:
/configure --prefix= --target-list=x86_64-softmmu 
--enable-debug --disable-pie --enable-tcg --disable-tcg-interpreter 
--enable-virtfs --disable-docs --disable-guest-agent --disable-modules 
--disable-gnutls --disable-nettle --disable-gcrypt --disable-sdl 
--disable-curses --disable-vnc --disable-vnc-sasl --disable-vnc-jpeg 
--disable-vnc-png --disable-cocoa --disable-xen --disable-xen-pci-passthrough 
--disable-brlapi --disable-curl --disable-fdt --disable-bluez --disable-kvm 
--disable-hax --disable-hvf --disable-whpx --disable-rdma --disable-vde 
--disable-netmap --disable-cap-ng --disable-spice --disable-rbd 
--disable-libiscsi --disable-libnfs --disable-smartcard --disable-libusb 
--disable-live-block-migration --disable-usb-redir --disable-glusterfs 
--disable-tpm --disable-libssh2 --disable-numa --disable-libxml2 
--disable-opengl --disable-virglrenderer --disable-qom-cast-debug 
--disable-tools --disable-vxhs --disable-crypto-afalg --disable-capstone 
--disable-replication --disable-xfsctl --disable-seccomp --disable-pvrdma 
--disable-libpmem

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1795369

Title:
  Record/replay (icount rr) causes emulation hang or exit with error
  about missing events in log

Status in QEMU:
  New

Bug description:
  Test case description:

  Guest image is Linux, which just powers off after kernel boots (instead of 
proceeding to user-space /init or /sbin/init).
  Base cmdline:
qemu-system-x86_64 \
  -nodefaults -nographic -machine pc,accel=tcg -m 2048 -cpu qemu64 \
  -kernel bzImage -initrd rootfs -append 'nokaslr console=ttyS0 
rdinit=/init_poweroff' \
  -serial SERIAL_VALUE \
  -rtc clock=vm,base=2000-01-01T00:00:00 \
  -icount 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin

  Test 1.
  When SERIAL_VALUE=none
  Running with RR_VALUE=record completes successfully.
  Running with RR_VALUE=replay doesn't completes. qemu process just eating 
~100% cpu and memory usage doesn't grow after some moment. I don't see what 
happens because of problem no.2 (see below).

  Test 2.
  When SERIAL_VALUE=stdio
  Running with RR_VALUE=record completes successfully.
  Running with RR_VALUE=replay causes exit with error:
  "qemu-system-x86_64: Missing character write event in the replay log"

  Tests 3,4,5...
  SERIAL_VALUE=stdio. Playing with "-rtc" clock and base suboptions, "-icount" 
sleep suboptions produces non-repeatable results.
  In most cases running with RR_VALUE=record completes successfully (but may 
hang at very begining).
  Running with RR_VALUE=replay with combinations of removing "-rtc base=..." 
and "-icount sleep=..." goes better, but at different places of boot process it 
may either hang (as in test 1) or exit with error (as in test 2).
  When qemu "hangs", it may also happen differently: either it can be stopped 
by Ctrl-C, or have to be killed.

  
  Guest image uploaded here: 
https://drive.google.com/open?id=1SHG4HyBdcPutc5Au4pyhN8z9w52et51A

  QEMU built from master (commit 042938f46e1c477419d1931381fdadffaa49d45e) with:
  /configure --prefix= 

[Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging

2018-09-30 Thread Artem Pisarenko
Feature still broken :(

Brief description of my tests.

Guest image is Linux, which just powers off after kernel boots (instead of
proceeding to user-space /init or /sbin/init).
Base cmdline:
qemu-system-x86_64 -nodefaults -machine pc,accel=tcg -m 2048 -cpu qemu64
-rtc clock=vm,base=2000-01-01T00:00:00 -kernel bzImage -initrd rootfs
-append 'nokaslr console=ttyS0 rdinit=/init_poweroff' -nographic -serial
SERIAL_VALUE -icount 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin

Test 1. When SERIAL_VALUE=none
Running with RR_VALUE=record completes successfully.
Running with RR_VALUE=replay doesn't completes. qemu process just eating
~100% cpu and memory usage doesn't grow after some moment. I don't see what
happens because of problem no.2 (see below).

Test 2. When SERIAL_VALUE=stdio
Running with RR_VALUE=record completes successfully.
Running with RR_VALUE=replay caues exit with error:
"qemu-system-x86_64: Missing character write event in the replay log"

These problems are same with qemu 2.12 (both vanilla and with previous
versions of these patches applied). Furthemore, I consider whole icount
mode broken and determinism isn't achievable.
The irony is that I actually don't need record/replay feature. I've tried
to use it only as instrument to debug failing determinism in qemu code. But
since replay/record feature itself relies on determinism, which is broken,
it's no wonder why it fails also (I just hoped to bypass it).

Contact me if you need more details. I just tired a lot trying to get all
these things working... Hope is leaving me...

-- 

С уважением,
  Артем Писаренко


Re: [Qemu-devel] backend for blk or fs with guaranteed blocking/synchronous I/O

2018-09-10 Thread Artem Pisarenko
It looks like things are even worse. Guest demonstrates strange timings
even without access to anything external to machine. I've added Paolo
Bonzini to CC, because issue looks related to cpu/tcg/memory stuff.

I've written simple test script running parallel 'dd' utility processes
operating on files located in RAM. QEMU machine with multiple vCPUs.
Moreover, it have separate NUMA nodes for each vCPU.
Script in brief: it accepts argument with desired processes count, for each
process it mounts tmpfs, binded to node memory, and runs 'dd', binded both
to node cpu and memory, which copies files located on that tmpfs.
It's expected that overall execution time of N parallel processes (or speed
of copying) should always be the same, not depending on N value (of course,
provided that N <= nodes_count and 'dd' is single-threaded). Because it's
just as simple as simple loops of instructions just loading and storing
values in memory local to each CPU. No common resources should be involved
- neither software (such as some target OS lock/mutex), nor hardware (such
as memory bus). It should be almost ideal parallelization.
But it's not only degradates when increasing N, but even does it
proportionally !!! Same test running oh host machine (just multicore, no
NUMA) shows expected results: it has degradation (because of common memory
bus), but with non-linear dependency on N.

Script ("test.sh"):
#!/bin/bash
N=$1
# Preparation...
if command -v numactl >/dev/null; then
  USE_NUMA_BIND=1
else
  USE_NUMA_BIND=0
fi
for i in $(seq 0 $((N - 1)));
do
  mkdir -p /mnt/testmnt_$i
  if [[ "$USE_NUMA_BIND" == 1 ]] ; then
TMPFS_EXTRA_OPT=",mpol=bind:$i"; fi
  mount -t tmpfs -o
size=25M,noatime,nodiratime,norelatime$TMPFS_EXTRA_OPT tmpfs /mnt/testmnt_$i
  dd if=/dev/zero of=/mnt/testmnt_$i/testfile_r bs=10M count=1
>/dev/null 2>&1
done
# Running...
for i in $(seq 0 $((N - 1)));
do
  if [[ "$USE_NUMA_BIND" == 1 ]] ; then PREFIX_RUN="numactl
--cpunodebind=$i --membind=$i"; fi
  $PREFIX_RUN dd if=/mnt/testmnt_$i/testfile_r
of=/mnt/testmnt_$i/testfile_w bs=100 count=10 2>&1 | sed -n 's/^.*,
\(.*\)$/\1/p' &
done
# Cleanup...
wait
for i in $(seq 0 $((N - 1))); do umount /mnt/testmnt_$i; done
rm -rf /mnt/testmnt_*

Corresponding QEMU command line fragment:
"-machine accel=tcg -m 2048 -icount 1,sleep=off -rtc clock=vm -smp 10
-cpu qemu64 -numa node -numa node -numa node -numa node -numa node -numa
node -numa node -numa node -numa node -numa node"
(Removing -icount or numa nodes don't change results.)

Example runs on my Intel Core i7-7700 host (adequate results):
  artem@host:~$ sudo ./test.sh 1
  117 MB/s
  artem@host:~$ sudo ./test.sh 10
  91,1 MB/s
  89,3 MB/s
  90,4 MB/s
  85,0 MB/s
  68,7 MB/s
  63,1 MB/s
  62,0 MB/s
  55,9 MB/s
  54,1 MB/s
  56,0 MB/s

Example runs on my tiny linux x86_64 guest (strange results):
  root@guest:~# ./test.sh 1
  17.5 MB/s
  root@guest:~# ./test.sh 10
  3.2 MB/s
  2.7 MB/s
  2.6 MB/s
  2.0 MB/s
  2.0 MB/s
  1.9 MB/s
  1.8 MB/s
  1.8 MB/s
  1.8 MB/s
  1.8 MB/s

Please, explain these results. Or maybe I wrong and it's normal ?


чт, 6 сент. 2018 г. в 16:24, Artem Pisarenko :

> Hi all,
>
> I'm developing paravirtualized target linux system which runs multiple
> linux containers (LXC) inside itself. (For those, who unfamiliar with LXC,
> simply put, it's an isolated group of userspace processes with their own
> rootfs.) Each container should be provided access to its rootfs located at
> host and execution of container should be deterministic. Particularly, it
> means that container I/O operations must be synchronized within some
> predefined quantum of guest _virtual_ time, i.e. its I/O activity shouldn't
> be delayed by host performance or activities on host and other containers.
> In other words, guest should see it's like either infinite throughput and
> zero latency, or some predefined throughput/latency characteristics
> guaranteed per each rootfs.
>
> While other sources of non-determinism are seem to be eliminated (using
> TCG, -icount, etc.), asynchronous I/O still introduces it.
>
> What is scope of "(asynchronous) I/O" term within qemu? Is it something
> related to block devices layer only, or generic term, covering whole
> datapath between vCPU and backend?
> If it relates to block devices only, does usage of VirtFS guarantee
> deterministic access, or it still involves some asynchrony relative to
> guest virtual clock?
> Is it possible to force asynchronous I/O within qemu to be blocking by
> some external means (host OS configuration, hooks, etc.) ? I know, it may
> greatly slow down guest performance, but it's still better than nothing.
> Maybe some trivial patch can be made to qemu code at virtio, block backend

Re: [Qemu-devel] backend for blk or fs with guaranteed blocking/synchronous I/O

2018-09-07 Thread Artem Pisarenko
No. I don't need realtime behavior. Realtime implies determinism, but
determinism doesn't implies realtime. Of course, I realize that there are
other sources of non-determinism exist, but these are separate stories.
Here I just trying to eliminate one of them - asynchronous emulation of I/O
inside qemu. Realtime isn't solution here.

Firstly, implementing realtime still leaves dependency on host machine (its
performance, hardware configuration, etc.) and number of containers
running. Yes, it will be deterministic, but results are tied to given host
and containers count.
Secondly, it's just an overkill for problem being solved. The problem area
is bounded by guest and QEMU implementation. Using realtime requires to
fight complexities on host also (host kernel must be realtime, system
configuration must be tuned, all possible latencies must be carefully
traced, etc.). I perfectly understand how complex to design realtime system
in generic, and implementing it using linux makes things even more complex.
Thirdly, it works only for KVM (and possibly other virtualization
hypervisors). It's not my case, since my guest running with TCG and
-icount,sleep=off.

It seems you got me wrong. I'll try to explain problem in other way.

Guest virtual clock must run independent of realtime (host) clock. They
might be synchronized only in order to wait for some QEMU/host operation to
be completed, i.e. guest time is being frozen by host performance
bottlenecks, but it's transparent for guest. This is how works (or, at
least, should work) "-icount,sleep=off" in time domain of CPU emulation.
But I/O operations are seems to not respect this "policy". When QEMU
processes I/O request from guest, it allows virtual time to run freely
until backend completes operation and result passed back to guest. And this
is what makes guest to "feel" speed/latency of I/O. It's the core of the
problem.

To explain problem even better I've written a simple script
(test_run_multiple_containers.sh), which emulates execution of multiple
containers:
#!/bin/bash
N=$1
for i in $(seq 1 $N);
do
  dd if=/dev/zero of=/tmp/testfile_$i bs=1K count=10 2>&1 | sed -n
's/^.*, \(.*\)$/\1/p' &
done
wait
rm -f /tmp/testfile*
Where N is a number of containers running in parallel, and /tmp/testfile_$i
is a file located in $i container's rootfs (dedicated mount point, blk
device or something else).
Running
./test_run_multiple_containers.sh 1
on real machine should output value, which corresponds to maximum write
speed. Lets define it as "max_io_throughput".
Running this script on real machine with different N values should give
ouptuts with roughly identical values like "max_io_throughput / N".
What I need is that running this script on guest should always give
identical and constant values, not depending on N value, current host load
or something else external to guest. (No magic. While running emulation
will cause at most "max_io_throughput" load on host (in terms of real
time), QEMU will throttle guest virtual clock to be N times slower relative
to realtime clock.)

Also I forgot to mention that container's rootfs aren't required to be
persistent and stay on host during execution of containers. They may be
transferred to guest RAM before execution. They're just source images of
rootfs.

чт, 6 сент. 2018 г. в 21:08, Michael S. Tsirkin :

> On Thu, Sep 06, 2018 at 04:24:12PM +0600, Artem Pisarenko wrote:
> > Hi all,
> >
> > I'm developing paravirtualized target linux system which runs multiple
> linux
> > containers (LXC) inside itself. (For those, who unfamiliar with LXC,
> simply
> > put, it's an isolated group of userspace processes with their own
> rootfs.) Each
> > container should be provided access to its rootfs located at host and
> execution
> > of container should be deterministic. Particularly, it means that
> container I/O
> > operations must be synchronized within some predefined quantum of guest
> > _virtual_ time, i.e. its I/O activity shouldn't be delayed by host
> performance
> > or activities on host and other containers. In other words, guest should
> see
> > it's like either infinite throughput and zero latency, or some predefined
> > throughput/latency characteristics guaranteed per each rootfs.
> >
> > While other sources of non-determinism are seem to be eliminated (using
> TCG,
> > -icount, etc.), asynchronous I/O still introduces it.
>
> ...
>
> Just that you should realize that the issues are not limited to QEMU: to
> get real time behaviour out of a Linux host you need a real-time kernel
> and real-time capable hardware/firmware. I'm not an expert on this at
> all, but see e.g. these old presentations:
> https://lwn.net/Articles/656807/
>
> --
> MST
>
-- 

С уважением,
  Артем Писаренко


[Qemu-devel] backend for blk or fs with guaranteed blocking/synchronous I/O

2018-09-06 Thread Artem Pisarenko
Hi all,

I'm developing paravirtualized target linux system which runs multiple
linux containers (LXC) inside itself. (For those, who unfamiliar with LXC,
simply put, it's an isolated group of userspace processes with their own
rootfs.) Each container should be provided access to its rootfs located at
host and execution of container should be deterministic. Particularly, it
means that container I/O operations must be synchronized within some
predefined quantum of guest _virtual_ time, i.e. its I/O activity shouldn't
be delayed by host performance or activities on host and other containers.
In other words, guest should see it's like either infinite throughput and
zero latency, or some predefined throughput/latency characteristics
guaranteed per each rootfs.

While other sources of non-determinism are seem to be eliminated (using
TCG, -icount, etc.), asynchronous I/O still introduces it.

What is scope of "(asynchronous) I/O" term within qemu? Is it something
related to block devices layer only, or generic term, covering whole
datapath between vCPU and backend?
If it relates to block devices only, does usage of VirtFS guarantee
deterministic access, or it still involves some asynchrony relative to
guest virtual clock?
Is it possible to force asynchronous I/O within qemu to be blocking by some
external means (host OS configuration, hooks, etc.) ? I know, it may
greatly slow down guest performance, but it's still better than nothing.
Maybe some trivial patch can be made to qemu code at virtio, block backend
or platform syscalls level?
Maybe I/O automatically (and guaranteed) fallbacks to synchronous mode in
some particular configurations, such as using block device with image
located on tmpfs in RAM (either directly or via overlay fs) ? If so, it's
great!
Or maybe some other solutions exists?...

Main problem is to organize access from guest linux to some file system at
host (directory, mount point, image file... doesn't matter) in
deterministic manner.
Secondary problem is to optimize performance as much as possible by:
- avoiding unnecessary overheads (e.g. using virtio infrastructure,
preference virtfs over blk device, etc.);
- allowing some asynchrony within defined quantum of time (e.g. 10ms), i.e.
i/o order and speed are free to float within each quantum borders, while
result seen by guest at end of quantum is always same.

Actually, what I'm trying to achieve have direct contradiction with most
people trying to avoid, because synchronous I/O degradates performance in
vast majority of usage scenarios.

Does anyone have any thoughts on this?

Best regards,
  Artem Pisarenko
-- 

С уважением,
  Артем Писаренко


[Qemu-devel] [Bug 1790460] Re: -icount, sleep=off mode is broken (target slows down or hangs)

2018-09-04 Thread Artem Pisarenko
This modification also fixes issue:
https://git.greensocs.com/qemu/qbox/commit/a8ed106032e375e715a531d6e93e4d9ec295dbdb
Although I tested it only on v2.12.0 and didn't noticed performance improvement.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1790460

Title:
  -icount,sleep=off mode is broken (target slows down or hangs)

Status in QEMU:
  New

Bug description:
  QEMU running with options "-icount,sleep=off -rtc clock=vm" doesn't execute 
emulation at maximum possible speed.
  Target virtual clock may run faster or slower than realtime clock by N times, 
where N value depends on various unrelated conditions (i.e. random from the 
user point of view). The worst case is when target hangs (hopefully, early in 
booting stage).
  Example scenarios I've described here: 
http://lists.nongnu.org/archive/html/qemu-discuss/2018-08/msg00102.html

  QEMU process just sleeps most of the time (polling, waiting some
  condition, etc.). I've tried to debug issue and came to 99% conclusion
  that there are racing somewhere in qemu internals.

  The feature is broken since v2.6.0 release.
  Bad commit is 281b2201e4e18d5b9a26e1e8d81b62b5581a13be by Pavel Dovgalyuk, 
03/10/2016 05:56 PM:

icount: remove obsolete warp call

qemu_clock_warp call in qemu_tcg_wait_io_event function is not needed
anymore, because it is called in every iteration of main_loop_wait.

Reviewed-by: Paolo Bonzini 

Signed-off-by: Pavel Dovgalyuk 
Message-Id: <20160310115603.4812.67559.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini 

  I've reverted commit to all major releases and latest git master
  branch. Issue was fixed for all of them. My adaptation is trivial:
  just restoring removed function call before "qemu_cond_wait(...)"
  line.

  I'm sure following bugs are just particular cases of the issue:
  #1774677, #1653063 .

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1790460/+subscriptions



[Qemu-devel] [Bug 1790460] [NEW] -icount, sleep=off mode is broken (target slows down or hangs)

2018-09-03 Thread Artem Pisarenko
Public bug reported:

QEMU running with options "-icount,sleep=off -rtc clock=vm" doesn't execute 
emulation at maximum possible speed.
Target virtual clock may run faster or slower than realtime clock by N times, 
where N value depends on various unrelated conditions (i.e. random from the 
user point of view). The worst case is when target hangs (hopefully, early in 
booting stage).
Example scenarios I've described here: 
http://lists.nongnu.org/archive/html/qemu-discuss/2018-08/msg00102.html

QEMU process just sleeps most of the time (polling, waiting some
condition, etc.). I've tried to debug issue and came to 99% conclusion
that there are racing somewhere in qemu internals.

The feature is broken since v2.6.0 release.
Bad commit is 281b2201e4e18d5b9a26e1e8d81b62b5581a13be by Pavel Dovgalyuk, 
03/10/2016 05:56 PM:

  icount: remove obsolete warp call
  
  qemu_clock_warp call in qemu_tcg_wait_io_event function is not needed
  anymore, because it is called in every iteration of main_loop_wait.
  
  Reviewed-by: Paolo Bonzini 

  Signed-off-by: Pavel Dovgalyuk 
  Message-Id: <20160310115603.4812.67559.stgit@PASHA-ISP>
  Signed-off-by: Paolo Bonzini 

I've reverted commit to all major releases and latest git master branch.
Issue was fixed for all of them. My adaptation is trivial: just
restoring removed function call before "qemu_cond_wait(...)" line.

I'm sure following bugs are just particular cases of the issue:
#1774677, #1653063 .

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1790460

Title:
  -icount,sleep=off mode is broken (target slows down or hangs)

Status in QEMU:
  New

Bug description:
  QEMU running with options "-icount,sleep=off -rtc clock=vm" doesn't execute 
emulation at maximum possible speed.
  Target virtual clock may run faster or slower than realtime clock by N times, 
where N value depends on various unrelated conditions (i.e. random from the 
user point of view). The worst case is when target hangs (hopefully, early in 
booting stage).
  Example scenarios I've described here: 
http://lists.nongnu.org/archive/html/qemu-discuss/2018-08/msg00102.html

  QEMU process just sleeps most of the time (polling, waiting some
  condition, etc.). I've tried to debug issue and came to 99% conclusion
  that there are racing somewhere in qemu internals.

  The feature is broken since v2.6.0 release.
  Bad commit is 281b2201e4e18d5b9a26e1e8d81b62b5581a13be by Pavel Dovgalyuk, 
03/10/2016 05:56 PM:

icount: remove obsolete warp call

qemu_clock_warp call in qemu_tcg_wait_io_event function is not needed
anymore, because it is called in every iteration of main_loop_wait.

Reviewed-by: Paolo Bonzini 

Signed-off-by: Pavel Dovgalyuk 
Message-Id: <20160310115603.4812.67559.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini 

  I've reverted commit to all major releases and latest git master
  branch. Issue was fixed for all of them. My adaptation is trivial:
  just restoring removed function call before "qemu_cond_wait(...)"
  line.

  I'm sure following bugs are just particular cases of the issue:
  #1774677, #1653063 .

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1790460/+subscriptions



[Qemu-devel] virtio-vsock feature has no TCG (non-KVM) support

2018-06-01 Thread Artem Pisarenko
Please, add important note to https://wiki.qemu.org/Features/VirtioVsock page,
that this feature only supported in KVM accelerated mode. It's not obvious.
Furthermore, it isn't checked by qemu when invoking with "-device
vhost-vsock-pci,..." and user encounters this only when communicating (via
AF_VSOCK) application fails to connect() with weird "Connection timed out"
error.
-- 

С уважением,
  Артем Писаренко


[Qemu-devel] "socket" host network backend: suggested improvements and fixes

2018-05-30 Thread Artem Pisarenko
Hi to all.

I'm working on integrating QEMU networking to simulation environment and
found socket backend very convenient: it's simple, easy to use (i.e no
intermediate things required, such as tap/tun adapter, vde switch, etc.)
and transparent to host environment (i.e. it doesn't pollutes system with
bunch of virtual devices, etc.).

Although, it have some problems, closely related to each other. I've
investigated source code and played with it a little, but I'm not ready
submit a complete patch. So, here are my thoughts...

1. Internal protocol (only qemu instances can join its network devices
between). I suggest to make it available to plug with external software,
i.e freeze communication protocol at current state and document it in
docs/interop/ directory.

2. Transport options wrongly documented. Section "2.3.6 Network options"
lists "-netdev socket,..." entries. It gives very different basic
understanding of how it works from actual one.
 2.1. It has two entries: listen/connect (TCP connecton) and mcast
(multicast UDP), but 'qemu --help' outputs additional one - udp (UDP
tunnel), which is undocumented, but looks like working.
 2.2. Each entry has fd=h parameter, which looks like it's an optional
parameter, but code analysis (net/socket.c) shows that, in fact, it's a
separate transport option exclusive to listed ones. It used as follows:
user creates/opens whatever (almost) custom socket/file it wants, connects
it with other endpoint and passes file descriptor (handle) to qemu, which
just recv/send over it and nothing more.
 2.3. As a consequence, if you try to invoke any transport/variant option
with "fd=", you'll get an error: "exactly one of listen=, connect=, mcast=
or udp= is required". And again, error message is incomplete - it misses
"fd=" option.

3. "fd=" transport doesn't work with SOCK_DGRAM type sockets. It's due to
an implementation error in net/socket.c. Function
net_socket_receive_dgram() uses sendo() call with s->dgram_dst value which
is undefined for this case (and, of course, cannot be defined).
Although net_socket_fd_init() execution branch is smart enough to detect
type of socket passed with "fd=", but its "connected" state forgotten
afterwards. Suggested fix: replace sendto() with send(), which correctly
assumes already connected socket, and add corresponding connect() calls for
"mcast=" and "udp=" init sequences.

(For those, who interested, currently I've got working network
communication with unmodified qemu 2.12.0 in Linux using UNIX domain
sockets created by socketpair(AF_LOCAL, SOCK_STREAM, ...), one of which
passed to spawned child qemu process via -netdev socket,fd=... and other
one, used in parent application process to send/receive packets. Protocol,
used by qemu, is simple and implements only data plane: it just transfers
raw ethernet frames in binary form, for datagram-type sockets it's
straightforward, and for stream-type sockets each frame prepended with
uint32 length in network byte order, without any delimiters and escaping.)
-- 

С уважением,
  Артем Писаренко