RE: [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
> -Original Message- > From: Philippe Mathieu-Daudé > Sent: Wednesday, April 17, 2024 2:14 PM > To: Li Zhijian ; Zhang, Hailiang > ; pet...@redhat.com; faro...@suse.de > Cc: qemu-devel@nongnu.org; Zhang, Chen ; Wen > Congyang ; Xie Changlong > > Subject: Re: [PATCH v2] migration/colo: Fix bdrv_graph_rdlock_main_loop: > Assertion `!qemu_in_coroutine()' failed. > > On 17/4/24 04:56, Li Zhijian via wrote: > > bdrv_activate_all() should not be called from the coroutine context, > > move it to the QEMU thread colo_process_incoming_thread() with the > > bql_lock protected. > > > > The backtrace is as follows: > > #4 0x561af7948362 in bdrv_graph_rdlock_main_loop () > at ../block/graph-lock.c:260 > > #5 0x561af7907a68 in graph_lockable_auto_lock_mainloop > (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259 > > #6 0x561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0) > at ../block.c:6906 > > #7 0x561af762b4af in colo_incoming_co () at ../migration/colo.c:935 > > #8 0x561af7607e57 in process_incoming_migration_co (opaque=0x0) > at ../migration/migration.c:793 > > #9 0x561af7adbeeb in coroutine_trampoline (i0=-106876144, > i1=22042) at ../util/coroutine-ucontext.c:175 > > #10 0x7fd2a5cf21c0 in () at /lib64/libc.so.6 > > > > CC: Fabiano Rosas > > Cc: qemu-sta...@nongnu.org > > > Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277 > > Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and > > bdrv_is_root_node() GRAPH_RDLOCK") > > Signed-off-by: Li Zhijian It looks good to me. And already verified this patch in my environment. After address Phillippe's comments please add: Reviewed-by: Zhang Chen Tested-by: Zhang Chen Thanks Chen > > --- > > V2: fix missing bql_unlock() in error path. > > --- > > migration/colo.c | 18 ++ > > 1 file changed, 10 insertions(+), 8 deletions(-) > > > > diff --git a/migration/colo.c b/migration/colo.c index > > 84632a603e..5600a43d78 100644 > > --- a/migration/colo.c > > +++ b/migration/colo.c > > @@ -835,6 +835,16 @@ static void *colo_process_incoming_thread(void > *opaque) > > return NULL; > > } > > > > +/* Make sure all file formats throw away their mutable metadata */ > > +bql_lock(); > > Note there is also the convenient BQL_LOCK_GUARD() macro. > > > +bdrv_activate_all(_err); > > +if (local_err) { > > +bql_unlock(); > > +error_report_err(local_err); > > +return NULL; > > +} > > +bql_unlock(); > > + > > failover_init_state();
RE: COLO state?
> -Original Message- > From: Fabiano Rosas > Sent: Thursday, April 11, 2024 10:40 PM > To: Zhang, Hailiang ; Zhang, Chen > ; Li Zhijian > Cc: qemu-devel@nongnu.org; Peter Xu > Subject: COLO state? > > Hi COLO maintainers, > > Would you please take a look at this issue? > > https://gitlab.com/qemu-project/qemu/-/issues/2277 > Sure, please share more details in gitlab of your test. For example the primary/secondary VM boot scripts and the backtrace...etc... > The reporter claims it affects from 9.0-rc2 all the way back to QEMU 7.2. I > don't have any kind of setup for COLO, so it will take me a while to be able > to > verify this. > > Could you also provide clarification on what is the state of COLO these days? > Are any of you looking at it? Do we know of active users of the feature? Currently, all the COLO code already merged by QEMU. But there is no support for libvirt (Need to passthrough QMP command to QEMU for enable COLO). Any users can use the WIKI page(https://wiki.qemu.org/Features/COLO) to setup COLO based on upstream code.(Maybe have issues? I will double check it... ). As we know, many cloud services provider build there HA/FT system or products based on COLO project. > > Also, is the MAINTAINERS file reflecting the actual maintenance state > according to you? I will check w/ Hailiang/Zhijian/Congyang/Changlong(Block replication) to see if he will keep working on COLO or not. Thanks Chen > > COLO Framework > M: Hailiang Zhang > S: Maintained > F: migration/colo* > F: include/migration/colo.h > F: include/migration/failover.h > F: docs/COLO-FT.txt > > COLO Proxy > M: Zhang Chen > M: Li Zhijian > S: Supported > F: docs/colo-proxy.txt > F: net/colo* > F: net/filter-rewriter.c > F: net/filter-mirror.c > F: tests/qtest/test-filter* > > Thanks
RE: [PATCH V2 09/11] migration: privatize colo interfaces
> -Original Message- > From: Steve Sistare > Sent: Tuesday, March 12, 2024 1:49 AM > To: qemu-devel@nongnu.org > Cc: Alex Williamson ; Cedric Le Goater > ; Michael S. Tsirkin ; David Hildenbrand > ; Peter Xu ; Fabiano Rosas > ; Zhang, Hailiang ; Zhang, > Chen ; Li Zhijian ; Jason Wang > ; Hyman Huang ; Song > Gao ; Alistair Francis ; > Steve Sistare > Subject: [PATCH V2 09/11] migration: privatize colo interfaces > > Remove private migration interfaces from net/colo-compare.c and push them > to migration/colo.c. > > Signed-off-by: Steve Sistare Reviewed-by: Zhang Chen Thanks Chen > --- > migration/colo.c | 17 +++-- > net/colo-compare.c | 3 +-- > stubs/colo.c | 1 - > 3 files changed, 12 insertions(+), 9 deletions(-) > > diff --git a/migration/colo.c b/migration/colo.c index 315e31f..84632a6 > 100644 > --- a/migration/colo.c > +++ b/migration/colo.c > @@ -63,9 +63,9 @@ static bool colo_runstate_is_stopped(void) > return runstate_check(RUN_STATE_COLO) || !runstate_is_running(); } > > -static void colo_checkpoint_notify(void *opaque) > +static void colo_checkpoint_notify(void) > { > -MigrationState *s = opaque; > +MigrationState *s = migrate_get_current(); > int64_t next_notify_time; > > qemu_event_set(>colo_checkpoint_event); > @@ -74,10 +74,15 @@ static void colo_checkpoint_notify(void *opaque) > timer_mod(s->colo_delay_timer, next_notify_time); } > > +static void colo_checkpoint_notify_timer(void *opaque) { > +colo_checkpoint_notify(); > +} > + > void colo_checkpoint_delay_set(void) > { > if (migration_in_colo_state()) { > -colo_checkpoint_notify(migrate_get_current()); > +colo_checkpoint_notify(); > } > } > > @@ -162,7 +167,7 @@ static void primary_vm_do_failover(void) > * kick COLO thread which might wait at > * qemu_sem_wait(>colo_checkpoint_sem). > */ > -colo_checkpoint_notify(s); > +colo_checkpoint_notify(); > > /* > * Wake up COLO thread which may blocked in recv() or send(), @@ -518,7 > +523,7 @@ out: > > static void colo_compare_notify_checkpoint(Notifier *notifier, void *data) { > -colo_checkpoint_notify(data); > +colo_checkpoint_notify(); > } > > static void colo_process_checkpoint(MigrationState *s) @@ -642,7 +647,7 > @@ void migrate_start_colo_process(MigrationState *s) > bql_unlock(); > qemu_event_init(>colo_checkpoint_event, false); > s->colo_delay_timer = timer_new_ms(QEMU_CLOCK_HOST, > -colo_checkpoint_notify, s); > +colo_checkpoint_notify_timer, NULL); > > qemu_sem_init(>colo_exit_sem, 0); > colo_process_checkpoint(s); > diff --git a/net/colo-compare.c b/net/colo-compare.c index f2dfc0e..c4ad0ab > 100644 > --- a/net/colo-compare.c > +++ b/net/colo-compare.c > @@ -28,7 +28,6 @@ > #include "sysemu/iothread.h" > #include "net/colo-compare.h" > #include "migration/colo.h" > -#include "migration/migration.h" > #include "util.h" > > #include "block/aio-wait.h" > @@ -189,7 +188,7 @@ static void > colo_compare_inconsistency_notify(CompareState *s) > notify_remote_frame(s); > } else { > notifier_list_notify(_compare_notifiers, > - migrate_get_current()); > + NULL); > } > } > > diff --git a/stubs/colo.c b/stubs/colo.c index 08c9f98..f8c069b 100644 > --- a/stubs/colo.c > +++ b/stubs/colo.c > @@ -2,7 +2,6 @@ > #include "qemu/notify.h" > #include "net/colo-compare.h" > #include "migration/colo.h" > -#include "migration/migration.h" > #include "qemu/error-report.h" > #include "qapi/qapi-commands-migration.h" > > -- > 1.8.3.1
RE: [PATCH trivial] colo: examples: remove mentions of script= and (wrong) downscript=
> -Original Message- > From: Michael Tokarev > Sent: Tuesday, January 9, 2024 1:44 PM > To: Zhang, Chen ; qemu-devel@nongnu.org > Cc: qemu-triv...@nongnu.org; Li Zhijian > Subject: Re: [PATCH trivial] colo: examples: remove mentions of script= and > (wrong) downscript= > > 09.01.2024 05:08, Zhang, Chen : > > > > > >> -Original Message- > >> From: Michael Tokarev > >> Sent: Sunday, January 7, 2024 7:25 PM > >> To: qemu-devel@nongnu.org > >> Cc: Michael Tokarev ; qemu-triv...@nongnu.org; Zhang, > >> Chen ; Li Zhijian > >> Subject: [PATCH trivial] colo: examples: remove mentions of script= > >> and > >> (wrong) downscript= > >> > >> There's no need to repeat script=/etc/qemu-ifup in examples, as it is > >> already in there. More, all examples uses incorrect "down script=" > >> (which should be "downscript="). > > > > Yes, good catch. > > Reviewed-by: Zhang Chen > > > >> --- > >> I'm not sure we need so many identical examples, and why it uses > >> vnet=off, - it looks like vnet= should also be dropped. > > > > Do you means the "vnet_hdr_support" in docs? > > Nope, it was a thinko on my part, I mean vhost=off parameter - which is right > next to script=. > Why vhost is explicitly disabled here, while it isn't even enabled by default? > Because Qemu net filter can't support vhost. Vhost puts virtio emulation code into the kernel, taking QEMU userspace out of the picture. So, the filter can't works to get network data. > And do we really need that many examples like this, maybe it's a good idea to > remove half of them and refer to the other place instead? Yes, nice to see optimized documentation. Thanks Chen > > /mjt
RE: [PATCH trivial] colo: examples: remove mentions of script= and (wrong) downscript=
> -Original Message- > From: Michael Tokarev > Sent: Sunday, January 7, 2024 7:25 PM > To: qemu-devel@nongnu.org > Cc: Michael Tokarev ; qemu-triv...@nongnu.org; Zhang, > Chen ; Li Zhijian > Subject: [PATCH trivial] colo: examples: remove mentions of script= and > (wrong) downscript= > > There's no need to repeat script=/etc/qemu-ifup in examples, as it is already > in there. More, all examples uses incorrect "down script=" (which should be > "downscript="). Yes, good catch. Reviewed-by: Zhang Chen > --- > I'm not sure we need so many identical examples, and why it uses vnet=off, - > it looks like vnet= should also be dropped. Do you means the "vnet_hdr_support" in docs? If yes, we can't drop it. Because the filters use this tag to communicate with an independent vnet_header. And when a filter with vnet_hdr_support tag(like filter-mirror) connect to another filter without tag(like filter-redirector), They cannot correctly parse the data sent to each other. Thanks Chen > > docs/colo-proxy.txt | 6 +++--- > qemu-options.hx | 8 > 2 files changed, 7 insertions(+), 7 deletions(-) > > diff --git a/docs/colo-proxy.txt b/docs/colo-proxy.txt index > 1fc38aed1b..e712c883db 100644 > --- a/docs/colo-proxy.txt > +++ b/docs/colo-proxy.txt > @@ -162,7 +162,7 @@ Here is an example using demonstration IP and port > addresses to more clearly describe the usage. > > Primary(ip:3.3.3.3): > --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu- > ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off > -chardev socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off > @@ -177,7 +177,7 @@ Primary(ip:3.3.3.3): > -object colo-compare,id=comp0,primary_in=compare0- > 0,secondary_in=compare1,outdev=compare_out0,iothread=iothread1 > > Secondary(ip:3.3.3.8): > --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu- > ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev socket,id=red0,host=3.3.3.3,port=9003 > -chardev socket,id=red1,host=3.3.3.3,port=9004 > @@ -202,7 +202,7 @@ Primary(ip:3.3.3.3): > -object colo-compare,id=comp0,primary_in=compare0- > 0,secondary_in=compare1,outdev=compare_out0,vnet_hdr_support > > Secondary(ip:3.3.3.8): > --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu- > ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev socket,id=red0,host=3.3.3.3,port=9003 > -chardev socket,id=red1,host=3.3.3.3,port=9004 > diff --git a/qemu-options.hx b/qemu-options.hx index > b66570ae00..d667bfa0c2 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -5500,7 +5500,7 @@ SRST > KVM COLO > > primary: > --netdev tap,id=hn0,vhost=off,script=/etc/qemu- > ifup,downscript=/etc/qemu-ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev > socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off > -chardev > socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off > @@ -5515,7 +5515,7 @@ SRST > -object colo-compare,id=comp0,primary_in=compare0- > 0,secondary_in=compare1,outdev=compare_out0,iothread=iothread1 > > secondary: > --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down > script=/etc/qemu-ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev socket,id=red0,host=3.3.3.3,port=9003 > -chardev socket,id=red1,host=3.3.3.3,port=9004 > @@ -5526,7 +5526,7 @@ SRST > Xen COLO > > primary: > --netdev tap,id=hn0,vhost=off,script=/etc/qemu- > ifup,downscript=/etc/qemu-ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev > socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off > -chardev > socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off > @@ -5542,7 +5542,7 @@ SRST > -object colo-compare,id=comp0,primary_in=compare0- > 0,secondary_in=compare1,outdev=compare_out0,notify_dev=nofity_way,ioth > read=iothread1 > > secondary: > --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down > script=/etc/qemu-ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev socket,id=red0,host=3.3.3.3,port=9003 > -chardev socket,id=red1,host=3.3.3.3,port=9004 > -- > 2.39.2
RE: [PATCH] tests/qtest: check the return value
> -Original Message- > From: zhujun2 > Sent: Tuesday, November 21, 2023 2:17 PM > To: Zhang, Chen > Cc: lviv...@redhat.com; pbonz...@redhat.com; qemu-devel@nongnu.org; > th...@redhat.com; zhuj...@cmss.chinamobile.com > Subject: [PATCH] tests/qtest: check the return value > > These variables "ret" are never referenced in the code, that add check logic > for the "ret" > Just tiny comments, please check the len before str, and add the V2 tag in mail title next time. After fix my comments and Thomas comments: Reviewed-by: Zhang Chen Thanks Chen > Signed-off-by: zhujun2 > --- > tests/qtest/test-filter-mirror.c | 1 + > tests/qtest/test-filter-redirector.c | 2 ++ > tests/qtest/virtio-net-test.c| 1 + > 3 files changed, 4 insertions(+) > > diff --git a/tests/qtest/test-filter-mirror.c > b/tests/qtest/test-filter-mirror.c > index adeada3eb8..f3865f7519 100644 > --- a/tests/qtest/test-filter-mirror.c > +++ b/tests/qtest/test-filter-mirror.c > @@ -61,6 +61,7 @@ static void test_mirror(void) > g_assert_cmpint(len, ==, sizeof(send_buf)); > recv_buf = g_malloc(len); > ret = recv(recv_sock[0], recv_buf, len, 0); > +g_assert_cmpint(ret, ==, len); > g_assert_cmpstr(recv_buf, ==, send_buf); > > g_free(recv_buf); > diff --git a/tests/qtest/test-filter-redirector.c b/tests/qtest/test-filter- > redirector.c > index e72e3b7873..a77d5fd8ec 100644 > --- a/tests/qtest/test-filter-redirector.c > +++ b/tests/qtest/test-filter-redirector.c > @@ -118,6 +118,7 @@ static void test_redirector_tx(void) > g_assert_cmpint(len, ==, sizeof(send_buf)); > recv_buf = g_malloc(len); > ret = recv(recv_sock, recv_buf, len, 0); > +g_assert_cmpint(ret, ==, len); > g_assert_cmpstr(recv_buf, ==, send_buf); > > g_free(recv_buf); > @@ -185,6 +186,7 @@ static void test_redirector_rx(void) > g_assert_cmpint(len, ==, sizeof(send_buf)); > recv_buf = g_malloc(len); > ret = recv(backend_sock[0], recv_buf, len, 0); > +g_assert_cmpint(ret, ==, len); > g_assert_cmpstr(recv_buf, ==, send_buf); > > close(send_sock); > diff --git a/tests/qtest/virtio-net-test.c b/tests/qtest/virtio-net-test.c > index > fab5dd8b05..b470d8c6e2 100644 > --- a/tests/qtest/virtio-net-test.c > +++ b/tests/qtest/virtio-net-test.c > @@ -92,6 +92,7 @@ static void tx_test(QVirtioDevice *dev, > > ret = recv(socket, buffer, len, 0); > g_assert_cmpstr(buffer, ==, "TEST"); > +g_assert_cmpint(ret, ==, len); Move it before g_assert_cmpstr(). > } > > static void rx_stop_cont_test(QVirtioDevice *dev, > -- > 2.17.1 > >
RE: [PATCH] tests/qtest: remove unused variables
> -Original Message- > From: qemu-devel-bounces+chen.zhang=intel@nongnu.org devel-bounces+chen.zhang=intel@nongnu.org> On Behalf Of zhujun2 > Sent: Wednesday, November 15, 2023 4:00 PM > To: th...@redhat.com > Cc: lviv...@redhat.com; pbonz...@redhat.com; qemu-devel@nongnu.org; > zhuj...@cmss.chinamobile.com > Subject: [PATCH] tests/qtest: remove unused variables > > These variables are never referenced in the code, just remove them > > Signed-off-by: zhujun2 > --- > tests/qtest/test-filter-mirror.c | 2 +- > tests/qtest/test-filter-redirector.c | 4 ++-- > tests/qtest/virtio-net-test.c| 2 +- > 3 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/tests/qtest/test-filter-mirror.c > b/tests/qtest/test-filter-mirror.c > index adeada3eb8..7aa81daa93 100644 > --- a/tests/qtest/test-filter-mirror.c > +++ b/tests/qtest/test-filter-mirror.c > @@ -60,7 +60,7 @@ static void test_mirror(void) > > g_assert_cmpint(len, ==, sizeof(send_buf)); > recv_buf = g_malloc(len); > -ret = recv(recv_sock[0], recv_buf, len, 0); > +recv(recv_sock[0], recv_buf, len, 0); > g_assert_cmpstr(recv_buf, ==, send_buf); > > g_free(recv_buf); > diff --git a/tests/qtest/test-filter-redirector.c b/tests/qtest/test-filter- > redirector.c > index e72e3b7873..e4dfeff2e0 100644 > --- a/tests/qtest/test-filter-redirector.c > +++ b/tests/qtest/test-filter-redirector.c > @@ -117,7 +117,7 @@ static void test_redirector_tx(void) > > g_assert_cmpint(len, ==, sizeof(send_buf)); > recv_buf = g_malloc(len); > -ret = recv(recv_sock, recv_buf, len, 0); > +recv(recv_sock, recv_buf, len, 0); > g_assert_cmpstr(recv_buf, ==, send_buf); > > g_free(recv_buf); > @@ -184,7 +184,7 @@ static void test_redirector_rx(void) > > g_assert_cmpint(len, ==, sizeof(send_buf)); > recv_buf = g_malloc(len); > -ret = recv(backend_sock[0], recv_buf, len, 0); > +recv(backend_sock[0], recv_buf, len, 0); It looks like add more check for the "ret" is better. For example: g_assert_cmpint(ret, ==, len); Thanks Chen > g_assert_cmpstr(recv_buf, ==, send_buf); > > close(send_sock); > diff --git a/tests/qtest/virtio-net-test.c b/tests/qtest/virtio-net-test.c > index > fab5dd8b05..26df5bbabe 100644 > --- a/tests/qtest/virtio-net-test.c > +++ b/tests/qtest/virtio-net-test.c > @@ -90,7 +90,7 @@ static void tx_test(QVirtioDevice *dev, > g_assert_cmpint(ret, ==, sizeof(len)); > len = ntohl(len); > > -ret = recv(socket, buffer, len, 0); > +recv(socket, buffer, len, 0); > g_assert_cmpstr(buffer, ==, "TEST"); } > > -- > 2.17.1 > > >
RE: [PATCH v6 13/21] virtio-net: Always set populate_hash
> -Original Message- > From: qemu-devel-bounces+chen.zhang=intel@nongnu.org devel-bounces+chen.zhang=intel@nongnu.org> On Behalf Of Akihiko > Odaki > Sent: Monday, October 30, 2023 1:13 PM > Cc: qemu-devel@nongnu.org; Yuri Benditovich > ; Andrew Melnychenko > ; Michael S . Tsirkin ; Jason Wang > ; Akihiko Odaki > Subject: [PATCH v6 13/21] virtio-net: Always set populate_hash > > The member is not cleared during reset so may have a stale value. > /docs/devel/ebpf_rss.rst: populate_hash - for now, not used. eBPF RSS doesn't support hash reporting. We need update docs? And why not clear it in virtio_net_reset function? Thanks Chen > Signed-off-by: Akihiko Odaki > --- > hw/net/virtio-net.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index > 1fa020d905..0fe75b3c08 100644 > --- a/hw/net/virtio-net.c > +++ b/hw/net/virtio-net.c > @@ -650,6 +650,7 @@ static void virtio_net_set_mrg_rx_bufs(VirtIONet *n, > int mergeable_rx_bufs, > n->guest_hdr_len = n->mergeable_rx_bufs ? > sizeof(struct virtio_net_hdr_mrg_rxbuf) : > sizeof(struct virtio_net_hdr); > +n->rss_data.populate_hash = false; > } > > for (i = 0; i < n->max_queue_pairs; i++) { > -- > 2.42.0 >
RE: [PATCH v6 05/21] tap: Remove tap_receive()
> -Original Message- > From: qemu-devel-bounces+chen.zhang=intel@nongnu.org devel-bounces+chen.zhang=intel@nongnu.org> On Behalf Of Akihiko > Odaki > Sent: Monday, October 30, 2023 1:12 PM > Cc: qemu-devel@nongnu.org; Yuri Benditovich > ; Andrew Melnychenko > ; Michael S . Tsirkin ; Jason Wang > ; Akihiko Odaki > Subject: [PATCH v6 05/21] tap: Remove tap_receive() > > The receive member of NetClientInfo is only for legacy clients and the > receive_iov member is always used when it is set. Under normal circumstances we still need to maintain compatibility. It seems that there is no need to remove the tap_receive here. You just need to optimize the tap_receive to call the tap_receive_iov. In the history, we can see a large number of devices still keep this interface, For example, e1000_receive can directly call the e1000_receive_iov. Thanks Chen > > Signed-off-by: Akihiko Odaki > --- > net/tap.c | 36 > 1 file changed, 36 deletions(-) > > diff --git a/net/tap.c b/net/tap.c > index d54e90f184..ab4e5a0e91 100644 > --- a/net/tap.c > +++ b/net/tap.c > @@ -133,41 +133,6 @@ static ssize_t tap_receive_iov(NetClientState *nc, > const struct iovec *iov, > return tap_write_packet(s, iovp, iovcnt); } > > -static ssize_t tap_receive_raw(NetClientState *nc, const uint8_t *buf, size_t > size) -{ > -TAPState *s = DO_UPCAST(TAPState, nc, nc); > -struct iovec iov[2]; > -int iovcnt = 0; > -struct virtio_net_hdr_mrg_rxbuf hdr = { }; > - > -if (s->host_vnet_hdr_len) { > -iov[iovcnt].iov_base = > -iov[iovcnt].iov_len = s->host_vnet_hdr_len; > -iovcnt++; > -} > - > -iov[iovcnt].iov_base = (char *)buf; > -iov[iovcnt].iov_len = size; > -iovcnt++; > - > -return tap_write_packet(s, iov, iovcnt); > -} > - > -static ssize_t tap_receive(NetClientState *nc, const uint8_t *buf, size_t > size) -{ > -TAPState *s = DO_UPCAST(TAPState, nc, nc); > -struct iovec iov[1]; > - > -if (s->host_vnet_hdr_len && !s->using_vnet_hdr) { > -return tap_receive_raw(nc, buf, size); > -} > - > -iov[0].iov_base = (char *)buf; > -iov[0].iov_len = size; > - > -return tap_write_packet(s, iov, 1); > -} > - > #ifndef __sun__ > ssize_t tap_read_packet(int tapfd, uint8_t *buf, int maxlen) { @@ -366,7 > +331,6 @@ int tap_get_fd(NetClientState *nc) static NetClientInfo > net_tap_info = { > .type = NET_CLIENT_DRIVER_TAP, > .size = sizeof(TAPState), > -.receive = tap_receive, > .receive_iov = tap_receive_iov, > .poll = tap_poll, > .cleanup = tap_cleanup, > -- > 2.42.0 >
RE: [PATCH 2/8] colo: Setup ram cache in normal migration path
> -Original Message- > From: Lukas Straub > Sent: Thursday, June 22, 2023 8:15 PM > To: qemu-devel > Cc: Zhang, Hailiang ; Juan Quintela > ; Peter Xu ; Leonardo Bras > ; Zhang, Chen > Subject: [PATCH 2/8] colo: Setup ram cache in normal migration path > > Now that x-colo capability needs to be always enabled on the incoming side > we can use that to initialize the ram cache in the normal migration path. > > Signed-off-by: Lukas Straub > --- > migration/migration.c | 16 > migration/savevm.c| 10 +- > 2 files changed, 13 insertions(+), 13 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c index > dc05c6f6ea..050bd8ffc8 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -404,10 +404,6 @@ int migration_incoming_enable_colo(void) > return -EINVAL; > } > > -if (ram_block_discard_disable(true)) { > -error_report("COLO: cannot disable RAM discard"); > -return -EBUSY; > -} COLO may call here when occur each checkpoint to pin the guest memory, in the Colo_incoming_process_checkpoint(). Is it still working after move to the "process_incoming_migration_co()" ? > migration_colo_enabled = true; > return 0; > } > @@ -519,6 +515,18 @@ process_incoming_migration_co(void *opaque) > goto fail; > } > > +if (migrate_colo()) { > +if (ram_block_discard_disable(true)) { > +error_report("COLO: cannot disable RAM discard"); > +goto fail; > +} > + > +if (colo_init_ram_cache() < 0) { > +error_report("Init ram cache failed"); Change the code path to here need to add COLO tag to the error: error_report("Init COLO ram cache failed"); It looks here removed the original migration_incoming_disable_colo(), But maybe it's OK for goto fail directly. Thanks Chen > +goto fail; > +} > +} > + > mis->largest_page_size = qemu_ram_pagesize_largest(); > postcopy_state_set(POSTCOPY_INCOMING_NONE); > migrate_set_state(>state, MIGRATION_STATUS_NONE, diff --git > a/migration/savevm.c b/migration/savevm.c index bc284087f9..155abb0fda > 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -2302,15 +2302,7 @@ static int > loadvm_handle_recv_bitmap(MigrationIncomingState *mis, > > static int loadvm_process_enable_colo(MigrationIncomingState *mis) { > -int ret = migration_incoming_enable_colo(); > - > -if (!ret) { > -ret = colo_init_ram_cache(); > -if (ret) { > -migration_incoming_disable_colo(); > -} > -} > -return ret; > +return migration_incoming_enable_colo(); > } > > /* > -- > 2.39.2
RE: [PATCH 1/8] colo: Only support the same qemu version on source and destination
> -Original Message- > From: Dong, Eddie > Sent: Friday, June 23, 2023 1:17 AM > To: Lukas Straub ; qemu-devel de...@nongnu.org> > Cc: Zhang, Hailiang ; Juan Quintela > ; Peter Xu ; Leonardo Bras > ; Zhang, Chen > Subject: RE: [PATCH 1/8] colo: Only support the same qemu version on > source and destination > > > > > -Original Message- > > From: qemu-devel-bounces+eddie.dong=intel@nongnu.org > devel-bounces+eddie.dong=intel@nongnu.org> On Behalf Of Lukas > > Straub > > Sent: Thursday, June 22, 2023 5:15 AM > > To: qemu-devel > > Cc: Zhang, Hailiang ; Juan Quintela > > ; Peter Xu ; Leonardo Bras > > ; Zhang, Chen > > Subject: [PATCH 1/8] colo: Only support the same qemu version on > > source and destination > > > > Signed-off-by: Lukas Straub > > --- > > docs/COLO-FT.txt | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt index > > 2e760a4aee..8e64480dbd 100644 > > --- a/docs/COLO-FT.txt > > +++ b/docs/COLO-FT.txt > > @@ -148,6 +148,8 @@ in test procedure. > > Note: Here we are running both instances on the same host for > > testing, change the IP Addresses if you want to run it on two hosts. > > Initially > > 127.0.0.1 is the Primary Host and 127.0.0.2 is the Secondary Host. > > +Note: COLO is a experimental feature, > an experimental feature > > >so currently is should only be > it should ... > > > +used with the same qemu version on sourcee and target. S/sourcee/source Thanks Chen > > > > == Startup qemu == > > 1. Primary: > > -- > > 2.39.2
RE: [PATCH v4 00/10] COLO: improve build options
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Saturday, April 29, 2023 3:49 AM > To: qemu-devel@nongnu.org > Cc: lukasstra...@web.de; quint...@redhat.com; Zhang, Chen > ; vsement...@yandex-team.ru > Subject: [PATCH v4 00/10] COLO: improve build options > > v4: > 01: add r-b by Lukas > 02: new > 03: - keep x-colo capability enum value unconditional > - drop ifdefs in options.c and keep capability check instead > - update stubs > - add missed a-b by Dr. David > 04: keep filter-mirror untouched, add r-b by Juan > > others: new. Some further improvements of COLO module API. May be > merged separately. > > Hi all! > > COLO substem seems to be useless when CONFIG_REPLICATION is unset, as > we simply don't allow to set x-colo capability in this case. So, let's not > compile > in unreachable code and interface we cannot use when > CONFIG_REPLICATION is unset. > > Also, provide personal configure option for COLO Proxy subsystem. This series looks good to me. Please add the new configure option related comments to docs/COLO-FT.txt, block-replication.txt, colo-proxy.txt. Thanks Chen > > Vladimir Sementsov-Ogievskiy (10): > block/meson.build: prefer positive condition for replication > colo: make colo_checkpoint_notify static and provide simpler API > build: move COLO under CONFIG_REPLICATION > configure: add --disable-colo-proxy option > migration: drop colo_incoming_thread from MigrationIncomingState > migration: process_incoming_migration_co: simplify code flow around > ret > migration: split migration_incoming_co > migration: process_incoming_migration_co(): move colo part to colo > migration: disallow change capabilities in COLO state > migration: block incoming colo when capability is disabled > > block/meson.build | 2 +- > hmp-commands.hx| 2 + > include/migration/colo.h | 18 +- > meson_options.txt | 2 + > migration/colo.c | 100 +++-- > migration/meson.build | 6 +- > migration/migration-hmp-cmds.c | 2 + > migration/migration.c | 51 +++-- > migration/migration.h | 11 +++- > migration/options.c| 6 +- > net/meson.build| 13 - > qapi/migration.json| 9 ++- > scripts/meson-buildoptions.sh | 3 + > stubs/colo-compare.c | 7 +++ > stubs/colo.c | 37 > stubs/meson.build | 2 + > 16 files changed, 181 insertions(+), 90 deletions(-) create mode 100644 > stubs/colo-compare.c create mode 100644 stubs/colo.c > > -- > 2.34.1
RE: [PATCH v4 10/10] migration: block incoming colo when capability is disabled
> -Original Message- > From: Lukas Straub > Sent: Friday, May 5, 2023 6:46 AM > To: Vladimir Sementsov-Ogievskiy > Cc: qemu-devel@nongnu.org; quint...@redhat.com; Zhang, Chen > ; Peter Xu ; Leonardo Bras > > Subject: Re: [PATCH v4 10/10] migration: block incoming colo when capability > is disabled > > On Fri, 5 May 2023 01:30:34 +0300 > Vladimir Sementsov-Ogievskiy wrote: > > > On 05.05.23 01:10, Lukas Straub wrote: > > > On Fri, 28 Apr 2023 22:49:28 +0300 > > > Vladimir Sementsov-Ogievskiy wrote: > > > > > >> We generally require same set of capabilities on source and target. > > >> Let's require x-colo capability to use COLO on target. > > >> > > >> Signed-off-by: Vladimir Sementsov-Ogievskiy > > >> > > > > > > Good patch, this is needed anyway for COLO with multifd. > > > > > > Also, it allows to remove a some code, like > > > migration_incoming_enable_colo(), qemu_savevm_send_colo_enable() > etc. > > > I will send patches for that. > > > > Great! But with such changes we should be careful to not break > compatibility between versions.. On the other hand, x-colo - is still > experimental with that x-, so there is no guarantee how it works. > > Given COLO's usecase, I think it should only be run with the same qemu > version on both sides anyway. Maybe we should even enforce that somehow, > while we're at it doing breaking changes. > > For upgrading qemu without downtime, normal migration can still be used. > > Zhang Cheng, what do you think? It's OK for me. We can add the same version requirement comments to the docs/COLO-FT.txt. Thanks Chen > > > > Or you can do it if you like. > > > > To be honest, I don't :) > > > > > > > > Please update the docs like below, then: > > > Reviewed-by: Lukas Straub > > > > > > diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt index > > > 8ec653f81c..2e760a4aee 100644 > > > --- a/docs/COLO-FT.txt > > > +++ b/docs/COLO-FT.txt > > > @@ -210,6 +210,7 @@ children.0=childs0 \ > > > > > > 3. On Secondary VM's QEMU monitor, issue command > > > {"execute":"qmp_capabilities"} > > > +{"execute": "migrate-set-capabilities", "arguments": > > > +{"capabilities": [ {"capability": "x-colo", "state": true } ] } } > > > {"execute": "nbd-server-start", "arguments": {"addr": {"type": "inet", > "data": {"host": "0.0.0.0", "port": ""} } } } > > > {"execute": "nbd-server-add", "arguments": {"device": "parent0", > > > "writable": true } } > > > > > > > I'll resend with this addition, thanks for reviewing! > > > > > > > >> --- > > >> migration/migration.c | 6 ++ > > >> 1 file changed, 6 insertions(+) > > >> > > >> diff --git a/migration/migration.c b/migration/migration.c index > > >> 8c5bbf3e94..5e162c0622 100644 > > >> --- a/migration/migration.c > > >> +++ b/migration/migration.c > > >> @@ -395,6 +395,12 @@ int migration_incoming_enable_colo(void) > > >> return -ENOTSUP; > > >> #endif > > >> > > >> +if (!migrate_colo()) { > > >> +error_report("ENABLE_COLO command come in migration stream, > but c-colo " > > >> + "capability is not set"); > > >> +return -EINVAL; > > >> +} > > >> + > > >> if (ram_block_discard_disable(true)) { > > >> error_report("COLO: cannot disable RAM discard"); > > >> return -EBUSY; > > > > > > > > > > > > > > > --
RE: [PATCH v4 10/10] migration: block incoming colo when capability is disabled
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Saturday, April 29, 2023 3:49 AM > To: qemu-devel@nongnu.org > Cc: lukasstra...@web.de; quint...@redhat.com; Zhang, Chen > ; vsement...@yandex-team.ru; Peter Xu > ; Leonardo Bras > Subject: [PATCH v4 10/10] migration: block incoming colo when capability is > disabled > > We generally require same set of capabilities on source and target. > Let's require x-colo capability to use COLO on target. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > Reviewed-by: Zhang Chen Thanks Chen --- > migration/migration.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/migration/migration.c b/migration/migration.c index > 8c5bbf3e94..5e162c0622 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -395,6 +395,12 @@ int migration_incoming_enable_colo(void) > return -ENOTSUP; > #endif > > +if (!migrate_colo()) { > +error_report("ENABLE_COLO command come in migration stream, but > c-colo " > + "capability is not set"); > +return -EINVAL; > +} > + > if (ram_block_discard_disable(true)) { > error_report("COLO: cannot disable RAM discard"); > return -EBUSY; > -- > 2.34.1
RE: [PATCH v4 09/10] migration: disallow change capabilities in COLO state
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Thursday, May 4, 2023 4:23 PM > To: Zhang, Chen ; qemu-devel@nongnu.org > Cc: lukasstra...@web.de; quint...@redhat.com; Peter Xu > ; Leonardo Bras > Subject: Re: [PATCH v4 09/10] migration: disallow change capabilities in COLO > state > > On 04.05.23 11:09, Zhang, Chen wrote: > > > > > >> -Original Message- > >> From: Vladimir Sementsov-Ogievskiy > >> Sent: Saturday, April 29, 2023 3:49 AM > >> To: qemu-devel@nongnu.org > >> Cc: lukasstra...@web.de; quint...@redhat.com; Zhang, Chen > >> ; vsement...@yandex-team.ru; Peter Xu > >> ; Leonardo Bras > >> Subject: [PATCH v4 09/10] migration: disallow change capabilities in > >> COLO state > >> > >> COLO is not listed as running state in migrate_is_running(), so, it's > >> theoretically possible to disable colo capability in COLO state and > >> the unexpected error in migration_iteration_finish() is reachable. > >> > >> Let's disallow that in qmp_migrate_set_capabilities. Than the error > >> becomes absolutely unreachable: we can get into COLO state only with > >> enabled capability and can't disable it while we are in COLO state. > >> So substitute the error by simple assertion. > >> > >> Signed-off-by: Vladimir Sementsov-Ogievskiy > >> > >> --- > >> migration/migration.c | 5 + > >> migration/options.c | 2 +- > >> 2 files changed, 2 insertions(+), 5 deletions(-) > >> > >> diff --git a/migration/migration.c b/migration/migration.c index > >> 0d912ee0d7..8c5bbf3e94 100644 > >> --- a/migration/migration.c > >> +++ b/migration/migration.c > >> @@ -2751,10 +2751,7 @@ static void > >> migration_iteration_finish(MigrationState *s) > >> runstate_set(RUN_STATE_POSTMIGRATE); > >> break; > >> case MIGRATION_STATUS_COLO: > >> -if (!migrate_colo()) { > >> -error_report("%s: critical error: calling COLO code without " > >> - "COLO enabled", __func__); > >> -} > >> +assert(migrate_colo()); > >> migrate_start_colo_process(s); > >> s->vm_was_running = true; > >> /* Fallthrough */ > >> diff --git a/migration/options.c b/migration/options.c index > >> 865a0214d8..f461d02eeb 100644 > >> --- a/migration/options.c > >> +++ b/migration/options.c > >> @@ -598,7 +598,7 @@ void > >> qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params, > >> MigrationCapabilityStatusList *cap; > >> bool new_caps[MIGRATION_CAPABILITY__MAX]; > >> > >> -if (migration_is_running(s->state)) { > >> +if (migration_is_running(s->state) || migration_in_colo_state()) > >> + { > > > > Make the "MIGRATION_STATUS_COLO" into the " migration_is_running()" > is a better way? > > I wasn't sure that that's correct.. migration_is_running() is used in several > places, to do so, I'd have to analyze them all. Actually, when running in the "MIGRATION_STATUS_COLO" means QEMU can not do a normal migration at the same time. Juan have any comments here? Thanks Chen > > > Like the "migration_is_setup_ot_active()". > > > > Thanks > > Chen > > > >> error_setg(errp, QERR_MIGRATION_ACTIVE); > >> return; > >> } > >> -- > >> 2.34.1 > > > > -- > Best regards, > Vladimir
RE: [PATCH v4 09/10] migration: disallow change capabilities in COLO state
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Saturday, April 29, 2023 3:49 AM > To: qemu-devel@nongnu.org > Cc: lukasstra...@web.de; quint...@redhat.com; Zhang, Chen > ; vsement...@yandex-team.ru; Peter Xu > ; Leonardo Bras > Subject: [PATCH v4 09/10] migration: disallow change capabilities in COLO > state > > COLO is not listed as running state in migrate_is_running(), so, it's > theoretically possible to disable colo capability in COLO state and the > unexpected error in migration_iteration_finish() is reachable. > > Let's disallow that in qmp_migrate_set_capabilities. Than the error becomes > absolutely unreachable: we can get into COLO state only with enabled > capability and can't disable it while we are in COLO state. So substitute the > error by simple assertion. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > migration/migration.c | 5 + > migration/options.c | 2 +- > 2 files changed, 2 insertions(+), 5 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c index > 0d912ee0d7..8c5bbf3e94 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -2751,10 +2751,7 @@ static void > migration_iteration_finish(MigrationState *s) > runstate_set(RUN_STATE_POSTMIGRATE); > break; > case MIGRATION_STATUS_COLO: > -if (!migrate_colo()) { > -error_report("%s: critical error: calling COLO code without " > - "COLO enabled", __func__); > -} > +assert(migrate_colo()); > migrate_start_colo_process(s); > s->vm_was_running = true; > /* Fallthrough */ > diff --git a/migration/options.c b/migration/options.c index > 865a0214d8..f461d02eeb 100644 > --- a/migration/options.c > +++ b/migration/options.c > @@ -598,7 +598,7 @@ void > qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params, > MigrationCapabilityStatusList *cap; > bool new_caps[MIGRATION_CAPABILITY__MAX]; > > -if (migration_is_running(s->state)) { > +if (migration_is_running(s->state) || migration_in_colo_state()) { Make the "MIGRATION_STATUS_COLO" into the " migration_is_running()" is a better way? Like the "migration_is_setup_ot_active()". Thanks Chen > error_setg(errp, QERR_MIGRATION_ACTIVE); > return; > } > -- > 2.34.1
RE: [PATCH v4 07/10] migration: split migration_incoming_co
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Thursday, May 4, 2023 6:52 AM > To: Peter Xu > Cc: qemu-devel@nongnu.org; lukasstra...@web.de; quint...@redhat.com; > Zhang, Chen ; Zhang, Hailiang > ; Leonardo Bras > Subject: Re: [PATCH v4 07/10] migration: split migration_incoming_co > > On 02.05.23 23:48, Peter Xu wrote: > > On Fri, Apr 28, 2023 at 10:49:25PM +0300, Vladimir Sementsov-Ogievskiy > wrote: > >> Originally, migration_incoming_co was introduced by > >> 25d0c16f625feb3b6 > >> "migration: Switch to COLO process after finishing loadvm" > >> to be able to enter from COLO code to one specific yield point, added > >> by 25d0c16f625feb3b6. > >> > >> Later in 923709896b1b0 > >> "migration: poll the cm event for destination qemu" > >> we reused this variable to wake the migration incoming coroutine from > >> RDMA code. > >> > >> That was doubtful idea. Entering coroutines is a very fragile thing: > >> you should be absolutely sure which yield point you are going to enter. > >> > >> I don't know how much is it safe to enter during qemu_loadvm_state() > >> which I think what RDMA want to do. But for sure RDMA shouldn't enter > >> the special COLO-related yield-point. As well, COLO code doesn't want > >> to enter during qemu_loadvm_state(), it want to enter it's own > >> specific yield-point. > >> > >> As well, when in 8e48ac95865ac97d > >> "COLO: Add block replication into colo process" we added > >> bdrv_invalidate_cache_all() call (now it's called activate_all()) it > >> became possible to enter the migration incoming coroutine during that > >> call which is wrong too. > >> > >> So, let't make these things separate and disjoint: loadvm_co for > >> RDMA, non-NULL during qemu_loadvm_state(), and colo_incoming_co for > >> COLO, non-NULL only around specific yield. > >> > >> Signed-off-by: Vladimir Sementsov-Ogievskiy > >> > >> --- > >> migration/colo.c | 4 ++-- > >> migration/migration.c | 8 ++-- > >> migration/migration.h | 9 - > >> 3 files changed, 16 insertions(+), 5 deletions(-) > > > > The idea looks right to me, but I really know mostly nothing on > > coroutines and also rdma+colo.. > > > > Is the other ref in rdma.c (rdma_cm_poll_handler()) still missing? > > > > Oops right.. I was building with rdma disabled. Will fix. > > Thanks a lot for reviewing! > Yes, I know some people and company try to enable COLO with RDMA. But in my side, I haven't tried this way yet. Thanks Chen > -- > Best regards, > Vladimir
RE: [PATCH v4 06/10] migration: process_incoming_migration_co: simplify code flow around ret
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Saturday, April 29, 2023 3:49 AM > To: qemu-devel@nongnu.org > Cc: lukasstra...@web.de; quint...@redhat.com; Zhang, Chen > ; vsement...@yandex-team.ru; Peter Xu > ; Leonardo Bras > Subject: [PATCH v4 06/10] migration: process_incoming_migration_co: > simplify code flow around ret > > Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Zhang Chen Thanks Chen > --- > migration/migration.c | 11 ++- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c index > d4fa1a853c..8db0892317 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -533,8 +533,13 @@ process_incoming_migration_co(void *opaque) > /* Else if something went wrong then just fall out of the normal > exit */ > } > > +if (ret < 0) { > +error_report("load of migration failed: %s", strerror(-ret)); > +goto fail; > +} > + > /* we get COLO info, and know if we are in COLO mode */ > -if (!ret && migration_incoming_colo_enabled()) { > +if (migration_incoming_colo_enabled()) { > QemuThread colo_incoming_thread; > > /* Make sure all file formats throw away their mutable metadata */ > @@ -556,10 +561,6 @@ process_incoming_migration_co(void *opaque) > colo_release_ram_cache(); > } > > -if (ret < 0) { > -error_report("load of migration failed: %s", strerror(-ret)); > -goto fail; > -} > mis->bh = qemu_bh_new(process_incoming_migration_bh, mis); > qemu_bh_schedule(mis->bh); > mis->migration_incoming_co = NULL; > -- > 2.34.1
RE: [PATCH v4 05/10] migration: drop colo_incoming_thread from MigrationIncomingState
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Saturday, April 29, 2023 3:49 AM > To: qemu-devel@nongnu.org > Cc: lukasstra...@web.de; quint...@redhat.com; Zhang, Chen > ; vsement...@yandex-team.ru; Peter Xu > ; Leonardo Bras > Subject: [PATCH v4 05/10] migration: drop colo_incoming_thread from > MigrationIncomingState > > have_colo_incoming_thread variable is unused. colo_incoming_thread can > be local. > > Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Zhang Chen Thanks Chen > --- > migration/migration.c | 7 --- > migration/migration.h | 2 -- > 2 files changed, 4 insertions(+), 5 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c index > 0c14837cd3..d4fa1a853c 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -535,6 +535,8 @@ process_incoming_migration_co(void *opaque) > > /* we get COLO info, and know if we are in COLO mode */ > if (!ret && migration_incoming_colo_enabled()) { > +QemuThread colo_incoming_thread; > + > /* Make sure all file formats throw away their mutable metadata */ > bdrv_activate_all(_err); > if (local_err) { > @@ -542,14 +544,13 @@ process_incoming_migration_co(void *opaque) > goto fail; > } > > -qemu_thread_create(>colo_incoming_thread, "COLO incoming", > +qemu_thread_create(_incoming_thread, "COLO incoming", > colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE); > -mis->have_colo_incoming_thread = true; > qemu_coroutine_yield(); > > qemu_mutex_unlock_iothread(); > /* Wait checkpoint incoming thread exit before free resource */ > -qemu_thread_join(>colo_incoming_thread); > +qemu_thread_join(_incoming_thread); > qemu_mutex_lock_iothread(); > /* We hold the global iothread lock, so it is safe here */ > colo_release_ram_cache(); > diff --git a/migration/migration.h b/migration/migration.h index > 3a918514e7..7721c7658b 100644 > --- a/migration/migration.h > +++ b/migration/migration.h > @@ -162,8 +162,6 @@ struct MigrationIncomingState { > > int state; > > -bool have_colo_incoming_thread; > -QemuThread colo_incoming_thread; > /* The coroutine we should enter (back) after failover */ > Coroutine *migration_incoming_co; > QemuSemaphore colo_incoming_sem; > -- > 2.34.1
RE: [PATCH v4 04/10] configure: add --disable-colo-proxy option
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Saturday, April 29, 2023 3:49 AM > To: qemu-devel@nongnu.org > Cc: lukasstra...@web.de; quint...@redhat.com; Zhang, Chen > ; vsement...@yandex-team.ru; Paolo Bonzini > ; Marc-André Lureau > ; Daniel P. Berrangé > ; Thomas Huth ; Philippe > Mathieu-Daudé ; Jason Wang > Subject: [PATCH v4 04/10] configure: add --disable-colo-proxy option > > Add option to not build filter-mirror, filter-rewriter and colo-compare when > they are not needed. Typo: This patch still build the filter-mirror/filter-redirector in filter-mirror.c. Please remove the "filter-mirror" here. Other code look good to me. Reviewed-by: Zhang Chen Thanks Chen > > There could be more agile configuration, for example add separate options > for each filter, but that may be done in future on demand. The aim of this > patch is to make possible to disable the whole COLO Proxy subsystem. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > Reviewed-by: Juan Quintela > --- > meson_options.txt | 2 ++ > net/meson.build | 13 ++--- > scripts/meson-buildoptions.sh | 3 +++ > stubs/colo-compare.c | 7 +++ > stubs/meson.build | 1 + > 5 files changed, 23 insertions(+), 3 deletions(-) create mode 100644 > stubs/colo-compare.c > > diff --git a/meson_options.txt b/meson_options.txt index > 2471dd02da..b59e7ae342 100644 > --- a/meson_options.txt > +++ b/meson_options.txt > @@ -289,6 +289,8 @@ option('live_block_migration', type: 'feature', value: > 'auto', > description: 'block migration in the main migration stream') > option('replication', type: 'feature', value: 'auto', > description: 'replication support') > +option('colo_proxy', type: 'feature', value: 'auto', > + description: 'colo-proxy support') > option('bochs', type: 'feature', value: 'auto', > description: 'bochs image format support') option('cloop', type: > 'feature', > value: 'auto', diff --git a/net/meson.build b/net/meson.build index > 87afca3e93..6f4ecde57f 100644 > --- a/net/meson.build > +++ b/net/meson.build > @@ -1,13 +1,10 @@ > softmmu_ss.add(files( >'announce.c', >'checksum.c', > - 'colo-compare.c', > - 'colo.c', >'dump.c', >'eth.c', >'filter-buffer.c', >'filter-mirror.c', > - 'filter-rewriter.c', >'filter.c', >'hub.c', >'net-hmp-cmds.c', > @@ -19,6 +16,16 @@ softmmu_ss.add(files( >'util.c', > )) > > +if get_option('replication').allowed() or \ > +get_option('colo_proxy').allowed() > + softmmu_ss.add(files('colo-compare.c')) > + softmmu_ss.add(files('colo.c')) > +endif > + > +if get_option('colo_proxy').allowed() > + softmmu_ss.add(files('filter-rewriter.c')) > +endif > + > softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('filter-replay.c')) > > if have_l2tpv3 > diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh > index d4369a3ad8..036047ce6f 100644 > --- a/scripts/meson-buildoptions.sh > +++ b/scripts/meson-buildoptions.sh > @@ -83,6 +83,7 @@ meson_options_help() { >printf "%s\n" ' capstoneWhether and how to find the capstone > library' >printf "%s\n" ' cloop cloop image format support' >printf "%s\n" ' cocoa Cocoa user interface (macOS only)' > + printf "%s\n" ' colo-proxy colo-proxy support' >printf "%s\n" ' coreaudio CoreAudio sound support' >printf "%s\n" ' crypto-afalgLinux AF_ALG crypto backend driver' >printf "%s\n" ' curlCURL block device driver' > @@ -236,6 +237,8 @@ _meson_option_parse() { > --disable-cloop) printf "%s" -Dcloop=disabled ;; > --enable-cocoa) printf "%s" -Dcocoa=enabled ;; > --disable-cocoa) printf "%s" -Dcocoa=disabled ;; > +--enable-colo-proxy) printf "%s" -Dcolo_proxy=enabled ;; > +--disable-colo-proxy) printf "%s" -Dcolo_proxy=disabled ;; > --enable-coreaudio) printf "%s" -Dcoreaudio=enabled ;; > --disable-coreaudio) printf "%s" -Dcoreaudio=disabled ;; > --enable-coroutine-pool) printf "%s" -Dcoroutine_pool=true ;; diff --git > a/stubs/colo-compare.c b/stubs/colo-compare.c new file mode 100644 index > 00..ec726665be > --- /dev/null > +++ b/stubs/colo-compare.c > @@ -0,0 +1,7 @@ > +#include "qemu/osdep.h" > +#include "qemu/notify.h" > +#include "net/colo-compare.h" > + > +void colo_compare_cleanup(void) > +{ > +} > diff --git a/stubs/meson.build b/stubs/meson.build index > 8412cad15f..a56645e2f7 100644 > --- a/stubs/meson.build > +++ b/stubs/meson.build > @@ -46,6 +46,7 @@ stub_ss.add(files('target-monitor-defs.c')) > stub_ss.add(files('trace-control.c')) > stub_ss.add(files('uuid.c')) > stub_ss.add(files('colo.c')) > +stub_ss.add(files('colo-compare.c')) > stub_ss.add(files('vmstate.c')) > stub_ss.add(files('vm-stop.c')) > stub_ss.add(files('win32-kbd-hook.c')) > -- > 2.34.1
RE: [PATCH v4 02/10] colo: make colo_checkpoint_notify static and provide simpler API
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Saturday, April 29, 2023 3:49 AM > To: qemu-devel@nongnu.org > Cc: lukasstra...@web.de; quint...@redhat.com; Zhang, Chen > ; vsement...@yandex-team.ru; Zhang, Hailiang > ; Peter Xu ; Leonardo > Bras > Subject: [PATCH v4 02/10] colo: make colo_checkpoint_notify static and > provide simpler API > > colo_checkpoint_notify() is mostly used in colo.c. Outside we use it once > when x-checkpoint-delay migration parameter is set. So, let's simplify the > external API to only that function - notify COLO that parameter was set. This > make external API more robust and hides implementation details from > external callers. Also this helps us to make COLO module optional in further > patch (i.e. we are going to add possibility not build the COLO module). > > Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Zhang Chen Thanks Chen > --- > include/migration/colo.h | 9 - > migration/colo.c | 29 ++--- > migration/options.c | 4 +--- > 3 files changed, 27 insertions(+), 15 deletions(-) > > diff --git a/include/migration/colo.h b/include/migration/colo.h index > 5fbe1a6d5d..7ef315473e 100644 > --- a/include/migration/colo.h > +++ b/include/migration/colo.h > @@ -36,6 +36,13 @@ COLOMode get_colo_mode(void); > /* failover */ > void colo_do_failover(void); > > -void colo_checkpoint_notify(void *opaque); > +/* > + * colo_checkpoint_delay_set > + * > + * Handles change of x-checkpoint-delay migration parameter, called > +from > + * migrate_params_apply() to notify COLO module about the change. > + */ > +void colo_checkpoint_delay_set(void); > + > void colo_shutdown(void); > #endif > diff --git a/migration/colo.c b/migration/colo.c index > 07bfa21fea..c9e0b909b9 100644 > --- a/migration/colo.c > +++ b/migration/colo.c > @@ -65,6 +65,24 @@ static bool colo_runstate_is_stopped(void) > return runstate_check(RUN_STATE_COLO) || !runstate_is_running(); } > > +static void colo_checkpoint_notify(void *opaque) { > +MigrationState *s = opaque; > +int64_t next_notify_time; > + > +qemu_event_set(>colo_checkpoint_event); > +s->colo_checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST); > +next_notify_time = s->colo_checkpoint_time + > migrate_checkpoint_delay(); > +timer_mod(s->colo_delay_timer, next_notify_time); } > + > +void colo_checkpoint_delay_set(void) > +{ > +if (migration_in_colo_state()) { > +colo_checkpoint_notify(migrate_get_current()); > +} > +} > + > static void secondary_vm_do_failover(void) { > /* COLO needs enable block-replication */ @@ -644,17 +662,6 @@ out: > } > } > > -void colo_checkpoint_notify(void *opaque) -{ > -MigrationState *s = opaque; > -int64_t next_notify_time; > - > -qemu_event_set(>colo_checkpoint_event); > -s->colo_checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST); > -next_notify_time = s->colo_checkpoint_time + > migrate_checkpoint_delay(); > -timer_mod(s->colo_delay_timer, next_notify_time); > -} > - > void migrate_start_colo_process(MigrationState *s) { > qemu_mutex_unlock_iothread(); > diff --git a/migration/options.c b/migration/options.c index > 53b7fc5d5d..865a0214d8 100644 > --- a/migration/options.c > +++ b/migration/options.c > @@ -1246,9 +1246,7 @@ static void > migrate_params_apply(MigrateSetParameters *params, Error **errp) > > if (params->has_x_checkpoint_delay) { > s->parameters.x_checkpoint_delay = params->x_checkpoint_delay; > -if (migration_in_colo_state()) { > -colo_checkpoint_notify(s); > -} > +colo_checkpoint_delay_set(); > } > > if (params->has_block_incremental) { > -- > 2.34.1
RE: [PATCH v4 01/10] block/meson.build: prefer positive condition for replication
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Saturday, April 29, 2023 3:49 AM > To: qemu-devel@nongnu.org > Cc: lukasstra...@web.de; quint...@redhat.com; Zhang, Chen > ; vsement...@yandex-team.ru; Philippe Mathieu- > Daudé ; Kevin Wolf ; Hanna Reitz > ; open list:Block layer core > Subject: [PATCH v4 01/10] block/meson.build: prefer positive condition for > replication > > Signed-off-by: Vladimir Sementsov-Ogievskiy > Reviewed-by: Juan Quintela > Reviewed-by: Philippe Mathieu-Daudé > Reviewed-by: Lukas Straub Reviewed-by: Zhang Chen Thanks Chen > --- > block/meson.build | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/meson.build b/block/meson.build index > 382bec0e7d..b9a72e219b 100644 > --- a/block/meson.build > +++ b/block/meson.build > @@ -84,7 +84,7 @@ block_ss.add(when: 'CONFIG_WIN32', if_true: files('file- > win32.c', 'win32-aio.c') > block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), coref, > iokit]) > block_ss.add(when: libiscsi, if_true: files('iscsi-opts.c')) > block_ss.add(when: 'CONFIG_LINUX', if_true: files('nvme.c')) -if not > get_option('replication').disabled() > +if get_option('replication').allowed() >block_ss.add(files('replication.c')) > endif > block_ss.add(when: libaio, if_true: files('linux-aio.c')) > -- > 2.34.1
RE: [PATCH v3 4/4] configure: add --disable-colo-proxy option
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Friday, April 28, 2023 5:30 AM > To: Lukas Straub > Cc: qemu-devel@nongnu.org; qemu-bl...@nongnu.org; > michael.r...@amd.com; arm...@redhat.com; ebl...@redhat.com; > jasow...@redhat.com; quint...@redhat.com; Zhang, Hailiang > ; phi...@linaro.org; th...@redhat.com; > berra...@redhat.com; marcandre.lur...@redhat.com; > pbonz...@redhat.com; d...@treblig.org; hre...@redhat.com; > kw...@redhat.com; Zhang, Chen ; > lizhij...@fujitsu.com > Subject: Re: [PATCH v3 4/4] configure: add --disable-colo-proxy option > > On 28.04.23 00:18, Lukas Straub wrote: > > On Thu, 27 Apr 2023 23:29:46 +0300 > > Vladimir Sementsov-Ogievskiy wrote: > > > >> Add option to not build filter-mirror, filter-rewriter and > >> colo-compare when they are not needed. > >> > >> There could be more agile configuration, for example add separate > >> options for each filter, but that may be done in future on demand. > >> The aim of this patch is to make possible to disable the whole COLO > >> Proxy subsystem. > >> > >> Signed-off-by: Vladimir > >> Sementsov-Ogievskiy > >> --- > >> meson_options.txt | 2 ++ > >> net/meson.build | 14 ++ > >> scripts/meson-buildoptions.sh | 3 +++ > >> stubs/colo-compare.c | 7 +++ > >> stubs/meson.build | 1 + > >> 5 files changed, 23 insertions(+), 4 deletions(-) > >> create mode 100644 stubs/colo-compare.c > >> > >> diff --git a/meson_options.txt b/meson_options.txt index > >> 2471dd02da..b59e7ae342 100644 > >> --- a/meson_options.txt > >> +++ b/meson_options.txt > >> @@ -289,6 +289,8 @@ option('live_block_migration', type: 'feature', > value: 'auto', > >> description: 'block migration in the main migration stream') > >> option('replication', type: 'feature', value: 'auto', > >> description: 'replication support') > >> +option('colo_proxy', type: 'feature', value: 'auto', > >> + description: 'colo-proxy support') > >> option('bochs', type: 'feature', value: 'auto', > >> description: 'bochs image format support') > >> option('cloop', type: 'feature', value: 'auto', diff --git > >> a/net/meson.build b/net/meson.build index 87afca3e93..4cfc850c69 > >> 100644 > >> --- a/net/meson.build > >> +++ b/net/meson.build > >> @@ -1,13 +1,9 @@ > >> softmmu_ss.add(files( > >> 'announce.c', > >> 'checksum.c', > >> - 'colo-compare.c', > >> - 'colo.c', > >> 'dump.c', > >> 'eth.c', > >> 'filter-buffer.c', > >> - 'filter-mirror.c', Need fix here for filter-mirror.c too. > >> - 'filter-rewriter.c', > >> 'filter.c', > >> 'hub.c', > >> 'net-hmp-cmds.c', > >> @@ -19,6 +15,16 @@ softmmu_ss.add(files( > >> 'util.c', > >> )) > >> > >> +if get_option('replication').allowed() or \ > >> +get_option('colo_proxy').allowed() > >> + softmmu_ss.add(files('colo-compare.c')) > >> + softmmu_ss.add(files('colo.c')) > >> +endif > >> + > >> +if get_option('colo_proxy').allowed() > >> + softmmu_ss.add(files('filter-mirror.c', 'filter-rewriter.c')) > >> +endif > >> + > > The last discussion didn't really come to a conclusion, but I still > > think that 'filter-mirror.c' (which also contains filter-redirect) > > should be left unchanged. > > > > OK for me, I'll wait a bit for more comments and resend with > > @@ -22,7 +22,7 @@ if get_option('replication').allowed() or \ >endif > >if get_option('colo_proxy').allowed() > - softmmu_ss.add(files('filter-mirror.c', 'filter-rewriter.c')) > + softmmu_ss.add(files('filter-rewriter.c')) >endif > >softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('filter-replay.c')) > > > applied here, if no other strong opinion. > It's OK to me except for the filter-mirror.c related comments. Thanks Chen > -- > Best regards, > Vladimir
RE: [PATCH v2 6/6] tests/migration: Only run auto_converge in slow mode
> -Original Message- > From: Thomas Huth > Sent: Monday, April 24, 2023 2:56 PM > To: quint...@redhat.com; Zhang, Chen > Cc: Daniel P. Berrangé ; qemu-devel@nongnu.org; > qemu-bl...@nongnu.org; Paolo Bonzini ; John > Snow ; Li Zhijian ; Stefan > Hajnoczi ; Laurent Vivier > Subject: Re: [PATCH v2 6/6] tests/migration: Only run auto_converge in slow > mode > > On 24/04/2023 07.58, Juan Quintela wrote: > > "Zhang, Chen" wrote: > >>> -Original Message- > >>> From: Daniel P. Berrangé > >>> Sent: Saturday, April 22, 2023 1:14 AM > >>> To: qemu-devel@nongnu.org > >>> Cc: qemu-bl...@nongnu.org; Paolo Bonzini ; > >>> Thomas Huth ; John Snow ; Li > >>> Zhijian ; Juan Quintela > >>> ; Stefan Hajnoczi ; > Zhang, > >>> Chen ; Laurent Vivier > >>> Subject: [PATCH v2 6/6] tests/migration: Only run auto_converge in > >>> slow mode > >>> > >> > >> What kind of scenario will the qtest open this g_test_init() -m slow to > trigger the slow mode? > > > > The only way that I know is: > > > > export G_TEST_SLOW=1 > > make check (or whatever individual test that you want) > > Or even simpler: > > make check SPEED=slow > > Or if you want to run the test manually: > > QTEST_QEMU_BINARY=./qemu-system-x86_64 \ > tests/qtest/migration-test -m slow > Thanks for the explanation. Thanks Chen > HTH, > Thomas
RE: [PATCH v2 6/6] tests/migration: Only run auto_converge in slow mode
> -Original Message- > From: Daniel P. Berrangé > Sent: Saturday, April 22, 2023 1:14 AM > To: qemu-devel@nongnu.org > Cc: qemu-bl...@nongnu.org; Paolo Bonzini ; > Thomas Huth ; John Snow ; Li > Zhijian ; Juan Quintela ; > Stefan Hajnoczi ; Zhang, Chen > ; Laurent Vivier > Subject: [PATCH v2 6/6] tests/migration: Only run auto_converge in slow > mode > > From: Juan Quintela > > Signed-off-by: Juan Quintela LGTM. Reviewed-by: Zhang Chen > --- > tests/qtest/migration-test.c | 23 +-- > 1 file changed, 21 insertions(+), 2 deletions(-)
RE: [PATCH v2 6/6] tests/migration: Only run auto_converge in slow mode
> -Original Message- > From: Daniel P. Berrangé > Sent: Saturday, April 22, 2023 1:14 AM > To: qemu-devel@nongnu.org > Cc: qemu-bl...@nongnu.org; Paolo Bonzini ; > Thomas Huth ; John Snow ; Li > Zhijian ; Juan Quintela ; > Stefan Hajnoczi ; Zhang, Chen > ; Laurent Vivier > Subject: [PATCH v2 6/6] tests/migration: Only run auto_converge in slow > mode > What kind of scenario will the qtest open this g_test_init() -m slow to trigger the slow mode? Thanks Chen > From: Juan Quintela > > Signed-off-by: Juan Quintela > --- > tests/qtest/migration-test.c | 23 +-- > 1 file changed, 21 insertions(+), 2 deletions(-) > > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index > 63bd8a1893..9ed178aa03 100644 > --- a/tests/qtest/migration-test.c > +++ b/tests/qtest/migration-test.c > @@ -1877,6 +1877,21 @@ static void test_validate_uuid_dst_not_set(void) > do_test_validate_uuid(, false); } > > +/* > + * The way auto_converge works, we need to do too many passes to > + * run this test. Auto_converge logic is only run once every > + * three iterations, so: > + * > + * - 3 iterations without auto_converge enabled > + * - 3 iterations with pct = 5 > + * - 3 iterations with pct = 30 > + * - 3 iterations with pct = 55 > + * - 3 iterations with pct = 80 > + * - 3 iterations with pct = 95 (max(95, 80 + 25)) > + * > + * To make things even worse, we need to run the initial stage at > + * 3MB/s so we enter autoconverge even when host is (over)loaded. > + */ > static void test_migrate_auto_converge(void) { > g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); @@ - > 2660,8 +2675,12 @@ int main(int argc, char **argv) > test_validate_uuid_src_not_set); > qtest_add_func("/migration/validate_uuid_dst_not_set", > test_validate_uuid_dst_not_set); > - > -qtest_add_func("/migration/auto_converge", > test_migrate_auto_converge); > +/* > + * See explanation why this test is slow on function definition > + */ > +if (g_test_slow()) { > +qtest_add_func("/migration/auto_converge", > test_migrate_auto_converge); > +} > qtest_add_func("/migration/multifd/tcp/plain/none", > test_multifd_tcp_none); > /* > -- > 2.40.0
RE: [PATCH v2 1/6] tests/qtest: replace qmp_discard_response with qtest_qmp_assert_success
> -Original Message- > From: Daniel P. Berrangé > Sent: Saturday, April 22, 2023 1:14 AM > To: qemu-devel@nongnu.org > Cc: qemu-bl...@nongnu.org; Paolo Bonzini ; > Thomas Huth ; John Snow ; Li > Zhijian ; Juan Quintela ; > Stefan Hajnoczi ; Zhang, Chen > ; Laurent Vivier ; Daniel P. > Berrangé > Subject: [PATCH v2 1/6] tests/qtest: replace qmp_discard_response with > qtest_qmp_assert_success > > The qmp_discard_response method simply ignores the result of the QMP > command, merely unref'ing the object. This is a bad idea for tests as it > leaves > no trace if the QMP command unexpectedly failed. The > qtest_qmp_assert_success method will validate that the QMP command > returned without error, and if errors occur, it will print a message on the > console aiding debugging. > > Signed-off-by: Daniel P. Berrangé Reviewed-by: Zhang Chen > --- > tests/qtest/ahci-test.c | 31 ++-- > tests/qtest/boot-order-test.c| 5 + > tests/qtest/fdc-test.c | 15 +++--- > tests/qtest/ide-test.c | 5 + > tests/qtest/migration-test.c | 5 + > tests/qtest/test-filter-mirror.c | 5 + > tests/qtest/test-filter-redirector.c | 7 ++- > tests/qtest/virtio-blk-test.c| 24 ++--- > 8 files changed, 40 insertions(+), 57 deletions(-) > > diff --git a/tests/qtest/ahci-test.c b/tests/qtest/ahci-test.c index > 1967cd5898..abab761c26 100644 > --- a/tests/qtest/ahci-test.c > +++ b/tests/qtest/ahci-test.c > @@ -36,9 +36,6 @@ > #include "hw/pci/pci_ids.h" > #include "hw/pci/pci_regs.h" > > -/* TODO actually test the results and get rid of this */ -#define > qmp_discard_response(s, ...) qobject_unref(qtest_qmp(s, __VA_ARGS__)) > - > /* Test images sizes in MB */ > #define TEST_IMAGE_SIZE_MB_LARGE (200 * 1024) #define > TEST_IMAGE_SIZE_MB_SMALL 64 @@ -1595,9 +1592,9 @@ static void > test_atapi_tray(void) > rsp = qtest_qmp_receive(ahci->parent->qts); > qobject_unref(rsp); > > -qmp_discard_response(ahci->parent->qts, > - "{'execute': 'blockdev-remove-medium', " > - "'arguments': {'id': 'cd0'}}"); > +qtest_qmp_assert_success(ahci->parent->qts, > + "{'execute': 'blockdev-remove-medium', " > + "'arguments': {'id': 'cd0'}}"); > > /* Test the tray without a medium */ > ahci_atapi_load(ahci, port); > @@ -1607,16 +1604,18 @@ static void test_atapi_tray(void) > atapi_wait_tray(ahci, true); > > /* Re-insert media */ > -qmp_discard_response(ahci->parent->qts, > - "{'execute': 'blockdev-add', " > - "'arguments': {'node-name': 'node0', " > -"'driver': 'raw', " > -"'file': { 'driver': 'file', " > - "'filename': %s }}}", iso); > -qmp_discard_response(ahci->parent->qts, > - "{'execute': 'blockdev-insert-medium'," > - "'arguments': { 'id': 'cd0', " > - "'node-name': 'node0' }}"); > +qtest_qmp_assert_success( > +ahci->parent->qts, > +"{'execute': 'blockdev-add', " > +"'arguments': {'node-name': 'node0', " > + "'driver': 'raw', " > + "'file': { 'driver': 'file', " > +"'filename': %s }}}", iso); > +qtest_qmp_assert_success( > +ahci->parent->qts, > +"{'execute': 'blockdev-insert-medium'," > +"'arguments': { 'id': 'cd0', " > + "'node-name': 'node0' }}"); > > /* Again, the event shows up first */ > qtest_qmp_send(ahci->parent->qts, "{'execute': 'blockdev-close-tray', " > diff --git a/tests/qtest/boot-order-test.c b/tests/qtest/boot-order-test.c > index 0680d79d6d..8f2b6ef05a 100644 > --- a/tests/qtest/boot-order-test.c > +++ b/tests/qtest/boot-order-test.c > @@ -16,9 +16,6 @@ > #include "qapi/qmp/qdict.h" > #include "standard-headers/linux/qemu_fw_cfg.h" > > -/* TODO actually test the results and get rid of this */ -#define > qmp_discard_response(qs, ...) qobject_unref(qtest_qmp(qs, __VA_ARGS__)) > - > typedef struct { > const c
RE: [PATCH v2 4/4] configure: add --disable-colo-filters option
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Friday, April 21, 2023 4:53 PM > To: Zhang, Chen ; qemu-devel@nongnu.org > Cc: qemu-bl...@nongnu.org; michael.r...@amd.com; arm...@redhat.com; > ebl...@redhat.com; jasow...@redhat.com; quint...@redhat.com; Zhang, > Hailiang ; phi...@linaro.org; > th...@redhat.com; berra...@redhat.com; marcandre.lur...@redhat.com; > pbonz...@redhat.com; d...@treblig.org; hre...@redhat.com; > kw...@redhat.com; lizhij...@fujitsu.com > Subject: Re: [PATCH v2 4/4] configure: add --disable-colo-filters option > > On 21.04.23 05:22, Zhang, Chen wrote: > > > > > >> -Original Message- > >> From: Vladimir Sementsov-Ogievskiy > >> Sent: Thursday, April 20, 2023 7:26 PM > >> To: Zhang, Chen ; qemu-devel@nongnu.org > >> Cc: qemu-bl...@nongnu.org; michael.r...@amd.com; > arm...@redhat.com; > >> ebl...@redhat.com; jasow...@redhat.com; quint...@redhat.com; > Zhang, > >> Hailiang ; phi...@linaro.org; > >> th...@redhat.com; berra...@redhat.com; > marcandre.lur...@redhat.com; > >> pbonz...@redhat.com; d...@treblig.org; hre...@redhat.com; > >> kw...@redhat.com; lizhij...@fujitsu.com > >> Subject: Re: [PATCH v2 4/4] configure: add --disable-colo-filters > >> option > >> > >> On 20.04.23 12:09, Zhang, Chen wrote: > >>> > >>> > >>>> -Original Message- > >>>> From: Vladimir Sementsov-Ogievskiy > >>>> Sent: Thursday, April 20, 2023 6:53 AM > >>>> To: qemu-devel@nongnu.org > >>>> Cc: qemu-bl...@nongnu.org; michael.r...@amd.com; > >> arm...@redhat.com; > >>>> ebl...@redhat.com; jasow...@redhat.com; quint...@redhat.com; > >> Zhang, > >>>> Hailiang ; phi...@linaro.org; > >>>> th...@redhat.com; berra...@redhat.com; > >> marcandre.lur...@redhat.com; > >>>> pbonz...@redhat.com; d...@treblig.org; hre...@redhat.com; > >>>> kw...@redhat.com; Zhang, Chen ; > >>>> lizhij...@fujitsu.com; Vladimir Sementsov-Ogievskiy > >>>> > >>>> Subject: [PATCH v2 4/4] configure: add --disable-colo-filters > >>>> option > >>>> > >>>> Add option to not build COLO Proxy subsystem if it is not needed. > >>> > >>> I think no need to add the --disable-colo-filter option. > >>> Net-filters just general infrastructure. Another example is COLO > >>> also use the -chardev socket to connect each filters. No need to add > >>> the -- > >> disable-colo-chardev > >>> Please drop this patch. > >> > >> I don't follow your reasoning. Of course, we don't need any option > >> like this, and can simply include to build all the components we > >> don't use. So "no need" is correct for any --disable-* option. > >> Still, I think, it's good, when you can exclude from build the > >> subsystems that you don't need / don't want to maintain or ship to your > end users. > > > > Yes, I agree with your idea. > > > >> > >> In MAINTAINERS these two filters are in "COLO Proxy" susbsystem. Is > >> it correct? What's wrong with adding an option to not build this > >> subsystem? I can rename it to --disable-colo-proxy. > > > > The history is COLO project contributed QEMU filter framework and filter- > mirror/redirector...etc.. > > And the "COLO Proxy" susbsystem covered the filters do not means it > belong to COLO. > > So, It is unreasonable to tell users that you cannot use > > filter-mirror/rediretor for network debugging Or other purpose because > you have not enabled COLO proxy. > > But we don't say it to users, as these filters are enabled by default. > > But I see your point. And looking at filter-mirror.c I see that there is no > relations with colo. Can't say this about filter-rewriter.c > > So, absolutely correct would be just have separate options > > --disable-net-filter-mirror > --disable-net-filter-rewriter > > and for any other filter we want to be "disable-able", like options for block > drivers (I mean --disable-parallels, --disable-qcow1, --disable-qed, etc for > files describing these drivers in block/) > Yes. > > > > >> > >>> But for COLO network part, still something need to do: > >>> You can add --disable-colo-proxy not to build the net/colo-compare.c > >>> if it is > >> not needed. > >>> This file just for COLO and not belong
RE: [PATCH v2 3/4] build: move COLO under CONFIG_REPLICATION
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Friday, April 21, 2023 4:36 PM > To: Zhang, Chen ; qemu-devel@nongnu.org > Cc: qemu-bl...@nongnu.org; michael.r...@amd.com; arm...@redhat.com; > ebl...@redhat.com; jasow...@redhat.com; quint...@redhat.com; Zhang, > Hailiang ; phi...@linaro.org; > th...@redhat.com; berra...@redhat.com; marcandre.lur...@redhat.com; > pbonz...@redhat.com; d...@treblig.org; hre...@redhat.com; > kw...@redhat.com; lizhij...@fujitsu.com > Subject: Re: [PATCH v2 3/4] build: move COLO under CONFIG_REPLICATION > > On 21.04.23 06:02, Zhang, Chen wrote: > > > > > >> -Original Message- > >> From: Vladimir Sementsov-Ogievskiy > >> Sent: Thursday, April 20, 2023 6:53 AM > >> To: qemu-devel@nongnu.org > >> Cc: qemu-bl...@nongnu.org; michael.r...@amd.com; > arm...@redhat.com; > >> ebl...@redhat.com; jasow...@redhat.com; quint...@redhat.com; > Zhang, > >> Hailiang ; phi...@linaro.org; > >> th...@redhat.com; berra...@redhat.com; > marcandre.lur...@redhat.com; > >> pbonz...@redhat.com; d...@treblig.org; hre...@redhat.com; > >> kw...@redhat.com; Zhang, Chen ; > >> lizhij...@fujitsu.com; Vladimir Sementsov-Ogievskiy > >> > >> Subject: [PATCH v2 3/4] build: move COLO under CONFIG_REPLICATION > >> > >> We don't allow to use x-colo capability when replication is not > >> configured. So, no reason to build COLO when replication is disabled, > >> it's unusable in this case. > > > > Yes, you are right for current status. Because COLO best practices is > replication + colo live migration + colo proxy. > > But doesn't mean it has to be done in all scenarios as I explanation in V1. > > The better way is allow to use x-colo capability firstly, and separate > > this patch with two config options: --disable-replication and --disable-x- > colo. > > > > But what for? We for sure don't have such scenarios now (COLO without > replication), as it's not allowed by far 7e934f5b27eee1b0d7 (by you and > David). > > If you think we need such scenario, I think it should be a separate series > which reverts 7e934f5b27eee1b0d7 and adds corresponding test and > probably documentation. In the patch 7e934f5b27eee1b0d7 said it's for current independent disk mode, And what we talked about before is the shared disk mode. Rethink about the COLO shared disk mode, this feature still needs some enabling works. It looks OK for now and separate the build options when enabling COLO shared disk mode. Thanks Chen > > > -- > Best regards, > Vladimir
RE: [PATCH v2 3/4] build: move COLO under CONFIG_REPLICATION
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Thursday, April 20, 2023 6:53 AM > To: qemu-devel@nongnu.org > Cc: qemu-bl...@nongnu.org; michael.r...@amd.com; arm...@redhat.com; > ebl...@redhat.com; jasow...@redhat.com; quint...@redhat.com; Zhang, > Hailiang ; phi...@linaro.org; > th...@redhat.com; berra...@redhat.com; marcandre.lur...@redhat.com; > pbonz...@redhat.com; d...@treblig.org; hre...@redhat.com; > kw...@redhat.com; Zhang, Chen ; > lizhij...@fujitsu.com; Vladimir Sementsov-Ogievskiy team.ru> > Subject: [PATCH v2 3/4] build: move COLO under CONFIG_REPLICATION > > We don't allow to use x-colo capability when replication is not configured. > So, > no reason to build COLO when replication is disabled, it's unusable in this > case. Yes, you are right for current status. Because COLO best practices is replication + colo live migration + colo proxy. But doesn't mean it has to be done in all scenarios as I explanation in V1. The better way is allow to use x-colo capability firstly, and separate this patch with two config options: --disable-replication and --disable-x-colo. Thanks Chen > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > hmp-commands.hx| 2 ++ > migration/colo.c | 6 + > migration/meson.build | 6 +++-- > migration/migration-hmp-cmds.c | 2 ++ > migration/migration.c | 19 +++--- > net/meson.build| 5 +++- > qapi/migration.json| 12 ++--- > stubs/colo.c | 47 ++ > stubs/meson.build | 1 + > 9 files changed, 78 insertions(+), 22 deletions(-) create mode 100644 > stubs/colo.c > > diff --git a/hmp-commands.hx b/hmp-commands.hx index > bb85ee1d26..fbd0932232 100644 > --- a/hmp-commands.hx > +++ b/hmp-commands.hx > @@ -1035,6 +1035,7 @@ SRST >migration (or once already in postcopy). > ERST > > +#ifdef CONFIG_REPLICATION > { > .name = "x_colo_lost_heartbeat", > .args_type = "", > @@ -1043,6 +1044,7 @@ ERST >"a failover or takeover is needed.", > .cmd = hmp_x_colo_lost_heartbeat, > }, > +#endif > > SRST > ``x_colo_lost_heartbeat`` > diff --git a/migration/colo.c b/migration/colo.c index > 0716e64689..089c491d70 100644 > --- a/migration/colo.c > +++ b/migration/colo.c > @@ -196,6 +196,12 @@ COLOMode get_colo_mode(void) > } > } > > +bool migrate_colo_enabled(void) > +{ > +MigrationState *s = migrate_get_current(); > +return s->enabled_capabilities[MIGRATION_CAPABILITY_X_COLO]; > +} > + > void colo_do_failover(void) > { > /* Make sure VM stopped while failover happened. */ diff --git > a/migration/meson.build b/migration/meson.build index > 0d1bb9f96e..3fccf79f12 100644 > --- a/migration/meson.build > +++ b/migration/meson.build > @@ -13,8 +13,6 @@ softmmu_ss.add(files( >'block-dirty-bitmap.c', >'channel.c', >'channel-block.c', > - 'colo-failover.c', > - 'colo.c', >'exec.c', >'fd.c', >'global_state.c', > @@ -29,6 +27,10 @@ softmmu_ss.add(files( >'threadinfo.c', > ), gnutls) > > +if get_option('replication').allowed() > + softmmu_ss.add(files('colo-failover.c', 'colo.c')) endif > + > softmmu_ss.add(when: rdma, if_true: files('rdma.c')) if > get_option('live_block_migration').allowed() >softmmu_ss.add(files('block.c')) > diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp- > cmds.c index 72519ea99f..4601c48f41 100644 > --- a/migration/migration-hmp-cmds.c > +++ b/migration/migration-hmp-cmds.c > @@ -640,6 +640,7 @@ void hmp_migrate_start_postcopy(Monitor *mon, > const QDict *qdict) > hmp_handle_error(mon, err); > } > > +#ifdef CONFIG_REPLICATION > void hmp_x_colo_lost_heartbeat(Monitor *mon, const QDict *qdict) { > Error *err = NULL; > @@ -647,6 +648,7 @@ void hmp_x_colo_lost_heartbeat(Monitor *mon, > const QDict *qdict) > qmp_x_colo_lost_heartbeat(); > hmp_handle_error(mon, err); > } > +#endif > > typedef struct HMPMigrationStatus { > QEMUTimer *timer; > diff --git a/migration/migration.c b/migration/migration.c index > bda4789193..2382958364 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -165,7 +165,9 @@ > INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot, > MIGRATION_CAPABILITY_RDMA_PIN_ALL, > MIGRATION_CAPABILITY_COMPRESS, > MIGRATION_CAPABILITY_XBZRLE, > +#ifdef CONFIG_REPLICATION > MIGRATION_CAPABILITY_X_COLO, > +#endif > MIGRATION_C
RE: [PATCH v2 4/4] configure: add --disable-colo-filters option
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Thursday, April 20, 2023 7:26 PM > To: Zhang, Chen ; qemu-devel@nongnu.org > Cc: qemu-bl...@nongnu.org; michael.r...@amd.com; arm...@redhat.com; > ebl...@redhat.com; jasow...@redhat.com; quint...@redhat.com; Zhang, > Hailiang ; phi...@linaro.org; > th...@redhat.com; berra...@redhat.com; marcandre.lur...@redhat.com; > pbonz...@redhat.com; d...@treblig.org; hre...@redhat.com; > kw...@redhat.com; lizhij...@fujitsu.com > Subject: Re: [PATCH v2 4/4] configure: add --disable-colo-filters option > > On 20.04.23 12:09, Zhang, Chen wrote: > > > > > >> -Original Message- > >> From: Vladimir Sementsov-Ogievskiy > >> Sent: Thursday, April 20, 2023 6:53 AM > >> To: qemu-devel@nongnu.org > >> Cc: qemu-bl...@nongnu.org; michael.r...@amd.com; > arm...@redhat.com; > >> ebl...@redhat.com; jasow...@redhat.com; quint...@redhat.com; > Zhang, > >> Hailiang ; phi...@linaro.org; > >> th...@redhat.com; berra...@redhat.com; > marcandre.lur...@redhat.com; > >> pbonz...@redhat.com; d...@treblig.org; hre...@redhat.com; > >> kw...@redhat.com; Zhang, Chen ; > >> lizhij...@fujitsu.com; Vladimir Sementsov-Ogievskiy > >> > >> Subject: [PATCH v2 4/4] configure: add --disable-colo-filters option > >> > >> Add option to not build COLO Proxy subsystem if it is not needed. > > > > I think no need to add the --disable-colo-filter option. > > Net-filters just general infrastructure. Another example is COLO also > > use the -chardev socket to connect each filters. No need to add the -- > disable-colo-chardev > > Please drop this patch. > > I don't follow your reasoning. Of course, we don't need any option like this, > and can simply include to build all the components we don't use. So "no > need" is correct for any --disable-* option. > Still, I think, it's good, when you can exclude from build the subsystems that > you don't need / don't want to maintain or ship to your end users. Yes, I agree with your idea. > > In MAINTAINERS these two filters are in "COLO Proxy" susbsystem. Is it > correct? What's wrong with adding an option to not build this subsystem? I > can rename it to --disable-colo-proxy. The history is COLO project contributed QEMU filter framework and filter-mirror/redirector...etc.. And the "COLO Proxy" susbsystem covered the filters do not means it belong to COLO. So, It is unreasonable to tell users that you cannot use filter-mirror/rediretor for network debugging Or other purpose because you have not enabled COLO proxy. > > > But for COLO network part, still something need to do: > > You can add --disable-colo-proxy not to build the net/colo-compare.c if it > > is > not needed. > > This file just for COLO and not belong to network filters. > > net/colo-compare.c is used only only for COLO, which in turn used only with > CONFIG_REPLICATION enabled, see patch 3. So, no reason to add separate > option for it, it should be disabled with --disable-replication. Yes, and as Lukas said, COLO is the only user for the filter-rewriter currently. It is more appropriate to add filter-rewriter replace the filter-mirror here. I saw the patch 3, it is better to move it to this patch. Thanks Chen > > -- > Best regards, > Vladimir
RE: [PATCH v2 4/4] configure: add --disable-colo-filters option
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Thursday, April 20, 2023 6:53 AM > To: qemu-devel@nongnu.org > Cc: qemu-bl...@nongnu.org; michael.r...@amd.com; arm...@redhat.com; > ebl...@redhat.com; jasow...@redhat.com; quint...@redhat.com; Zhang, > Hailiang ; phi...@linaro.org; > th...@redhat.com; berra...@redhat.com; marcandre.lur...@redhat.com; > pbonz...@redhat.com; d...@treblig.org; hre...@redhat.com; > kw...@redhat.com; Zhang, Chen ; > lizhij...@fujitsu.com; Vladimir Sementsov-Ogievskiy team.ru> > Subject: [PATCH v2 4/4] configure: add --disable-colo-filters option > > Add option to not build COLO Proxy subsystem if it is not needed. I think no need to add the --disable-colo-filter option. Net-filters just general infrastructure. Another example is COLO also use the -chardev socket to connect each filters. No need to add the --disable-colo-chardev Please drop this patch. But for COLO network part, still something need to do: You can add --disable-colo-proxy not to build the net/colo-compare.c if it is not needed. This file just for COLO and not belong to network filters. Thanks Chen > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > meson.build | 1 + > meson_options.txt | 2 ++ > net/meson.build | 11 --- > scripts/meson-buildoptions.sh | 3 +++ > 4 files changed, 14 insertions(+), 3 deletions(-) > > diff --git a/meson.build b/meson.build > index c44d05a13f..5b2fdfbd3a 100644 > --- a/meson.build > +++ b/meson.build > @@ -1962,6 +1962,7 @@ config_host_data.set('CONFIG_GPROF', > get_option('gprof')) > config_host_data.set('CONFIG_LIVE_BLOCK_MIGRATION', > get_option('live_block_migration').allowed()) > config_host_data.set('CONFIG_QOM_CAST_DEBUG', > get_option('qom_cast_debug')) > config_host_data.set('CONFIG_REPLICATION', > get_option('replication').allowed()) > +config_host_data.set('CONFIG_COLO_FILTERS', > +get_option('colo_filters').allowed()) > > # has_header > config_host_data.set('CONFIG_EPOLL', cc.has_header('sys/epoll.h')) diff --git > a/meson_options.txt b/meson_options.txt index fc9447d267..ffe81317cb > 100644 > --- a/meson_options.txt > +++ b/meson_options.txt > @@ -289,6 +289,8 @@ option('live_block_migration', type: 'feature', value: > 'auto', > description: 'block migration in the main migration stream') > option('replication', type: 'feature', value: 'auto', > description: 'replication support') > +option('colo_filters', type: 'feature', value: 'auto', > + description: 'colo_filters support') > option('bochs', type: 'feature', value: 'auto', > description: 'bochs image format support') option('cloop', type: > 'feature', > value: 'auto', diff --git a/net/meson.build b/net/meson.build index > 38d50b8c96..7e54744aea 100644 > --- a/net/meson.build > +++ b/net/meson.build > @@ -1,12 +1,9 @@ > softmmu_ss.add(files( >'announce.c', >'checksum.c', > - 'colo.c', >'dump.c', >'eth.c', >'filter-buffer.c', > - 'filter-mirror.c', > - 'filter-rewriter.c', >'filter.c', >'hub.c', >'net-hmp-cmds.c', > @@ -22,6 +19,14 @@ if get_option('replication').allowed() >softmmu_ss.add(files('colo-compare.c')) > endif > > +if get_option('replication').allowed() or > +get_option('colo_filters').allowed() > + softmmu_ss.add(files('colo.c')) > +endif > + > +if get_option('colo_filters').allowed() > + softmmu_ss.add(files('filter-mirror.c', 'filter-rewriter.c')) endif > + > softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('filter-replay.c')) > > if have_l2tpv3 > diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh > index 009fab1515..cf9d23369f 100644 > --- a/scripts/meson-buildoptions.sh > +++ b/scripts/meson-buildoptions.sh > @@ -83,6 +83,7 @@ meson_options_help() { >printf "%s\n" ' capstoneWhether and how to find the capstone > library' >printf "%s\n" ' cloop cloop image format support' >printf "%s\n" ' cocoa Cocoa user interface (macOS only)' > + printf "%s\n" ' colo-filterscolo_filters support' >printf "%s\n" ' coreaudio CoreAudio sound support' >printf "%s\n" ' crypto-afalgLinux AF_ALG crypto backend driver' >printf "%s\n" ' curlCURL block device driver' > @@ -236,6 +237,8 @@ _meson_option_parse() { > --disable-cloop) printf "%s" -Dcloop=disabled ;; > --enable-cocoa) printf "%s" -Dcocoa=enabled ;; > --disable-cocoa) printf "%s" -Dcocoa=disabled ;; > +--enable-colo-filters) printf "%s" -Dcolo_filters=enabled ;; > +--disable-colo-filters) printf "%s" -Dcolo_filters=disabled ;; > --enable-coreaudio) printf "%s" -Dcoreaudio=enabled ;; > --disable-coreaudio) printf "%s" -Dcoreaudio=disabled ;; > --enable-coroutine-pool) printf "%s" -Dcoroutine_pool=true ;; > -- > 2.34.1
RE: [PATCH] replication: compile out some staff when replication is not configured
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Friday, April 14, 2023 5:51 PM > To: Zhang, Chen ; qemu-devel@nongnu.org > Cc: qemu-bl...@nongnu.org; pbonz...@redhat.com; arm...@redhat.com; > ebl...@redhat.com; jasow...@redhat.com; dgilb...@redhat.com; > quint...@redhat.com; hre...@redhat.com; kw...@redhat.com; Zhang, > Hailiang ; lizhij...@fujitsu.com; > wencongya...@huawei.com; xiechanglon...@gmail.com; den- > plotni...@yandex-team.ru > Subject: Re: [PATCH] replication: compile out some staff when replication is > not configured > > On 14.04.23 04:24, Zhang, Chen wrote: > >> So, if I want to have an option to disable all COLO modules, do you > >> mean it should be additional --disable-colo option? Or better keep > >> one option -- disable-replication (and, maybe just rename to to --disable- > colo)? > > I think keep the option --disable-replication is enough. > > Generally speaking, these three modules do not belong to COLO, It has > been decoupled at the time of design. > > Instead, COLO is formed when these three modules are used in > combination. > > But it's not enough to me, I want to have a possibility to not build the > subsystem I don't need. As I said, COLO not a specific subsystem, It is a usage of three general subsystems. Let's back to this patch, it try to not build block replication when not configured. It's OK. Although COLO may be the only user of replication, but can't assume all the COLO used subsystem not needed, even have a --disable-colo. For example in this patch disabled the net/filter-mirror/redirector Qemu network filter is a general framework with many submodules: filter-buffer/replay/mirror/rewriter. Logically speaking, It is completely irrelevant with COLO. Thanks Chen > > -- > Best regards, > Vladimir
RE: [PATCH] replication: compile out some staff when replication is not configured
> -Original Message- > From: qemu-devel-bounces+chen.zhang=intel@nongnu.org devel-bounces+chen.zhang=intel@nongnu.org> On Behalf Of Vladimir > Sementsov-Ogievskiy > Sent: Thursday, April 13, 2023 9:47 PM > To: Zhang, Chen ; qemu-devel@nongnu.org > Cc: qemu-bl...@nongnu.org; pbonz...@redhat.com; arm...@redhat.com; > ebl...@redhat.com; jasow...@redhat.com; dgilb...@redhat.com; > quint...@redhat.com; hre...@redhat.com; kw...@redhat.com; Zhang, > Hailiang ; lizhij...@fujitsu.com; > wencongya...@huawei.com; xiechanglon...@gmail.com; den- > plotni...@yandex-team.ru > Subject: Re: [PATCH] replication: compile out some staff when replication is > not configured > > On 13.04.23 12:52, Zhang, Chen wrote: > > > > > >> -Original Message- > >> From: Vladimir Sementsov-Ogievskiy > >> Sent: Tuesday, April 11, 2023 10:51 PM > >> To: qemu-devel@nongnu.org > >> Cc: qemu-bl...@nongnu.org; pbonz...@redhat.com; > arm...@redhat.com; > >> ebl...@redhat.com; jasow...@redhat.com; dgilb...@redhat.com; > >> quint...@redhat.com; hre...@redhat.com; kw...@redhat.com; Zhang, > >> Hailiang ; Zhang, Chen > >> ; lizhij...@fujitsu.com; > >> wencongya...@huawei.com; xiechanglon...@gmail.com; den- > >> plotni...@yandex-team.ru; Vladimir Sementsov-Ogievskiy > >> > >> Subject: [PATCH] replication: compile out some staff when replication > >> is not configured > >> > >> Don't compile-in replication-related files when replication is > >> disabled in config. > >> > >> Signed-off-by: Vladimir Sementsov-Ogievskiy > >> > >> --- > >> > >> Hi all! > >> > >> I'm unsure, should it be actually separate --disable-colo / > >> --enable-colo options or it's really used only together with > >> replication staff.. So, I decided to start with simpler variant. > >> > > > > For replication, I think there's nothing wrong with the idea. > > But not so for COLO. COLO project consists of three independent parts: > Replication, migration, net-proxy. > > Each one have ability to run alone for other proposals. For example we > > can just run filter-mirror/redirector for networking Analysis/debugs. > > Although the best practice of COLO is to make the three modules work > together, in fact, we can also use only some modules of COLO for other > usage scenarios. Like COLO migration + net-proxy for shared disk, etc... > > So I think no need to disable all COLO related modules when replication is > not configured. > > For details: > > https://wiki.qemu.org/Features/COLO > > > > So, if I want to have an option to disable all COLO modules, do you mean it > should be additional --disable-colo option? Or better keep one option -- > disable-replication (and, maybe just rename to to --disable-colo)? I think keep the option --disable-replication is enough. Generally speaking, these three modules do not belong to COLO, It has been decoupled at the time of design. Instead, COLO is formed when these three modules are used in combination. Thanks Chen > > > Thanks > > Chen > > > >> > >> block/meson.build | 2 +- > >> migration/meson.build | 6 -- > >> net/meson.build | 8 > >> qapi/migration.json | 6 -- > >> stubs/colo.c | 46 > +++ > >> stubs/meson.build | 1 + > >> 6 files changed, 60 insertions(+), 9 deletions(-) create mode > >> 100644 stubs/colo.c > >> > >> diff --git a/block/meson.build b/block/meson.build index > >> 382bec0e7d..b9a72e219b 100644 > >> --- a/block/meson.build > >> +++ b/block/meson.build > >> @@ -84,7 +84,7 @@ block_ss.add(when: 'CONFIG_WIN32', if_true: > >> files('file- win32.c', 'win32-aio.c') > >> block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), > >> coref, > iokit]) > >> block_ss.add(when: libiscsi, if_true: files('iscsi-opts.c')) > >> block_ss.add(when: 'CONFIG_LINUX', if_true: files('nvme.c')) -if > >> not > >> get_option('replication').disabled() > >> +if get_option('replication').allowed() > >> block_ss.add(files('replication.c')) > >> endif > >> block_ss.add(when: libaio, if_true: files('linux-aio.c')) diff > >> --git a/migration/meson.build b/migration/meson.build index > >> 0d1bb9f96e..8180eaea7b 100644 > >> --- a/migration/meson.build > >> +++ b/migration/meson.build > >> @@ -13,8 +13,6 @@ softm
RE: [PATCH] replication: compile out some staff when replication is not configured
> -Original Message- > From: Vladimir Sementsov-Ogievskiy > Sent: Tuesday, April 11, 2023 10:51 PM > To: qemu-devel@nongnu.org > Cc: qemu-bl...@nongnu.org; pbonz...@redhat.com; arm...@redhat.com; > ebl...@redhat.com; jasow...@redhat.com; dgilb...@redhat.com; > quint...@redhat.com; hre...@redhat.com; kw...@redhat.com; Zhang, > Hailiang ; Zhang, Chen > ; lizhij...@fujitsu.com; > wencongya...@huawei.com; xiechanglon...@gmail.com; den- > plotni...@yandex-team.ru; Vladimir Sementsov-Ogievskiy > > Subject: [PATCH] replication: compile out some staff when replication is not > configured > > Don't compile-in replication-related files when replication is disabled in > config. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > > Hi all! > > I'm unsure, should it be actually separate --disable-colo / --enable-colo > options or it's really used only together with replication staff.. So, I > decided > to start with simpler variant. > For replication, I think there's nothing wrong with the idea. But not so for COLO. COLO project consists of three independent parts: Replication, migration, net-proxy. Each one have ability to run alone for other proposals. For example we can just run filter-mirror/redirector for networking Analysis/debugs. Although the best practice of COLO is to make the three modules work together, in fact, we can also use only some modules of COLO for other usage scenarios. Like COLO migration + net-proxy for shared disk, etc... So I think no need to disable all COLO related modules when replication is not configured. For details: https://wiki.qemu.org/Features/COLO Thanks Chen > > block/meson.build | 2 +- > migration/meson.build | 6 -- > net/meson.build | 8 > qapi/migration.json | 6 -- > stubs/colo.c | 46 +++ > stubs/meson.build | 1 + > 6 files changed, 60 insertions(+), 9 deletions(-) create mode 100644 > stubs/colo.c > > diff --git a/block/meson.build b/block/meson.build index > 382bec0e7d..b9a72e219b 100644 > --- a/block/meson.build > +++ b/block/meson.build > @@ -84,7 +84,7 @@ block_ss.add(when: 'CONFIG_WIN32', if_true: files('file- > win32.c', 'win32-aio.c') > block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), coref, > iokit]) > block_ss.add(when: libiscsi, if_true: files('iscsi-opts.c')) > block_ss.add(when: 'CONFIG_LINUX', if_true: files('nvme.c')) -if not > get_option('replication').disabled() > +if get_option('replication').allowed() >block_ss.add(files('replication.c')) > endif > block_ss.add(when: libaio, if_true: files('linux-aio.c')) diff --git > a/migration/meson.build b/migration/meson.build index > 0d1bb9f96e..8180eaea7b 100644 > --- a/migration/meson.build > +++ b/migration/meson.build > @@ -13,8 +13,6 @@ softmmu_ss.add(files( >'block-dirty-bitmap.c', >'channel.c', >'channel-block.c', > - 'colo-failover.c', > - 'colo.c', >'exec.c', >'fd.c', >'global_state.c', > @@ -29,6 +27,10 @@ softmmu_ss.add(files( >'threadinfo.c', > ), gnutls) > > +if get_option('replication').allowed() > + softmmu_ss.add(files('colo.c', 'colo-failover.c')) endif > + > softmmu_ss.add(when: rdma, if_true: files('rdma.c')) if > get_option('live_block_migration').allowed() >softmmu_ss.add(files('block.c')) > diff --git a/net/meson.build b/net/meson.build index > 87afca3e93..634ab71cc6 100644 > --- a/net/meson.build > +++ b/net/meson.build > @@ -1,13 +1,9 @@ > softmmu_ss.add(files( >'announce.c', >'checksum.c', > - 'colo-compare.c', > - 'colo.c', >'dump.c', >'eth.c', >'filter-buffer.c', > - 'filter-mirror.c', > - 'filter-rewriter.c', >'filter.c', >'hub.c', >'net-hmp-cmds.c', > @@ -19,6 +15,10 @@ softmmu_ss.add(files( >'util.c', > )) > > +if get_option('replication').allowed() > + softmmu_ss.add(files('colo-compare.c', 'colo.c', 'filter-rewriter.c', > +'filter-mirror.c')) endif > + > softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('filter-replay.c')) > > if have_l2tpv3 > diff --git a/qapi/migration.json b/qapi/migration.json index > c84fa10e86..5b81e09369 100644 > --- a/qapi/migration.json > +++ b/qapi/migration.json > @@ -1685,7 +1685,8 @@ > ## > { 'struct': 'COLOStatus', >'data': { 'mode': 'COLOMode', 'last-mode': 'COLOMode', > -'reason': 'COLOExitReason' } } > +'reason': 'COLOExitReason' }, > + 'if': 'CONFIG_REPLICATION' } > > ## > # @query-colo-status: > @@ -1702,7 +1703,8 @@ > # Since: 3.1 > ## > { 'command': 'query-colo-status', > - 'returns': 'COLOStatus' } >
RE: [PATCH] MAINTAINERS: Remove and change David Gilbert maintainer entries
> -Original Message- > From: qemu-devel-bounces+chen.zhang=intel@nongnu.org devel-bounces+chen.zhang=intel@nongnu.org> On Behalf Of Dr. David > Alan Gilbert (git) > Sent: Thursday, March 30, 2023 5:55 PM > To: qemu-devel@nongnu.org; stefa...@redhat.com; quint...@redhat.com; > d...@treblig.org > Cc: peter.mayd...@linaro.org; alex.ben...@linaro.org > Subject: [PATCH] MAINTAINERS: Remove and change David Gilbert > maintainer entries > > From: "Dr. David Alan Gilbert" > > I'm leaving Red Hat next week, so clean up the maintainer entries. > > 'virtiofs' is just the device code now, so is pretty small, and Stefan is > still a > maintainer there. > > 'migration' still has Juan. Reviewed-by: Zhang Chen Thanks for the guidance and help with the migration/HMP/QMP... over the years. Good luck Dave. > > For 'HMP' I'll swing that over to my personal email. > > Signed-off-by: Dr. David Alan Gilbert > --- > MAINTAINERS | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/MAINTAINERS b/MAINTAINERS > index ef45b5e71e..f0f7fb3746 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -2119,7 +2119,6 @@ T: git https://github.com/borntraeger/qemu.git > s390-next > L: qemu-s3...@nongnu.org > > virtiofs > -M: Dr. David Alan Gilbert > M: Stefan Hajnoczi > S: Supported > F: hw/virtio/vhost-user-fs* > @@ -2863,7 +2862,7 @@ F: tests/unit/test-rcu-*.c > F: util/rcu.c > > Human Monitor (HMP) > -M: Dr. David Alan Gilbert > +M: Dr. David Alan Gilbert > S: Maintained > F: monitor/monitor-internal.h > F: monitor/misc.c > @@ -3136,7 +3135,6 @@ F: scripts/checkpatch.pl > > Migration > M: Juan Quintela > -M: Dr. David Alan Gilbert > S: Maintained > F: hw/core/vmstate-if.c > F: include/hw/vmstate-if.h > -- > 2.39.2 >
RE: [PATCH 00/14] migration/ram.c: Refactor compress code
> -Original Message- > From: qemu-devel-bounces+chen.zhang=intel@nongnu.org devel-bounces+chen.zhang=intel@nongnu.org> On Behalf Of Lukas > Straub > Sent: Monday, April 3, 2023 1:56 AM > To: qemu-devel > Cc: Dr. David Alan Gilbert ; Juan Quintela > ; Peter Xu > Subject: [PATCH 00/14] migration/ram.c: Refactor compress code > > This series refactors the ram compress code. > > It first removes ram.c dependencies from the core compress code, then > moves it out to its own file. Finally, on the migration destination side the > initialisation and cleanup of compress threads is moved out of ram.c to > migration.c. This allows using COLO with compress enabled. > Nice to see this optimization~ Support the ram compress feature will reduce COLO checkpoint network bandwidth requirements And may improve COLO protected VM's performance. Thanks Chen > This series is based on the following series: > https://lore.kernel.org/qemu- > devel/af76761aa6978071c5b8e9b872b697db465a5520.1680457631.git.lukasstr > a...@web.de/T/#t > > Lukas Straub (14): > ram.c: Let the compress threads return a CompressResult enum > ram.c: Dont change param->block in the compress thread > ram.c: Reset result after sending queued data > ram.c: Do not call save_page_header() from compress threads > ram.c: Call update_compress_thread_counts from > compress_send_queued_data > ram.c: Remove last ram.c dependency from the core compress code > ram.c: Introduce whitespace (squash with next patch) > ram.c: Move core compression code into its own file > ram.c: Remove whitespace (squash with previous patch) > ram.c: Move core decompression code into its own file > ram compress: Assert that the file buffer matches the result > ram.c: Remove unused include after moving out code > ram-compress.c: Make target independent > migration: Initialize and cleanup decompression in migration.c > > migration/meson.build| 5 +- > migration/migration.c| 9 + > migration/qemu-file.c| 11 + > migration/qemu-file.h| 1 + > migration/ram-compress.c | 483 > ++ > migration/ram-compress.h | 70 ++ > migration/ram.c | 490 +++ > 7 files changed, 615 insertions(+), 454 deletions(-) create mode 100644 > migration/ram-compress.c create mode 100644 migration/ram-compress.h > > -- > 2.30.2
RE: [PATCH] vfio/migration: Fix wrong enum usage
> -Original Message- > From: Qemu-devel > On Behalf Of Avihai Horon > Sent: Sunday, October 16, 2022 4:58 PM > To: qemu-devel@nongnu.org; Alex Williamson > ; Kunkun Jiang > Cc: Avihai Horon > Subject: [PATCH] vfio/migration: Fix wrong enum usage > > vfio_migration_init() initializes VFIOMigration->device_state using enum of > VFIO migration protocol v2. Current implemented protocol is v1 so v1 enum > should be used. Fix it. > > Fixes: 429c72800654 ("vfio/migration: Fix incorrect initialization value for > parameters in VFIOMigration") > Signed-off-by: Avihai Horon Looks good to me. Reviewed-by: Zhang Chen But I found nowhere using the enum of VFIO migration protocol v2 (vfio_device_mig_state). I don't know more about the background, should we remove the redundancy definition or add full support for the VFIO migration protocol v2 ? Thanks Chen > --- > hw/vfio/migration.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index > d9598ce070..8dbbfa2c56 100644 > --- a/hw/vfio/migration.c > +++ b/hw/vfio/migration.c > @@ -803,7 +803,7 @@ static int vfio_migration_init(VFIODevice *vbasedev, > } > > vbasedev->migration = g_new0(VFIOMigration, 1); > -vbasedev->migration->device_state = VFIO_DEVICE_STATE_RUNNING; > +vbasedev->migration->device_state = VFIO_DEVICE_STATE_V1_RUNNING; > vbasedev->migration->vm_running = runstate_is_running(); > > ret = vfio_region_setup(obj, vbasedev, >migration->region, > -- > 2.21.3 >
RE: [PATCH v2] migration/channel-block: fix return value for qio_channel_block_{readv, writev}
> -Original Message- > From: Qemu-devel > On Behalf Of Fiona Ebner > Sent: Thursday, October 13, 2022 4:41 PM > To: qemu-devel@nongnu.org > Cc: dgilb...@redhat.com; quint...@redhat.com; berra...@redhat.com > Subject: [PATCH v2] migration/channel-block: fix return value for > qio_channel_block_{readv, writev} > > in the error case. The documentation in include/io/channel.h states that -1 or > QIO_CHANNEL_ERR_BLOCK should be returned upon error. Simply passing > along the return value from the bdrv-functions has the potential to confuse > the call sides. Non-blocking mode is not implemented currently, so -1 it is. > > Signed-off-by: Fiona Ebner LGTM. Reviewed-by: Zhang Chen But I found the same problem elsewhere, for example: "qio_channel_rdma_writev/readv", "qio_channel_buffer_writev/readv"...etc... Can you send other patches to fix it? Thanks Chen > --- > > v1 -> v2: > * Use error_setg_errno() instead of error_setg(). > * Use "failed" instead of "returned error" in error message. Now > that no numerical error code is used, this sounds a bit nicer > IMHO. > > migration/channel-block.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/migration/channel-block.c b/migration/channel-block.c index > c55c8c93ce..f4ab53acdb 100644 > --- a/migration/channel-block.c > +++ b/migration/channel-block.c > @@ -62,7 +62,8 @@ qio_channel_block_readv(QIOChannel *ioc, > qemu_iovec_init_external(, (struct iovec *)iov, niov); > ret = bdrv_readv_vmstate(bioc->bs, , bioc->offset); > if (ret < 0) { > -return ret; > +error_setg_errno(errp, -ret, "bdrv_readv_vmstate failed"); > +return -1; > } > > bioc->offset += qiov.size; > @@ -86,7 +87,8 @@ qio_channel_block_writev(QIOChannel *ioc, > qemu_iovec_init_external(, (struct iovec *)iov, niov); > ret = bdrv_writev_vmstate(bioc->bs, , bioc->offset); > if (ret < 0) { > -return ret; > +error_setg_errno(errp, -ret, "bdrv_writev_vmstate failed"); > +return -1; > } > > bioc->offset += qiov.size; > -- > 2.30.2 > >
RE: [PATCH] tests/qtest/cxl-test: Remove temporary directories after testing
> -Original Message- > From: Qemu-devel > On Behalf Of Thomas Huth > Sent: Wednesday, October 12, 2022 5:15 PM > To: qemu-devel@nongnu.org; Jonathan Cameron > > Cc: Laurent Vivier ; Michael S . Tsirkin > Subject: [PATCH] tests/qtest/cxl-test: Remove temporary directories after > testing > > The cxl-test leaves some temporary directories behind. Let's clean them up > now! > > Signed-off-by: Thomas Huth LGTM. Reviewed-by: Zhang Chen Thanks Chen > --- > tests/qtest/cxl-test.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/tests/qtest/cxl-test.c b/tests/qtest/cxl-test.c index > cbe0fb549b..61f25a72b6 100644 > --- a/tests/qtest/cxl-test.c > +++ b/tests/qtest/cxl-test.c > @@ -101,6 +101,7 @@ static void cxl_t3d(void) > > qtest_start(cmdline->str); > qtest_end(); > +rmdir(tmpfs); > } > > static void cxl_1pxb_2rp_2t3d(void) > @@ -115,6 +116,7 @@ static void cxl_1pxb_2rp_2t3d(void) > > qtest_start(cmdline->str); > qtest_end(); > +rmdir(tmpfs); > } > > static void cxl_2pxb_4rp_4t3d(void) > @@ -130,6 +132,7 @@ static void cxl_2pxb_4rp_4t3d(void) > > qtest_start(cmdline->str); > qtest_end(); > +rmdir(tmpfs); > } > #endif /* CONFIG_POSIX */ > > -- > 2.31.1 >
RE: [PATCH V5] net/colo.c: Fix the pointer issue reported by Coverity.
> -Original Message- > From: Jason Wang > Sent: Tuesday, August 23, 2022 10:03 AM > To: Zhang, Chen > Cc: Peter Maydell ; Li Zhijian > ; qemu-dev > Subject: Re: [PATCH V5] net/colo.c: Fix the pointer issue reported by > Coverity. > > On Mon, Aug 22, 2022 at 4:29 PM Zhang Chen wrote: > > > > When enabled the virtio-net-pci, guest network packet will load the > > vnet_hdr. In COLO status, the primary VM's network packet maybe > > redirect to another VM, it needs filter-redirect enable the vnet_hdr > > flag at the same time, COLO-proxy will correctly parse the original > > network packet. If have any misconfiguration here, the vnet_hdr_len is > > wrong for parse the packet, the data+offset will point to wrong place. > > > > Signed-off-by: Zhang Chen > > Not sure it's worth 7.1. So I queued this for 7.2. It's fine for me. Thanks Chen > > Thanks > > > --- > > net/colo.c | 25 - > > net/colo.h | 1 + > > net/trace-events | 2 +- > > 3 files changed, 18 insertions(+), 10 deletions(-) > > > > diff --git a/net/colo.c b/net/colo.c > > index 6b0ff562ad..fb2c36a026 100644 > > --- a/net/colo.c > > +++ b/net/colo.c > > @@ -44,21 +44,28 @@ int parse_packet_early(Packet *pkt) { > > int network_length; > > static const uint8_t vlan[] = {0x81, 0x00}; > > -uint8_t *data = pkt->data + pkt->vnet_hdr_len; > > +uint8_t *data = pkt->data; > > uint16_t l3_proto; > > ssize_t l2hdr_len; > > > > -if (data == NULL) { > > -trace_colo_proxy_main_vnet_info("This packet is not parsed > > correctly, " > > -"pkt->vnet_hdr_len", > > pkt->vnet_hdr_len); > > +assert(data); > > + > > +/* Check the received vnet_hdr_len then add the offset */ > > +if ((pkt->vnet_hdr_len > sizeof(struct virtio_net_hdr_v1_hash)) || > > +(pkt->size < sizeof(struct eth_header) + sizeof(struct > > vlan_header) + > > +pkt->vnet_hdr_len)) { > > +/* > > + * The received remote packet maybe misconfiguration here, > > + * Please enable/disable filter module's the vnet_hdr flag at > > + * the same time. > > + */ > > +trace_colo_proxy_main_vnet_info("This received packet load wrong ", > > +pkt->vnet_hdr_len, > > + pkt->size); > > return 1; > > } > > -l2hdr_len = eth_get_l2_hdr_length(data); > > +data += pkt->vnet_hdr_len; > > > > -if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { > > -trace_colo_proxy_main("pkt->size < ETH_HLEN"); > > -return 1; > > -} > > +l2hdr_len = eth_get_l2_hdr_length(data); > > > > /* > > * TODO: support vlan. > > diff --git a/net/colo.h b/net/colo.h > > index 8b3e8d5a83..22fc3031f7 100644 > > --- a/net/colo.h > > +++ b/net/colo.h > > @@ -18,6 +18,7 @@ > > #include "qemu/jhash.h" > > #include "qemu/timer.h" > > #include "net/eth.h" > > +#include "standard-headers/linux/virtio_net.h" > > > > #define HASHTABLE_MAX_SIZE 16384 > > > > diff --git a/net/trace-events b/net/trace-events index > > 6af927b4b9..823a071bdc 100644 > > --- a/net/trace-events > > +++ b/net/trace-events > > @@ -9,7 +9,7 @@ vhost_user_event(const char *chr, int event) "chr: %s got > event: %d" > > > > # colo.c > > colo_proxy_main(const char *chr) ": %s" > > -colo_proxy_main_vnet_info(const char *sta, int size) ": %s = %d" > > +colo_proxy_main_vnet_info(const char *sta, uint32_t vnet_hdr, int size) > ": %s pkt->vnet_hdr_len = %u, pkt->size = %d" > > > > # colo-compare.c > > colo_compare_main(const char *chr) ": %s" > > -- > > 2.25.1 > >
[PATCH V5] net/colo.c: Fix the pointer issue reported by Coverity.
When enabled the virtio-net-pci, guest network packet will load the vnet_hdr. In COLO status, the primary VM's network packet maybe redirect to another VM, it needs filter-redirect enable the vnet_hdr flag at the same time, COLO-proxy will correctly parse the original network packet. If have any misconfiguration here, the vnet_hdr_len is wrong for parse the packet, the data+offset will point to wrong place. Signed-off-by: Zhang Chen --- net/colo.c | 25 - net/colo.h | 1 + net/trace-events | 2 +- 3 files changed, 18 insertions(+), 10 deletions(-) diff --git a/net/colo.c b/net/colo.c index 6b0ff562ad..fb2c36a026 100644 --- a/net/colo.c +++ b/net/colo.c @@ -44,21 +44,28 @@ int parse_packet_early(Packet *pkt) { int network_length; static const uint8_t vlan[] = {0x81, 0x00}; -uint8_t *data = pkt->data + pkt->vnet_hdr_len; +uint8_t *data = pkt->data; uint16_t l3_proto; ssize_t l2hdr_len; -if (data == NULL) { -trace_colo_proxy_main_vnet_info("This packet is not parsed correctly, " -"pkt->vnet_hdr_len", pkt->vnet_hdr_len); +assert(data); + +/* Check the received vnet_hdr_len then add the offset */ +if ((pkt->vnet_hdr_len > sizeof(struct virtio_net_hdr_v1_hash)) || +(pkt->size < sizeof(struct eth_header) + sizeof(struct vlan_header) + +pkt->vnet_hdr_len)) { +/* + * The received remote packet maybe misconfiguration here, + * Please enable/disable filter module's the vnet_hdr flag at + * the same time. + */ +trace_colo_proxy_main_vnet_info("This received packet load wrong ", +pkt->vnet_hdr_len, pkt->size); return 1; } -l2hdr_len = eth_get_l2_hdr_length(data); +data += pkt->vnet_hdr_len; -if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { -trace_colo_proxy_main("pkt->size < ETH_HLEN"); -return 1; -} +l2hdr_len = eth_get_l2_hdr_length(data); /* * TODO: support vlan. diff --git a/net/colo.h b/net/colo.h index 8b3e8d5a83..22fc3031f7 100644 --- a/net/colo.h +++ b/net/colo.h @@ -18,6 +18,7 @@ #include "qemu/jhash.h" #include "qemu/timer.h" #include "net/eth.h" +#include "standard-headers/linux/virtio_net.h" #define HASHTABLE_MAX_SIZE 16384 diff --git a/net/trace-events b/net/trace-events index 6af927b4b9..823a071bdc 100644 --- a/net/trace-events +++ b/net/trace-events @@ -9,7 +9,7 @@ vhost_user_event(const char *chr, int event) "chr: %s got event: %d" # colo.c colo_proxy_main(const char *chr) ": %s" -colo_proxy_main_vnet_info(const char *sta, int size) ": %s = %d" +colo_proxy_main_vnet_info(const char *sta, uint32_t vnet_hdr, int size) ": %s pkt->vnet_hdr_len = %u, pkt->size = %d" # colo-compare.c colo_compare_main(const char *chr) ": %s" -- 2.25.1
RE: [PATCH V4 RESEND] net/colo.c: Fix the pointer issue reported by Coverity.
> -Original Message- > From: Jason Wang > Sent: Thursday, August 18, 2022 4:04 PM > To: Zhang, Chen > Cc: Peter Maydell ; Li Zhijian > ; qemu-dev > Subject: Re: [PATCH V4 RESEND] net/colo.c: Fix the pointer issue reported by > Coverity. > > On Wed, Aug 17, 2022 at 3:45 PM Zhang, Chen wrote: > > > > Ping Jason and Peter, any comments for this patch? > > > > Thanks > > Chen > > > > > -Original Message- > > > From: Zhang, Chen > > > Sent: Tuesday, August 9, 2022 4:49 PM > > > To: Jason Wang ; Peter Maydell > > > ; Li Zhijian ; > > > qemu-dev > > > Cc: Zhang, Chen > > > Subject: [PATCH V4 RESEND] net/colo.c: Fix the pointer issue > > > reported by Coverity. > > > > > > When enabled the virtio-net-pci, guest network packet will load the > vnet_hdr. > > > In COLO status, the primary VM's network packet maybe redirect to > > > another VM, it need filter-redirect enable the vnet_hdr flag at the > > > same time, COLO- proxy will correctly parse the original network > > > packet. If have any misconfiguration here, the vnet_hdr_len is wrong > > > for parse the packet, the > > > data+offset will point to wrong place. > > > > > > Signed-off-by: Zhang Chen > > > --- > > > net/colo.c | 18 ++ > > > net/colo.h | 1 + > > > 2 files changed, 11 insertions(+), 8 deletions(-) > > > > > > diff --git a/net/colo.c b/net/colo.c index 6b0ff562ad..2b5568fff4 > > > 100644 > > > --- a/net/colo.c > > > +++ b/net/colo.c > > > @@ -44,21 +44,23 @@ int parse_packet_early(Packet *pkt) { > > > int network_length; > > > static const uint8_t vlan[] = {0x81, 0x00}; > > > -uint8_t *data = pkt->data + pkt->vnet_hdr_len; > > > +uint8_t *data = pkt->data; > > > uint16_t l3_proto; > > > ssize_t l2hdr_len; > > > > > > -if (data == NULL) { > > > -trace_colo_proxy_main_vnet_info("This packet is not parsed > > > correctly, > " > > > +assert(data); > > > + > > > +/* Check the received vnet_hdr_len then add the offset */ > > > +if ((pkt->vnet_hdr_len > sizeof(struct virtio_net_hdr_v1_hash)) || > > > +(pkt->size < sizeof(struct eth_header) + sizeof(struct > > > vlan_header) > > > ++ pkt->vnet_hdr_len)) { > > > +trace_colo_proxy_main_vnet_info("This packet may be load wrong " > > > "pkt->vnet_hdr_len", > > > pkt->vnet_hdr_len); > > Nit: I think we need to be verbose here, e.g put the pkt_size here at least. OK, I will change here to: /* * The received remote packet maybe misconfiguration here, * Please enable/disable filter module's the vnet_hdr flag at the same time. */ trace_colo_proxy_main_vnet_info("This received packet load wrong " "pkt->vnet_hdr_len", pkt->vnet_hdr_len, pkt->size); Thanks Chen > > Thanks > > > > return 1; > > > } > > > -l2hdr_len = eth_get_l2_hdr_length(data); > > > +data += pkt->vnet_hdr_len; > > > > > > -if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { > > > -trace_colo_proxy_main("pkt->size < ETH_HLEN"); > > > -return 1; > > > -} > > > +l2hdr_len = eth_get_l2_hdr_length(data); > > > > > > /* > > > * TODO: support vlan. > > > diff --git a/net/colo.h b/net/colo.h index 8b3e8d5a83..22fc3031f7 > > > 100644 > > > --- a/net/colo.h > > > +++ b/net/colo.h > > > @@ -18,6 +18,7 @@ > > > #include "qemu/jhash.h" > > > #include "qemu/timer.h" > > > #include "net/eth.h" > > > +#include "standard-headers/linux/virtio_net.h" > > > > > > #define HASHTABLE_MAX_SIZE 16384 > > > > > > -- > > > 2.25.1 > >
RE: [PATCH V4 RESEND] net/colo.c: Fix the pointer issue reported by Coverity.
Ping Jason and Peter, any comments for this patch? Thanks Chen > -Original Message- > From: Zhang, Chen > Sent: Tuesday, August 9, 2022 4:49 PM > To: Jason Wang ; Peter Maydell > ; Li Zhijian ; qemu-dev > > Cc: Zhang, Chen > Subject: [PATCH V4 RESEND] net/colo.c: Fix the pointer issue reported by > Coverity. > > When enabled the virtio-net-pci, guest network packet will load the vnet_hdr. > In COLO status, the primary VM's network packet maybe redirect to another > VM, it need filter-redirect enable the vnet_hdr flag at the same time, COLO- > proxy will correctly parse the original network packet. If have any > misconfiguration here, the vnet_hdr_len is wrong for parse the packet, the > data+offset will point to wrong place. > > Signed-off-by: Zhang Chen > --- > net/colo.c | 18 ++ > net/colo.h | 1 + > 2 files changed, 11 insertions(+), 8 deletions(-) > > diff --git a/net/colo.c b/net/colo.c > index 6b0ff562ad..2b5568fff4 100644 > --- a/net/colo.c > +++ b/net/colo.c > @@ -44,21 +44,23 @@ int parse_packet_early(Packet *pkt) { > int network_length; > static const uint8_t vlan[] = {0x81, 0x00}; > -uint8_t *data = pkt->data + pkt->vnet_hdr_len; > +uint8_t *data = pkt->data; > uint16_t l3_proto; > ssize_t l2hdr_len; > > -if (data == NULL) { > -trace_colo_proxy_main_vnet_info("This packet is not parsed > correctly, " > +assert(data); > + > +/* Check the received vnet_hdr_len then add the offset */ > +if ((pkt->vnet_hdr_len > sizeof(struct virtio_net_hdr_v1_hash)) || > +(pkt->size < sizeof(struct eth_header) + sizeof(struct vlan_header) > ++ pkt->vnet_hdr_len)) { > +trace_colo_proxy_main_vnet_info("This packet may be load wrong " > "pkt->vnet_hdr_len", > pkt->vnet_hdr_len); > return 1; > } > -l2hdr_len = eth_get_l2_hdr_length(data); > +data += pkt->vnet_hdr_len; > > -if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { > -trace_colo_proxy_main("pkt->size < ETH_HLEN"); > -return 1; > -} > +l2hdr_len = eth_get_l2_hdr_length(data); > > /* > * TODO: support vlan. > diff --git a/net/colo.h b/net/colo.h > index 8b3e8d5a83..22fc3031f7 100644 > --- a/net/colo.h > +++ b/net/colo.h > @@ -18,6 +18,7 @@ > #include "qemu/jhash.h" > #include "qemu/timer.h" > #include "net/eth.h" > +#include "standard-headers/linux/virtio_net.h" > > #define HASHTABLE_MAX_SIZE 16384 > > -- > 2.25.1
RE: [PATCH V4] net/colo.c: Fix the pointer issue reported by Coverity.
Please review the V4 RESEND version. Thanks Chen > -Original Message- > From: Zhang, Chen > Sent: Tuesday, August 9, 2022 4:45 PM > To: Jason Wang ; Peter Maydell > ; Li Zhijian ; qemu-dev > > Cc: Zhang, Chen > Subject: [PATCH V4] net/colo.c: Fix the pointer issue reported by Coverity. > > When enabled the virtio-net-pci, guest network packet will load the vnet_hdr. > In COLO status, the primary VM's network packet maybe redirect to another > VM, it need filter-redirect enable the vnet_hdr flag at the same time, COLO- > proxy will correctly parse the original network packet. If have any > misconfiguration here, the vnet_hdr_len is wrong for parse the packet, the > data+offset will point to wrong place. > > Signed-off-by: Zhang Chen > --- > net/colo.c | 18 ++ > net/colo.h | 1 + > 2 files changed, 11 insertions(+), 8 deletions(-) > > diff --git a/net/colo.c b/net/colo.c > index 6b0ff562ad..6930a2f4f6 100644 > --- a/net/colo.c > +++ b/net/colo.c > @@ -44,21 +44,23 @@ int parse_packet_early(Packet *pkt) { > int network_length; > static const uint8_t vlan[] = {0x81, 0x00}; > -uint8_t *data = pkt->data + pkt->vnet_hdr_len; > +uint8_t *data = pkt->data; > uint16_t l3_proto; > ssize_t l2hdr_len; > > -if (data == NULL) { > -trace_colo_proxy_main_vnet_info("This packet is not parsed > correctly, " > +assert(data); > + > +/* Check the received vnet_hdr_len then add the offset */ > +if ((pkt->vnet_hdr_len > sizeof(sizeof(struct virtio_net_hdr_v1_hash))) > || > +(pkt->size < sizeof(struct eth_header) + sizeof(struct vlan_header) > ++ pkt->vnet_hdr_len)) { > +trace_colo_proxy_main_vnet_info("This packet may be load wrong " > "pkt->vnet_hdr_len", > pkt->vnet_hdr_len); > return 1; > } > -l2hdr_len = eth_get_l2_hdr_length(data); > +data += pkt->vnet_hdr_len; > > -if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { > -trace_colo_proxy_main("pkt->size < ETH_HLEN"); > -return 1; > -} > +l2hdr_len = eth_get_l2_hdr_length(data); > > /* > * TODO: support vlan. > diff --git a/net/colo.h b/net/colo.h > index 8b3e8d5a83..22fc3031f7 100644 > --- a/net/colo.h > +++ b/net/colo.h > @@ -18,6 +18,7 @@ > #include "qemu/jhash.h" > #include "qemu/timer.h" > #include "net/eth.h" > +#include "standard-headers/linux/virtio_net.h" > > #define HASHTABLE_MAX_SIZE 16384 > > -- > 2.25.1
[PATCH V4 RESEND] net/colo.c: Fix the pointer issue reported by Coverity.
When enabled the virtio-net-pci, guest network packet will load the vnet_hdr. In COLO status, the primary VM's network packet maybe redirect to another VM, it need filter-redirect enable the vnet_hdr flag at the same time, COLO-proxy will correctly parse the original network packet. If have any misconfiguration here, the vnet_hdr_len is wrong for parse the packet, the data+offset will point to wrong place. Signed-off-by: Zhang Chen --- net/colo.c | 18 ++ net/colo.h | 1 + 2 files changed, 11 insertions(+), 8 deletions(-) diff --git a/net/colo.c b/net/colo.c index 6b0ff562ad..2b5568fff4 100644 --- a/net/colo.c +++ b/net/colo.c @@ -44,21 +44,23 @@ int parse_packet_early(Packet *pkt) { int network_length; static const uint8_t vlan[] = {0x81, 0x00}; -uint8_t *data = pkt->data + pkt->vnet_hdr_len; +uint8_t *data = pkt->data; uint16_t l3_proto; ssize_t l2hdr_len; -if (data == NULL) { -trace_colo_proxy_main_vnet_info("This packet is not parsed correctly, " +assert(data); + +/* Check the received vnet_hdr_len then add the offset */ +if ((pkt->vnet_hdr_len > sizeof(struct virtio_net_hdr_v1_hash)) || +(pkt->size < sizeof(struct eth_header) + sizeof(struct vlan_header) ++ pkt->vnet_hdr_len)) { +trace_colo_proxy_main_vnet_info("This packet may be load wrong " "pkt->vnet_hdr_len", pkt->vnet_hdr_len); return 1; } -l2hdr_len = eth_get_l2_hdr_length(data); +data += pkt->vnet_hdr_len; -if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { -trace_colo_proxy_main("pkt->size < ETH_HLEN"); -return 1; -} +l2hdr_len = eth_get_l2_hdr_length(data); /* * TODO: support vlan. diff --git a/net/colo.h b/net/colo.h index 8b3e8d5a83..22fc3031f7 100644 --- a/net/colo.h +++ b/net/colo.h @@ -18,6 +18,7 @@ #include "qemu/jhash.h" #include "qemu/timer.h" #include "net/eth.h" +#include "standard-headers/linux/virtio_net.h" #define HASHTABLE_MAX_SIZE 16384 -- 2.25.1
[PATCH V4] net/colo.c: Fix the pointer issue reported by Coverity.
When enabled the virtio-net-pci, guest network packet will load the vnet_hdr. In COLO status, the primary VM's network packet maybe redirect to another VM, it need filter-redirect enable the vnet_hdr flag at the same time, COLO-proxy will correctly parse the original network packet. If have any misconfiguration here, the vnet_hdr_len is wrong for parse the packet, the data+offset will point to wrong place. Signed-off-by: Zhang Chen --- net/colo.c | 18 ++ net/colo.h | 1 + 2 files changed, 11 insertions(+), 8 deletions(-) diff --git a/net/colo.c b/net/colo.c index 6b0ff562ad..6930a2f4f6 100644 --- a/net/colo.c +++ b/net/colo.c @@ -44,21 +44,23 @@ int parse_packet_early(Packet *pkt) { int network_length; static const uint8_t vlan[] = {0x81, 0x00}; -uint8_t *data = pkt->data + pkt->vnet_hdr_len; +uint8_t *data = pkt->data; uint16_t l3_proto; ssize_t l2hdr_len; -if (data == NULL) { -trace_colo_proxy_main_vnet_info("This packet is not parsed correctly, " +assert(data); + +/* Check the received vnet_hdr_len then add the offset */ +if ((pkt->vnet_hdr_len > sizeof(sizeof(struct virtio_net_hdr_v1_hash))) || +(pkt->size < sizeof(struct eth_header) + sizeof(struct vlan_header) ++ pkt->vnet_hdr_len)) { +trace_colo_proxy_main_vnet_info("This packet may be load wrong " "pkt->vnet_hdr_len", pkt->vnet_hdr_len); return 1; } -l2hdr_len = eth_get_l2_hdr_length(data); +data += pkt->vnet_hdr_len; -if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { -trace_colo_proxy_main("pkt->size < ETH_HLEN"); -return 1; -} +l2hdr_len = eth_get_l2_hdr_length(data); /* * TODO: support vlan. diff --git a/net/colo.h b/net/colo.h index 8b3e8d5a83..22fc3031f7 100644 --- a/net/colo.h +++ b/net/colo.h @@ -18,6 +18,7 @@ #include "qemu/jhash.h" #include "qemu/timer.h" #include "net/eth.h" +#include "standard-headers/linux/virtio_net.h" #define HASHTABLE_MAX_SIZE 16384 -- 2.25.1
RE: [PATCH V3] net/colo.c: Fix the pointer issue reported by Coverity.
> -Original Message- > From: Jason Wang > Sent: Tuesday, August 9, 2022 4:43 PM > To: Zhang, Chen > Cc: Peter Maydell ; Li Zhijian > ; qemu-dev > Subject: Re: [PATCH V3] net/colo.c: Fix the pointer issue reported by > Coverity. > > On Tue, Aug 9, 2022 at 4:29 PM Zhang, Chen wrote: > > > > > > > > > -Original Message- > > > From: Jason Wang > > > Sent: Tuesday, August 9, 2022 4:12 PM > > > To: Zhang, Chen > > > Cc: Peter Maydell ; Li Zhijian > > > ; qemu-dev > > > Subject: Re: [PATCH V3] net/colo.c: Fix the pointer issue reported by > Coverity. > > > > > > On Tue, Aug 9, 2022 at 11:05 AM Zhang Chen > wrote: > > > > > > > > When enable the virtio-net-pci, guest network packet will load the > > > > vnet_hdr. In COLO status, the primary VM's network packet maybe > > > > redirect to another VM, it need filter-redirect enable the > > > > vnet_hdr flag at the same time, COLO-proxy will correctly parse > > > > the original network packet. If have any misconfiguration here, > > > > the vnet_hdr_len is wrong for parse the packet, the data+offset will > > > > point > to wrong place. > > > > > > > > Signed-off-by: Zhang Chen > > > > --- > > > > net/colo.c | 18 ++ net/colo.h | 1 + > > > > 2 files changed, 11 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/net/colo.c b/net/colo.c index 6b0ff562ad..4f1b4a61f6 > > > > 100644 > > > > --- a/net/colo.c > > > > +++ b/net/colo.c > > > > @@ -44,21 +44,23 @@ int parse_packet_early(Packet *pkt) { > > > > int network_length; > > > > static const uint8_t vlan[] = {0x81, 0x00}; > > > > -uint8_t *data = pkt->data + pkt->vnet_hdr_len; > > > > +uint8_t *data = pkt->data; > > > > uint16_t l3_proto; > > > > ssize_t l2hdr_len; > > > > > > > > -if (data == NULL) { > > > > -trace_colo_proxy_main_vnet_info("This packet is not parsed > correctly, " > > > > +assert(data); > > > > + > > > > +/* Check the received vnet_hdr_len then add the offset */ > > > > +if ((pkt->vnet_hdr_len > sizeof(struct virtio_net_hdr)) || > > > > > > Virtio-net starts to support RSS so it means the vnet_hdr size could > > > be greater than virtio_net_hdr now. > > > > > > Or did you actually mean "<" here? > > > > No, I just try to avoid overflow with a maliciously over-large > > vnet_hdr_len value as Peter's comments. > > If enabled the RSS, how to get the maximum of vnet_hdr size? > > With hash_report the maximum is sizeof(virtio_net_hdr_v1_hash). But it might > be extended in the future. > > We can probably start from it. Thanks, I will change here to " if ((pkt->vnet_hdr_len > sizeof(virtio_net_hdr_v1_hash)) ||" in next version. Thanks Chen > > Thanks > > > > > Thanks > > Chen > > > > > > > > Thanks > > > > > > > +(pkt->size < sizeof(struct eth_header) + sizeof(struct > > > > vlan_header) > > > > ++ pkt->vnet_hdr_len)) { > > > > +trace_colo_proxy_main_vnet_info("This packet may be load wrong > " > > > > "pkt->vnet_hdr_len", > > > > pkt->vnet_hdr_len); > > > > return 1; > > > > } > > > > -l2hdr_len = eth_get_l2_hdr_length(data); > > > > +data += pkt->vnet_hdr_len; > > > > > > > > -if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { > > > > -trace_colo_proxy_main("pkt->size < ETH_HLEN"); > > > > -return 1; > > > > -} > > > > +l2hdr_len = eth_get_l2_hdr_length(data); > > > > > > > > /* > > > > * TODO: support vlan. > > > > diff --git a/net/colo.h b/net/colo.h index 8b3e8d5a83..22fc3031f7 > > > > 100644 > > > > --- a/net/colo.h > > > > +++ b/net/colo.h > > > > @@ -18,6 +18,7 @@ > > > > #include "qemu/jhash.h" > > > > #include "qemu/timer.h" > > > > #include "net/eth.h" > > > > +#include "standard-headers/linux/virtio_net.h" > > > > > > > > #define HASHTABLE_MAX_SIZE 16384 > > > > > > > > -- > > > > 2.25.1 > > > > > >
RE: [PATCH V3] net/colo.c: Fix the pointer issue reported by Coverity.
> -Original Message- > From: Jason Wang > Sent: Tuesday, August 9, 2022 4:12 PM > To: Zhang, Chen > Cc: Peter Maydell ; Li Zhijian > ; qemu-dev > Subject: Re: [PATCH V3] net/colo.c: Fix the pointer issue reported by > Coverity. > > On Tue, Aug 9, 2022 at 11:05 AM Zhang Chen wrote: > > > > When enable the virtio-net-pci, guest network packet will load the > > vnet_hdr. In COLO status, the primary VM's network packet maybe > > redirect to another VM, it need filter-redirect enable the vnet_hdr > > flag at the same time, COLO-proxy will correctly parse the original > > network packet. If have any misconfiguration here, the vnet_hdr_len is > > wrong for parse the packet, the data+offset will point to wrong place. > > > > Signed-off-by: Zhang Chen > > --- > > net/colo.c | 18 ++ > > net/colo.h | 1 + > > 2 files changed, 11 insertions(+), 8 deletions(-) > > > > diff --git a/net/colo.c b/net/colo.c > > index 6b0ff562ad..4f1b4a61f6 100644 > > --- a/net/colo.c > > +++ b/net/colo.c > > @@ -44,21 +44,23 @@ int parse_packet_early(Packet *pkt) { > > int network_length; > > static const uint8_t vlan[] = {0x81, 0x00}; > > -uint8_t *data = pkt->data + pkt->vnet_hdr_len; > > +uint8_t *data = pkt->data; > > uint16_t l3_proto; > > ssize_t l2hdr_len; > > > > -if (data == NULL) { > > -trace_colo_proxy_main_vnet_info("This packet is not parsed > > correctly, " > > +assert(data); > > + > > +/* Check the received vnet_hdr_len then add the offset */ > > +if ((pkt->vnet_hdr_len > sizeof(struct virtio_net_hdr)) || > > Virtio-net starts to support RSS so it means the vnet_hdr size could be > greater > than virtio_net_hdr now. > > Or did you actually mean "<" here? No, I just try to avoid overflow with a maliciously over-large vnet_hdr_len value as Peter's comments. If enabled the RSS, how to get the maximum of vnet_hdr size? Thanks Chen > > Thanks > > > +(pkt->size < sizeof(struct eth_header) + sizeof(struct vlan_header) > > ++ pkt->vnet_hdr_len)) { > > +trace_colo_proxy_main_vnet_info("This packet may be load wrong " > > "pkt->vnet_hdr_len", > > pkt->vnet_hdr_len); > > return 1; > > } > > -l2hdr_len = eth_get_l2_hdr_length(data); > > +data += pkt->vnet_hdr_len; > > > > -if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { > > -trace_colo_proxy_main("pkt->size < ETH_HLEN"); > > -return 1; > > -} > > +l2hdr_len = eth_get_l2_hdr_length(data); > > > > /* > > * TODO: support vlan. > > diff --git a/net/colo.h b/net/colo.h > > index 8b3e8d5a83..22fc3031f7 100644 > > --- a/net/colo.h > > +++ b/net/colo.h > > @@ -18,6 +18,7 @@ > > #include "qemu/jhash.h" > > #include "qemu/timer.h" > > #include "net/eth.h" > > +#include "standard-headers/linux/virtio_net.h" > > > > #define HASHTABLE_MAX_SIZE 16384 > > > > -- > > 2.25.1 > >
[PATCH V3] net/colo.c: Fix the pointer issue reported by Coverity.
When enable the virtio-net-pci, guest network packet will load the vnet_hdr. In COLO status, the primary VM's network packet maybe redirect to another VM, it need filter-redirect enable the vnet_hdr flag at the same time, COLO-proxy will correctly parse the original network packet. If have any misconfiguration here, the vnet_hdr_len is wrong for parse the packet, the data+offset will point to wrong place. Signed-off-by: Zhang Chen --- net/colo.c | 18 ++ net/colo.h | 1 + 2 files changed, 11 insertions(+), 8 deletions(-) diff --git a/net/colo.c b/net/colo.c index 6b0ff562ad..4f1b4a61f6 100644 --- a/net/colo.c +++ b/net/colo.c @@ -44,21 +44,23 @@ int parse_packet_early(Packet *pkt) { int network_length; static const uint8_t vlan[] = {0x81, 0x00}; -uint8_t *data = pkt->data + pkt->vnet_hdr_len; +uint8_t *data = pkt->data; uint16_t l3_proto; ssize_t l2hdr_len; -if (data == NULL) { -trace_colo_proxy_main_vnet_info("This packet is not parsed correctly, " +assert(data); + +/* Check the received vnet_hdr_len then add the offset */ +if ((pkt->vnet_hdr_len > sizeof(struct virtio_net_hdr)) || +(pkt->size < sizeof(struct eth_header) + sizeof(struct vlan_header) ++ pkt->vnet_hdr_len)) { +trace_colo_proxy_main_vnet_info("This packet may be load wrong " "pkt->vnet_hdr_len", pkt->vnet_hdr_len); return 1; } -l2hdr_len = eth_get_l2_hdr_length(data); +data += pkt->vnet_hdr_len; -if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { -trace_colo_proxy_main("pkt->size < ETH_HLEN"); -return 1; -} +l2hdr_len = eth_get_l2_hdr_length(data); /* * TODO: support vlan. diff --git a/net/colo.h b/net/colo.h index 8b3e8d5a83..22fc3031f7 100644 --- a/net/colo.h +++ b/net/colo.h @@ -18,6 +18,7 @@ #include "qemu/jhash.h" #include "qemu/timer.h" #include "net/eth.h" +#include "standard-headers/linux/virtio_net.h" #define HASHTABLE_MAX_SIZE 16384 -- 2.25.1
RE: [PATCH V2] net/colo.c: Fix the pointer issuse reported by Coverity.
> -Original Message- > From: Peter Maydell > Sent: Friday, August 5, 2022 5:57 PM > To: Zhang, Chen > Cc: Jason Wang ; Li Zhijian ; > qemu-dev > Subject: Re: [PATCH V2] net/colo.c: Fix the pointer issuse reported by > Coverity. > > On Fri, 5 Aug 2022 at 10:53, Zhang Chen wrote: > > > > When enable the virtio-net-pci, guest network packet will load the > > vnet_hdr. In COLO status, the primary VM's network packet maybe > > redirect to another VM, it need filter-redirect enable the vnet_hdr > > flag at the same time, COLO-proxy will correctly parse the original > > network packet. If have any misconfiguration here, the vnet_hdr_len is > > wrong for parse the packet, the data+offset will point to wrong place. > > > > Signed-off-by: Zhang Chen > > --- > > net/colo.c | 17 + > > 1 file changed, 9 insertions(+), 8 deletions(-) > > > > diff --git a/net/colo.c b/net/colo.c > > index 6b0ff562ad..524afa3d9b 100644 > > --- a/net/colo.c > > +++ b/net/colo.c > > @@ -44,21 +44,22 @@ int parse_packet_early(Packet *pkt) { > > int network_length; > > static const uint8_t vlan[] = {0x81, 0x00}; > > -uint8_t *data = pkt->data + pkt->vnet_hdr_len; > > +uint8_t *data = pkt->data; > > uint16_t l3_proto; > > ssize_t l2hdr_len; > > > > -if (data == NULL) { > > -trace_colo_proxy_main_vnet_info("This packet is not parsed > > correctly, > " > > +assert(data); > > + > > +/* Check the received vnet_hdr_len then add the offset */ > > +if (pkt->size < sizeof(struct eth_header) + sizeof(struct vlan_header) > > ++ pkt->vnet_hdr_len) { > > I think this expression needs more care to avoid overflow with a maliciously > over-large vnet_hdr_len value. > > Casting pkt->vnet_hdr_len to int64_t would be one way to do that; there > may be better approaches. Yes, you are right. And I found better to use this check to avoid overflow: "pkt->vnet_hdr_len < sizeof(struct virtio_net_hdr)" I will fix it in next version. Thanks Chen > > thanks > -- PMM
[PATCH V2] net/colo.c: Fix the pointer issuse reported by Coverity.
When enable the virtio-net-pci, guest network packet will load the vnet_hdr. In COLO status, the primary VM's network packet maybe redirect to another VM, it need filter-redirect enable the vnet_hdr flag at the same time, COLO-proxy will correctly parse the original network packet. If have any misconfiguration here, the vnet_hdr_len is wrong for parse the packet, the data+offset will point to wrong place. Signed-off-by: Zhang Chen --- net/colo.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/net/colo.c b/net/colo.c index 6b0ff562ad..524afa3d9b 100644 --- a/net/colo.c +++ b/net/colo.c @@ -44,21 +44,22 @@ int parse_packet_early(Packet *pkt) { int network_length; static const uint8_t vlan[] = {0x81, 0x00}; -uint8_t *data = pkt->data + pkt->vnet_hdr_len; +uint8_t *data = pkt->data; uint16_t l3_proto; ssize_t l2hdr_len; -if (data == NULL) { -trace_colo_proxy_main_vnet_info("This packet is not parsed correctly, " +assert(data); + +/* Check the received vnet_hdr_len then add the offset */ +if (pkt->size < sizeof(struct eth_header) + sizeof(struct vlan_header) ++ pkt->vnet_hdr_len) { +trace_colo_proxy_main_vnet_info("This packet may be load wrong " "pkt->vnet_hdr_len", pkt->vnet_hdr_len); return 1; } -l2hdr_len = eth_get_l2_hdr_length(data); +data += pkt->vnet_hdr_len; -if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { -trace_colo_proxy_main("pkt->size < ETH_HLEN"); -return 1; -} +l2hdr_len = eth_get_l2_hdr_length(data); /* * TODO: support vlan. -- 2.25.1
RE: [PATCH] net/colo.c: Fix the pointer issuse reported by Coverity.
> -Original Message- > From: Peter Maydell > Sent: Friday, August 5, 2022 4:53 PM > To: Zhang, Chen > Cc: Jason Wang ; Li Zhijian ; > qemu-dev > Subject: Re: [PATCH] net/colo.c: Fix the pointer issuse reported by Coverity. > > On Fri, 5 Aug 2022 at 06:56, Zhang, Chen wrote: > > > > > > > > > -Original Message- > > > From: Jason Wang I wonder under which case > we > > > can see data == NULL? > > > > > > AFAIK, data is either dup via packet_new() or assigned to a pointer > > > to the buf in packet_new_nocopy(). > > > > Yes, you are right. I just checked it for hint of bugs. > > Do you think no need to do it? > > If you think it is a "should never happen unless QEMU is buggy" case, then > assert(data). OK, I will change it to assert() in V2. Thanks Chen > > thanks > -- PMM
RE: [PATCH] net/colo.c: Fix the pointer issuse reported by Coverity.
> -Original Message- > From: Jason Wang > Sent: Friday, August 5, 2022 11:46 AM > To: Zhang, Chen > Cc: Peter Maydell ; Li Zhijian > ; qemu-dev > Subject: Re: [PATCH] net/colo.c: Fix the pointer issuse reported by Coverity. > > On Tue, Aug 2, 2022 at 4:24 PM Zhang Chen wrote: > > > > When enable the virtio-net-pci, guest network packet will load the > > vnet_hdr. In COLO status, the primary VM's network packet maybe > > redirect to another VM, it need filter-redirect enable the vnet_hdr > > flag at the same time, COLO-proxy will correctly parse the original > > network packet. If have any misconfiguration here, the vnet_hdr_len is > > wrong for parse the packet, the data+offset will point to wrong place. > > > > Signed-off-by: Zhang Chen > > --- > > net/colo.c | 16 ++-- > > 1 file changed, 10 insertions(+), 6 deletions(-) > > > > diff --git a/net/colo.c b/net/colo.c > > index 6b0ff562ad..dfb15b4c14 100644 > > --- a/net/colo.c > > +++ b/net/colo.c > > @@ -44,21 +44,25 @@ int parse_packet_early(Packet *pkt) { > > int network_length; > > static const uint8_t vlan[] = {0x81, 0x00}; > > -uint8_t *data = pkt->data + pkt->vnet_hdr_len; > > +uint8_t *data = pkt->data; > > uint16_t l3_proto; > > ssize_t l2hdr_len; > > > > if (data == NULL) { > > I wonder under which case we can see data == NULL? > > AFAIK, data is either dup via packet_new() or assigned to a pointer to the buf > in packet_new_nocopy(). Yes, you are right. I just checked it for hint of bugs. Do you think no need to do it? Thanks Chen > > Thanks > > > -trace_colo_proxy_main_vnet_info("This packet is not parsed > > correctly, > " > > -"pkt->vnet_hdr_len", > > pkt->vnet_hdr_len); > > +trace_colo_proxy_main("COLO-proxy got NULL data packet "); > > return 1; > > } > > -l2hdr_len = eth_get_l2_hdr_length(data); > > > > -if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { > > -trace_colo_proxy_main("pkt->size < ETH_HLEN"); > > +/* Check the received vnet_hdr_len then add the offset */ > > +if (pkt->size < sizeof(struct eth_header) + sizeof(struct vlan_header) > > ++ pkt->vnet_hdr_len) { > > +trace_colo_proxy_main_vnet_info("This packet may be load wrong " > > +"pkt->vnet_hdr_len", > > + pkt->vnet_hdr_len); > > return 1; > > } > > +data += pkt->vnet_hdr_len; > > + > > +l2hdr_len = eth_get_l2_hdr_length(data); > > > > /* > > * TODO: support vlan. > > -- > > 2.25.1 > >
RE: [PATCH 0/1] Update vfio-user module to the latest
> -Original Message- > From: Qemu-devel bounces+chen.zhang=intel@nongnu.org> On Behalf Of Jagannathan > Raman > Sent: Tuesday, August 2, 2022 9:24 AM > To: qemu-devel@nongnu.org > Cc: stefa...@gmail.com; berra...@redhat.com > Subject: [PATCH 0/1] Update vfio-user module to the latest > > Hi, > > This patch updates the libvfio-user submodule to the latest. Just a rough idea, why not depends on linux distribution for the libvfio-user.so? It looks no libvfio-user packet in distribution's repo. Hi Thomas/Daniel: For the RFC QEMU user space eBPF support, https://lore.kernel.org/all/20220617073630.535914-6-chen.zh...@intel.com/T/ Maybe introduce the libubpf.so as a subproject like libvfio-user.so is more appropriate? Thanks Chen > > Passed 'make check' & GitLab CI. > > Thank you! > -- > Jag > > Jagannathan Raman (1): > vfio-user: update submodule to latest > > subprojects/libvfio-user | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > -- > 2.20.1 >
[PATCH] net/colo.c: Fix the pointer issuse reported by Coverity.
When enable the virtio-net-pci, guest network packet will load the vnet_hdr. In COLO status, the primary VM's network packet maybe redirect to another VM, it need filter-redirect enable the vnet_hdr flag at the same time, COLO-proxy will correctly parse the original network packet. If have any misconfiguration here, the vnet_hdr_len is wrong for parse the packet, the data+offset will point to wrong place. Signed-off-by: Zhang Chen --- net/colo.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/net/colo.c b/net/colo.c index 6b0ff562ad..dfb15b4c14 100644 --- a/net/colo.c +++ b/net/colo.c @@ -44,21 +44,25 @@ int parse_packet_early(Packet *pkt) { int network_length; static const uint8_t vlan[] = {0x81, 0x00}; -uint8_t *data = pkt->data + pkt->vnet_hdr_len; +uint8_t *data = pkt->data; uint16_t l3_proto; ssize_t l2hdr_len; if (data == NULL) { -trace_colo_proxy_main_vnet_info("This packet is not parsed correctly, " -"pkt->vnet_hdr_len", pkt->vnet_hdr_len); +trace_colo_proxy_main("COLO-proxy got NULL data packet "); return 1; } -l2hdr_len = eth_get_l2_hdr_length(data); -if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { -trace_colo_proxy_main("pkt->size < ETH_HLEN"); +/* Check the received vnet_hdr_len then add the offset */ +if (pkt->size < sizeof(struct eth_header) + sizeof(struct vlan_header) ++ pkt->vnet_hdr_len) { +trace_colo_proxy_main_vnet_info("This packet may be load wrong " +"pkt->vnet_hdr_len", pkt->vnet_hdr_len); return 1; } +data += pkt->vnet_hdr_len; + +l2hdr_len = eth_get_l2_hdr_length(data); /* * TODO: support vlan. -- 2.25.1
RE: [PULL V2 25/25] net/colo.c: fix segmentation fault when packet is not parsed correctly
> -Original Message- > From: Jason Wang > Sent: Monday, August 1, 2022 12:18 PM > To: Zhang, Chen ; Xu, Tao3 > Cc: qemu-devel@nongnu.org; Li Zhijian ; Peter > Maydell > Subject: Re: [PULL V2 25/25] net/colo.c: fix segmentation fault when packet > is not parsed correctly > > > 在 2022/7/29 21:58, Peter Maydell 写道: > > On Wed, 20 Jul 2022 at 10:04, Jason Wang wrote: > >> From: Zhang Chen > >> > >> When COLO use only one vnet_hdr_support parameter between > >> filter-redirector and filter-mirror(or colo-compare), COLO will crash > >> with segmentation fault. Back track as follow: > >> > >> Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation > fault. > >> 0x55cb200b in eth_get_l2_hdr_length (p=0x0) > >> at /home/tao/project/COLO/colo-qemu/include/net/eth.h:296 > >> 296 uint16_t proto = be16_to_cpu(PKT_GET_ETH_HDR(p)->h_proto); > >> (gdb) bt > >> 0 0x55cb200b in eth_get_l2_hdr_length (p=0x0) > >> at /home/tao/project/COLO/colo-qemu/include/net/eth.h:296 > >> 1 0x55cb22b4 in parse_packet_early (pkt=0x56a44840) at > >> net/colo.c:49 > >> 2 0x55cb2b91 in is_tcp_packet (pkt=0x56a44840) at > >> net/filter-rewriter.c:63 > >> > >> So wrong vnet_hdr_len will cause pkt->data become NULL. Add check to > >> raise error and add trace-events to track vnet_hdr_len. > >> > >> Signed-off-by: Tao Xu > >> Signed-off-by: Zhang Chen > >> Reviewed-by: Li Zhijian > >> Signed-off-by: Jason Wang > > Hi -- Coverity points out that there's a problem with this fix (CID > > 1490786): > > > >> @@ -46,7 +46,14 @@ int parse_packet_early(Packet *pkt) > >> static const uint8_t vlan[] = {0x81, 0x00}; > >> uint8_t *data = pkt->data + pkt->vnet_hdr_len; > > data here is set to pkt->data + pkt->vnet_hdr_len. > > If pkt->data is NULL then this addition is C undefined behaviour, and > > if pkt->data is not NULL but the integer added is such that the > > pointer ends up not pointing within data, then that is also C > > undefined behaviour... > > > Right. And I don't see how pkt->data can be NULL, maybe a hint of bug > elsewhere. > > > > > >> uint16_t l3_proto; > >> -ssize_t l2hdr_len = eth_get_l2_hdr_length(data); > >> +ssize_t l2hdr_len; > >> + > >> +if (data == NULL) { > > ...so the compiler is allowed to assume that data cannot be NULL > > here, and this entire error checking clause is logically dead code. > > > > If you're trying to check whether vnet_hdr_len is valid, you > > need to do that as an integer arithmetic check before you start > > using it for pointer arithmetic. And you probably want to be > > checking against some kind of bound, not just for whether the > > result is going to be NULL. > > > Or we can let the caller to validate the Pkt before calling this function. > > > > > > Overall this looks kinda sketchy and could probably use more > > detailed code review. Where does the bogus vnet_hdr_len come from in > > the first place? Is this data from the guest, or from the network > > (ie uncontrolled)? > > > If I understand correctly the vnet_hdr_len is set during handshake > (net_fill_rstate()) which is sent from a network backend. > > Chen or Xu, please post a fix and explain what happens in the changelog. OK, we will send a patch to fix it. Thanks Chen > > Thanks > > > > > >> +trace_colo_proxy_main_vnet_info("This packet is not parsed > correctly, " > >> +"pkt->vnet_hdr_len", > >> pkt->vnet_hdr_len); > >> +return 1; > >> +} > > thanks > > -- PMM > >
RE: [PATCH V2 1/4] softmmu/runstate.c: add RunStateTransition support form COLO to PRELAUNCH
> > > > > > On Fri, Apr 1, 2022 at 11:59 AM Zhang Chen > wrote: > > > > > > > > > > > > > > If the checkpoint occurs when the guest finishes restarting > > > > > > > but has not started running, the runstate_set() may reject > > > > > > > the transition from COLO to PRELAUNCH with the crash log: > > > > > > > > > > > > > > {"timestamp": {"seconds": 1593484591, "microseconds": > > > > > > > 26605},\ > > > > > > > "event": "RESET", "data": {"guest": true, "reason": > > > > > > > "guest-reset"}} > > > > > > > qemu-system-x86_64: invalid runstate transition: 'colo' -> > 'prelaunch' > > > > > > > > > > > > > > Long-term testing says that it's pretty safe. > > > > > > > > > > > > > > Signed-off-by: Like Xu > > > > > > > Signed-off-by: Zhang Chen > > > > > > > > > > > > I'd expect this to get ack from the relevant maintainers. > > > > > > > > > > > > > > > > The scripts/get_maintainer.pl can't find relevant maintainers for this > patch. > > > > > Maybe Paolo have time to cover this simple patch related to runstate? > > > > > > > > No news for a while, any comments for unmaintained files changes ? > > > > Ping... > > > > > > Adding David and Juan. > > > > This looks OK to me; > > > > Acked-by: Dr. David Alan Gilbert > > Great. > > > > > it should be fine to merge it along with the pull that takes the other > > patches. > > Yes, I've queued this series. Hi Jason, did this series get lost in the net queue branch? Thanks Chen > > Thanks > > > > > Dave > > > > > Thanks > > > > > > > > >
RE: [PATCH] gtk: Add show_tabs=on|off command line option.
> -Original Message- > From: Qemu-devel bounces+chen.zhang=intel@nongnu.org> On Behalf Of Felix xq > Queißner > Sent: Tuesday, June 28, 2022 12:44 AM > To: qemu-devel@nongnu.org > Cc: kra...@redhat.com; th...@redhat.com; Felix "xq" Queißner > > Subject: [PATCH] gtk: Add show_tabs=on|off command line option. > > The patch adds "show_tabs" command line option for GTK ui similar to > "grab_on_hover". This option allows tabbed view mode to not have to be > enabled by hand at each start of the VM. > > Signed-off-by: Felix "xq" Queißner Thanks your patch, but please use your real name to sign a patch. For the details: docs/devel/submitting-a-patch.rst Thanks Chen > --- > qapi/ui.json| 5 - > qemu-options.hx | 2 +- > ui/gtk.c| 4 > 3 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/qapi/ui.json b/qapi/ui.json index 413371d5e8..049e7957a9 100644 > --- a/qapi/ui.json > +++ b/qapi/ui.json > @@ -1195,12 +1195,15 @@ > # assuming the guest will resize the display to match > # the window size then. Otherwise it defaults to "off". > # Since 3.1 > +# @show-tabs: Displays the tab items by default. > +# Since 7.1 > # > # Since: 2.12 > ## > { 'struct' : 'DisplayGTK', >'data': { '*grab-on-hover' : 'bool', > -'*zoom-to-fit' : 'bool' } } > +'*zoom-to-fit' : 'bool', > +'*show-tabs' : 'bool' } } > > ## > # @DisplayEGLHeadless: > diff --git a/qemu-options.hx b/qemu-options.hx index > 377d22fbd8..2b279afff7 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -1937,7 +1937,7 @@ DEF("display", HAS_ARG, QEMU_OPTION_display, > "[,window-close=on|off]\n" > #endif > #if defined(CONFIG_GTK) > -"-display gtk[,full-screen=on|off][,gl=on|off][,grab-on-hover=on|off]\n" > +"-display gtk[,full-screen=on|off][,gl=on|off][,grab-on- > hover=on|off][,show-tabs=on|off]\n" > "[,show-cursor=on|off][,window-close=on|off]\n" > #endif > #if defined(CONFIG_VNC) > diff --git a/ui/gtk.c b/ui/gtk.c > index 2a791dd2aa..1467b8c7d7 100644 > --- a/ui/gtk.c > +++ b/ui/gtk.c > @@ -2390,6 +2390,10 @@ static void gtk_display_init(DisplayState *ds, > DisplayOptions *opts) > opts->u.gtk.grab_on_hover) { > gtk_menu_item_activate(GTK_MENU_ITEM(s->grab_on_hover_item)); > } > +if (opts->u.gtk.has_show_tabs && > +opts->u.gtk.show_tabs) { > +gtk_menu_item_activate(GTK_MENU_ITEM(s->show_tabs_item)); > +} > gd_clipboard_init(s); > } > > -- > 2.36.1 >
RE: [RFC PATCH 00/12] Introduce QEMU userspace ebpf support
> -Original Message- > From: Andrew Melnichenko > Sent: Wednesday, June 29, 2022 6:43 PM > To: Zhang, Chen > Cc: Jason Wang ; qemu-dev de...@nongnu.org>; Paolo Bonzini ; Daniel P. > Berrangé ; Eduardo Habkost > ; Eric Blake ; Markus > Armbruster ; Peter Maydell > ; Thomas Huth ; Laurent > Vivier ; Yuri Benditovich > > Subject: Re: [RFC PATCH 00/12] Introduce QEMU userspace ebpf support > > Hi all, > Nice idea. > It would be great if future patches would add the BPF map support(if uBPF > allows it). The BPF map support is very useful. But current uBPF project don't support this yet. According to the previous discussion with Thomas and Daniel, we should avoid Introduce new git submodule for QEMU. And related general discussion: Why we should avoid new submodules if possible: https://www.mail-archive.com/qemu-devel@nongnu.org/msg897339.html I think for the future development, we have to submit patch to the uBPF project. Thanks Chen > > On Fri, Jun 17, 2022 at 10:51 AM Zhang Chen wrote: > > > > Hi All, > > > > The goal of this series is to bring the power of ebpf to QEMU. > > It makes QEMU have the ability to extend the capabilities without > > requiring changing source code. Just need to load the eBPF binary file > > even at VM runtime. And already have some userspace ebpf > > implementation like: Intel DPDK eBPF, windows eBPF, etc.. > > The original idea suggested by Jason Wang. > > > > eBPF is a revolutionary technology with origins in the Linux > > kernel that can run sandboxed programs in an operating system kernel. > > It is used to safely and efficiently extend the capabilities of the > > kernel without requiring to change kernel source code or load kernel > > modules.(from https://ebpf.io/) > > > > KVM already got benefits from it, but QEMU did not. Hence we want > > to bring the power of eBPF to QEMU. It can load binary eBPF program > > even when VM running. At the same time, add some hooks in QEMU as > the > > user space eBPF load point. Do the things on different layers. > > > >That’s the advantages of kernel eBPF. Most of the functions can be > > implemented in QEMU. This series just a start of the Power of > Programmability. > > > > 1). Safety: > > > > Building on the foundation of seeing and understanding all system > > calls and combining that with a packet and socket-level view of all > > networking operations allows for revolutionary new approaches to > > securing systems. > > > > 2). Tracing & Profiling: > > > > The ability to attach eBPF programs to trace points as well as kernel > > and user application probe points allows unprecedented visibility into > > the runtime behavior of applications and the system itself. > > > > 3). Networking: > > > > The combination of programmability and efficiency makes eBPF a natural > > fit for all packet processing requirements of networking solutions. > > > > 4). Observability & Monitoring: > > > > Instead of relying on static counters and gauges exposed by the > > perating system, eBPF enables the collection & in-kernel aggregation > > of custom metrics and generation of visibility events based on a wide > > range of possible sources. > > > > QEMU userspace ebpf design based on ubpf project > (https://github.com/iovisor/ubpf). > > The most mature userspace ebpf implementation. This project officially > > support by iovisor(Like BCC and bpftrace). This project includes an > > eBPF assembler, disassembler, interpreter (for all platforms), and JIT > > compiler (for x86-64 and Arm64 targets). Qemu userspace ebpf make the > > ubpf project as the git submodule. > > > > Current implementation support load ebpf program and run it in > > net/filter-ubpf module, this filter can support any user defined rules > > to hanle network packet. At the same time, it's easy for other > > developers to use the ubpf infrastructue in QEMU's other modules from > > the function in /ebpf/ubpf.c, and it support JIT. > > > > For the uBPF License is Apache License 2.0, It's OK to compatible > > with QEMU’s GPLv2 LICENSE same as mason. > > > > TODO: Need to add more comments and test-case for ubpf, current > > implementation not include ebpf verifier. But I think maybe it's not a > > big problem, current ebpf load/unload API exposed by QMP command. > > Qemu is a userspace program, if someone want to hack QEMU, no need to > > load a malicious ubpf program, it can hack QE
RE: [PATCH] common-user: Only compile the common user code if have_user is set
> -Original Message- > From: Qemu-devel bounces+chen.zhang=intel@nongnu.org> On Behalf Of Thomas Huth > Sent: Wednesday, June 22, 2022 10:03 PM > To: qemu-devel@nongnu.org; Riku Voipio > Cc: Michael Tokarev ; qemu-triv...@nongnu.org; Paolo > Bonzini > Subject: [PATCH] common-user: Only compile the common user code if > have_user is set > > There is no need to waste cycles here if we only compile the system binaries > or tools. Additionally, this change is even a hard requirement for building > the > tools on systems that do not have an entry in the common-user/host/ folder > (since common-user/meson.build is trying to add such a path via the > include_directories() command). > > Reported-by: Michael Tokarev > Signed-off-by: Thomas Huth Looks good to me. Reviewed-by: Zhang Chen > --- > common-user/meson.build | 4 > 1 file changed, 4 insertions(+) > > diff --git a/common-user/meson.build b/common-user/meson.build index > 26212dda5c..ac9de5b9e3 100644 > --- a/common-user/meson.build > +++ b/common-user/meson.build > @@ -1,3 +1,7 @@ > +if not have_user > + subdir_done() > +endif > + > common_user_inc += include_directories('host/' / host_arch) > > user_ss.add(files( > -- > 2.31.1 >
RE: [PATCH 1/1] i386/monitor: Fix page table walking issue for LA57 enabled guest
> -Original Message- > From: Qemu-devel bounces+chen.zhang=intel@nongnu.org> On Behalf Of Yuan Yao > Sent: Thursday, June 9, 2022 4:35 PM > To: Paolo Bonzini ; Philippe Mathieu-Daudé > ; Dr. David Alan Gilbert ; Zhong, > Yang ; Connor Kuehl > Cc: qemu-devel@nongnu.org; Yao, Yuan ; Yamahata, > Isaku > Subject: [PATCH 1/1] i386/monitor: Fix page table walking issue for LA57 > enabled guest > > Don't skip next leve page table for pdpe/pde when the PG_PRESENT_MASK S/leve/level > is set. > > This fixs the issue that no mapping information was collected from "info > mem" for guest with LA57 enabled. > > Signed-off-by: Yuan Yao LGTM. It should same with the excp_helper.c/mmu_translate() la57 implementation. Reviewed-by: Zhang Chen Add Eric and Markus for double check. Thanks Chen > --- > target/i386/monitor.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/target/i386/monitor.c b/target/i386/monitor.c index > 8e4b4d600c..3339550bbe 100644 > --- a/target/i386/monitor.c > +++ b/target/i386/monitor.c > @@ -489,7 +489,7 @@ static void mem_info_la57(Monitor *mon, > CPUArchState *env) > cpu_physical_memory_read(pdp_addr + l2 * 8, , 8); > pdpe = le64_to_cpu(pdpe); > end = (l0 << 48) + (l1 << 39) + (l2 << 30); > -if (pdpe & PG_PRESENT_MASK) { > +if (!(pdpe & PG_PRESENT_MASK)) { > prot = 0; > mem_print(mon, env, , _prot, end, prot); > continue; > @@ -508,7 +508,7 @@ static void mem_info_la57(Monitor *mon, > CPUArchState *env) > cpu_physical_memory_read(pd_addr + l3 * 8, , 8); > pde = le64_to_cpu(pde); > end = (l0 << 48) + (l1 << 39) + (l2 << 30) + (l3 << 21); > -if (pde & PG_PRESENT_MASK) { > +if (!(pde & PG_PRESENT_MASK)) { > prot = 0; > mem_print(mon, env, , _prot, end, prot); > continue; > > base-commit: 6d940eff4734bcb40b1a25f62d7cec5a396f994a > -- > 2.27.0 >
RE: [RFC PATCH 01/12] configure: Add iovisor/ubpf project as a submodule for QEMU
> -Original Message- > From: Daniel P. Berrangé > Sent: Monday, June 20, 2022 6:37 PM > To: Zhang, Chen > Cc: Thomas Huth ; Jason Wang > ; qemu-dev ; Paolo > Bonzini ; Eduardo Habkost ; > Eric Blake ; Markus Armbruster > ; Peter Maydell ; Laurent > Vivier ; Yuri Benditovich > ; Andrew Melnychenko > > Subject: Re: [RFC PATCH 01/12] configure: Add iovisor/ubpf project as a > submodule for QEMU > > On Mon, Jun 20, 2022 at 10:29:14AM +, Zhang, Chen wrote: > > > > > > > -Original Message- > > > From: Thomas Huth > > > Sent: Monday, June 20, 2022 5:44 PM > > > To: Zhang, Chen ; Daniel P. Berrangé > > > > > > Cc: Jason Wang ; qemu-dev > > de...@nongnu.org>; Paolo Bonzini ; Eduardo > > > Habkost ; Eric Blake ; > > > Markus Armbruster ; Peter Maydell > > > ; Laurent Vivier ; > > > Yuri Benditovich ; Andrew Melnychenko > > > > > > Subject: Re: [RFC PATCH 01/12] configure: Add iovisor/ubpf project > > > as a submodule for QEMU > > > > > > On 20/06/2022 11.29, Zhang, Chen wrote: > > > > > > > > > > > >> -Original Message- > > > >> From: Thomas Huth > > > >> Sent: Monday, June 20, 2022 4:47 PM > > > >> To: Daniel P. Berrangé ; Zhang, Chen > > > >> > > > >> Cc: Jason Wang ; qemu-dev > > >> de...@nongnu.org>; Paolo Bonzini ; Eduardo > > > >> Habkost ; Eric Blake ; > > > Markus > > > >> Armbruster ; Peter Maydell > > > >> ; Laurent Vivier ; > > > >> Yuri Benditovich ; Andrew > > > >> Melnychenko > > > >> Subject: Re: [RFC PATCH 01/12] configure: Add iovisor/ubpf > > > >> project as a submodule for QEMU > > > >> > > > >> On 20/06/2022 10.11, Daniel P. Berrangé wrote: > > > >>> On Mon, Jun 20, 2022 at 05:59:06AM +, Zhang, Chen wrote: > > > >>>> > > > >>>> > > > >>>>> -Original Message- > > > >>>>> From: Daniel P. Berrangé > > > >>>>> Sent: Friday, June 17, 2022 4:05 PM > > > >>>>> To: Zhang, Chen > > > >>>>> Cc: Jason Wang ; qemu-dev > > >>>>> de...@nongnu.org>; Paolo Bonzini ; > > > >>>>> Eduardo Habkost ; Eric Blake > > > >>>>> ; Markus Armbruster > ; > > > >>>>> Peter Maydell ; Thomas Huth > > > >>>>> ; > > > >> Laurent > > > >>>>> Vivier ; Yuri Benditovich > > > >>>>> ; Andrew Melnychenko > > > >>>>> > > > >>>>> Subject: Re: [RFC PATCH 01/12] configure: Add iovisor/ubpf > > > >>>>> project as a submodule for QEMU > > > >>>>> > > > >>>>> On Fri, Jun 17, 2022 at 03:36:19PM +0800, Zhang Chen wrote: > > > >>>>>> Make iovisor/ubpf project be a git submodule for QEMU. > > > >>>>>> It will auto clone ubpf project when configure QEMU. > > > >>>>> > > > >>>>> I don't think we need todo this. As it is brand new > > > >>>>> functionality we don't have any back compat issues. We should > > > >>>>> just expect the distros to ship ubpf if they want their QEMU > > > >>>>> builds to take advantage > > > of it. > > > >>>>> > > > >>>> > > > >>>> Yes, agree. It's the best way to use the uBPF project. > > > >>>> But current status is distros(ubuntu, RHEL...) does not ship > > > >>>> the iovisor/ubpf like the iovisor/bcc. So I have to do it. > > > >>>> Or do you have any better suggestions? > > > >>> > > > >>> If distros want to support the functionality, they can add > > > >>> packages for it IMHO. > > > >> > > > >> Yes, let's please avoid new submodules. Submodules can sometimes > > > >> be a real PITA (e.g. if you forget to update before rsync'ing > > > >> your code to a machine that has limited internet access), and if > > > >> users install QEMU from sources, they can also install ubpf from > sources, too. > > > >> And if distros want to support this feature, they can pack
RE: [RFC PATCH 01/12] configure: Add iovisor/ubpf project as a submodule for QEMU
> -Original Message- > From: Thomas Huth > Sent: Monday, June 20, 2022 5:44 PM > To: Zhang, Chen ; Daniel P. Berrangé > > Cc: Jason Wang ; qemu-dev de...@nongnu.org>; Paolo Bonzini ; Eduardo > Habkost ; Eric Blake ; Markus > Armbruster ; Peter Maydell > ; Laurent Vivier ; Yuri > Benditovich ; Andrew Melnychenko > > Subject: Re: [RFC PATCH 01/12] configure: Add iovisor/ubpf project as a > submodule for QEMU > > On 20/06/2022 11.29, Zhang, Chen wrote: > > > > > >> -Original Message- > >> From: Thomas Huth > >> Sent: Monday, June 20, 2022 4:47 PM > >> To: Daniel P. Berrangé ; Zhang, Chen > >> > >> Cc: Jason Wang ; qemu-dev >> de...@nongnu.org>; Paolo Bonzini ; Eduardo > >> Habkost ; Eric Blake ; > Markus > >> Armbruster ; Peter Maydell > >> ; Laurent Vivier ; Yuri > >> Benditovich ; Andrew Melnychenko > >> > >> Subject: Re: [RFC PATCH 01/12] configure: Add iovisor/ubpf project as > >> a submodule for QEMU > >> > >> On 20/06/2022 10.11, Daniel P. Berrangé wrote: > >>> On Mon, Jun 20, 2022 at 05:59:06AM +, Zhang, Chen wrote: > >>>> > >>>> > >>>>> -Original Message- > >>>>> From: Daniel P. Berrangé > >>>>> Sent: Friday, June 17, 2022 4:05 PM > >>>>> To: Zhang, Chen > >>>>> Cc: Jason Wang ; qemu-dev >>>>> de...@nongnu.org>; Paolo Bonzini ; Eduardo > >>>>> Habkost ; Eric Blake ; > >>>>> Markus Armbruster ; Peter Maydell > >>>>> ; Thomas Huth ; > >> Laurent > >>>>> Vivier ; Yuri Benditovich > >>>>> ; Andrew Melnychenko > >>>>> > >>>>> Subject: Re: [RFC PATCH 01/12] configure: Add iovisor/ubpf project > >>>>> as a submodule for QEMU > >>>>> > >>>>> On Fri, Jun 17, 2022 at 03:36:19PM +0800, Zhang Chen wrote: > >>>>>> Make iovisor/ubpf project be a git submodule for QEMU. > >>>>>> It will auto clone ubpf project when configure QEMU. > >>>>> > >>>>> I don't think we need todo this. As it is brand new functionality > >>>>> we don't have any back compat issues. We should just expect the > >>>>> distros to ship ubpf if they want their QEMU builds to take advantage > of it. > >>>>> > >>>> > >>>> Yes, agree. It's the best way to use the uBPF project. > >>>> But current status is distros(ubuntu, RHEL...) does not ship the > >>>> iovisor/ubpf like the iovisor/bcc. So I have to do it. > >>>> Or do you have any better suggestions? > >>> > >>> If distros want to support the functionality, they can add packages > >>> for it IMHO. > >> > >> Yes, let's please avoid new submodules. Submodules can sometimes be a > >> real PITA (e.g. if you forget to update before rsync'ing your code to > >> a machine that has limited internet access), and if users install > >> QEMU from sources, they can also install ubpf from sources, too. > >> And if distros want to support this feature, they can package ubpf on > >> their own, as Daniel said. > > > > Hi Daniel and Thomas, > > > > I don't know much the background history of QEMU submodules, but > meson > > build is a submodule for QEMU too. It means user can't install QEMU > > from sources with limited internet access. > > There is no written policy, but I think the general consensus is that we only > ship code in submodules if: > > 1) It's not available in a required version in distros yet > > and > > 2) it is essentially required to build QEMU (like meson) or if the feature has > been part of the QEMU sources before and then moved to a separate > repository (like slirp). > > We ship meson as a submodule since we require some meson features that > are not available with the meson versions in the distros yet. Once the distros > catch up, we'll likely remove the meson submodule from QEMU. > > > And back to Daniel's comments, Yes, the best way is distros add the > > ubpf packages, But maybe it's too late to implement new features for > > us. We can introduce the submodule now and auto change to the distros's > lib when distros add it. For example QEMU's submodule SLIRP do it in the > same way. > > slirp used to be part of the QEMU repository, but then has been moved to a > separate project a
RE: [RFC PATCH 01/12] configure: Add iovisor/ubpf project as a submodule for QEMU
> -Original Message- > From: Daniel P. Berrangé > Sent: Monday, June 20, 2022 6:01 PM > To: Zhang, Chen > Cc: Thomas Huth ; Jason Wang > ; qemu-dev ; Paolo > Bonzini ; Eduardo Habkost ; > Eric Blake ; Markus Armbruster > ; Peter Maydell ; Laurent > Vivier ; Yuri Benditovich > ; Andrew Melnychenko > > Subject: Re: [RFC PATCH 01/12] configure: Add iovisor/ubpf project as a > submodule for QEMU > > On Mon, Jun 20, 2022 at 09:29:05AM +, Zhang, Chen wrote: > > > > > > > > On 20/06/2022 10.11, Daniel P. Berrangé wrote: > > > > On Mon, Jun 20, 2022 at 05:59:06AM +, Zhang, Chen wrote: > > > >> > > > >> > > > >>> On Fri, Jun 17, 2022 at 03:36:19PM +0800, Zhang Chen wrote: > > > >>>> Make iovisor/ubpf project be a git submodule for QEMU. > > > >>>> It will auto clone ubpf project when configure QEMU. > > > >>> > > > >>> I don't think we need todo this. As it is brand new > > > >>> functionality we don't have any back compat issues. We should > > > >>> just expect the distros to ship ubpf if they want their QEMU builds to > take advantage of it. > > > >>> > > > >> > > > >> Yes, agree. It's the best way to use the uBPF project. > > > >> But current status is distros(ubuntu, RHEL...) does not ship the > > > >> iovisor/ubpf like the iovisor/bcc. So I have to do it. > > > >> Or do you have any better suggestions? > > > > > > > > If distros want to support the functionality, they can add > > > > packages for it IMHO. > > > > > > Yes, let's please avoid new submodules. Submodules can sometimes be > > > a real PITA (e.g. if you forget to update before rsync'ing your code > > > to a machine that has limited internet access), and if users install > > > QEMU from sources, they can also install ubpf from sources, too. > > > And if distros want to support this feature, they can package ubpf > > > on their own, as Daniel said. > > > > Hi Daniel and Thomas, > > > > I don't know much the background history of QEMU submodules, but > meson > > build is a submodule for QEMU too. It means user can't install QEMU > > from sources with limited internet access. > > And back to Daniel's comments, Yes, the best way is distros add the > > ubpf packages, But maybe it's too late to implement new features for > > us. We can introduce the submodule now and auto change to the distros's > lib when distros add it. For example QEMU's submodule SLIRP do it in the > same way. > > It's already added by most distros and still as a QEMU submodule. It > > make user experience the latest technology with no other dependencies. > > uBPF infrastructure have the ability to extend the capabilities > > without requiring changing source code. If we not allow it, we have to re- > implement all the eBPF assembler, disassembler, interpreter, and JIT > compiler like DPDK userspace eBPF support (DPDK can't use ubpf project by > license issue). > > Slirp is a different scenario. That was functionality that was historically > integrated into QEMU and was then spun out into a standalone project. Since > we had existing users on existing distro releases dependant on Slirp, we > wanted to give a smooth upgrade experiance by bundling Slirp. Essentially > the goal was to avoid the regression if someone deployed new QEMU on > existing distros. > > Meson is a fairly similar scenario. We wanted to swap the build system in > QEMU over to Meson, and that change would affect all existing users of > QEMU. > Many distros didn't have a new enough meson, and so bundling it in QEMU > enables us to give a smooth upgrade path without any regression for existing > users on existing distros. > > This patch, however, is proposing an entirely new piece of functionality that > has no existing users and even once present will be used by relatively few > users compartively speaking. As such there is no upgrade compatibility / > regression scenario that we need to worry about. Anyone interested in the > new functionality can be reasonably asked to either wait for the distro to > package it, or build it themselves. > OK, got your point. For this series, should we introduce an external library "libubpf" in QEMU configure? If this library is found, the relevant files will be compiled and the feature can be enabled in QEMU. Thanks Chen > With regards, > Daniel > -- > |: https://berrange.com -o- > https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o-https://fstop138.berrange.com :| > |: https://entangle-photo.org-o- > https://www.instagram.com/dberrange :|
RE: [RFC PATCH 08/12] qapi: Add FilterUbpfProperties and qemu-options
> -Original Message- > From: Markus Armbruster > Sent: Monday, June 20, 2022 3:45 PM > To: Zhang, Chen > Cc: Jason Wang ; qemu-dev de...@nongnu.org>; Paolo Bonzini ; Daniel > P.Berrangé ; Eduardo Habkost > ; Eric Blake ; Peter Maydell > ; Thomas Huth ; Laurent > Vivier ; Yuri Benditovich > ; Andrew Melnychenko > > Subject: Re: [RFC PATCH 08/12] qapi: Add FilterUbpfProperties and qemu- > options > > Zhang Chen writes: > > > Add filter-ubpf related QOM and qemu-options. > > > > Signed-off-by: Zhang Chen > > --- > > qapi/qom.json | 18 ++ > > qemu-options.hx | 6 ++ > > 2 files changed, 24 insertions(+) > > > > diff --git a/qapi/qom.json b/qapi/qom.json index > > 6a653c6636..820a5218e8 100644 > > --- a/qapi/qom.json > > +++ b/qapi/qom.json > > @@ -444,6 +444,22 @@ > >'base': 'NetfilterProperties', > >'data': { '*vnet_hdr_support': 'bool' } } > > > > +## > > +# @FilterUbpfProperties: > > +# > > +# Properties for filter-ubpf objects. > > +# > > +# @ip-mode: if true, IP packet handle mode is enabled(default: true). > > Space between "enabled" and "(default: true)", please. > > I'm not sure about the name @ip-mode. A mode tends to be an enum. A > boolean tends to be a flag, like @disable-packed-handle-mode. Note that I > reverted the sense there, to make the default false. Thanks for your review Markus~ Makes sense. Current mode just include ip-mode and raw-mode, it looks Need change to a enum mode for it, maybe will add other mode in the future. > > > +# > > +# @ubpf-handler: The filename where the userspace ebpf packets > handler. > > +# > > +# Since: 7.1 > > +## > > +{ 'struct': 'FilterUbpfProperties', > > + 'base': 'NetfilterProperties', > > + 'data': { '*ip-mode': 'bool', > > +'*ubpf-handler': 'str' } } > > + > > ## > > # @InputBarrierProperties: > > # > > @@ -845,6 +861,7 @@ > > 'filter-redirector', > > 'filter-replay', > > 'filter-rewriter', > > +'filter-ubpf', > > 'input-barrier', > > { 'name': 'input-linux', > >'if': 'CONFIG_LINUX' }, > > @@ -911,6 +928,7 @@ > >'filter-redirector': 'FilterRedirectorProperties', > >'filter-replay': 'NetfilterProperties', > >'filter-rewriter':'FilterRewriterProperties', > > + 'filter-ubpf':'FilterUbpfProperties', > >'input-barrier': 'InputBarrierProperties', > >'input-linux':{ 'type': 'InputLinuxProperties', > >'if': 'CONFIG_LINUX' }, diff > > --git a/qemu-options.hx b/qemu-options.hx index 60cf188da4..3dfb858867 > > 100644 > > --- a/qemu-options.hx > > +++ b/qemu-options.hx > > @@ -5080,6 +5080,12 @@ SRST > > stored. The file format is libpcap, so it can be analyzed with > > tools such as tcpdump or Wireshark. > > > > +``-object filter-ubpf,id=id,netdev=dev,ubpf-handler=filename[,ip- > mode][,position=head|tail|id=][,insert=behind|before]`` > > +filter-ubpf is the userspace ebpf network traffic handler on netdev > dev > > +from the userspace ebpf handler file specified by filename. > > I believe the conventional capitalization is eBPF. Yes, will fix it. > > > +If disable ip_mode, the loaded ebpf program will handle raw > > Markup: ``ip_mode``. OK. > > > +network packet. > > Suggest something like "If ``ip_mode`` is off, the eBPF program is fed raw > network packets" (hope I'm not misinterpreting things). OK. I will address your comments in next version. Thanks Chen > > > + > > ``-object colo- > compare,id=id,primary_in=chardevid,secondary_in=chardevid,outdev=chard > evid,iothread=id[,vnet_hdr_support][,notify_dev=id][,compare_timeout=@ > var{ms}][,expired_scan_cycle=@var{ms}][,max_queue_size=@var{size}]`` > > Colo-compare gets packet from primary\_in chardevid and > > secondary\_in, then compare whether the payload of primary > > packet
RE: [RFC PATCH 12/12] test/qtest: Add ubpf basic test case
> -Original Message- > From: Thomas Huth > Sent: Friday, June 17, 2022 5:34 PM > To: Zhang, Chen ; Jason Wang > ; qemu-dev ; Paolo > Bonzini ; Daniel P. Berrangé > ; Eduardo Habkost ; Eric > Blake ; Markus Armbruster > Cc: Peter Maydell ; Laurent Vivier > ; Yuri Benditovich ; > Andrew Melnychenko > Subject: Re: [RFC PATCH 12/12] test/qtest: Add ubpf basic test case > > On 17/06/2022 09.36, Zhang Chen wrote: > > TODO: This test case does not work. Need add ubpf.h header in qtest > > compile "-I ../ubpf/vm -I ../ubpf/vm/inc". > > I'm not sure if we need it in qtest. Because normal tests/qtest not > > including external module test case like fdt. Or we just need a qtest > > case for filter-ubpf module. > > This test will load pre-compiled ebpf binary and run it in QEMU. > > > > Signed-off-by: Zhang Chen > > --- > [...] > > diff --git a/tests/qtest/ubpf-test.c b/tests/qtest/ubpf-test.c new > > file mode 100644 index 00..6e70a99320 > > --- /dev/null > > +++ b/tests/qtest/ubpf-test.c > > @@ -0,0 +1,64 @@ > > +/* > > + * QEMU Userspace eBPF test case > > + * > > + * Copyright(C) 2022 Intel Corporation. > > + * > > + * Author: > > + * Zhang Chen > > + * > > + * This work is licensed under the terms of the GNU GPL, version 2 or > > later. > > + * See the COPYING file in the top-level directory. > > + * > > + */ > > + > > +#include "qemu/osdep.h" > > +#include "libqtest.h" > > +#include "ebpf/ubpf.h" > > + > > +/* > > + * Demo userspace ebpf program > > + * The test binary use clang to build this source code: > > + * demo_ubpf.c > > + * > > + * #include > > + * > > + * static uint32_t double_it(uint32_t a) > > + * { > > + * return (a * 2); > > + * } > > + * > > + * uint32_t bpf_prog(int32_t *arg) { > > + * uint32_t result = 0; > > + * result = double_it(*arg); > > + * > > + * return result; > > + * } > > + * > > + * Build the userspace ebpf program binary file: > > + * clang -O2 -target bpf -c demo_ubpf.c -o demo_ubpf.o > > + * > > + * The external terget source: > > + * printf "%b" '\x05\x00\x00\x00' > integer_5.mem > > + * > > + */ > > + > > +int main(int argc, char **argc) > > +{ > > +UbpfState u_ebpf; > > +char program_path[] = "demo_ubpf.o"; > > +/* uBPF can read target from internal source or external source*/ > > +char target_path[] = "integer_5.mem"; > > + > > +qemu_ubpf_init_jit(_ebpf, true); > > + > > +g_assert_cmpuint(qemu_ubpf_prepare(_ebpf, program_path), ==, > > + 0); > > + > > +g_assert_true(qemu_ubpf_read_target(_ebpf, target_path)); > > + > > +g_assert_cmpuint(qemu_run_ubpf_once(_ebpf, u_ebpf.target, > > +u_ebpf.target_len), ==, 10); > > + > > +ubpf_destroy(u_ebpf.vm); > > + > > +return 0; > > +} > > Apart from the #include "libqtest.h" there is nothing related to qtest in > here ... should this maybe rather go into test/unit/ instead? Rethink about it, I think you are right. The only issue is can we involve submodule's header file in tests/unit? I can't find meson/fdt/SLIRP test cases in the tests. Thanks Chen > > Thomas
RE: [RFC PATCH 01/12] configure: Add iovisor/ubpf project as a submodule for QEMU
> -Original Message- > From: Thomas Huth > Sent: Monday, June 20, 2022 4:47 PM > To: Daniel P. Berrangé ; Zhang, Chen > > Cc: Jason Wang ; qemu-dev de...@nongnu.org>; Paolo Bonzini ; Eduardo > Habkost ; Eric Blake ; Markus > Armbruster ; Peter Maydell > ; Laurent Vivier ; Yuri > Benditovich ; Andrew Melnychenko > > Subject: Re: [RFC PATCH 01/12] configure: Add iovisor/ubpf project as a > submodule for QEMU > > On 20/06/2022 10.11, Daniel P. Berrangé wrote: > > On Mon, Jun 20, 2022 at 05:59:06AM +, Zhang, Chen wrote: > >> > >> > >>> -Original Message- > >>> From: Daniel P. Berrangé > >>> Sent: Friday, June 17, 2022 4:05 PM > >>> To: Zhang, Chen > >>> Cc: Jason Wang ; qemu-dev >>> de...@nongnu.org>; Paolo Bonzini ; Eduardo > >>> Habkost ; Eric Blake ; > >>> Markus Armbruster ; Peter Maydell > >>> ; Thomas Huth ; > Laurent > >>> Vivier ; Yuri Benditovich > >>> ; Andrew Melnychenko > >>> > >>> Subject: Re: [RFC PATCH 01/12] configure: Add iovisor/ubpf project > >>> as a submodule for QEMU > >>> > >>> On Fri, Jun 17, 2022 at 03:36:19PM +0800, Zhang Chen wrote: > >>>> Make iovisor/ubpf project be a git submodule for QEMU. > >>>> It will auto clone ubpf project when configure QEMU. > >>> > >>> I don't think we need todo this. As it is brand new functionality we > >>> don't have any back compat issues. We should just expect the distros > >>> to ship ubpf if they want their QEMU builds to take advantage of it. > >>> > >> > >> Yes, agree. It's the best way to use the uBPF project. > >> But current status is distros(ubuntu, RHEL...) does not ship the > >> iovisor/ubpf like the iovisor/bcc. So I have to do it. > >> Or do you have any better suggestions? > > > > If distros want to support the functionality, they can add packages > > for it IMHO. > > Yes, let's please avoid new submodules. Submodules can sometimes be a > real PITA (e.g. if you forget to update before rsync'ing your code to a > machine that has limited internet access), and if users install QEMU from > sources, they can also install ubpf from sources, too. > And if distros want to support this feature, they can package ubpf on their > own, as Daniel said. Hi Daniel and Thomas, I don't know much the background history of QEMU submodules, but meson build is a submodule for QEMU too. It means user can't install QEMU from sources with limited internet access. And back to Daniel's comments, Yes, the best way is distros add the ubpf packages, But maybe it's too late to implement new features for us. We can introduce the submodule now and auto change to the distros's lib when distros add it. For example QEMU's submodule SLIRP do it in the same way. It's already added by most distros and still as a QEMU submodule. It make user experience the latest technology with no other dependencies. uBPF infrastructure have the ability to extend the capabilities without requiring changing source code. If we not allow it, we have to re-implement all the eBPF assembler, disassembler, interpreter, and JIT compiler like DPDK userspace eBPF support (DPDK can't use ubpf project by license issue). Thanks Chen > > Thomas
RE: [RFC PATCH 01/12] configure: Add iovisor/ubpf project as a submodule for QEMU
> -Original Message- > From: Daniel P. Berrangé > Sent: Friday, June 17, 2022 4:05 PM > To: Zhang, Chen > Cc: Jason Wang ; qemu-dev de...@nongnu.org>; Paolo Bonzini ; Eduardo > Habkost ; Eric Blake ; Markus > Armbruster ; Peter Maydell > ; Thomas Huth ; Laurent > Vivier ; Yuri Benditovich > ; Andrew Melnychenko > > Subject: Re: [RFC PATCH 01/12] configure: Add iovisor/ubpf project as a > submodule for QEMU > > On Fri, Jun 17, 2022 at 03:36:19PM +0800, Zhang Chen wrote: > > Make iovisor/ubpf project be a git submodule for QEMU. > > It will auto clone ubpf project when configure QEMU. > > I don't think we need todo this. As it is brand new functionality we don't > have > any back compat issues. We should just expect the distros to ship ubpf if > they want their QEMU builds to take advantage of it. > Yes, agree. It's the best way to use the uBPF project. But current status is distros(ubuntu, RHEL...) does not ship the iovisor/ubpf like the iovisor/bcc. So I have to do it. Or do you have any better suggestions? Thanks Chen > > With regards, > Daniel > -- > |: https://berrange.com -o- > https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o-https://fstop138.berrange.com :| > |: https://entangle-photo.org-o- > https://www.instagram.com/dberrange :|
[RFC PATCH 12/12] test/qtest: Add ubpf basic test case
TODO: This test case does not work. Need add ubpf.h header in qtest compile "-I ../ubpf/vm -I ../ubpf/vm/inc". I'm not sure if we need it in qtest. Because normal tests/qtest not including external module test case like fdt. Or we just need a qtest case for filter-ubpf module. This test will load pre-compiled ebpf binary and run it in QEMU. Signed-off-by: Zhang Chen --- tests/qtest/demo_ubpf.o | Bin 0 -> 544 bytes tests/qtest/integer_5.mem | Bin 0 -> 4 bytes tests/qtest/meson.build | 3 +- tests/qtest/ubpf-test.c | 64 ++ 4 files changed, 66 insertions(+), 1 deletion(-) create mode 100644 tests/qtest/demo_ubpf.o create mode 100644 tests/qtest/integer_5.mem create mode 100644 tests/qtest/ubpf-test.c diff --git a/tests/qtest/demo_ubpf.o b/tests/qtest/demo_ubpf.o new file mode 100644 index ..960a411c224348548db42d9ae2716ae3ef4ea249 GIT binary patch literal 544 zcmb<-^>JfjWMqH=MuzVU2p#Csy$>0EHJ20>URVE5RB+`KtNZ(Wl7bhtPlwo1` z_#a~fe9`w0b}Wvq*jzLBo(B^7Zl~EGw9{yl;y@Jrlb@VXQnfh0>yPpQj1IU zk{R@hONvSolYn$(E{LWM&;lC6jK!!0P%$esIrOjt@j;jkO`QXj5BDdO{66uitn o|MP)V1G3ZtWDbzc0_CIIZv+%agepQ)1eEE4qz|MHW + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "libqtest.h" +#include "ebpf/ubpf.h" + +/* + * Demo userspace ebpf program + * The test binary use clang to build this source code: + * demo_ubpf.c + * + * #include + * + * static uint32_t double_it(uint32_t a) + * { + * return (a * 2); + * } + * + * uint32_t bpf_prog(int32_t *arg) { + * uint32_t result = 0; + * result = double_it(*arg); + * + * return result; + * } + * + * Build the userspace ebpf program binary file: + * clang -O2 -target bpf -c demo_ubpf.c -o demo_ubpf.o + * + * The external terget source: + * printf "%b" '\x05\x00\x00\x00' > integer_5.mem + * + */ + +int main(int argc, char **argc) +{ +UbpfState u_ebpf; +char program_path[] = "demo_ubpf.o"; +/* uBPF can read target from internal source or external source*/ +char target_path[] = "integer_5.mem"; + +qemu_ubpf_init_jit(_ebpf, true); + +g_assert_cmpuint(qemu_ubpf_prepare(_ebpf, program_path), ==, 0); + +g_assert_true(qemu_ubpf_read_target(_ebpf, target_path)); + +g_assert_cmpuint(qemu_run_ubpf_once(_ebpf, u_ebpf.target, +u_ebpf.target_len), ==, 10); + +ubpf_destroy(u_ebpf.vm); + +return 0; +} -- 2.25.1
[RFC PATCH 11/12] docs/devel: Add userspace-ebpf.rst
Introduce userspace ebpf basic knowledge. Signed-off-by: Zhang Chen --- docs/devel/userspace-ebpf.rst | 106 ++ 1 file changed, 106 insertions(+) create mode 100644 docs/devel/userspace-ebpf.rst diff --git a/docs/devel/userspace-ebpf.rst b/docs/devel/userspace-ebpf.rst new file mode 100644 index 00..41eb9b04d6 --- /dev/null +++ b/docs/devel/userspace-ebpf.rst @@ -0,0 +1,106 @@ +=== +Userspace eBPF support +=== + +eBPF is a revolutionary technology with origins in the Linux kernel that +can run sandboxed programs in an operating system kernel. It is used to +safely and efficiently extend the capabilities of the kernel without +requiring to change kernel source code or load kernel +modules.(from https://ebpf.io/) + +Recently, I worked on QEMU net filter related jobs, like netfilter/iptables +in kernel. We noticed kernel extend the netfilter original cBPF to eBPF, + +It make Linux kernel have the ability to load code dynamically. Why not +enable user space eBPF in QEMU? It can load binary eBPF program even +when VM running. Add some hooks in QEMU as the user space eBPF load point. +Do the things on different layers. The original idea from Jason Wang. + + +That???s the advantages of kernel eBPF. Most of the functions can be +implemented in QEMU. The Power of Programmability. + +1). Safety: + +Building on the foundation of seeing and understanding all system +calls and combining that with a packet and socket-level view of all +networking operations allows for revolutionary new approaches to +securing systems. + +2). Tracing & Profiling: + +The ability to attach eBPF programs to trace points as well as kernel +and user application probe points allows unprecedented visibility into +the runtime behavior of applications and the system itself. + +3). Networking: + +The combination of programmability and efficiency makes eBPF a natural +fit for all packet processing requirements of networking solutions. + +4). Observability & Monitoring: + +Instead of relying on static counters and gauges exposed by the +perating system, eBPF enables the collection & in-kernel aggregation +of custom metrics and generation of visibility events based on a wide +range of possible sources. + +Qemu userspace ebpf design based on ubpf project (https://github.com/iovisor/ubpf). +The most mature userspace ebpf implementation. This project officially +support by iovisor(Like BCC and bpftrace). Qemu userspace ebpf make +the ubpf project as the git submodule. + +Current implementation support load ebpf program and run it in +filter-ubpf module, developer can easy reuse the ubpf function in +Qemu's other modules from the function in /ebpf/ubpf.c, And it support JIT. +For the uBPF License is Apache License 2.0, It's OK to compatible +with QEMU???s GPLv2 LICENSE same as mason. + +How to use it: +1. Write your ebpf C program. For example filter dst IP: + +bpf_filter.c + +#include +#include + +#define ONE_ONE_ONE_ONE 0x01010101 + +struct ipv4_header { +uint8_t ver_ihl; +uint8_t tos; +uint16_t total_length; +uint16_t id; +uint16_t frag; +uint8_t ttl; +uint8_t proto; +uint16_t csum; +uint32_t src; +uint32_t dst; +}; + +int is_dst_one_one_one_one(void *opaque) { +struct ipv4_header *ipv4_header = (struct ipv4_header*)opaque; + +if (ntohl(ipv4_header->dst) == ONE_ONE_ONE_ONE) { +return 1; +} + +return 0; +} + +2. Build it with clang: + clang -O2 -target bpf -c bpf_filter.c -o ip_dst.o + +3. Load it with Qemu filter-ubpf: + -object filter-ubpf,netdev=hn0,id=ubpf1,queue=tx,ip-mode=on, + ubpf-handler=ip_dst.o + +4. Boot the VM and it will filt IP dst 1.1.1.1 packet. + + +TODO: Need to add more comments and test-case for ubpf, current + implementation not include ebpf verifier. Qemu is a userspace + program, not like kernel ebpf run code in kernel space, I think + if the someone want to hack Qemu code no need to load a malicious + ubpf program, he can hack Qemu code directly. -- 2.25.1
[RFC PATCH 07/12] net/filter: Introduce filter-ubpf module
The filter-ubpf module able to load user defined ebpf program to handle network packet based on filter framework. Signed-off-by: Zhang Chen --- net/filter-ubpf.c | 149 ++ net/meson.build | 1 + 2 files changed, 150 insertions(+) create mode 100644 net/filter-ubpf.c diff --git a/net/filter-ubpf.c b/net/filter-ubpf.c new file mode 100644 index 00..c63a021759 --- /dev/null +++ b/net/filter-ubpf.c @@ -0,0 +1,149 @@ +/* + * QEMU Userspace eBPF Support + * + * Copyright(C) 2022 Intel Corporation. + * + * Author: + * Zhang Chen + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "net/filter.h" +#include "net/net.h" +#include "qapi/error.h" +#include "qom/object.h" +#include "qemu/main-loop.h" +#include "qemu/error-report.h" +#include "trace.h" +#include "ebpf/ubpf.h" + +#define TYPE_FILTER_UBPF "filter-ubpf" +OBJECT_DECLARE_SIMPLE_TYPE(FiliterUbpfState, FILTER_UBPF) + +struct FiliterUbpfState { +NetFilterState parent_obj; +bool ip_mode; +char *handler; +UbpfState ubpf; +}; + +static ssize_t filter_ubpf_receive_iov(NetFilterState *nf, + NetClientState *sender, + unsigned flags, + const struct iovec *iov, + int iovcnt, + NetPacketSent *sent_cb) +{ +/* TODO: handle packet by loaded userspace ebpf program */ + +return 0; +} + +static void filter_ubpf_cleanup(NetFilterState *nf) +{ +/* cleanup */ +} + +static void filter_ubpf_setup(NetFilterState *nf, Error **errp) +{ +FiliterUbpfState *s = FILTER_UBPF(nf); + +if (s->handler == NULL) { +error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND, "filter-ubpf parameter"\ + " 'ubpf-handler' cannot be empty"); +return; +} + +qemu_ubpf_init_jit(>ubpf, true); + +if (qemu_ubpf_prepare(>ubpf, s->handler)) { +error_set(errp, ERROR_CLASS_DEVICE_NOT_ACTIVE, "filter-ubpf parameter"\ + " 'ubpf-handler' cannot be load"); +return; +} +} + +static char *filter_ubpf_get_handler(Object *obj, Error **errp) +{ +FiliterUbpfState *s = FILTER_UBPF(obj); + +return g_strdup(s->handler); +} + +static void filter_ubpf_set_handler(Object *obj, +const char *value, +Error **errp) +{ +FiliterUbpfState *s = FILTER_UBPF(obj); + +g_free(s->handler); +s->handler = g_strdup(value); +if (!s->handler) { +error_setg(errp, "filter ubpf needs 'ubpf-handler' " + "property set"); +return; +} +} + +static bool filter_ubpf_get_mode(Object *obj, Error **errp) +{ +FiliterUbpfState *s = FILTER_UBPF(obj); + +return s->ip_mode; +} + +static void filter_ubpf_set_mode(Object *obj, bool value, Error **errp) +{ +FiliterUbpfState *s = FILTER_UBPF(obj); + +s->ip_mode = value; +} + +static void filter_ubpf_class_init(ObjectClass *oc, void *data) +{ +NetFilterClass *nfc = NETFILTER_CLASS(oc); + +object_class_property_add_str(oc, "ubpf-handler", + filter_ubpf_get_handler, + filter_ubpf_set_handler); +object_class_property_add_bool(oc, "ip-mode", + filter_ubpf_get_mode, + filter_ubpf_set_mode); + +nfc->setup = filter_ubpf_setup; +nfc->cleanup = filter_ubpf_cleanup; +nfc->receive_iov = filter_ubpf_receive_iov; +} + +static void filter_ubpf_init(Object *obj) +{ +FiliterUbpfState *s = FILTER_UBPF(obj); + +/* Filter-ubpf default is ip_mode */ +s->ip_mode = true; +} + +static void filter_ubpf_fini(Object *obj) +{ +/* do some thing */ +} + +static const TypeInfo filter_ubpf_info = { +.name = TYPE_FILTER_UBPF, +.parent = TYPE_NETFILTER, +.class_init = filter_ubpf_class_init, +.instance_init = filter_ubpf_init, +.instance_finalize = filter_ubpf_fini, +.instance_size = sizeof(FiliterUbpfState), +}; + +static void register_types(void) +{ +type_register_static(_ubpf_info); +} + +type_init(register_types); diff --git a/net/meson.build b/net/meson.build index 754e2d1d40..177078fa7a 100644 --- a/net/meson.build +++ b/net/meson.build @@ -14,6 +14,7 @@ softmmu_ss.add(files( 'queue.c', 'socket.c', 'util.c', + 'filter-ubpf.c', )) softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('filter-replay.c')) -- 2.25.1
[RFC PATCH 08/12] qapi: Add FilterUbpfProperties and qemu-options
Add filter-ubpf related QOM and qemu-options. Signed-off-by: Zhang Chen --- qapi/qom.json | 18 ++ qemu-options.hx | 6 ++ 2 files changed, 24 insertions(+) diff --git a/qapi/qom.json b/qapi/qom.json index 6a653c6636..820a5218e8 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -444,6 +444,22 @@ 'base': 'NetfilterProperties', 'data': { '*vnet_hdr_support': 'bool' } } +## +# @FilterUbpfProperties: +# +# Properties for filter-ubpf objects. +# +# @ip-mode: if true, IP packet handle mode is enabled(default: true). +# +# @ubpf-handler: The filename where the userspace ebpf packets handler. +# +# Since: 7.1 +## +{ 'struct': 'FilterUbpfProperties', + 'base': 'NetfilterProperties', + 'data': { '*ip-mode': 'bool', +'*ubpf-handler': 'str' } } + ## # @InputBarrierProperties: # @@ -845,6 +861,7 @@ 'filter-redirector', 'filter-replay', 'filter-rewriter', +'filter-ubpf', 'input-barrier', { 'name': 'input-linux', 'if': 'CONFIG_LINUX' }, @@ -911,6 +928,7 @@ 'filter-redirector': 'FilterRedirectorProperties', 'filter-replay': 'NetfilterProperties', 'filter-rewriter':'FilterRewriterProperties', + 'filter-ubpf':'FilterUbpfProperties', 'input-barrier': 'InputBarrierProperties', 'input-linux':{ 'type': 'InputLinuxProperties', 'if': 'CONFIG_LINUX' }, diff --git a/qemu-options.hx b/qemu-options.hx index 60cf188da4..3dfb858867 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -5080,6 +5080,12 @@ SRST stored. The file format is libpcap, so it can be analyzed with tools such as tcpdump or Wireshark. +``-object filter-ubpf,id=id,netdev=dev,ubpf-handler=filename[,ip-mode][,position=head|tail|id=][,insert=behind|before]`` +filter-ubpf is the userspace ebpf network traffic handler on netdev dev +from the userspace ebpf handler file specified by filename. +If disable ip_mode, the loaded ebpf program will handle raw +network packet. + ``-object colo-compare,id=id,primary_in=chardevid,secondary_in=chardevid,outdev=chardevid,iothread=id[,vnet_hdr_support][,notify_dev=id][,compare_timeout=@var{ms}][,expired_scan_cycle=@var{ms}][,max_queue_size=@var{size}]`` Colo-compare gets packet from primary\_in chardevid and secondary\_in, then compare whether the payload of primary packet -- 2.25.1
[RFC PATCH 05/12] ebpf/uBPF: Add qemu_prepare_ubpf to load ebpf binary
The qemu_prepare_ubpf() can load user defined userspace ebpf binary file to Qemu userspace ebpf VM but not run it. The ebpf program will triggered in the hook point. Signed-off-by: Zhang Chen --- ebpf/ubpf-stub.c | 5 +++ ebpf/ubpf.c | 100 +++ ebpf/ubpf.h | 1 + 3 files changed, 106 insertions(+) diff --git a/ebpf/ubpf-stub.c b/ebpf/ubpf-stub.c index 2e8bf15b91..885bd954b7 100644 --- a/ebpf/ubpf-stub.c +++ b/ebpf/ubpf-stub.c @@ -22,3 +22,8 @@ bool qemu_ubpf_read_target(UbpfState *u_ebpf, char *path) } void qemu_ubpf_init_jit(UbpfState *u_ebpf, bool jit) {} + +int qemu_ubpf_prepare(UbpfState *u_ebpf, char *code_path) +{ +return 0; +} diff --git a/ebpf/ubpf.c b/ebpf/ubpf.c index 38a6530903..d65fffeda3 100644 --- a/ebpf/ubpf.c +++ b/ebpf/ubpf.c @@ -99,3 +99,103 @@ void qemu_ubpf_init_jit(UbpfState *u_ebpf, bool jit) { u_ebpf->jit = jit; } + +static uint64_t gather_bytes(uint8_t a, uint8_t b, uint8_t c, + uint8_t d, uint8_t e) +{ +return ((uint64_t)a << 32) | + ((uint32_t)b << 24) | + ((uint32_t)c << 16) | + ((uint16_t)d << 8) | + e; +} + +static void trash_registers(void) +{ +/* Overwrite all caller-save registers */ +asm( +"mov $0xf0, %rax;" +"mov $0xf1, %rcx;" +"mov $0xf2, %rdx;" +"mov $0xf3, %rsi;" +"mov $0xf4, %rdi;" +"mov $0xf5, %r8;" +"mov $0xf6, %r9;" +"mov $0xf7, %r10;" +"mov $0xf8, %r11;" +); +} + +static uint32_t sqrti(uint32_t x) +{ +return sqrt(x); +} + +static uint64_t unwind(uint64_t i) +{ +return i; +} + +static void register_functions(struct ubpf_vm *vm) +{ +ubpf_register(vm, 0, "gather_bytes", gather_bytes); +ubpf_register(vm, 1, "memfrob", memfrob); +ubpf_register(vm, 2, "trash_registers", trash_registers); +ubpf_register(vm, 3, "sqrti", sqrti); +ubpf_register(vm, 4, "strcmp_ext", strcmp); +ubpf_register(vm, 5, "unwind", unwind); +ubpf_set_unwind_function_index(vm, 5); +} + +int qemu_ubpf_prepare(UbpfState *u_ebpf, char *code_path) +{ +bool is_elf; +char *errmsg; +int ret; + +if (!qemu_ubpf_read_code(u_ebpf, code_path)) { +error_report("Ubpf failed to read code"); +return -1; +} + +u_ebpf->vm = ubpf_create(); +if (!u_ebpf->vm) { +error_report("Failed to create ubpf VM"); +return -1; +} + +register_functions(u_ebpf->vm); + +/* + * The ELF magic corresponds to an RSH instruction with an offset, + * which is invalid. + */ + is_elf = u_ebpf->code_len >= SELFMAG && !memcmp(u_ebpf->code, + ELFMAG, SELFMAG); + +if (is_elf) { +ret = ubpf_load_elf(u_ebpf->vm, u_ebpf->code, +u_ebpf->code_len, ); +} else { +ret = ubpf_load(u_ebpf->vm, u_ebpf->code, +u_ebpf->code_len, ); +} + +if (ret < 0) { +error_report("Failed to load ubpf code: %s ", errmsg); +free(errmsg); +ubpf_destroy(u_ebpf->vm); +return -1; +} + +if (u_ebpf->jit) { +u_ebpf->fn = ubpf_compile(u_ebpf->vm, ); +if (u_ebpf->fn == NULL) { +error_report("Failed to ubpf compile: %s", errmsg); +free(errmsg); +return -1; +} +} + +return 0; +} diff --git a/ebpf/ubpf.h b/ebpf/ubpf.h index 808c02565c..9a35efbeb6 100644 --- a/ebpf/ubpf.h +++ b/ebpf/ubpf.h @@ -37,5 +37,6 @@ typedef struct UbpfState { bool qemu_ubpf_read_code(UbpfState *u_ebpf, char *path); bool qemu_ubpf_read_target(UbpfState *u_ebpf, char *path); void qemu_ubpf_init_jit(UbpfState *u_ebpf, bool jit); +int qemu_ubpf_prepare(UbpfState *u_ebpf, char *code_path); #endif /* QEMU_UBPF_H */ -- 2.25.1
[RFC PATCH 10/12] net/filter-ubpf.c: run the ubpf program to handle network packet
Run the loaded userspace ebpf program with the packet. Signed-off-by: Zhang Chen --- net/filter-ubpf.c | 40 ++-- 1 file changed, 38 insertions(+), 2 deletions(-) diff --git a/net/filter-ubpf.c b/net/filter-ubpf.c index c63a021759..554cc24d8f 100644 --- a/net/filter-ubpf.c +++ b/net/filter-ubpf.c @@ -20,6 +20,8 @@ #include "qemu/error-report.h" #include "trace.h" #include "ebpf/ubpf.h" +#include "colo.h" +#include "util.h" #define TYPE_FILTER_UBPF "filter-ubpf" OBJECT_DECLARE_SIMPLE_TYPE(FiliterUbpfState, FILTER_UBPF) @@ -38,9 +40,43 @@ static ssize_t filter_ubpf_receive_iov(NetFilterState *nf, int iovcnt, NetPacketSent *sent_cb) { -/* TODO: handle packet by loaded userspace ebpf program */ +FiliterUbpfState *s = FILTER_UBPF(nf); +size_t size; +char *buf; +Packet *pkt = NULL; +uint64_t result; + +size = iov_size(iov, iovcnt); +if (!size) { +return 0; +} + +buf = g_malloc(size); +if (unlikely(iov_to_buf(iov, iovcnt, 0, buf, size) != size)) { +g_free(buf); +return 0; +} + +pkt = packet_new_nocopy(buf, size, 0); -return 0; +if (parse_packet_early(pkt)) { +packet_destroy(pkt, NULL); +pkt = NULL; +return 0; +} + +if (s->ip_mode) { +result = qemu_ubpf_run_once(>ubpf, pkt->ip, sizeof(struct ip)); +} else { +result = qemu_ubpf_run_once(>ubpf, pkt->data, pkt->size); +} + +/* If result == 1, means trigger the ebpf program rules */ +if (result) { +return -1; +} else { +return 0; +} } static void filter_ubpf_cleanup(NetFilterState *nf) -- 2.25.1
[RFC PATCH 06/12] ebpf/uBPF: Add qemu_ubpf_run_once excute real ebpf program
Before running this function, we need to ensure that the userspace ebpf program has been loaded correctly. Signed-off-by: Zhang Chen --- ebpf/ubpf-stub.c | 6 ++ ebpf/ubpf.c | 16 ebpf/ubpf.h | 2 ++ 3 files changed, 24 insertions(+) diff --git a/ebpf/ubpf-stub.c b/ebpf/ubpf-stub.c index 885bd954b7..70da421629 100644 --- a/ebpf/ubpf-stub.c +++ b/ebpf/ubpf-stub.c @@ -27,3 +27,9 @@ int qemu_ubpf_prepare(UbpfState *u_ebpf, char *code_path) { return 0; } + +uint64_t qemu_ubpf_run_once(UbpfState *u_ebpf, void *target, +size_t target_len) +{ +return 0; +} diff --git a/ebpf/ubpf.c b/ebpf/ubpf.c index d65fffeda3..8ac513c7ed 100644 --- a/ebpf/ubpf.c +++ b/ebpf/ubpf.c @@ -199,3 +199,19 @@ int qemu_ubpf_prepare(UbpfState *u_ebpf, char *code_path) return 0; } + +uint64_t qemu_ubpf_run_once(UbpfState *u_ebpf, void *target, +size_t target_len) +{ +uint64_t result; + +if (u_ebpf->jit) { +result = u_ebpf->fn(target, target_len); +} else { +if (ubpf_exec(u_ebpf->vm, target, target_len, ) < 0) { +result = UINT64_MAX; +} +} + +return result; +} diff --git a/ebpf/ubpf.h b/ebpf/ubpf.h index 9a35efbeb6..fc40e84e51 100644 --- a/ebpf/ubpf.h +++ b/ebpf/ubpf.h @@ -38,5 +38,7 @@ bool qemu_ubpf_read_code(UbpfState *u_ebpf, char *path); bool qemu_ubpf_read_target(UbpfState *u_ebpf, char *path); void qemu_ubpf_init_jit(UbpfState *u_ebpf, bool jit); int qemu_ubpf_prepare(UbpfState *u_ebpf, char *code_path); +uint64_t qemu_ubpf_run_once(UbpfState *u_ebpf, void *target, +size_t target_len); #endif /* QEMU_UBPF_H */ -- 2.25.1
[RFC PATCH 04/12] ebpf/uBPF: Introduce ubpf initialize functions
Introduce ubpf.c/ubpf-stub.c with basic read and init_jit functions. Add ubpf related .c files to meson.build. Signed-off-by: Zhang Chen --- ebpf/meson.build | 1 + ebpf/ubpf-stub.c | 24 +++ ebpf/ubpf.c | 101 +++ ebpf/ubpf.h | 4 ++ 4 files changed, 130 insertions(+) create mode 100644 ebpf/ubpf-stub.c create mode 100644 ebpf/ubpf.c diff --git a/ebpf/meson.build b/ebpf/meson.build index 2dd0fd8948..f4457fbd28 100644 --- a/ebpf/meson.build +++ b/ebpf/meson.build @@ -1 +1,2 @@ softmmu_ss.add(when: libbpf, if_true: files('ebpf_rss.c'), if_false: files('ebpf_rss-stub.c')) +softmmu_ss.add(when: ubpf, if_true: files('ubpf.c'), if_false: files('ubpf-stub.c')) diff --git a/ebpf/ubpf-stub.c b/ebpf/ubpf-stub.c new file mode 100644 index 00..2e8bf15b91 --- /dev/null +++ b/ebpf/ubpf-stub.c @@ -0,0 +1,24 @@ +/* + * QEMU Userspace eBPF Stub File + * + * Copyright(C) 2022 Intel Corporation. + * + * Author: + * Zhang Chen + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +bool qemu_ubpf_read_code(UbpfState *u_ebpf, char *path) +{ +return 0; +} + +bool qemu_ubpf_read_target(UbpfState *u_ebpf, char *path) +{ +return 0; +} + +void qemu_ubpf_init_jit(UbpfState *u_ebpf, bool jit) {} diff --git a/ebpf/ubpf.c b/ebpf/ubpf.c new file mode 100644 index 00..38a6530903 --- /dev/null +++ b/ebpf/ubpf.c @@ -0,0 +1,101 @@ +/* + * QEMU Userspace eBPF Support + * + * Copyright(C) 2022 Intel Corporation. + * + * Author: + * Zhang Chen + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu/error-report.h" +#include "ebpf/ubpf.h" + +static void *qemu_ubpf_read(const char *path, size_t maxlen, size_t *len) +{ +FILE *file; +size_t offset = 0, rv; +void *data; + +if (!strcmp(path, "-")) { +file = fdopen(STDIN_FILENO, "r"); +} else { +file = fopen(path, "r"); +} + +if (file == NULL) { +error_report("Failed to open %s: %s", path, strerror(errno)); +return NULL; +} + +data = g_malloc0(maxlen); + +while ((rv = fread(data + offset, 1, maxlen - offset, file)) > 0) { +offset += rv; +} + +if (ferror(file)) { +error_report("Failed to read %s: %s", path, strerror(errno)); +goto err; +} + +if (!feof(file)) { +error_report("Failed to read %s because it is too large" + " (max %u bytes)", path, (unsigned)maxlen); +goto err; +} + +fclose(file); +if (len) { +*len = offset; +} +return data; + +err: +fclose(file); +free(data); +return false; +} + +/* Read Userspace eBPF binary file to QEMU */ +bool qemu_ubpf_read_code(UbpfState *u_ebpf, char *path) +{ +if (!path) { +return false; +} +u_ebpf->code_path = path; + +u_ebpf->code = qemu_ubpf_read(u_ebpf->code_path, MAX_LEN, + _ebpf->code_len); +if (u_ebpf->code) { +return true; +} else { +return false; +} +} + +/* Read Userspace eBPF target */ +bool qemu_ubpf_read_target(UbpfState *u_ebpf, char *path) +{ +if (!path) { +return false; +} +u_ebpf->target_path = path; + +u_ebpf->target = qemu_ubpf_read(u_ebpf->target_path, MAX_LEN, +_ebpf->target_len); +if (u_ebpf->target) { +return true; +} else { +return false; +} +} + +void qemu_ubpf_init_jit(UbpfState *u_ebpf, bool jit) +{ +u_ebpf->jit = jit; +} diff --git a/ebpf/ubpf.h b/ebpf/ubpf.h index 2562fff503..808c02565c 100644 --- a/ebpf/ubpf.h +++ b/ebpf/ubpf.h @@ -34,4 +34,8 @@ typedef struct UbpfState { char *func; } UbpfState; +bool qemu_ubpf_read_code(UbpfState *u_ebpf, char *path); +bool qemu_ubpf_read_target(UbpfState *u_ebpf, char *path); +void qemu_ubpf_init_jit(UbpfState *u_ebpf, bool jit); + #endif /* QEMU_UBPF_H */ -- 2.25.1
[RFC PATCH 00/12] Introduce QEMU userspace ebpf support
Hi All, The goal of this series is to bring the power of ebpf to QEMU. It makes QEMU have the ability to extend the capabilities without requiring changing source code. Just need to load the eBPF binary file even at VM runtime. And already have some userspace ebpf implementation like: Intel DPDK eBPF, windows eBPF, etc.. The original idea suggested by Jason Wang. eBPF is a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in an operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules.(from https://ebpf.io/) KVM already got benefits from it, but QEMU did not. Hence we want to bring the power of eBPF to QEMU. It can load binary eBPF program even when VM running. At the same time, add some hooks in QEMU as the user space eBPF load point. Do the things on different layers. That???s the advantages of kernel eBPF. Most of the functions can be implemented in QEMU. This series just a start of the Power of Programmability. 1). Safety: Building on the foundation of seeing and understanding all system calls and combining that with a packet and socket-level view of all networking operations allows for revolutionary new approaches to securing systems. 2). Tracing & Profiling: The ability to attach eBPF programs to trace points as well as kernel and user application probe points allows unprecedented visibility into the runtime behavior of applications and the system itself. 3). Networking: The combination of programmability and efficiency makes eBPF a natural fit for all packet processing requirements of networking solutions. 4). Observability & Monitoring: Instead of relying on static counters and gauges exposed by the perating system, eBPF enables the collection & in-kernel aggregation of custom metrics and generation of visibility events based on a wide range of possible sources. QEMU userspace ebpf design based on ubpf project (https://github.com/iovisor/ubpf). The most mature userspace ebpf implementation. This project officially support by iovisor(Like BCC and bpftrace). This project includes an eBPF assembler, disassembler, interpreter (for all platforms), and JIT compiler (for x86-64 and Arm64 targets). Qemu userspace ebpf make the ubpf project as the git submodule. Current implementation support load ebpf program and run it in net/filter-ubpf module, this filter can support any user defined rules to hanle network packet. At the same time, it's easy for other developers to use the ubpf infrastructue in QEMU's other modules from the function in /ebpf/ubpf.c, and it support JIT. For the uBPF License is Apache License 2.0, It's OK to compatible with QEMU???s GPLv2 LICENSE same as mason. TODO: Need to add more comments and test-case for ubpf, current implementation not include ebpf verifier. But I think maybe it's not a big problem, current ebpf load/unload API exposed by QMP command. Qemu is a userspace program, if someone want to hack QEMU, no need to load a malicious ubpf program, it can hack QEMU code or crash QEMU on host directly(different from kernel ebpf needs strict inspection, but yes, it still need basic check). Any comments are welcome. Thanks Chen Zhang Chen (12): configure: Add iovisor/ubpf project as a submodule for QEMU meson: Add ubpf build config and misc ebpf/uBPF: Introduce userspace ebpf data structure ebpf/uBPF: Introduce ubpf initialize functions ebpf/uBPF: Add qemu_prepare_ubpf to load ebpf binary ebpf/uBPF: Add qemu_ubpf_run_once excute real ebpf program net/filter: Introduce filter-ubpf module qapi: Add FilterUbpfProperties and qemu-options softmmu/vl.c: Add filter-ubpf for netdev as other netfilters net/filter-ubpf.c: run the ubpf program to handle network packet docs/devel: Add userspace-ebpf.rst test/qtest: Add ubpf basic test case .gitmodules | 3 + configure | 20 +++ docs/devel/userspace-ebpf.rst | 106 ++ ebpf/meson.build| 1 + ebpf/ubpf-stub.c| 35 + ebpf/ubpf.c | 217 ebpf/ubpf.h | 44 ++ meson.build | 47 ++ meson_options.txt | 3 + net/filter-ubpf.c | 185 net/meson.build | 1 + qapi/qom.json | 18 +++ qemu-options.hx | 6 + scripts/coverity-scan/COMPONENTS.md | 3 + scripts/meson-buildoptions.sh | 5 + softmmu/vl.c| 3 +- tests/qtest/demo_ubpf.o | Bin 0 -> 544 bytes tests/qtest/integer_5.mem | Bin 0 -> 4 bytes tests/qtest/meson.build | 3 +-
[RFC PATCH 09/12] softmmu/vl.c: Add filter-ubpf for netdev as other netfilters
Signed-off-by: Zhang Chen --- softmmu/vl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/softmmu/vl.c b/softmmu/vl.c index 4c1e94b00e..d924fb1c71 100644 --- a/softmmu/vl.c +++ b/softmmu/vl.c @@ -1822,7 +1822,8 @@ static bool object_create_early(const char *type) g_str_equal(type, "filter-redirector") || g_str_equal(type, "colo-compare") || g_str_equal(type, "filter-rewriter") || -g_str_equal(type, "filter-replay")) { +g_str_equal(type, "filter-replay") || +g_str_equal(type, "filter-ubpf")) { return false; } -- 2.25.1
[RFC PATCH 03/12] ebpf/uBPF: Introduce userspace ebpf data structure
Add ebpf/ubpf.h for the UbpfState. Signed-off-by: Zhang Chen --- ebpf/ubpf.h | 37 + 1 file changed, 37 insertions(+) create mode 100644 ebpf/ubpf.h diff --git a/ebpf/ubpf.h b/ebpf/ubpf.h new file mode 100644 index 00..2562fff503 --- /dev/null +++ b/ebpf/ubpf.h @@ -0,0 +1,37 @@ +/* + * QEMU Userspace eBPF Header + * + * Copyright(C) 2022 Intel Corporation. + * + * Author: + * Zhang Chen + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef QEMU_UBPF_H +#define QEMU_UBPF_H + +#include +#include +#include + +#define MAX_LEN (1024 * 1024) + +typedef struct UbpfState { +bool jit; +char *code_path; +void *code; +size_t code_len; +char *target_path; +void *target; +size_t target_len; +struct ubpf_vm *vm; +ubpf_jit_fn fn; +int type; +char *func; +} UbpfState; + +#endif /* QEMU_UBPF_H */ -- 2.25.1
[RFC PATCH 01/12] configure: Add iovisor/ubpf project as a submodule for QEMU
Make iovisor/ubpf project be a git submodule for QEMU. It will auto clone ubpf project when configure QEMU. Signed-off-by: Zhang Chen --- .gitmodules | 3 +++ configure | 20 ubpf| 1 + 3 files changed, 24 insertions(+) create mode 16 ubpf diff --git a/.gitmodules b/.gitmodules index b8bff47df8..30fb082f39 100644 --- a/.gitmodules +++ b/.gitmodules @@ -64,3 +64,6 @@ [submodule "tests/lcitool/libvirt-ci"] path = tests/lcitool/libvirt-ci url = https://gitlab.com/libvirt/libvirt-ci.git +[submodule "ubpf"] + path = ubpf + url = https://github.com/iovisor/ubpf.git diff --git a/configure b/configure index e69537c756..7dde1429dc 100755 --- a/configure +++ b/configure @@ -326,6 +326,7 @@ else slirp="auto" fi fdt="auto" +ubpf="auto" # 2. Automatically enable/disable other options tcg="enabled" @@ -820,6 +821,14 @@ for opt do ;; --enable-slirp=*) slirp="$optarg" ;; + --disable-ubpf) ubpf="disabled" + ;; + --enable-ubpf) ubpf="enabled" + ;; + --enable-ubpf=git) ubpf="internal" + ;; + --enable-ubpf=*) ubpf="$optarg" + ;; --disable-tcg) tcg="disabled" plugins="no" ;; @@ -2176,6 +2185,16 @@ if test "$have_ubsan" = "yes"; then QEMU_LDFLAGS="-fsanitize=undefined $QEMU_LDFLAGS" fi +## +# check for ubpf + +case "$ubpf" in + auto | enabled | internal) +# Simpler to always update submodule, even if not needed. +git_submodules="${git_submodules} ubpf" +;; +esac + ## # Exclude --warn-common with TSan to suppress warnings from the TSan libraries. @@ -2664,6 +2683,7 @@ if test "$skip_meson" = no; then # QEMU options test "$cfi" != false && meson_option_add "-Dcfi=$cfi" test "$fdt" != auto && meson_option_add "-Dfdt=$fdt" + test "$ubpf" != auto && meson_option_add "-Dubpf=$ubpf" test -n "${LIB_FUZZING_ENGINE+xxx}" && meson_option_add "-Dfuzzing_engine=$LIB_FUZZING_ENGINE" test "$qemu_suffix" != qemu && meson_option_add "-Dqemu_suffix=$qemu_suffix" test "$slirp" != auto && meson_option_add "-Dslirp=$slirp" diff --git a/ubpf b/ubpf new file mode 16 index 00..0dd334daf4 --- /dev/null +++ b/ubpf @@ -0,0 +1 @@ +Subproject commit 0dd334daf4849137fa40d2b7676d2bf920d5c81d -- 2.25.1
[RFC PATCH 02/12] meson: Add ubpf build config and misc
Make meson to build iovisor/ubpf code in Qemu. Signed-off-by: Zhang Chen --- meson.build | 47 + meson_options.txt | 3 ++ scripts/coverity-scan/COMPONENTS.md | 3 ++ scripts/meson-buildoptions.sh | 5 +++ 4 files changed, 58 insertions(+) diff --git a/meson.build b/meson.build index 21cd949082..f370c1aba7 100644 --- a/meson.build +++ b/meson.build @@ -2717,9 +2717,53 @@ if not fdt.found() and fdt_required.length() > 0 error('fdt not available but required by targets ' + ', '.join(fdt_required)) endif +ubpf = not_found +ubpf_opt = 'disabled' +if have_system + ubpf_opt = get_option('ubpf') + if ubpf_opt in ['enabled', 'auto', 'system'] +have_internal = fs.exists(meson.current_source_dir() / 'ubpf/vm/Makefile') +ubpf = dependency('ubpf', kwargs: static_kwargs, + method: 'pkg-config', + required: ubpf_opt == 'system' or + ubpf_opt == 'enabled' and not have_internal) +if ubpf.found() + ubpf_opt = 'system' +elif have_internal + ubpf_opt = 'internal' +else + ubpf_opt = 'disabled' +endif + endif + if ubpf_opt == 'internal' +ubpf_data = configuration_data() + +ubpf_files = files( + 'ubpf/vm/ubpf_jit_x86_64.c', + 'ubpf/vm/ubpf_vm.c', + 'ubpf/vm/ubpf_loader.c', +) + +ubpf_cargs = [ + '-Wno-error', '-w', + '-include', 'ubpf-defs.h' +] + +configure_file(output: 'ubpf-defs.h', configuration: ubpf_data) +ubpf_inc = include_directories('ubpf/vm', 'ubpf/vm/inc') +libubpf = static_library('ubpf', + sources: ubpf_files, + c_args: ubpf_cargs, + include_directories: ubpf_inc) +ubpf = declare_dependency(link_with: libubpf, + include_directories: ubpf_inc) + endif +endif + config_host_data.set('CONFIG_CAPSTONE', capstone.found()) config_host_data.set('CONFIG_FDT', fdt.found()) config_host_data.set('CONFIG_SLIRP', slirp.found()) +config_host_data.set('CONFIG_UBPF', ubpf.found()) # # Generated sources # @@ -3046,6 +3090,8 @@ subdir('softmmu') common_ss.add(capstone) specific_ss.add(files('cpu.c', 'disas.c', 'gdbstub.c'), capstone) +common_ss.add(ubpf) + # Work around a gcc bug/misfeature wherein constant propagation looks # through an alias: # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696 @@ -3911,6 +3957,7 @@ summary_info += {'libudev': libudev} # Dummy dependency, keep .found() summary_info += {'FUSE lseek':fuse_lseek.found()} summary_info += {'selinux': selinux} +summary_info += {'ubpf support': ubpf_opt == 'internal' ? ubpf_opt : ubpf} summary(summary_info, bool_yn: true, section: 'Dependencies') if not supported_cpus.contains(cpu) diff --git a/meson_options.txt b/meson_options.txt index 2de94af037..1eb9164857 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -262,6 +262,9 @@ option('slirp', type: 'combo', value: 'auto', option('fdt', type: 'combo', value: 'auto', choices: ['disabled', 'enabled', 'auto', 'system', 'internal'], description: 'Whether and how to find the libfdt library') +option('ubpf', type: 'combo', value: 'auto', + choices: ['disabled', 'enabled', 'auto', 'system', 'internal'], + description: 'Whether and how to find the ubpf library') option('selinux', type: 'feature', value: 'auto', description: 'SELinux support in qemu-nbd') diff --git a/scripts/coverity-scan/COMPONENTS.md b/scripts/coverity-scan/COMPONENTS.md index 183f26a32c..dd28116674 100644 --- a/scripts/coverity-scan/COMPONENTS.md +++ b/scripts/coverity-scan/COMPONENTS.md @@ -72,6 +72,9 @@ char capstone ~ (/qemu)?(/capstone/.*) +ubpf + ~ (/qemu)?(/ubpf/vm/.*) + crypto ~ (/qemu)?((/include)?/crypto/.*|/hw/.*/crypto.*) diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh index 00ea4d8cd1..044dde1cff 100644 --- a/scripts/meson-buildoptions.sh +++ b/scripts/meson-buildoptions.sh @@ -37,6 +37,8 @@ meson_options_help() { printf "%s\n" ' getrandom()' printf "%s\n" ' --enable-slirp[=CHOICE] Whether and how to find the slirp library' printf "%s\n" ' (choices: auto/disabled/enabled/internal/system)' + printf "%s\n" ' --enable-ubpf[=CHOICE] Whether and how to find the ubpf library' + printf "%s\n" ' (choices: auto/disabled/enabled/internal/system)' printf "%s\n" ' --enable-strip Strip targets on install' printf "%s\n" ' --enable-tcg-interpreter TCG with bytecode interpreter (slow)' printf "%s\n" ' --enable-trace-backends=CHOICES' @@ -379,6 +381,9 @@ _meson_option_parse() { --enable-slirp=*) quote_sh "-Dslirp=$2" ;;
RE: [PATCH v7 05/13] multifd: Count the number of bytes sent correctly
> -Original Message- > From: Qemu-devel bounces+chen.zhang=intel@nongnu.org> On Behalf Of Juan Quintela > Sent: Tuesday, May 31, 2022 6:43 PM > To: qemu-devel@nongnu.org > Cc: Marcel Apfelbaum ; Philippe Mathieu- > Daudé ; Yanan Wang ; Dr. > David Alan Gilbert ; Juan Quintela > ; Eduardo Habkost ; Peter > Xu ; Leonardo Bras > Subject: [PATCH v7 05/13] multifd: Count the number of bytes sent correctly > > Current code asumes that all pages are whole. That is not true for example > for compression already. Fix it for creating a new field > ->sent_bytes that includes it. > > All ram_counters are used only from the migration thread, so we have two > options: > - put a mutex and fill everything when we sent it (not only ram_counters, > also qemu_file->xfer_bytes). > - Create a local variable that implements how much has been sent through > each channel. And when we push another packet, we "add" the previous > stats. > > I choose two due to less changes overall. On the previous code we increase > transferred and then we sent. Current code goes the other way around. It > sents the data, and after the fact, it updates the counters. Notice that each > channel can have a maximum of half a megabyte of data without counting, so > it is not very important. > > Signed-off-by: Juan Quintela > --- > migration/multifd.h | 2 ++ > migration/multifd.c | 14 ++ > 2 files changed, 8 insertions(+), 8 deletions(-) > > diff --git a/migration/multifd.h b/migration/multifd.h index > 71f49b4063..8a45dda58c 100644 > --- a/migration/multifd.h > +++ b/migration/multifd.h > @@ -102,6 +102,8 @@ typedef struct { > uint32_t flags; > /* global number of generated multifd packets */ > uint64_t packet_num; > +/* How many bytes have we sent on the last packet */ > +uint64_t sent_bytes; > /* thread has work to do */ > int pending_job; > /* array of pages to sent. > diff --git a/migration/multifd.c b/migration/multifd.c index > 166246b9b7..eef47c274f 100644 > --- a/migration/multifd.c > +++ b/migration/multifd.c > @@ -394,7 +394,6 @@ static int multifd_send_pages(QEMUFile *f) > static int next_channel; > MultiFDSendParams *p = NULL; /* make happy gcc */ > MultiFDPages_t *pages = multifd_send_state->pages; > -uint64_t transferred; > > if (qatomic_read(_send_state->exiting)) { > return -1; > @@ -429,10 +428,10 @@ static int multifd_send_pages(QEMUFile *f) > p->packet_num = multifd_send_state->packet_num++; > multifd_send_state->pages = p->pages; > p->pages = pages; > -transferred = ((uint64_t) pages->num) * p->page_size + p->packet_len; > -qemu_file_update_transfer(f, transferred); > -ram_counters.multifd_bytes += transferred; > -ram_counters.transferred += transferred; > +ram_transferred_add(p->sent_bytes); > +ram_counters.multifd_bytes += p->sent_bytes; > +qemu_file_update_transfer(f, p->sent_bytes); > +p->sent_bytes = 0; > qemu_mutex_unlock(>mutex); > qemu_sem_post(>sem); > > @@ -605,9 +604,6 @@ int multifd_send_sync_main(QEMUFile *f) > p->packet_num = multifd_send_state->packet_num++; > p->flags |= MULTIFD_FLAG_SYNC; > p->pending_job++; > -qemu_file_update_transfer(f, p->packet_len); > -ram_counters.multifd_bytes += p->packet_len; > -ram_counters.transferred += p->packet_len; > qemu_mutex_unlock(>mutex); > qemu_sem_post(>sem); > > @@ -712,6 +708,8 @@ static void *multifd_send_thread(void *opaque) > } > > qemu_mutex_lock(>mutex); > +p->sent_bytes += p->packet_len;; Typo here about ";;" ? Thanks Chen > +p->sent_bytes += p->next_packet_size; > p->pending_job--; > qemu_mutex_unlock(>mutex); > > -- > 2.35.3 >
RE: [PATCH v7 01/13] multifd: Document the locking of MultiFD{Send/Recv}Params
> -Original Message- > From: Qemu-devel bounces+chen.zhang=intel@nongnu.org> On Behalf Of Juan Quintela > Sent: Tuesday, May 31, 2022 6:43 PM > To: qemu-devel@nongnu.org > Cc: Marcel Apfelbaum ; Philippe Mathieu- > Daudé ; Yanan Wang ; Dr. > David Alan Gilbert ; Juan Quintela > ; Eduardo Habkost ; Peter > Xu ; Leonardo Bras > Subject: [PATCH v7 01/13] multifd: Document the locking of > MultiFD{Send/Recv}Params > > Reorder the structures so we can know if the fields are: > - Read only > - Their own locking (i.e. sems) > - Protected by 'mutex' > - Only for the multifd channel > > Signed-off-by: Juan Quintela > --- > migration/multifd.h | 90 ++--- > 1 file changed, 53 insertions(+), 37 deletions(-) > > diff --git a/migration/multifd.h b/migration/multifd.h index > 4d8d89e5e5..345cfdb50c 100644 > --- a/migration/multifd.h > +++ b/migration/multifd.h > @@ -65,7 +65,9 @@ typedef struct { > } MultiFDPages_t; > > typedef struct { > -/* this fields are not changed once the thread is created */ > +/* Fiields are only written at creating/deletion time */ S/Fiields/Fields Thanks Chen > +/* No lock required for them, they are read only */ > + > /* channel number */ > uint8_t id; > /* channel thread name */ > @@ -74,39 +76,47 @@ typedef struct { > QemuThread thread; > /* communication channel */ > QIOChannel *c; > -/* sem where to wait for more work */ > -QemuSemaphore sem; > -/* this mutex protects the following parameters */ > -QemuMutex mutex; > -/* is this channel thread running */ > -bool running; > -/* should this thread finish */ > -bool quit; > /* is the yank function registered */ > bool registered_yank; > +/* packet allocated len */ > +uint32_t packet_len; > +/* multifd flags for sending ram */ > +int write_flags; > + > +/* sem where to wait for more work */ > +QemuSemaphore sem; > +/* syncs main thread and channels */ > +QemuSemaphore sem_sync; > + > +/* this mutex protects the following parameters */ > +QemuMutex mutex; > +/* is this channel thread running */ > +bool running; > +/* should this thread finish */ > +bool quit; > +/* multifd flags for each packet */ > +uint32_t flags; > +/* global number of generated multifd packets */ > +uint64_t packet_num; > /* thread has work to do */ > int pending_job; > -/* array of pages to sent */ > +/* array of pages to sent. > + * The owner of 'pages' depends of 'pending_job' value: > + * pending_job == 0 -> migration_thread can use it. > + * pending_job != 0 -> multifd_channel can use it. > + */ > MultiFDPages_t *pages; > -/* packet allocated len */ > -uint32_t packet_len; > + > +/* thread local variables. No locking required */ > + > /* pointer to the packet */ > MultiFDPacket_t *packet; > -/* multifd flags for sending ram */ > -int write_flags; > -/* multifd flags for each packet */ > -uint32_t flags; > /* size of the next packet that contains pages */ > uint32_t next_packet_size; > -/* global number of generated multifd packets */ > -uint64_t packet_num; > -/* thread local variables */ > /* packets sent through this channel */ > uint64_t num_packets; > /* non zero pages sent through this channel */ > uint64_t total_normal_pages; > -/* syncs main thread and channels */ > -QemuSemaphore sem_sync; > /* buffers to send */ > struct iovec *iov; > /* number of iovs used */ > @@ -120,7 +130,9 @@ typedef struct { > } MultiFDSendParams; > > typedef struct { > -/* this fields are not changed once the thread is created */ > +/* Fiields are only written at creating/deletion time */ > +/* No lock required for them, they are read only */ > + > /* channel number */ > uint8_t id; > /* channel thread name */ > @@ -129,31 +141,35 @@ typedef struct { > QemuThread thread; > /* communication channel */ > QIOChannel *c; > +/* packet allocated len */ > +uint32_t packet_len; > + > +/* syncs main thread and channels */ > +QemuSemaphore sem_sync; > + > /* this mutex protects the following parameters */ > QemuMutex mutex; > /* is this channel thread running */ > bool running; > /* should this thread finish */ > bool quit; > +/* multifd flags for each packet */ > +uint32_t flags; > +/* global number of generated multifd packets */ > +uint64_t packet_num; > + > +/* thread local variables. No locking required */ > + > +/* pointer to the packet */ > +MultiFDPacket_t *packet; > +/* size of the next packet that contains pages */ > +uint32_t next_packet_size; > +/* packets sent through this channel */ > +uint64_t num_packets; > /* ramblock host address */ > uint8_t *host; > -/* packet
RE: [PATCH] ebpf: replace deprecated bpf_program__set_socket_filter
> -Original Message- > From: Qemu-devel bounces+chen.zhang=intel@nongnu.org> On Behalf Of Haochen Tong > Sent: Saturday, May 28, 2022 3:07 AM > To: qemu-devel@nongnu.org > Cc: qemu-triv...@nongnu.org; Haochen Tong > Subject: [PATCH] ebpf: replace deprecated bpf_program__set_socket_filter > > bpf_program__set_ functions have been deprecated since libbpf 0.8. > Replace with the equivalent bpf_program__set_type call to avoid a > deprecation warning. > > Signed-off-by: Haochen Tong It looks good to me. By the way, add eBPF maintainers. Reviewed-by: Zhang Chen Thanks Chen > --- > ebpf/ebpf_rss.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/ebpf/ebpf_rss.c b/ebpf/ebpf_rss.c index 118c68da83..cee658c158 > 100644 > --- a/ebpf/ebpf_rss.c > +++ b/ebpf/ebpf_rss.c > @@ -49,7 +49,7 @@ bool ebpf_rss_load(struct EBPFRSSContext *ctx) > goto error; > } > > -bpf_program__set_socket_filter(rss_bpf_ctx- > >progs.tun_rss_steering_prog); > +bpf_program__set_type(rss_bpf_ctx->progs.tun_rss_steering_prog, > + BPF_PROG_TYPE_SOCKET_FILTER); > > if (rss_bpf__load(rss_bpf_ctx)) { > trace_ebpf_error("eBPF RSS", "can not load RSS program"); > -- > 2.36.1 >
RE: [PATCH V2 1/4] softmmu/runstate.c: add RunStateTransition support form COLO to PRELAUNCH
> -Original Message- > From: Zhang, Chen > Sent: Wednesday, April 27, 2022 5:26 PM > To: Jason Wang ; Paolo Bonzini > > Cc: Li Zhijian ; qemu-dev de...@nongnu.org>; Like Xu > Subject: RE: [PATCH V2 1/4] softmmu/runstate.c: add RunStateTransition > support form COLO to PRELAUNCH > > > > > -Original Message- > > From: Jason Wang > > Sent: Wednesday, April 27, 2022 4:57 PM > > To: Zhang, Chen > > Cc: Li Zhijian ; qemu-dev > de...@nongnu.org>; Like Xu > > Subject: Re: [PATCH V2 1/4] softmmu/runstate.c: add RunStateTransition > > support form COLO to PRELAUNCH > > > > On Fri, Apr 1, 2022 at 11:59 AM Zhang Chen wrote: > > > > > > If the checkpoint occurs when the guest finishes restarting but has > > > not started running, the runstate_set() may reject the transition > > > from COLO to PRELAUNCH with the crash log: > > > > > > {"timestamp": {"seconds": 1593484591, "microseconds": 26605},\ > > > "event": "RESET", "data": {"guest": true, "reason": "guest-reset"}} > > > qemu-system-x86_64: invalid runstate transition: 'colo' -> 'prelaunch' > > > > > > Long-term testing says that it's pretty safe. > > > > > > Signed-off-by: Like Xu > > > Signed-off-by: Zhang Chen > > > > I'd expect this to get ack from the relevant maintainers. > > > > The scripts/get_maintainer.pl can't find relevant maintainers for this patch. > Maybe Paolo have time to cover this simple patch related to runstate? No news for a while, any comments for unmaintained files changes ? Ping... Thanks Chen > > Thanks > Chen > > > Thanks > > > > > --- > > > softmmu/runstate.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/softmmu/runstate.c b/softmmu/runstate.c index > > > e0d869b21a..c021c56338 100644 > > > --- a/softmmu/runstate.c > > > +++ b/softmmu/runstate.c > > > @@ -127,6 +127,7 @@ static const RunStateTransition > > runstate_transitions_def[] = { > > > { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH }, > > > > > > { RUN_STATE_COLO, RUN_STATE_RUNNING }, > > > +{ RUN_STATE_COLO, RUN_STATE_PRELAUNCH }, > > > { RUN_STATE_COLO, RUN_STATE_SHUTDOWN}, > > > > > > { RUN_STATE_RUNNING, RUN_STATE_DEBUG }, > > > -- > > > 2.25.1 > > >
RE: [PATCH V2 1/4] softmmu/runstate.c: add RunStateTransition support form COLO to PRELAUNCH
> -Original Message- > From: Jason Wang > Sent: Wednesday, April 27, 2022 4:57 PM > To: Zhang, Chen > Cc: Li Zhijian ; qemu-dev de...@nongnu.org>; Like Xu > Subject: Re: [PATCH V2 1/4] softmmu/runstate.c: add RunStateTransition > support form COLO to PRELAUNCH > > On Fri, Apr 1, 2022 at 11:59 AM Zhang Chen wrote: > > > > If the checkpoint occurs when the guest finishes restarting but has > > not started running, the runstate_set() may reject the transition from > > COLO to PRELAUNCH with the crash log: > > > > {"timestamp": {"seconds": 1593484591, "microseconds": 26605},\ > > "event": "RESET", "data": {"guest": true, "reason": "guest-reset"}} > > qemu-system-x86_64: invalid runstate transition: 'colo' -> 'prelaunch' > > > > Long-term testing says that it's pretty safe. > > > > Signed-off-by: Like Xu > > Signed-off-by: Zhang Chen > > I'd expect this to get ack from the relevant maintainers. > The scripts/get_maintainer.pl can't find relevant maintainers for this patch. Maybe Paolo have time to cover this simple patch related to runstate? Thanks Chen > Thanks > > > --- > > softmmu/runstate.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/softmmu/runstate.c b/softmmu/runstate.c index > > e0d869b21a..c021c56338 100644 > > --- a/softmmu/runstate.c > > +++ b/softmmu/runstate.c > > @@ -127,6 +127,7 @@ static const RunStateTransition > runstate_transitions_def[] = { > > { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH }, > > > > { RUN_STATE_COLO, RUN_STATE_RUNNING }, > > +{ RUN_STATE_COLO, RUN_STATE_PRELAUNCH }, > > { RUN_STATE_COLO, RUN_STATE_SHUTDOWN}, > > > > { RUN_STATE_RUNNING, RUN_STATE_DEBUG }, > > -- > > 2.25.1 > >
RE: [PATCH V2 0/4] COLO net and runstate bugfix/optimization
> -Original Message- > From: Jason Wang > Sent: Thursday, April 14, 2022 11:12 AM > To: Zhang, Chen > Cc: Li Zhijian ; qemu-dev de...@nongnu.org> > Subject: Re: [PATCH V2 0/4] COLO net and runstate bugfix/optimization > > On Thu, Apr 14, 2022 at 9:52 AM Zhang, Chen > wrote: > > > > No update for a while. Ping... > > > > Thanks > > Chen > > Hi: > > It's near to release, I think we can queue this for 7.1? Sure. I forgot to add "for 7.1" tag. Thanks Chen > > Thanks > > > > > > -Original Message- > > > From: Zhang, Chen > > > Sent: Friday, April 1, 2022 11:47 AM > > > To: Jason Wang ; Li Zhijian > > > > > > Cc: Zhang, Chen ; qemu-dev > > de...@nongnu.org> > > > Subject: [PATCH V2 0/4] COLO net and runstate bugfix/optimization > > > > > > This series fix some COLO related issues in internal stress testing. > > > > > > - V2: > > > - Add more comments in patch 2/4 commit log. > > > > > > Zhang Chen (4): > > > softmmu/runstate.c: add RunStateTransition support form COLO to > > > PRELAUNCH > > > net/colo: Fix a "double free" crash to clear the conn_list > > > net/colo.c: No need to track conn_list for filter-rewriter > > > net/colo.c: fix segmentation fault when packet is not parsed > > > correctly > > > > > > net/colo-compare.c| 2 +- > > > net/colo.c| 11 +-- > > > net/filter-rewriter.c | 2 +- > > > net/trace-events | 1 + > > > softmmu/runstate.c| 1 + > > > 5 files changed, 13 insertions(+), 4 deletions(-) > > > > > > -- > > > 2.25.1 > >
RE: [PATCH V2 0/4] COLO net and runstate bugfix/optimization
No update for a while. Ping... Thanks Chen > -Original Message- > From: Zhang, Chen > Sent: Friday, April 1, 2022 11:47 AM > To: Jason Wang ; Li Zhijian > Cc: Zhang, Chen ; qemu-dev de...@nongnu.org> > Subject: [PATCH V2 0/4] COLO net and runstate bugfix/optimization > > This series fix some COLO related issues in internal stress testing. > > - V2: > - Add more comments in patch 2/4 commit log. > > Zhang Chen (4): > softmmu/runstate.c: add RunStateTransition support form COLO to > PRELAUNCH > net/colo: Fix a "double free" crash to clear the conn_list > net/colo.c: No need to track conn_list for filter-rewriter > net/colo.c: fix segmentation fault when packet is not parsed correctly > > net/colo-compare.c| 2 +- > net/colo.c| 11 +-- > net/filter-rewriter.c | 2 +- > net/trace-events | 1 + > softmmu/runstate.c| 1 + > 5 files changed, 13 insertions(+), 4 deletions(-) > > -- > 2.25.1
[PATCH V2 4/4] net/colo.c: fix segmentation fault when packet is not parsed correctly
When COLO use only one vnet_hdr_support parameter between filter-redirector and filter-mirror(or colo-compare), COLO will crash with segmentation fault. Back track as follow: Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x55cb200b in eth_get_l2_hdr_length (p=0x0) at /home/tao/project/COLO/colo-qemu/include/net/eth.h:296 296 uint16_t proto = be16_to_cpu(PKT_GET_ETH_HDR(p)->h_proto); (gdb) bt 0 0x55cb200b in eth_get_l2_hdr_length (p=0x0) at /home/tao/project/COLO/colo-qemu/include/net/eth.h:296 1 0x55cb22b4 in parse_packet_early (pkt=0x56a44840) at net/colo.c:49 2 0x55cb2b91 in is_tcp_packet (pkt=0x56a44840) at net/filter-rewriter.c:63 So wrong vnet_hdr_len will cause pkt->data become NULL. Add check to raise error and add trace-events to track vnet_hdr_len. Signed-off-by: Tao Xu Signed-off-by: Zhang Chen Reviewed-by: Li Zhijian --- net/colo.c | 9 - net/trace-events | 1 + 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/net/colo.c b/net/colo.c index 694f3c93ef..6b0ff562ad 100644 --- a/net/colo.c +++ b/net/colo.c @@ -46,7 +46,14 @@ int parse_packet_early(Packet *pkt) static const uint8_t vlan[] = {0x81, 0x00}; uint8_t *data = pkt->data + pkt->vnet_hdr_len; uint16_t l3_proto; -ssize_t l2hdr_len = eth_get_l2_hdr_length(data); +ssize_t l2hdr_len; + +if (data == NULL) { +trace_colo_proxy_main_vnet_info("This packet is not parsed correctly, " +"pkt->vnet_hdr_len", pkt->vnet_hdr_len); +return 1; +} +l2hdr_len = eth_get_l2_hdr_length(data); if (pkt->size < ETH_HLEN + pkt->vnet_hdr_len) { trace_colo_proxy_main("pkt->size < ETH_HLEN"); diff --git a/net/trace-events b/net/trace-events index d7a17256cc..6af927b4b9 100644 --- a/net/trace-events +++ b/net/trace-events @@ -9,6 +9,7 @@ vhost_user_event(const char *chr, int event) "chr: %s got event: %d" # colo.c colo_proxy_main(const char *chr) ": %s" +colo_proxy_main_vnet_info(const char *sta, int size) ": %s = %d" # colo-compare.c colo_compare_main(const char *chr) ": %s" -- 2.25.1
[PATCH V2 3/4] net/colo.c: No need to track conn_list for filter-rewriter
Filter-rewriter no need to track connection in conn_list. This patch fix the glib g_queue_is_empty assertion when COLO guest keep a lot of network connection. Signed-off-by: Zhang Chen Reviewed-by: Li Zhijian --- net/colo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/colo.c b/net/colo.c index 1f8162f59f..694f3c93ef 100644 --- a/net/colo.c +++ b/net/colo.c @@ -218,7 +218,7 @@ Connection *connection_get(GHashTable *connection_track_table, /* * clear the conn_list */ -while (!g_queue_is_empty(conn_list)) { +while (conn_list && !g_queue_is_empty(conn_list)) { connection_destroy(g_queue_pop_head(conn_list)); } } -- 2.25.1
[PATCH V2 1/4] softmmu/runstate.c: add RunStateTransition support form COLO to PRELAUNCH
If the checkpoint occurs when the guest finishes restarting but has not started running, the runstate_set() may reject the transition from COLO to PRELAUNCH with the crash log: {"timestamp": {"seconds": 1593484591, "microseconds": 26605},\ "event": "RESET", "data": {"guest": true, "reason": "guest-reset"}} qemu-system-x86_64: invalid runstate transition: 'colo' -> 'prelaunch' Long-term testing says that it's pretty safe. Signed-off-by: Like Xu Signed-off-by: Zhang Chen --- softmmu/runstate.c | 1 + 1 file changed, 1 insertion(+) diff --git a/softmmu/runstate.c b/softmmu/runstate.c index e0d869b21a..c021c56338 100644 --- a/softmmu/runstate.c +++ b/softmmu/runstate.c @@ -127,6 +127,7 @@ static const RunStateTransition runstate_transitions_def[] = { { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH }, { RUN_STATE_COLO, RUN_STATE_RUNNING }, +{ RUN_STATE_COLO, RUN_STATE_PRELAUNCH }, { RUN_STATE_COLO, RUN_STATE_SHUTDOWN}, { RUN_STATE_RUNNING, RUN_STATE_DEBUG }, -- 2.25.1
[PATCH V2 2/4] net/colo: Fix a "double free" crash to clear the conn_list
We notice the QEMU may crash when the guest has too many incoming network connections with the following log: 15197@1593578622.668573:colo_proxy_main : colo proxy connection hashtable full, clear it free(): invalid pointer [1]15195 abort (core dumped) qemu-system-x86_64 This is because we create the s->connection_track_table with g_hash_table_new_full() which is defined as: GHashTable * g_hash_table_new_full (GHashFunc hash_func, GEqualFunc key_equal_func, GDestroyNotify key_destroy_func, GDestroyNotify value_destroy_func); The fourth parameter connection_destroy() will be called to free the memory allocated for all 'Connection' values in the hashtable when we call g_hash_table_remove_all() in the connection_hashtable_reset(). But both connection_track_table and conn_list reference to the same conn instance. It will trigger double free in conn_list clear. So this patch remove free action on hash table side to avoid double free the conn. Signed-off-by: Like Xu Signed-off-by: Zhang Chen --- net/colo-compare.c| 2 +- net/filter-rewriter.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/colo-compare.c b/net/colo-compare.c index 62554b5b3c..ab054cfd21 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -1324,7 +1324,7 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp) s->connection_track_table = g_hash_table_new_full(connection_key_hash, connection_key_equal, g_free, - connection_destroy); + NULL); colo_compare_iothread(s); diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c index bf05023dc3..c18c4c2019 100644 --- a/net/filter-rewriter.c +++ b/net/filter-rewriter.c @@ -383,7 +383,7 @@ static void colo_rewriter_setup(NetFilterState *nf, Error **errp) s->connection_track_table = g_hash_table_new_full(connection_key_hash, connection_key_equal, g_free, - connection_destroy); + NULL); s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf); } -- 2.25.1
[PATCH V2 0/4] COLO net and runstate bugfix/optimization
This series fix some COLO related issues in internal stress testing. - V2: - Add more comments in patch 2/4 commit log. Zhang Chen (4): softmmu/runstate.c: add RunStateTransition support form COLO to PRELAUNCH net/colo: Fix a "double free" crash to clear the conn_list net/colo.c: No need to track conn_list for filter-rewriter net/colo.c: fix segmentation fault when packet is not parsed correctly net/colo-compare.c| 2 +- net/colo.c| 11 +-- net/filter-rewriter.c | 2 +- net/trace-events | 1 + softmmu/runstate.c| 1 + 5 files changed, 13 insertions(+), 4 deletions(-) -- 2.25.1
RE: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list
> -Original Message- > From: lizhij...@fujitsu.com > Sent: Friday, April 1, 2022 9:47 AM > To: Zhang, Chen ; Jason Wang > > Cc: qemu-dev ; Like Xu > Subject: Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the > conn_list > > > > On 31/03/2022 10:25, Zhang, Chen wrote: > > > >> -Original Message- > >> From: lizhij...@fujitsu.com > >> Sent: Thursday, March 31, 2022 9:15 AM > >> To: Zhang, Chen ; Jason Wang > >> > >> Cc: qemu-dev ; Like Xu > >> > >> Subject: Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear > >> the conn_list > >> > >> > >> connection_track_table > >> -+-- > >> key1 | conn > >> |---+ > >> -+-- > >> | > >> key2 | conn|--+ > >> | > >> -+-- | > >> | > >> key3 | conn|-+| > >> | > >> -+-- || > >> | > >>|| > >> | > >>|| > >> | > >> + CompareState ++|| > >> | > >> | |VV > >> V > >> +---+ +---+ +---+ > >> |conn_list +--->conn +->conn | > >> connx > >> +---+ +---+ +---+ > >> | | | | | | > >> +---+ +---v+ +---v++---v+ +---v+ > >> |primary | |secondary|primary | |secondary > >> |packet | |packet +|packet | |packet + > >> ++ ++++ ++ > >> | | | | > >> +---v+ +---v++---v+ +---v+ > >> |primary | |secondary|primary | |secondary > >> |packet | |packet +|packet | |packet + > >> ++ ++++ ++ > >> | | | | > >> +---v+ +---v++---v+ +---v+ > >> |primary | |secondary|primary | |secondary > >> |packet | |packet +|packet | |packet + > >> ++ ++++ ++ > >> > >> I recalled that we should above relationships between > >> connection_track_table conn_list and conn. > >> That means both connection_track_table and conn_list reference to the > >> same conn instance. > >> > >> So before this patch, connection_get() is possible to > >> use-after-free/double free conn. where 1st was in > >> connection_hashtable_reset() and 2nd was > >> 221 while (!g_queue_is_empty(conn_list)) { > >> 222 connection_destroy(g_queue_pop_head(conn_list)); > >> 223 } > >> > >> I also doubt that your current abort was just due to above use-after- > >> free/double free. > >> If so, looks it's enough we just update to g_queue_clear(conn_list) > >> in the 2nd place. > > Make sense, but It also means the original patch works here, skip free conn > in connection_hashtable_reset() and do it in: > > 221 while (!g_queue_is_empty(conn_list)) { > > 222 connection_destroy(g_queue_pop_head(conn_list)); > > 223 }. > > It also avoid use-after-free/double free conn. > Although you will not use-after-free here, you have to consider other > situations carefully that > g_hash_table_remove_all() g_hash_table_destroy() were called where the > conn_list should also be freed with you approach. > > I re-checked the code, it looks fine to me.
RE: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list
> -Original Message- > From: lizhij...@fujitsu.com > Sent: Thursday, March 31, 2022 9:15 AM > To: Zhang, Chen ; Jason Wang > > Cc: qemu-dev ; Like Xu > Subject: Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the > conn_list > > > connection_track_table > -+-- > key1 | conn|---+ > -+-- | > key2 | conn|--+| > -+-- || > key3 | conn|-+|| > -+-- ||| > ||| > ||| > + CompareState ++||| > | |VVV > +---+ +---+ +---+ > |conn_list +--->conn +->conn | connx > +---+ +---+ +---+ > | | | | | | > +---+ +---v+ +---v++---v+ +---v+ >|primary | |secondary|primary | |secondary >|packet | |packet +|packet | |packet + >++ ++++ ++ >| | | | >+---v+ +---v++---v+ +---v+ >|primary | |secondary|primary | |secondary >|packet | |packet +|packet | |packet + >++ ++++ ++ >| | | | >+---v+ +---v++---v+ +---v+ >|primary | |secondary|primary | |secondary >|packet | |packet +|packet | |packet + >++ ++++ ++ > > I recalled that we should above relationships between > connection_track_table conn_list and conn. > That means both connection_track_table and conn_list reference to the > same conn instance. > > So before this patch, connection_get() is possible to use-after-free/double > free conn. where 1st was in > connection_hashtable_reset() and 2nd was > 221 while (!g_queue_is_empty(conn_list)) { > 222 connection_destroy(g_queue_pop_head(conn_list)); > 223 } > > I also doubt that your current abort was just due to above use-after- > free/double free. > If so, looks it's enough we just update to g_queue_clear(conn_list) in the 2nd > place. Make sense, but It also means the original patch works here, skip free conn in connection_hashtable_reset() and do it in: 221 while (!g_queue_is_empty(conn_list)) { 222 connection_destroy(g_queue_pop_head(conn_list)); 223 }. It also avoid use-after-free/double free conn. Maybe we can keep the original version to fix it? Thanks Chen > > Thanks > Zhijian > > > On 28/03/2022 17:13, Zhang, Chen wrote: > > > >> -Original Message- > >> From: lizhij...@fujitsu.com > >> Sent: Monday, March 21, 2022 11:06 AM > >> To: Zhang, Chen ; Jason Wang > >> ; lizhij...@fujitsu.com > >> Cc: qemu-dev ; Like Xu > >> > >> Subject: Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear > >> the conn_list > >> > >> > >> > >> On 09/03/2022 16:38, Zhang Chen wrote: > >>> We notice the QEMU may crash when the guest has too many incoming > >>> network connections with the following log: > >>> > >>> 15197@1593578622.668573:colo_proxy_main : colo proxy connection > >>> hashtable full, clear it > >>> free(): invalid pointer > >>> [1]15195 abort (core dumped) qemu-system-x86_64 > >>> > >>> This is because we create the s->connection_track_table with > >>> g_hash_table_new_full() which is defined as: > >>> > >>> GHashTable * g_hash_table_new_full (GHashFunc hash_func, > >>> GEqualFunc key_equal_func, > >>>
RE: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list
> -Original Message- > From: lizhij...@fujitsu.com > Sent: Monday, March 21, 2022 11:06 AM > To: Zhang, Chen ; Jason Wang > ; lizhij...@fujitsu.com > Cc: qemu-dev ; Like Xu > Subject: Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the > conn_list > > > > On 09/03/2022 16:38, Zhang Chen wrote: > > We notice the QEMU may crash when the guest has too many incoming > > network connections with the following log: > > > > 15197@1593578622.668573:colo_proxy_main : colo proxy connection > > hashtable full, clear it > > free(): invalid pointer > > [1]15195 abort (core dumped) qemu-system-x86_64 > > > > This is because we create the s->connection_track_table with > > g_hash_table_new_full() which is defined as: > > > > GHashTable * g_hash_table_new_full (GHashFunc hash_func, > > GEqualFunc key_equal_func, > > GDestroyNotify key_destroy_func, > > GDestroyNotify value_destroy_func); > > > > The fourth parameter connection_destroy() will be called to free the > > memory allocated for all 'Connection' values in the hashtable when we > > call g_hash_table_remove_all() in the connection_hashtable_reset(). > > > > It's unnecessary because we clear the conn_list explicitly later, and > > it's buggy when other agents try to call connection_get() with the > > same connection_track_table. > > > > Signed-off-by: Like Xu > > Signed-off-by: Zhang Chen > > --- > > net/colo-compare.c| 2 +- > > net/filter-rewriter.c | 2 +- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/net/colo-compare.c b/net/colo-compare.c index > > 62554b5b3c..ab054cfd21 100644 > > --- a/net/colo-compare.c > > +++ b/net/colo-compare.c > > @@ -1324,7 +1324,7 @@ static void > colo_compare_complete(UserCreatable *uc, Error **errp) > > s->connection_track_table = > g_hash_table_new_full(connection_key_hash, > > > > connection_key_equal, > > g_free, > > - connection_destroy); > > + NULL); > > > 202 /* if not found, create a new connection and add to hash table */ > 203 Connection *connection_get(GHashTable *connection_track_table, > 204 ConnectionKey *key, > 205 GQueue *conn_list) > 206 { > 207 Connection *conn = g_hash_table_lookup(connection_track_table, > key); > 208 > 209 if (conn == NULL) { > 210 ConnectionKey *new_key = g_memdup(key, sizeof(*key)); > 211 > 212 conn = connection_new(key); > 213 > 214 if (g_hash_table_size(connection_track_table) > > HASHTABLE_MAX_SIZE) { > 215 trace_colo_proxy_main("colo proxy connection hashtable full," > 216 " clear it"); > 217 connection_hashtable_reset(connection_track_table); > > 197 void connection_hashtable_reset(GHashTable *connection_track_table) > 198 { > 199 g_hash_table_remove_all(connection_track_table); > 200 } > > IIUC, above subroutine will do some cleanup explicitly. And before your > patch, connection_hashtable_reset() will release all keys and their values in > this hashtable. But now, you remove all keys and just one value(conn_list) > instead. Does it means other values will be leaked ? Thanks Zhijian. Re-think about it. Yes, I think you are right. It looks need a lock to avoid into connection_get() when triggered the clear hashtable operation. What do you think? Thanks Chen > > > 218 /* > 219 * clear the conn_list > 220 */ > 221 while (!g_queue_is_empty(conn_list)) { > 222 connection_destroy(g_queue_pop_head(conn_list)); > 223 } > 224 } > 225 > 226 g_hash_table_insert(connection_track_table, new_key, conn); > 227 } > 228 > 229 return conn; > 230 } > > > Thanks > Zhijian > > > > > colo_compare_iothread(s); > > > > diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c index > > bf05023dc3..c18c4c2019 100644 > > --- a/net/filter-rewriter.c > > +++ b/net/filter-rewriter.c > > @@ -383,7 +383,7 @@ static void colo_rewriter_setup(NetFilterState *nf, > Error **errp) > > s->connection_track_table = > g_hash_table_new_full(connection_key_hash, > > > > connection_key_equal, > > g_free, > > - connection_destroy); > > + NULL); > > s->incoming_queue = > qemu_new_net_queue(qemu_netfilter_pass_to_next, nf); > > } > >
RE: [PATCH] docs/devel/migration.rst: Updates content from code and fixes minor issues
Ping Thanks Chen > -Original Message- > From: Zhang, Chen > Sent: Thursday, January 27, 2022 6:33 PM > To: Dr. David Alan Gilbert ; Juan Quintela > ; qemu-dev ; qemu- > triv...@nongnu.org > Cc: Zhang, Chen > Subject: [PATCH] docs/devel/migration.rst: Updates content from code and > fixes minor issues > > Signed-off-by: Zhang Chen > --- > docs/devel/migration.rst | 36 +++- > 1 file changed, 23 insertions(+), 13 deletions(-) > > diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst index > 2401253482..9f190d439a 100644 > --- a/docs/devel/migration.rst > +++ b/docs/devel/migration.rst > @@ -156,12 +156,20 @@ An example (from hw/input/pckbd.c) >.name = "pckbd", >.version_id = 3, >.minimum_version_id = 3, > + .pre_load = kbd_pre_load, > + .post_load = kbd_post_load, > + .pre_save = kbd_pre_save, >.fields = (VMStateField[]) { >VMSTATE_UINT8(write_cmd, KBDState), >VMSTATE_UINT8(status, KBDState), >VMSTATE_UINT8(mode, KBDState), > - VMSTATE_UINT8(pending, KBDState), > + VMSTATE_UINT8(pending_tmp, KBDState), >VMSTATE_END_OF_LIST() > + }, > + .subsections = (const VMStateDescription*[]) { > + _kbd_outport, > + _kbd_extended_state, > + NULL >} >}; > > @@ -278,7 +286,7 @@ Example: >IDEState *s = opaque; > >return ((s->status & DRQ_STAT) != 0) > - || (s->bus->error_status & BM_STATUS_PIO_RETRY); > + || (s->bus->error_status & IDE_RETRY_PIO); >} > >const VMStateDescription vmstate_ide_drive_pio_state = { @@ -312,6 > +320,8 @@ Example: >}, >.subsections = (const VMStateDescription*[]) { >_ide_drive_pio_state, > + _ide_tray_state, > + _ide_atapi_gesn_state, >NULL >} >}; > @@ -481,11 +491,11 @@ versions exist for high bandwidth IO. > > An iterative device must provide: > > - - A ``save_setup`` function that initialises the data structures and > -transmits a first section containing information on the device. In the > -case of RAM this transmits a list of RAMBlocks and sizes. > + - A ``save_setup`` function that initialize the data structures and > +transmits a first section containing information on the device. In the > +case of RAM used to transmit a list of RAMBlocks and sizes. > > - - A ``load_setup`` function that initialises the data structures on the > + - A ``load_setup`` function that initialize the data structures on > + the > destination. > >- A ``save_live_pending`` function that is called repeatedly and must @@ - > 756,13 +766,13 @@ ADVISE->DISCARD->LISTEN->RUNNING->END > (the 'listen thread') which takes over the job of receiving > pages off the migration stream, while the main thread carries > on processing the blob. With this thread able to process page > -reception, the destination now 'sensitises' the RAM to detect > +reception, the destination now 'sensitive' the RAM to detect > any access to missing pages (on Linux using the 'userfault' > system). > > - Running > > -POSTCOPY_RUN causes the destination to synchronise all > +POSTCOPY_RUN causes the destination to synchronize all > state and start the CPUs and IO devices running. The main > thread now finishes processing the migration package and > now carries on as it would for normal precopy migration @@ -771,8 +781,8 > @@ ADVISE->DISCARD->LISTEN->RUNNING->END > > - End > > -The listen thread can now quit, and perform the cleanup of migration > -state, the migration is now complete. > +The listen thread already exited, and perform the cleanup of migration > +state, the migration has been completed. > > Source side page maps > - > @@ -861,9 +871,9 @@ Firmware > > > Migration migrates the copies of RAM and ROM, and thus when running -on > the destination it includes the firmware from the source. Even after - > resetting a VM, the old firmware is used. Only once QEMU has been > restarted -is the new firmware in use. > +on the destination it includes the firmware from the source. The old > +firmware of the VM is still used even rebooted. And the new firmware > +will not be loaded until QEMU is restarted. > > - Changes in firmware size can cause changes in the required RAMBlock size >to hold the firmware and thus migration can fail. In practice it's best > -- > 2.25.1
[PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list
We notice the QEMU may crash when the guest has too many incoming network connections with the following log: 15197@1593578622.668573:colo_proxy_main : colo proxy connection hashtable full, clear it free(): invalid pointer [1]15195 abort (core dumped) qemu-system-x86_64 This is because we create the s->connection_track_table with g_hash_table_new_full() which is defined as: GHashTable * g_hash_table_new_full (GHashFunc hash_func, GEqualFunc key_equal_func, GDestroyNotify key_destroy_func, GDestroyNotify value_destroy_func); The fourth parameter connection_destroy() will be called to free the memory allocated for all 'Connection' values in the hashtable when we call g_hash_table_remove_all() in the connection_hashtable_reset(). It's unnecessary because we clear the conn_list explicitly later, and it's buggy when other agents try to call connection_get() with the same connection_track_table. Signed-off-by: Like Xu Signed-off-by: Zhang Chen --- net/colo-compare.c| 2 +- net/filter-rewriter.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/colo-compare.c b/net/colo-compare.c index 62554b5b3c..ab054cfd21 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -1324,7 +1324,7 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp) s->connection_track_table = g_hash_table_new_full(connection_key_hash, connection_key_equal, g_free, - connection_destroy); + NULL); colo_compare_iothread(s); diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c index bf05023dc3..c18c4c2019 100644 --- a/net/filter-rewriter.c +++ b/net/filter-rewriter.c @@ -383,7 +383,7 @@ static void colo_rewriter_setup(NetFilterState *nf, Error **errp) s->connection_track_table = g_hash_table_new_full(connection_key_hash, connection_key_equal, g_free, - connection_destroy); + NULL); s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf); } -- 2.25.1
[PATCH 1/4] softmmu/runstate.c: add RunStateTransition support form COLO to PRELAUNCH
If the checkpoint occurs when the guest finishes restarting but has not started running, the runstate_set() may reject the transition from COLO to PRELAUNCH with the crash log: {"timestamp": {"seconds": 1593484591, "microseconds": 26605},\ "event": "RESET", "data": {"guest": true, "reason": "guest-reset"}} qemu-system-x86_64: invalid runstate transition: 'colo' -> 'prelaunch' Long-term testing says that it's pretty safe. Signed-off-by: Like Xu Signed-off-by: Zhang Chen --- softmmu/runstate.c | 1 + 1 file changed, 1 insertion(+) diff --git a/softmmu/runstate.c b/softmmu/runstate.c index e0d869b21a..c021c56338 100644 --- a/softmmu/runstate.c +++ b/softmmu/runstate.c @@ -127,6 +127,7 @@ static const RunStateTransition runstate_transitions_def[] = { { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH }, { RUN_STATE_COLO, RUN_STATE_RUNNING }, +{ RUN_STATE_COLO, RUN_STATE_PRELAUNCH }, { RUN_STATE_COLO, RUN_STATE_SHUTDOWN}, { RUN_STATE_RUNNING, RUN_STATE_DEBUG }, -- 2.25.1