I have tested Proxy with QMP: "{'execute': 'trace-event-set-state',
'arguments': {'name': 'colo*', 'enable': true} }"I got this nothing except this logs on PVM side: [email protected]:colo_compare_main : secondary: unsupported packet in [email protected]:colo_compare_main : secondary: unsupported packet in [email protected]:colo_compare_main : secondary: unsupported packet in [email protected]:colo_compare_main : primary: unsupported packet in [email protected]:colo_compare_main : secondary: unsupported packet in My guest OS is Centos 7.5. I did nothing but boot up the OS. After that, firing some net IO still get those logs. I did some debug, maybe some parse error in parse_packet_early(), get the wrong ETH_P_protocolName Thanks, Zhengtao 发件人: Zhang, Chen <[email protected]> 发送时间: 2019年3月5日 23:32 收件人: wenzt <[email protected]> 抄送: 'qemu-discuss' <[email protected]> 主题: RE: Latest Qemu-COLO Problems From: wenzt [mailto:[email protected]] Sent: Thursday, February 28, 2019 10:00 AM To: Zhang, Chen <[email protected] <mailto:[email protected]> > Cc: 'qemu-discuss' <[email protected] <mailto:[email protected]> > Subject: 答复: Latest Qemu-COLO Problems This version: <https://github.com/coloft/qemu/tree/colo-v4.1-periodic-mode> https://github.com/coloft/qemu/tree/colo-v4.1-periodic-mode This is old version from 3 years ago, please drop it, use qemu upstream codes. Another question: What is the relationship between Proxy and Checkpoint ? When PVM and SVM send different net packet, proxy will send a signal to COLO-frame to do a checkpoint. Do they work together ? I guess we should set checkpoint interval longer like 20s. Yes, they work together, at the same time, we have periodic checkpoint mechanism, like a timer. You can set the time too. Does Proxy only works under network workload ? In my test, I feel like Proxy not working. Yes, as wiki said, colo-proxy compare the PVM and SVM packet to decide if do checkpoint. You can enable the COLO debug info to see proxy’s job in primary node like this: "{'execute': 'trace-event-set-state', 'arguments': {'name': 'colo*', 'enable': true} }" Thanks Zhang Chen 发件人: Zhang, Chen < <mailto:[email protected]> [email protected]> 发送时间: 2019年2月28日 9:34 收件人: wenzt < <mailto:[email protected]> [email protected]> 抄送: 'qemu-discuss' < <mailto:[email protected]> qemu-discuss@nongnu. org> 主题: RE: Latest Qemu-COLO Problems Which version? COLO project always said the PVM and SVM execute in parallel. Thanks Zhang Chen From: wenzt [ <mailto:[email protected]> mailto:[email protected]] Sent: Thursday, February 28, 2019 9:21 AM To: Zhang, Chen < <mailto:[email protected]> [email protected]> Cc: 'qemu-discuss' < <mailto:[email protected]> [email protected]> Subject: 答复: Latest Qemu-COLO Problems But in earlier version, I noticed that SVM always inmigration status even doing checkpoint. No operation can be performed on SVM. Thanks, Zhengtao 发件人: Zhang, Chen < <mailto:[email protected]> [email protected]> 发送时间: 2019年2月27日 18:45 收件人: wenzt < <mailto:[email protected]> [email protected]> 抄送: 'qemu-discuss' < <mailto:[email protected]> qemu-discuss@nongnu. org> 主题: RE: Latest Qemu-COLO Problems From: wenzt [ <mailto:[email protected]> mailto:[email protected]] Sent: Wednesday, February 27, 2019 6:04 PM To: Zhang, Chen < <mailto:[email protected]> [email protected]> Cc: 'qemu-discuss' < <mailto:[email protected]> [email protected]> Subject: 答复: Latest Qemu-COLO Problems Thanks for help ! I don’t know why we keep switching SVM between Run and Stop ? Why we don’t keep SVM inmigration status ? Because we need do checkpoint to sync all status between PVM and SVM. We can’t guarantee that their status will be the same after a while. Thanks Zhang Chen Thanks, Zhengtao 发件人: Zhang, Chen < <mailto:[email protected]> [email protected]> 发送时间: 2019年2月26日 18:41 收件人: wenzt < <mailto:[email protected]> [email protected]> 抄送: 'qemu-discuss' < <mailto:[email protected]> qemu-discuss@nongnu. org> 主题: RE: Latest Qemu-COLO Problems By the way, please read the COLO wiki use this command to trigger failover in secondary node: { 'execute': 'nbd-server-stop' } { "execute": "x-colo-lost-heartbeat" } Thanks Zhang Chen From: Zhang, Chen Sent: Tuesday, February 26, 2019 2:46 PM To: 'wenzt' < <mailto:[email protected]> [email protected]> Cc: 'qemu-discuss' < <mailto:[email protected]> [email protected]> Subject: RE: Latest Qemu-COLO Problems Sorry for slow response. I have fixed this bug in this series: <https://lists.nongnu.org/archive/html/qemu-devel/2019-02/msg06920.html> https://lists.nongnu.org/archive/html/qemu-devel/2019-02/msg06920.html Please test it. Thanks Zhang Chen From: wenzt [ <mailto:[email protected]> mailto:[email protected]] Sent: Friday, February 15, 2019 7:54 PM To: Zhang, Chen < <mailto:[email protected]> [email protected]> Cc: 'qemu-discuss' < <mailto:[email protected]> [email protected]> Subject: Latest Qemu-COLO Problems Hi Zhang, I have tested COLO with qemu-3.1.0 follow https://wiki.qemu.org/Features/COLO I got this problems on PVM: {"timestamp": {"seconds": 1550230616, "microseconds": 644348}, "event": "STOP"} {"timestamp": {"seconds": 1550230616, "microseconds": 719003}, "event": "RESUME"} {"timestamp": {"seconds": 1550230616, "microseconds": 743554}, "event": "STOP"} qemu-system-x86_64: Can't receive COLO message: Input/output error qemu-system-x86_64: Can't receive COLO message: Input/output error {"timestamp": {"seconds": 1550230618, "microseconds": 257209}, "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "error"}} And on SVM: {"timestamp": {"seconds": 1550230616, "microseconds": 731544}, "event": "STOP"} [email protected]:colo_vm_state_change <mailto:[email protected]:colo_vm_state_change> Change 'run' => 'stop' [email protected]:colo_send_message <mailto:[email protected]:colo_send_message> Send 'checkpoint-reply' message [email protected]:colo_receive_message <mailto:[email protected]:colo_receive_message> Receive 'vmstate-send' message [email protected]:colo_flush_ram_cache_begin <mailto:22555@1550230616. 759522:colo_flush_ram_cache_begin> dirty_pages 18446744073708498780 [email protected]:colo_flush_ram_cache_end <mailto:[email protected]:colo_flush_ram_cache_end> [email protected]:colo_receive_message <mailto:[email protected]:colo_receive_message> Receive 'vmstate-size' message [email protected]:colo_send_message <mailto:[email protected]:colo_send_message> Send 'vmstate-received' message {"timestamp": {"seconds": 1550230616, "microseconds": 837436}, "event": "RESUME"} qemu-system-x86_64: block.c:5062: bdrv_detach_aio_context: Assertion `!bs->walking_aio_notifiers' failed. Aborted (core dumped)
