In migration_maybe_pause we have: migrate_set_state(&s->state, *current_active_state, MIGRATION_STATUS_PRE_SWITCHOVER); qemu_sem_wait(&s->pause_sem); migrate_set_state(&s->state, MIGRATION_STATUS_PRE_SWITCHOVER, new_state);
the line numbers don't match my 2.12.0 checkout; so I guess that it's that qemu_sem_wait it's stuck at. QEMU must have sent the switch to PRE_SWITCHOVER and that should have sent an event to libvirt, and libvirt should notice that - I'm not sure how to tell whether libvirt has seen that event yet or not? Thank you for your attention. Yes, you are right, QEMU wait semaphore in this place. I use qemu-2.12.1, libvirt-4.0.0. Because I added some debug code, so the line numbers doesn't match open qemu -----邮件原件----- 发件人: Dr. David Alan Gilbert [mailto:dgilb...@redhat.com] 发送时间: 2019年8月21日 19:13 收件人: yuchen (Cloud) <yu.c...@h3c.com>; jdene...@redhat.com 抄送: qemu-devel@nongnu.org 主题: Re: [Qemu-devel] [BUG] living migrate vm pause forever * Yuchen (yu.c...@h3c.com) wrote: > Sometimes, living migrate vm pause forever, migrate job stop, but very small > probability, I can’t reproduce. > qemu wait semaphore from libvirt send migrate continue, however libvirt wait > semaphore from qemu send vm pause. Hi, I've copied in Jiri Denemark from libvirt. Can you confirm exactly which qemu and libvirt versions you're using please. > follow stack: > qemu: > Thread 6 (Thread 0x7f50445f3700 (LWP 18120)): > #0 0x00007f504b84d670 in sem_wait () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #1 0x00005574eda1e164 in qemu_sem_wait (sem=sem@entry=0x5574ef6930e0) > at qemu-2.12/util/qemu-thread-posix.c:322 > #2 0x00005574ed8dd72e in migration_maybe_pause (s=0x5574ef692f50, > current_active_state=0x7f50445f2ae4, new_state=10) > at qemu-2.12/migration/migration.c:2106 > #3 0x00005574ed8df51a in migration_completion (s=0x5574ef692f50) at > qemu-2.12/migration/migration.c:2137 > #4 migration_iteration_run (s=0x5574ef692f50) at > qemu-2.12/migration/migration.c:2311 > #5 migration_thread (opaque=0x5574ef692f50) > atqemu-2.12/migration/migration.c:2415 > #6 0x00007f504b847184 in start_thread () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #7 0x00007f504b574bed in clone () from > /lib/x86_64-linux-gnu/libc.so.6 In migration_maybe_pause we have: migrate_set_state(&s->state, *current_active_state, MIGRATION_STATUS_PRE_SWITCHOVER); qemu_sem_wait(&s->pause_sem); migrate_set_state(&s->state, MIGRATION_STATUS_PRE_SWITCHOVER, new_state); the line numbers don't match my 2.12.0 checkout; so I guess that it's that qemu_sem_wait it's stuck at. QEMU must have sent the switch to PRE_SWITCHOVER and that should have sent an event to libvirt, and libvirt should notice that - I'm not sure how to tell whether libvirt has seen that event yet or not? Dave > libvirt: > Thread 95 (Thread 0x7fdb82ffd700 (LWP 28775)): > #0 0x00007fdd177dc404 in pthread_cond_wait@@GLIBC_2.3.2 () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #1 0x00007fdd198c3b07 in virCondWait (c=0x7fdbc4003000, > m=0x7fdbc4002f30) at ../../../src/util/virthread.c:252 > #2 0x00007fdd198f36d2 in virDomainObjWait (vm=0x7fdbc4002f20) at > ../../../src/conf/domain_conf.c:3303 > #3 0x00007fdd09ffaa44 in qemuMigrationRun (driver=0x7fdd000037b0, > vm=0x7fdbc4002f20, persist_xml=0x0, > cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n > <uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n <hostname>mss > </hostname>\n > <hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., > cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, > flags=777, > resource=0, spec=0x7fdb82ffc670, dconn=0x0, graphicsuri=0x0, > nmigrate_disks=0, migrate_disks=0x0, compression=0x7fdb78007990, > migParams=0x7fdb82ffc900) > at ../../../src/qemu/qemu_migration.c:3937 > #4 0x00007fdd09ffb26a in doNativeMigrate (driver=0x7fdd000037b0, > vm=0x7fdbc4002f20, persist_xml=0x0, uri=0x7fdb780073a0 > "tcp://172.16.202.17:49152", > cookiein=0x7fdb780084e0 "<qemu-migration>\n > <name>mss-pl_652</name>\n > <uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n > <hostname>mss</hostname>\n <hos---Type <return> to continue, or q > <return> to quit--- > tuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra".. > tuuid>., cookieinlen=207, cookieout=0x7fdb82ffcad0, > tuuid>cookieoutlen=0x7fdb82ffcac8, flags=777, > resource=0, dconn=0x0, graphicsuri=0x0, nmigrate_disks=0, > migrate_disks=0x0, compression=0x7fdb78007990, migParams=0x7fdb82ffc900) > at ../../../src/qemu/qemu_migration.c:4118 > #5 0x00007fdd09ffd808 in qemuMigrationPerformPhase (driver=0x7fdd000037b0, > conn=0x7fdb500205d0, vm=0x7fdbc4002f20, persist_xml=0x0, > uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", graphicsuri=0x0, > nmigrate_disks=0, migrate_disks=0x0, compression=0x7fdb78007990, > migParams=0x7fdb82ffc900, > cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n > <uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n > <hostname>mss</hostname>\n > <hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., > cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, > flags=777, > resource=0) at ../../../src/qemu/qemu_migration.c:5030 > #6 0x00007fdd09ffdbb5 in qemuMigrationPerform (driver=0x7fdd000037b0, > conn=0x7fdb500205d0, vm=0x7fdbc4002f20, xmlin=0x0, persist_xml=0x0, > dconnuri=0x0, > uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", graphicsuri=0x0, > listenAddress=0x0, nmigrate_disks=0, migrate_disks=0x0, nbdPort=0, > compression=0x7fdb78007990, > migParams=0x7fdb82ffc900, > cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n > <uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n > <hostname>mss</hostname>\n > <hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., > cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, > flags=777, > dname=0x0, resource=0, v3proto=true) at > ../../../src/qemu/qemu_migration.c:5124 > #7 0x00007fdd0a054725 in qemuDomainMigratePerform3 (dom=0x7fdb78007b00, > xmlin=0x0, > cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n > <uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n > <hostname>mss</hostname>\n > <hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., > cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, > dconnuri=0x0, > uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", flags=777, > dname=0x0, resource=0) at ../../../src/qemu/qemu_driver.c:12996 > #8 0x00007fdd199ad0f0 in virDomainMigratePerform3 (domain=0x7fdb78007b00, > xmlin=0x0, > cookiein=0x7fdb780084e0 "<qemu-migration>\n <name>mss-pl_652</name>\n > <uuid>1f2b2334-451e-424b-822a-ea10452abb38</uuid>\n > <hostname>mss</hostname>\n > <hostuuid>334e344a-4130-4336-5534-323544543642</hostuuid>\n</qemu-migra"..., > cookieinlen=207, cookieout=0x7fdb82ffcad0, cookieoutlen=0x7fdb82ffcac8, > dconnuri=0x0, > uri=0x7fdb780073a0 "tcp://172.16.202.17:49152", flags=777, > dname=0x0, bandwidth=0) at ../../../src/libvirt-domain.c:4698 > #9 0x000055d13923a939 in remoteDispatchDomainMigratePerform3 > (server=0x55d13af90e60, client=0x55d13b0156f0, msg=0x55d13afbf620, > rerr=0x7fdb82ffcbc0, > args=0x7fdb7800b220, ret=0x7fdb78021e90) at > ../../../daemon/remote.c:4528 > #10 0x000055d13921a043 in remoteDispatchDomainMigratePerform3Helper > (server=0x55d13af90e60, client=0x55d13b0156f0, msg=0x55d13afbf620, > rerr=0x7fdb82ffcbc0, > args=0x7fdb7800b220, ret=0x7fdb78021e90) at > ../../../daemon/remote_dispatch.h:7944 > #11 0x00007fdd19a260b4 in virNetServerProgramDispatchCall > (prog=0x55d13af98b50, server=0x55d13af90e60, client=0x55d13b0156f0, > msg=0x55d13afbf620) > at ../../../src/rpc/virnetserverprogram.c:436 > #12 0x00007fdd19a25c17 in virNetServerProgramDispatch (prog=0x55d13af98b50, > server=0x55d13af90e60, client=0x55d13b0156f0, msg=0x55d13afbf620) > at ../../../src/rpc/virnetserverprogram.c:307 > #13 0x000055d13925933b in virNetServerProcessMsg (srv=0x55d13af90e60, > client=0x55d13b0156f0, prog=0x55d13af98b50, msg=0x55d13afbf620) > at ../../../src/rpc/virnetserver.c:148 > ---------------------------------------------------------------------- > --------------------------------------------------------------- > 本邮件及其附件含有新华三集团的保密信息,仅限于发送给上面地址中列出 > 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 > 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 > 邮件! > This e-mail and its attachments contain confidential information from > New H3C, which is intended only for the person or entity whose address > is listed above. Any use of the information contained herein in any > way (including, but not limited to, total or partial disclosure, > reproduction, or dissemination) by persons other than the intended > recipient(s) is prohibited. If you receive this e-mail in error, > please notify the sender by phone or email immediately and delete it! -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK