On Thu, Apr 16, 2015 at 02:28:52PM +0800, zhang bo wrote:
On 2015/4/10 15:54, Jiri Denemark wrote:

On Wed, Apr 08, 2015 at 15:40:36 +0800, zhang bo wrote:
We recently encountered a problem:
   1) migrate a domain
   2) the client unexpectedly got *crashed* (let's take it as virsh command)
   3) *libvirtd still kept migrating the domain*
   4) after it's restarted, the client didn't know the guest is still migrating.

The problem is that libvirtd and the client has different view of the task 
state.  After migration,
the client may wrongly think that something's wrong that the domain got 
unexpectedly migrated.

In my opinion, libvirtd should just *execute* tasks, like the hands of a human,
while clients should be the brain to *schedule and remember* tasks.

So, In order to avoid this problem,we should let the client record all the 
taskes somewhere,
and reload the states after its restart. the client may cancel or continue the 
task as it wishes.
Libvirtd should not record the task status.

Not really. It's libvirtd, the daemon, which has to remember everything.
It manages the state of all domains running on a host and synchronizes
all clients that want to change state of the domains. Remember, even if
a client is not restarted, domains my unexpectedly migrate somewhere
else because another client might have asked for it.

That said, if you're implementing a higher management layer which
manages domains using libvirt and you know it is going to be the only
client talking directly to libvirt, you can remember the state there if
you want. However, it's not something libvirt itself should or could do.
But you will most likely need to synchronize the state with libvirtd in
case the client is restarted. Even libvirtd has to synchronize its
internal state with all running QEMU processes when it is restarted
because the state might have changed.

Jirka

.



Thank you Jirka.

Let's go a step further, suppose that the client doesn't crash at step 2), it's 
just disconnected to libvirtd at src side.
  1) client(nova) calls virDomainMigrateToURI2() to migrate a guest
  2) libvirtd at src side connects to libvirtd at dest side.
  3) Unfortunately, somehow, client(nova) gets disconnected to libvirtd while 
migrating the guest.
  4) the API virDomainMigrateToURI2() returns with error in client(nova)
  5) but libvirtd doesn't aware that the connection to client is broken, and 
keeps migrating the guest to dest.

libvirtd is aware of that, but that doesn't mean it should stop the
migration, if the task virDomainMigrateToURI2() got through the wire,
it started migrating.

  6) the guest is migrated to the dest  side eventually.
  7) Because the nova at src side thinks migration is not successed as step 4), 
the nova at the dest will consider the migrated-in guest as an unexpected 
running guest, and will shut it down.


nova knows the exact error that occurred and thus it can differentiate
between "error: Cannot migrate because 'asdf'" and "error: XML-RPC:
connection broken" or whatever.  If the connection was broken nova
must get all new info (refresh its knowledge state) from libvirt upon
new connection.

  The guest disappears  at last, due to the previous disconnection of libvirtd 
client and server.
  Even though libvirtd remembers everything, the client at dest side still 
wrongly killed the guest after migration.

So, how to solve this problem? Shall libvirtd keep watching its clients' 
connection, and cancel running jobs concerning the disconnected client 
immediately after the client disconnects?


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Attachment: signature.asc
Description: PGP signature

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Reply via email to