On 09/20/2011 03:24 PM, Juan Quintela wrote:
If we have one error while migrating, and then we issuse a
"migrate_cancel" command, guest hang.  Fix it for flushing only when
migration is in MIG_STATE_ACTIVE.  In case of error of cancellation,
don't flush.

We had an infinite loop at buffered_close()

         while (!s->has_error && s->buffer_size) {
             buffered_flush(s);
             if (s->freeze_output)
                 s->wait_for_unfreeze(s);
         }

There was no errors, there were things to send, and connection was
broken.  send() returns -EAGAIN, so we freezed output, but we
unfreeze_output and try again.

Signed-off-by: Juan Quintela<quint...@redhat.com>

I don't like the idea of adding an extra argument to fix a bug elsewhere. I don't consider this even a safety net, since it is relying anyway on the migration code setting the new argument correctly:

diff --git a/migration.c b/migration.c
index 9a93e3b..15d001e 100644
--- a/migration.c
+++ b/migration.c
@@ -288,7 +288,7 @@ int migrate_fd_cleanup(FdMigrationState *s)

     if (s->file) {
         DPRINTF("closing file\n");
-        if (qemu_fclose(s->file) != 0) {
+        if (qemu_fclose(s->file, s->state == MIG_STATE_ACTIVE) != 0) {
             ret = -1;
         }
         s->file = NULL;

Dan's patch is the right fix, this one can be dropped altogether.

If anything, you may consider making wait_for_unfreeze return an error code, and set has_error when wait_for_unfreeze returns an error. Alternatively, merge QEMUFile and BufferedFile's has_error flags, and use qemu_file_set_error when migration is canceled.

Paolo

Reply via email to