On Wed, Apr 24, 2024 at 10:50:14PM +0300, Vladimir Sementsov-Ogievskiy wrote: > On 24.04.24 22:40, Peter Xu wrote: > > On Wed, Apr 24, 2024 at 08:42:44PM +0300, Vladimir Sementsov-Ogievskiy > > wrote: > > > 1. Most of callers want to free the error after call. Let's help them. > > > > > > 2. Some callers log the error, some not. We also have places where we > > > log the stored error. Let's instead simply log every migration > > > error. > > > > > > 3. Some callers have to retrieve current migration state only to pass > > > it to migrate_set_error(). In the new helper let's get the state > > > automatically. > > > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@yandex-team.ru> > > > --- > > > migration/migration.c | 48 ++++++++++++---------------------------- > > > migration/migration.h | 2 +- > > > migration/multifd.c | 18 ++++++--------- > > > migration/postcopy-ram.c | 3 +-- > > > migration/savevm.c | 16 +++++--------- > > > 5 files changed, 28 insertions(+), 59 deletions(-) > > > > > > diff --git a/migration/migration.c b/migration/migration.c > > > index 696762bc64..806b7b080b 100644 > > > --- a/migration/migration.c > > > +++ b/migration/migration.c > > > @@ -285,7 +285,7 @@ void migration_bh_schedule(QEMUBHFunc *cb, void > > > *opaque) > > > void migration_cancel(const Error *error) > > > { > > > if (error) { > > > - migrate_set_error(current_migration, error); > > > + migrate_report_err(error_copy(error)); > > > } > > > if (migrate_dirty_limit()) { > > > qmp_cancel_vcpu_dirty_limit(false, -1, NULL); > > > @@ -779,13 +779,6 @@ process_incoming_migration_co(void *opaque) > > > } > > > if (ret < 0) { > > > - MigrationState *s = migrate_get_current(); > > > - > > > - if (migrate_has_error(s)) { > > > - WITH_QEMU_LOCK_GUARD(&s->error_mutex) { > > > - error_report_err(s->error); > > > - } > > > - } > > > error_report("load of migration failed: %s", strerror(-ret)); > > > goto fail; > > > } > > > @@ -1402,10 +1395,6 @@ static void migrate_fd_cleanup(MigrationState *s) > > > MIGRATION_STATUS_CANCELLED); > > > } > > > - if (s->error) { > > > - /* It is used on info migrate. We can't free it */ > > > - error_report_err(error_copy(s->error)); > > > - } > > > type = migration_has_failed(s) ? MIG_EVENT_PRECOPY_FAILED : > > > MIG_EVENT_PRECOPY_DONE; > > > migration_call_notifiers(s, type, NULL); > > > @@ -1418,12 +1407,14 @@ static void migrate_fd_cleanup_bh(void *opaque) > > > migrate_fd_cleanup(opaque); > > > } > > > -void migrate_set_error(MigrationState *s, const Error *error) > > > +void migrate_report_err(Error *error) > > > { > > > + MigrationState *s = migrate_get_current(); > > > > Avoid passing in *s looks ok. > > > > > QEMU_LOCK_GUARD(&s->error_mutex); > > > if (!s->error) { > > > s->error = error_copy(error); > > > > I think I get your point, but then looks like this error_copy() should be > > removed but forgotten? > > > > I remember I had an attempt to do similarly (but only transfer the > > ownership): > > > > https://lore.kernel.org/qemu-devel/20230829214235.69309-3-pet...@redhat.com/ > > > > However I gave up later and I forgot why. One reason could be that I hit a > > use-after-free, then found that well indeed leaking an Error is much better > > than debugging a UAF. > > Hmm, yes I saw a leaked Error somewhere, and this patch should fix it. But > may be I missed introducing this use-after-free again)
Ah do you mean you fixed a bug? If so please use a standalone patch for that and we'll need to copy stable. I did notice that in this series on patch looks like does more than one thing. It'll be helpful too for reviewers when patch can be split properly. > > > > > So maybe we simply keep it like before? If you like such change, let's > > just be extremely caucious. > > > > > } > > > + error_report_err(error); > > > > The ideal case to me is we don't trigger an error_report() all over the > > place. We're slightly going backward from that regard, IMHO.. > > > > Ideally I think we shouldn't dump anything to stderr, but user should > > always query from qmp. > > > > Sounds reasonable to me. I'm just unsure, if keep it like before, how > should I correctly update logging to stderr in > process_incoming_migration_co(). Could we just drop error reporting to > stderr? Or should it be kept as is for the case when exit_on_error is > true? I'm not sure I get the question, but I'll comment in patch 2 very soon, so we can move the discussion there. Thanks, -- Peter Xu