On Thu, Jul 06, 2023 at 02:33:42PM -0300, Fabiano Rosas wrote: > Peter Xu <pet...@redhat.com> writes: > > > On Thu, Jul 06, 2023 at 10:50:34AM -0300, Fabiano Rosas wrote: > >> Peter Xu <pet...@redhat.com> writes: > >> > >> > On Wed, Jul 05, 2023 at 07:05:13PM -0300, Fabiano Rosas wrote: > >> >> Peter Xu <pet...@redhat.com> writes: > >> >> > >> >> > Provide an explicit reason for qemu_file_shutdown()s, which can be > >> >> > displayed in query-migrate when used. > >> >> > > >> >> > >> >> Can we consider this to cover the TODO: > >> >> > >> >> * TODO: convert to propagate Error objects instead of squashing > >> >> * to a fixed errno value > >> >> > >> >> or would that need something fancier? > >> > > >> > The TODO seems to say we want to allow qemu_file_shutdown() to report an > >> > Error* when anything wrong happened (e.g. shutdown() failed)? While this > >> > patch was trying to store a specific error string so when query migration > >> > later it'll show up to the user. If so, IMHO they're two things. > >> > > >> > >> Ok, just making sure. > >> > >> >> > >> >> > This will make e.g. migrate-pause to display explicit error > >> >> > descriptions, > >> >> > from: > >> >> > > >> >> > "error-desc": "Channel error: Input/output error" > >> >> > > >> >> > To: > >> >> > > >> >> > "error-desc": "Channel is explicitly shutdown by the user" > >> >> > > >> >> > in query-migrate. > >> >> > > >> >> > Signed-off-by: Peter Xu <pet...@redhat.com> > >> >> > --- > >> >> > migration/qemu-file.c | 5 ++++- > >> >> > 1 file changed, 4 insertions(+), 1 deletion(-) > >> >> > > >> >> > diff --git a/migration/qemu-file.c b/migration/qemu-file.c > >> >> > index 419b4092e7..ff605027de 100644 > >> >> > --- a/migration/qemu-file.c > >> >> > +++ b/migration/qemu-file.c > >> >> > @@ -87,7 +87,10 @@ int qemu_file_shutdown(QEMUFile *f) > >> >> > * --> guest crash! > >> >> > */ > >> >> > if (!f->last_error) { > >> >> > - qemu_file_set_error(f, -EIO); > >> >> > + Error *err = NULL; > >> >> > + > >> >> > + error_setg(&err, "Channel is explicitly shutdown by the > >> >> > user"); > >> >> > >> >> It is good that we can grep this message. However, I'm confused about > >> >> who the "user" is meant to be here and how are they implicated in this > >> >> error. > >> > > >> > Ah, here the user is who sends the "migrate-pause" command, according to > >> > the example of the commit message. > >> > > >> > >> That's where I'm confused. There are 15 callsites for > >> qemu_file_shutdown(). Only 2 of them are from migrate-pause. So I'm > >> missing the logical step that links migrate-pause with this > >> error_setg(). > >> Are you assuming that the race described will only happen > >> with migrate-pause and the other invocations would have set an error > >> already? > > > > It's not a race, but I think you're right. I thought it was always the case > > I'm talking about the race with another thread checking f->last_error > and this thread setting it. Described in commit f5816b5c86ed > ("migration: Fix race on qemu_file_shutdown()").
I don't yet catch your point, sorry. I thought f5816b5c86ed closed that race. What's still missing? > > > to shut but actually not: we do shutdown() also in a few places where we > > don't really fail, either for COLO or for completion of migration. With > > the 1st patch, it'll even show in query-migrate. Thanks for spotting it - > > I could have done better. > > > > The idea is that we avoid doing IO after the file has been shutdown, so > we preload this -EIO error. We could just alter the message to "Channel > has been explicitly shutdown" or "Tried to do IO after channel > shutdown". It would still be better than the generic EIO message. My point is I'm afraid (I thought after you pointed out, but maybe I just misread what you said..) we'll call qemu_file_shutdown() even in normal paths, so we can see an error poped up in query-migrate even if nothing wrong happened. I think that's unwanted. We can still improve that msg by only setting that specific error in e.g. qmp_migrate_pause|cancel() or paths where we know we want to set the error, but I'd rather drop the patch first so the rest patches can be reviewed and merged first; that'll be a cosmetic change. -- Peter Xu