Peter Xu <pet...@redhat.com> writes:

> On Thu, Jul 06, 2023 at 02:33:42PM -0300, Fabiano Rosas wrote:
>> Peter Xu <pet...@redhat.com> writes:
>> 
>> > On Thu, Jul 06, 2023 at 10:50:34AM -0300, Fabiano Rosas wrote:
>> >> Peter Xu <pet...@redhat.com> writes:
>> >> 
>> >> > On Wed, Jul 05, 2023 at 07:05:13PM -0300, Fabiano Rosas wrote:
>> >> >> Peter Xu <pet...@redhat.com> writes:
>> >> >> 
>> >> >> > Provide an explicit reason for qemu_file_shutdown()s, which can be
>> >> >> > displayed in query-migrate when used.
>> >> >> >
>> >> >> 
>> >> >> Can we consider this to cover the TODO:
>> >> >> 
>> >> >>  * TODO: convert to propagate Error objects instead of squashing
>> >> >>  * to a fixed errno value
>> >> >> 
>> >> >> or would that need something fancier?
>> >> >
>> >> > The TODO seems to say we want to allow qemu_file_shutdown() to report an
>> >> > Error* when anything wrong happened (e.g. shutdown() failed)?  While 
>> >> > this
>> >> > patch was trying to store a specific error string so when query 
>> >> > migration
>> >> > later it'll show up to the user.  If so, IMHO they're two things.
>> >> >
>> >> 
>> >> Ok, just making sure.
>> >> 
>> >> >> 
>> >> >> > This will make e.g. migrate-pause to display explicit error 
>> >> >> > descriptions,
>> >> >> > from:
>> >> >> >
>> >> >> > "error-desc": "Channel error: Input/output error"
>> >> >> >
>> >> >> > To:
>> >> >> >
>> >> >> > "error-desc": "Channel is explicitly shutdown by the user"
>> >> >> >
>> >> >> > in query-migrate.
>> >> >> >
>> >> >> > Signed-off-by: Peter Xu <pet...@redhat.com>
>> >> >> > ---
>> >> >> >  migration/qemu-file.c | 5 ++++-
>> >> >> >  1 file changed, 4 insertions(+), 1 deletion(-)
>> >> >> >
>> >> >> > diff --git a/migration/qemu-file.c b/migration/qemu-file.c
>> >> >> > index 419b4092e7..ff605027de 100644
>> >> >> > --- a/migration/qemu-file.c
>> >> >> > +++ b/migration/qemu-file.c
>> >> >> > @@ -87,7 +87,10 @@ int qemu_file_shutdown(QEMUFile *f)
>> >> >> >       *      --> guest crash!
>> >> >> >       */
>> >> >> >      if (!f->last_error) {
>> >> >> > -        qemu_file_set_error(f, -EIO);
>> >> >> > +        Error *err = NULL;
>> >> >> > +
>> >> >> > +        error_setg(&err, "Channel is explicitly shutdown by the 
>> >> >> > user");
>> >> >> 
>> >> >> It is good that we can grep this message. However, I'm confused about
>> >> >> who the "user" is meant to be here and how are they implicated in this
>> >> >> error.
>> >> >
>> >> > Ah, here the user is who sends the "migrate-pause" command, according to
>> >> > the example of the commit message.
>> >> >
>> >> 
>> >> That's where I'm confused. There are 15 callsites for
>> >> qemu_file_shutdown(). Only 2 of them are from migrate-pause. So I'm
>> >> missing the logical step that links migrate-pause with this
>> >> error_setg().
>> >> Are you assuming that the race described will only happen
>> >> with migrate-pause and the other invocations would have set an error
>> >> already?
>> >
>> > It's not a race, but I think you're right. I thought it was always the case
>> 
>> I'm talking about the race with another thread checking f->last_error
>> and this thread setting it. Described in commit f5816b5c86ed
>> ("migration: Fix race on qemu_file_shutdown()").
>
> I don't yet catch your point, sorry.  I thought f5816b5c86ed closed that
> race.  What's still missing?
>

I was initially trying to ask if your previous knowledge about the
situation that caused the race could allow you to infer that the error
message would only be relevant in the migrate-pause scenario. But I now
understand that is not the case.

>> 
>> > to shut but actually not: we do shutdown() also in a few places where we
>> > don't really fail, either for COLO or for completion of migration.  With
>> > the 1st patch, it'll even show in query-migrate.  Thanks for spotting it -
>> > I could have done better.
>> >
>> 
>> The idea is that we avoid doing IO after the file has been shutdown, so
>> we preload this -EIO error. We could just alter the message to "Channel
>> has been explicitly shutdown" or "Tried to do IO after channel
>> shutdown". It would still be better than the generic EIO message.
>
> My point is I'm afraid (I thought after you pointed out, but maybe I just
> misread what you said..) we'll call qemu_file_shutdown() even in normal
> paths, so we can see an error poped up in query-migrate even if nothing
> wrong happened. I think that's unwanted.
>

I see. My point was that the error message wouldn't always match the
situation in which qemu_file_shutdown() was called. The fact that we
might not even want the error message at all had not crossed my mind.

> We can still improve that msg by only setting that specific error in e.g.
> qmp_migrate_pause|cancel() or paths where we know we want to set the error,
> but I'd rather drop the patch first so the rest patches can be reviewed and
> merged first; that'll be a cosmetic change.

Ok, I agree. Thanks for the clarification.

Reply via email to