On Fri, May 03, 2024 at 06:31:06PM -0300, Fabiano Rosas wrote:
> Peter Xu <pet...@redhat.com> writes:
> 
> > On Fri, May 03, 2024 at 04:56:08PM -0300, Fabiano Rosas wrote:
> >> Peter Xu <pet...@redhat.com> writes:
> >> 
> >> > On Fri, Apr 26, 2024 at 11:20:35AM -0300, Fabiano Rosas wrote:
> >> >> When the migration using the "file:" URI was implemented, I don't
> >> >> think any of us noticed that if you pass in a file name with the
> >> >> format "/dev/fdset/N", this allows a file descriptor to be passed in
> >> >> to QEMU and that behaves just like the "fd:" URI. So the "file:"
> >> >> support has been added without regard for the fdset part and we got
> >> >> some things wrong.
> >> >> 
> >> >> The first issue is that we should not truncate the migration file if
> >> >> we're allowing an fd + offset. We need to leave the file contents
> >> >> untouched.
> >> >
> >> > I'm wondering whether we can use fallocate() instead on the ranges so 
> >> > that
> >> > we always don't open() with O_TRUNC.  Before that..  could you remind me
> >> > why do we need to truncate in the first place?  I definitely missed
> >> > something else here too.
> >> 
> >> AFAIK, just to avoid any issues if the file is pre-existing. I don't see
> >> the difference between O_TRUNC and fallocate in this case.
> >
> > Then, shall we avoid truncations at all, leaving all the feasibility to
> > user (also errors prone to make)?
> >
> 
> Is this a big deal? I'd rather close that possible gap and avoid the bug
> reports.

No possible of such report if the user uses Libvirt or even more virt
stacks, am I right?  While this is only for whoever uses QEMU directly, and
only if the one forgot to remove a leftover image file?

I'd not worry about those people who use QEMU directly - they aren't the
people we need to care too much about, imho (and I'm definitely one of
them..).  The problem is I feel it an overkill introducing a migration
global var just for this purpose.

No strong opinions, if you feel strongly like so I'm ok with it.  But if
one day if we want to remove FileOutgoingArgs I'll also leave that to you
as a trade-off. :-)

> 
> >> 
> >> >
> >> >> 
> >> >> The second issue is that there's an expectation that QEMU removes the
> >> >> fd after the migration has finished. That's what the "fd:" code
> >> >> does. Otherwise a second migration on the same VM could attempt to
> >> >> provide an fdset with the same name and QEMU would reject it.
> >> >
> >> > Let me check what we do when with "fd:" and when migration completes or
> >> > cancels.
> >> >
> >> > IIUC it's qio_channel_file_close() that does the final cleanup work on
> >> > e.g. to_dst_file, right?  Then there's qemu_close(), and it has:
> >> >
> >> >     /* Close fd that was dup'd from an fdset */
> >> >     fdset_id = monitor_fdset_dup_fd_find(fd);
> >> >     if (fdset_id != -1) {
> >> >         int ret;
> >> >
> >> >         ret = close(fd);
> >> >         if (ret == 0) {
> >> >             monitor_fdset_dup_fd_remove(fd);
> >> >         }
> >> >
> >> >         return ret;
> >> >     }
> >> >
> >> > Shouldn't this done the work already?
> >> 
> >> That removes the mon_fdset_fd_dup->fd, we want to remove the
> >> mon_fdset_fd->fd.
> >
> > What I read so far is when we are removing the dup-fds, we'll do one more
> > thing:
> >
> > monitor_fdset_dup_fd_find_remove():
> >                     if (QLIST_EMPTY(&mon_fdset->dup_fds)) {
> >                         monitor_fdset_cleanup(mon_fdset);
> >                     }
> >
> > It means if we removed all the dup-fds correctly, we should also remove the
> > whole fdset, which includes the ->fds, IIUC.
> >
> 
> Since mon_fdset_fd->removed == false, we hit the runstate_is_running()
> problem. I'm not sure, but probably mon_refcount > 0 as well. So the fd
> would not be removed.
> 
> But I'll retest this on Monday just be sure, it's been a while since I
> wrote some parts of this.

Thanks.  And I hope we can also get some more clues too when you dig out
more out of the whole add-fd API; I hope we don't pile up more complicated
logics on top of a mistery.  I feel like this is the time we figure things
out.

-- 
Peter Xu


Reply via email to