On 02/20/2017 12:15 PM, Kevin Wolf wrote: > Am 18.02.2017 um 11:54 hat Denis V. Lunev geschrieben: >> On 02/17/2017 03:54 PM, Vladimir Sementsov-Ogievskiy wrote: >>> 17.02.2017 17:24, Kevin Wolf wrote: >>>> Am 17.02.2017 um 14:48 hat Denis V. Lunev geschrieben: >>>>> On 02/17/2017 04:34 PM, Kevin Wolf wrote: >>>>>> Am 17.02.2017 um 14:22 hat Denis V. Lunev geschrieben: >>>>>>> But for sure this is bad from the downtime point of view. >>>>>>> On migrate you will have to write to the image and re-read >>>>>>> it again on the target. This would be very slow. This will >>>>>>> not help for the migration with non-shared disk too. >>>>>>> >>>>>>> That is why we have specifically worked in a migration, >>>>>>> which for a good does not influence downtime at all now. >>>>>>> >>>>>>> With a write we are issuing several write requests + sync. >>>>>>> Our measurements shows that bdrv_drain could take around >>>>>>> a second on an averagely loaded conventional system, which >>>>>>> seems unacceptable addition to me. >>>>>> I'm not arguing against optimising migration, I fully agree with >>>>>> you. I >>>>>> just think that we should start with a correct if slow base version >>>>>> and >>>>>> then add optimisation to that, instead of starting with a broken base >>>>>> version and adding to that. >>>>>> >>>>>> Look, whether you do the expensive I/O on open/close and make that a >>>>>> slow operation or whether you do it on invalidate_cache/inactivate >>>>>> doesn't really make a difference in term of slowness because in >>>>>> general >>>>>> both operations are called exactly once. But it does make a difference >>>>>> in terms of correctness. >>>>>> >>>>>> Once you do the optimisation, of course, you'll skip writing those >>>>>> bitmaps that you transfer using a different channel, no matter whether >>>>>> you skip it in bdrv_close() or in bdrv_inactivate(). >>>>>> >>>>>> Kevin >>>>> I do not understand this point as in order to optimize this >>>>> we will have to create specific code path or option from >>>>> the migration code and keep this as an ugly kludge forever. >>>> The point that I don't understand is why it makes any difference for the >>>> follow-up migration series whether the writeout is in bdrv_close() or >>>> bdrv_inactivate(). I don't really see the difference between the two >>>> from a migration POV; both need to be skipped if we transfer the bitmap >>>> using a different channel. >>>> >>>> Maybe I would see the reason if I could find the time to look at the >>>> migration patches first, but unfortunately I don't have this time at the >>>> moment. >>>> >>>> My point is just that generally we want to have a correctly working qemu >>>> after every single patch, and even more importantly after every series. >>>> As the migration series is separate from this, I don't think it's a good >>>> excuse for doing worse than we could easily do here. >>>> >>>> Kevin >>> With open/close all is already ok - bitmaps will not be saved because >>> of BDRV_O_INACTIVE and will not be loaded because of IN_USE. > If the contents of so-called persistent bitmaps is lost across > migration and stale after bdrv_invalidate_cache, that's not "all ok" in > my book. It's buggy. > >> in any case load/store happens outside of VM downtime. >> The target is running at the moment of close on source, >> the source is running at the moment of open on target. > Which makes the operation in open/close meaningless: open reads data > which may still by changed, and close cannot write to the image any more > because ownership is already given up. > > Anyway, all of this isn't really productive. Please, can't you just > answer that simple question that I asked you above: What problems would > you get if you used invalidate_cache/inactivate instead of open/close > for triggering these actions? > > If you can bring up some "this would break X", "it would be very hard to > implement Y correctly then" or "in scenario Z, this would have unwanted > effects", then we can figure out what to do. But not putting things in > the proper place just because you don't feel like it isn't a very strong > argument. > > Kevin This will increase the downtime ~0.5-1 second for migrated VM or will require very intrusive interface from migration code to here to avoid bitmap save for inactivation on that path.
No other arguments so far. Den