On 02/17/2017 03:48 PM, Kevin Wolf wrote: > Am 17.02.2017 um 13:40 hat Vladimir Sementsov-Ogievskiy geschrieben: >> 17.02.2017 15:09, Kevin Wolf wrote: >>> Am 17.02.2017 um 12:46 hat Vladimir Sementsov-Ogievskiy geschrieben: >>>> 16.02.2017 14:49, Kevin Wolf wrote: >>>>> Am 16.02.2017 um 12:25 hat Kevin Wolf geschrieben: >>>>>> Am 15.02.2017 um 11:10 hat Vladimir Sementsov-Ogievskiy geschrieben: >>>>>>> Auto loading bitmaps are bitmaps stored in the disk image, which should >>>>>>> be loaded when the image is opened and become BdrvDirtyBitmaps for the >>>>>>> corresponding drive. >>>>>>> >>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> >>>>>>> Reviewed-by: John Snow <js...@redhat.com> >>>>>>> Reviewed-by: Max Reitz <mre...@redhat.com> >>>>>> Why do we need a new BlockDriver callback and special code for it in >>>>>> bdrv_open_common()? The callback is only ever called immediately after >>>>>> .bdrv_open/.bdrv_file_open, so can't the drivers just do this internally >>>>>> in their .bdrv_open implementation? Even more so because qcow2 is the >>>>>> only driver that supports this callback. >>>>> Actually, don't we have to call this in qcow2_invalidate_cache()? >>>>> Currently, I think, after a migration, the autoload bitmaps aren't >>>>> loaded. >>>>> >>>>> By moving the qcow2_load_autoloading_dirty_bitmaps() call to >>>>> qcow2_open(), this would be fixed. >>>>> >>>>> Kevin >>>> Bitmap should not be reloaded on any intermediate qcow2-open's, >>>> reopens, etc. It should be loaded once, on bdrv_open, to not create >>>> extra collisions (between in-memory bitmap and it's stored version). >>>> That was the idea. >>>> >>>> For bitmaps migration there are separate series, we shouldn't load >>>> bitmap from file on migration, as it's version in the file is >>>> outdated. >>> That's not what your series is doing, though. It loads the bitmaps when >> Actually, they will not be loaded as they will have IN_USE flag. >> >>> migration starts and doesn't reload then when migration completes, even >>> though they are stale. Migration with shared storage would just work >>> without an extra series if you did these things in the correct places. >>> >>> As a reminder, this is how migration with shared storage works (or >>> should work with your series): >>> >>> 1. Start destination qemu instance. This calls bdrv_open() with >>> BDRV_O_INACTIVE. We can read in some metadata, though we don't need >>> much more than the image size at this point. Writing to the image is >>> still impossible. >>> >>> 2. Start migration on the source, while the VM is still writing to the >>> image, rendering the cached metadata from step 1 stale. >>> >>> 3. Migration completes: >>> >>> a. Stop the VM >>> >>> b. Inactivate all images in the source qemu. This is where all >>> metadata needs to be written back to the image file, including >>> bitmaps. No writes to the image are possible after this point >>> because BDRV_O_INACTIVE is set. >>> >>> c. Invalidate the caches in the destination qemu, i.e. reload >>> everything from the file that could have changed since step 1, >>> including bitmaps. BDRV_O_INACTIVE is cleared, making the image >>> ready for writes. >>> >>> d. Resume the VM on the destination >>> >>> 4. Exit the source qemu process, which involves bdrv_close(). Note that >>> at this point, no writing to the image file is possible any more, >>> it's the destination qemu process that own the image file now. >>> >>> Your series loads and stores bitmaps in steps 1 and 4. This means that >> Actually - not. in 1 bitmaps are "in use", in 4 INACTIVE is set (and >> it is checked), nothing is stored. >> >>> they are stale on the destination when migration completes, and that >>> bdrv_close() wants to write to an image file that it doesn't own any >>> more, which will cause an assertion failure. If you instead move things >>> to steps 3b and 3c, it will just work. >> Hmm, I understand the idea.. But this will interfere with postcopy >> bitmap migration. So if we really need this, there should be some >> additional control flags or capabilities.. The problem of your >> approach is that bitmap actually migrated in the short state when >> source and destination are stopped, it may take time, as bitmaps may >> be large. > You can always add optimisations, but this is the basic lifecycle > process of block devices in qemu, so it would be good to adhere to it. > So far there are no other pieces of information that are ignored in > bdrv_invalidate()/bdrv_inactivate() and instead only handled in > bdrv_open()/bdrv_close(). It's a matter of consistency, too. > > And not having to add special cases for specific features in the generic > bdrv_open()/close() paths is a big plus for me anyway. > > Kevin But for sure this is bad from the downtime point of view. On migrate you will have to write to the image and re-read it again on the target. This would be very slow. This will not help for the migration with non-shared disk too.
That is why we have specifically worked in a migration, which for a good does not influence downtime at all now. With a write we are issuing several write requests + sync. Our measurements shows that bdrv_drain could take around a second on an averagely loaded conventional system, which seems unacceptable addition to me. Den