On 01/31/2017 05:28 PM, Max Reitz wrote: > On 31.01.2017 13:23, John Snow wrote: >> >> >> On 01/23/2017 11:29 AM, Max Reitz wrote: > > [...] > >>> The drawbacks with this approach would be the following: >>> (1) Is printing a warning enough to make the user shut down the VM as >>> fast as possible and run qemu-img check? >> >> No. >> >>> (2) It is legal to have a greater refcount than the number of internal >>> snapshots plus one. qemu never produces such images, though (or does >>> it?). Could there be existing images where users will be just annoyed by >>> such warnings? Should we add a runtime option to disable them? >>> >> >> I'd assume it's not legal? If you have a refcount > 1 and delete the >> last reference, wouldn't that refcount be then incorrect...? >> >> Or I guess maybe an external tool may be playing games and using it as a >> "sticky" bit of sorts? > > As I said to Berto, it's perfectly legal because it's not forbidden and > it could be useful for data deduplication. >
Wound up seeing that later. I still wonder if that's completely accurate, but... oh well. Not too important. > I don't think that's something anyone does currently, though, so we > could disallow it as I have described. Question is, would it be worth it? > >>> And of course another approach I already mentioned would be to scrap the >>> overlap checks altogether once we have image locking (and I guess we can >>> keep them around in their current form at least until then). >>> >>> Max >>> >> >> I think it's worth having them. > > Fair enough. On one hand I'm biased towards keeping them because I've > written them. On the other hand, I just really don't like all of the > *pre_write_overlap_check calls... > >> Perhaps if you use image locking they >> can be disabled by default, but I wouldn't really advocate for removing >> them. > > If we disable them by default, nobody will use them. We shouldn't have > dead code. > Only if you use the image locking, which is -- oh, oh. Yeah, you do have a point. > We could think about disabling them if the image is locked and enabling > them if it is not locked. This way, they would still get at least some > usage... > >> I think what you are pointing out with refcount blocks not being >> essential to protect is probably pretty sufficient as an optimization, >> too. Or I guess we can merge your ~real~ series to help optimize things. > > Or Berto's idea works out to be good enough. :-) > Follow your dreams! >> I know in the little qcheck utility I wrote I use RBtrees of ranges that >> get merged together; the tool doesn't really care which kinds of >> metadata it is, it just knows that it has "metadata" that should be >> protected. I think it's fast enough. If the qcow2 is kept fairly >> defragmented, it's REALLY fast. IIRC, your patchset has something >> similar, so it should be fast enough too. > > Nothing fancy. It's based on a bitmap of metadata information (more like > a nibble-map, though) that is RLE compressed for areas not currently in > use. Simple, algorithmically speaking, but makes it harder to calculate > the runtime complexity than using a "real" structure, I admit. > > Max > It's a real structure! It's just got some properties that are difficult to analyze :) You just need to consider a load factor alpha and some kind of probabilistic distribution describing sector usage for the bitmap, right? You could probably at least pick a few common cases and calculate that way. Or don't. :) --js
