On 2018-02-09 14:35, Alberto Garcia wrote: > On Fri 09 Feb 2018 02:04:06 PM CET, Max Reitz wrote: >> I don't like completely refusing to open a qcow2 image if one of the >> snapshots is invalid, without giving the user any way of fixing it. >> >> With this patch, the final two images created in 080 cannot be opened >> at all (not even with qemu-img info). Without it, you can, you just >> can't load the snapshots. (Well, in one case. In the other, you can >> even do that, but that's a bug, as you correctly claim.) >> >> More importantly, though, without this patch, you can delete the >> snapshots. qemu-img will complain a bit, and you'll have leaks >> afterwards, but that's nothing qemu-img check can't fix. With this >> patch, you can't, because the image cannot be opened at all so it's >> basically gone for good (unless you get a hex editor). > > What you're saying is correct. The alternative however is to add checks > everywhere the snapshot table is used.
Yes. > There's for example qcow2_check_metadata_overlap() that will happily > allocate a buffer of (l1_size * 8) bytes, that can be up to 32 GB. It will only do that with QCOW2_OL_INACTIVE_L2, though, which is not enabled by default and will cost a ton of performance anyway, when used. (And by the way, I'd be OK with dropping that branch. Or with revisiting my good old series that strived to make the overlap checks faster in general.) > qcow2_expand_zero_clusters() has the same problem but there g_realloc() > is used, aborting the process on failure (we should actually use > g_try_realloc() instead, I'll leave that for a different patch). And this is only used for qemu-img amend from v3 to v2, offline. While aborts are bad, in this case they're not even that bad (because this is not a common operation at all, and because it's fully offline). And I'm still not sure how this patch would make anything better. Without it, qemu-img amend aborts. Yes, too bad. So you can still do qemu-img convert, losing all snapshots, but at least you got something. (Or we just fix qemu-img amend, that tells you there's something bad going on, and then you delete the snapshot in question.) With this patch, your whole data is gone because qemu now refuses to look at anything in the image. > If we want to allow opening images with corrupted snapshots perhaps we > could still do the checks in qcow2_read_snapshots() but instead of > returning an error add a 'corrupted' flag to QCowSnapshot. If you think that's simpler than having the proper checks everywhere we access a snapshot, be my guest. It does sound reasonable and it might be nicer for users than having to figure out which snapshot to delete themselves. > That solution is not without problems because we'd still need to check > for that flag everytime the snapshot is used, and that's calling for > errors in the future. Well, errors if your file is corrupted. Not a good thing but I still prefer it over rejecting files altogether because some non-vital data structure is corrupted, with no chance of allowing the user to fix the issue. An intermediate thing would be to mark the whole image corrupt if some snapshot is broken and then adding functionality to qemu-img check to fix that (by deleting the snapshot, if the user has agreed to that). But whatever we do, there needs to be a way for the user to get all non-corrupted data off the image. Max
Description: OpenPGP digital signature