On 29.03.19 18:31, Vladimir Sementsov-Ogievskiy wrote: > 29.03.2019 20:15, Max Reitz wrote: >> On 29.03.19 12:04, Vladimir Sementsov-Ogievskiy wrote: >>> bdrv_replace_child() calls bdrv_check_perm() with error_abort on >>> loosening permissions. However file-locking operations may fail even >>> in this case, for example on NFS. And this leads to Qemu crash. >>> >>> Let's avoid such errors. Note, that we ignore such things anyway on >>> permission update commit and abort. >>> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> >>> --- >>> block/file-posix.c | 12 ++++++++++++ >>> 1 file changed, 12 insertions(+) >>> >>> diff --git a/block/file-posix.c b/block/file-posix.c >>> index db4cccbe51..1cf4ee49eb 100644 >>> --- a/block/file-posix.c >>> +++ b/block/file-posix.c >>> @@ -815,6 +815,18 @@ static int raw_handle_perm_lock(BlockDriverState *bs, >>> >>> switch (op) { >>> case RAW_PL_PREPARE: >>> + if ((s->perm | new_perm) == s->perm && >>> + (s->shared_perm & new_shared) == s->shared_perm) >>> + { >>> + /* >>> + * We are going to unlock bytes, it should not fail. If it >>> fail due >>> + * to some fs-dependent permission-unrelated reasons (which >>> occurs >>> + * sometimes on NFS and leads to abort in bdrv_replace_child) >>> we >>> + * can't prevent such errors by any check here. And we ignore >>> them >>> + * anyway in ABORT and COMMIT. >>> + */ >>> + return 0; >>> + } >>> ret = raw_apply_lock_bytes(s, s->fd, s->perm | new_perm, >>> ~s->shared_perm | ~new_shared, >>> false, errp); >> >> Help me understand the exact issue, please. I understand that there are >> operations like bdrv_replace_child() that pass &error_abort to >> bdrv_check_perm() because they just loosen the permissions, so it should >> not fail. >> >> However, if the whole effect really would be to loosen permissions, >> raw_apply_lock_bytes() wouldn't have failed here in PREPARE anyway: >> @unlock is passed as false, so no bytes will be unlocked. And if >> permissions are just loosened (as your condition checks), it should not >> lock any bytes. > > but it does qemu_lock_fd unconditionally on any locked byte (it may be > loosening, > but not of all bytes).
It only locks bytes that aren't already locked ("!(locked_perm & bit)"). So if (s->perm | new_perm) & ~s->locked_perm == 0 (same for shared), it won't lock anything. That's the same as s->perm | new_perm == s->perm and s->perm == s->locked_perm. So if this patch prevents qemu_lock_fd() errors in raw_apply_lock_bytes(), s->perm must be different from s->locked_perm. > and fcntl is called, which may fail. Hm, yeah, and as Kevin said, raw_check_lock_bytes() if new_perm and new_shared are already covered (though this could be fixed by passing new_perm & ~s->perm and... new_shared | ~s->shared_perm? to raw_check_lock_bytes). It just seems to me that we sometimes do end up in a position where s->perm and s->locked_perm (or ~s->shared_perm and s->locked_shared_perm) do not align, which should not be happening: This would mean that the block layer thinks the permission locks are taken, but file-posix disagrees. Max >> So why does it attempt lock any bytes in the first place? There must be >> some discrepancy between s->perm and s->locked_perm, or ~s->shared_perm >> and s->locked_shared_perm. How does that occur? >> >> Max >> > >
signature.asc
Description: OpenPGP digital signature