29.03.2019 20:58, Vladimir Sementsov-Ogievskiy wrote: > 29.03.2019 20:44, Max Reitz wrote: >> On 29.03.19 18:40, Kevin Wolf wrote: >>> Am 29.03.2019 um 18:30 hat Max Reitz geschrieben: >>>> On 29.03.19 18:24, Kevin Wolf wrote: >>>>> Am 29.03.2019 um 18:15 hat Max Reitz geschrieben: >>>>>> On 29.03.19 12:04, Vladimir Sementsov-Ogievskiy wrote: >>>>>>> bdrv_replace_child() calls bdrv_check_perm() with error_abort on >>>>>>> loosening permissions. However file-locking operations may fail even >>>>>>> in this case, for example on NFS. And this leads to Qemu crash. >>>>>>> >>>>>>> Let's avoid such errors. Note, that we ignore such things anyway on >>>>>>> permission update commit and abort. >>>>>>> >>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <[email protected]> >>>>>>> --- >>>>>>> block/file-posix.c | 12 ++++++++++++ >>>>>>> 1 file changed, 12 insertions(+) >>>>>>> >>>>>>> diff --git a/block/file-posix.c b/block/file-posix.c >>>>>>> index db4cccbe51..1cf4ee49eb 100644 >>>>>>> --- a/block/file-posix.c >>>>>>> +++ b/block/file-posix.c >>>>>>> @@ -815,6 +815,18 @@ static int raw_handle_perm_lock(BlockDriverState >>>>>>> *bs, >>>>>>> switch (op) { >>>>>>> case RAW_PL_PREPARE: >>>>>>> + if ((s->perm | new_perm) == s->perm && >>>>>>> + (s->shared_perm & new_shared) == s->shared_perm) >>>>>>> + { >>>>>>> + /* >>>>>>> + * We are going to unlock bytes, it should not fail. If it >>>>>>> fail due >>>>>>> + * to some fs-dependent permission-unrelated reasons >>>>>>> (which occurs >>>>>>> + * sometimes on NFS and leads to abort in >>>>>>> bdrv_replace_child) we >>>>>>> + * can't prevent such errors by any check here. And we >>>>>>> ignore them >>>>>>> + * anyway in ABORT and COMMIT. >>>>>>> + */ >>>>>>> + return 0; >>>>>>> + } >>>>>>> ret = raw_apply_lock_bytes(s, s->fd, s->perm | new_perm, >>>>>>> ~s->shared_perm | ~new_shared, >>>>>>> false, errp); >>>>>> >>>>>> Help me understand the exact issue, please. I understand that there are >>>>>> operations like bdrv_replace_child() that pass &error_abort to >>>>>> bdrv_check_perm() because they just loosen the permissions, so it should >>>>>> not fail. >>>>>> >>>>>> However, if the whole effect really would be to loosen permissions, >>>>>> raw_apply_lock_bytes() wouldn't have failed here in PREPARE anyway: >>>>>> @unlock is passed as false, so no bytes will be unlocked. And if >>>>>> permissions are just loosened (as your condition checks), it should not >>>>>> lock any bytes. >>>>>> >>>>>> So why does it attempt lock any bytes in the first place? There must be >>>>>> some discrepancy between s->perm and s->locked_perm, or ~s->shared_perm >>>>>> and s->locked_shared_perm. How does that occur? >>>>> >>>>> I suppose raw_check_lock_bytes() is what is failing, not >>>>> raw_apply_lock_bytes(). >>>> >>>> Hm, maybe in Vladimir's case, but not in e.g. >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1652572 . >>> >>> This is reported against 3.0, which didn't avoid re-locking permissions >>> that we already hold, so there raw_apply_lock_bytes() can still fail. >> >> That makes sense. Which leaves the question why Vladimir still seems to >> see the error there...? >> > > I'm sorry :(. I'm trying to fix bug based on 2.10, and now I see that is > already fixed > upstream. I don't have a reproducer, only old coredumps. > > So, now it looks like we don't need this patch, as on permission loosening > file-posix > don't call any FS apis, yes? >
Ah, you mentioned, that raw_check_lock_bytes is still buggy. -- Best regards, Vladimir
