Re: [Qemu-devel] [PATCH v5 07/10] qcow2: qcow2_co_preadv: improve locking

2019-05-03 Thread Max Reitz
On 30.04.19 11:44, Vladimir Sementsov-Ogievskiy wrote:
> 30.04.2019 11:38, Vladimir Sementsov-Ogievskiy wrote:
>> 29.04.2019 19:37, Max Reitz wrote:
>>> On 02.04.19 17:37, Vladimir Sementsov-Ogievskiy wrote:
 Background: decryption will be done in threads, to take benefit of it,
 we should move it out of the lock first.
>>>
>>> ...which is safe after your commit c972fa123c73501b4, I presume.
>>>
>>> (At first glance, the patched looked a bit weird to me because it
>>> doesn't give a reason why dropping the lock around
>>> qcrypto_block_decrypt() would be OK.)
>>>
 But let's go further: it turns out, that for locking around switch
 cases we have only two variants: when we just do memset(0) not
 releasing the lock (it is useless) and when we actually can handle the
 whole case out of the lock. So, refactor the whole thing to reduce
 locked code region and make it clean.

 Signed-off-by: Vladimir Sementsov-Ogievskiy 
 Reviewed-by: Alberto Garcia 
 ---
   block/qcow2.c | 46 ++
   1 file changed, 22 insertions(+), 24 deletions(-)

 diff --git a/block/qcow2.c b/block/qcow2.c
 index 46e8e39da5..fcf92a7eb6 100644
 --- a/block/qcow2.c
 +++ b/block/qcow2.c
 @@ -1983,6 +1983,7 @@ static coroutine_fn int 
 qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
   ret = qcow2_get_cluster_offset(bs, offset, _bytes, 
 _offset);
>>>
>>> Isn't this the only function in the loop that actually needs the lock?
>>> Wouldn't it make more sense to just take it around this call?
>>>
>>
>> Hmm, looks correct, I'll resend.
>>
>>
> 
> Or not, actually, we may have several qcow2_get_data_offset calls under one 
> lock,
> if clusters are different kinds of ZERO. So, I think better to keep it as is 
> for now.

Hm, but how is this relevant?  For one thing, if that was a problem if
some other party concurrently changes the image, then that would be a
problem in general.  Keeping the lock would hide it for different kinds
of read-as-zero clusters, but it would still appear if data clusters and
other clusters are interleaved, wouldn’t it?

Also, this is a coroutine.  As long as nothing yields, nothing gets
concurrent access.  I don’t see anything outside of
qcow2_get_cluster_offset() that could yield as long as we only see
read-as-zero clusters.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v5 07/10] qcow2: qcow2_co_preadv: improve locking

2019-04-30 Thread Vladimir Sementsov-Ogievskiy
30.04.2019 11:38, Vladimir Sementsov-Ogievskiy wrote:
> 29.04.2019 19:37, Max Reitz wrote:
>> On 02.04.19 17:37, Vladimir Sementsov-Ogievskiy wrote:
>>> Background: decryption will be done in threads, to take benefit of it,
>>> we should move it out of the lock first.
>>
>> ...which is safe after your commit c972fa123c73501b4, I presume.
>>
>> (At first glance, the patched looked a bit weird to me because it
>> doesn't give a reason why dropping the lock around
>> qcrypto_block_decrypt() would be OK.)
>>
>>> But let's go further: it turns out, that for locking around switch
>>> cases we have only two variants: when we just do memset(0) not
>>> releasing the lock (it is useless) and when we actually can handle the
>>> whole case out of the lock. So, refactor the whole thing to reduce
>>> locked code region and make it clean.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> Reviewed-by: Alberto Garcia 
>>> ---
>>>   block/qcow2.c | 46 ++
>>>   1 file changed, 22 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/block/qcow2.c b/block/qcow2.c
>>> index 46e8e39da5..fcf92a7eb6 100644
>>> --- a/block/qcow2.c
>>> +++ b/block/qcow2.c
>>> @@ -1983,6 +1983,7 @@ static coroutine_fn int 
>>> qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
>>>   ret = qcow2_get_cluster_offset(bs, offset, _bytes, 
>>> _offset);
>>
>> Isn't this the only function in the loop that actually needs the lock?
>> Wouldn't it make more sense to just take it around this call?
>>
> 
> Hmm, looks correct, I'll resend.
> 
> 

Or not, actually, we may have several qcow2_get_data_offset calls under one 
lock,
if clusters are different kinds of ZERO. So, I think better to keep it as is 
for now.



-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH v5 07/10] qcow2: qcow2_co_preadv: improve locking

2019-04-30 Thread Vladimir Sementsov-Ogievskiy
29.04.2019 19:37, Max Reitz wrote:
> On 02.04.19 17:37, Vladimir Sementsov-Ogievskiy wrote:
>> Background: decryption will be done in threads, to take benefit of it,
>> we should move it out of the lock first.
> 
> ...which is safe after your commit c972fa123c73501b4, I presume.
> 
> (At first glance, the patched looked a bit weird to me because it
> doesn't give a reason why dropping the lock around
> qcrypto_block_decrypt() would be OK.)
> 
>> But let's go further: it turns out, that for locking around switch
>> cases we have only two variants: when we just do memset(0) not
>> releasing the lock (it is useless) and when we actually can handle the
>> whole case out of the lock. So, refactor the whole thing to reduce
>> locked code region and make it clean.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> Reviewed-by: Alberto Garcia 
>> ---
>>   block/qcow2.c | 46 ++
>>   1 file changed, 22 insertions(+), 24 deletions(-)
>>
>> diff --git a/block/qcow2.c b/block/qcow2.c
>> index 46e8e39da5..fcf92a7eb6 100644
>> --- a/block/qcow2.c
>> +++ b/block/qcow2.c
>> @@ -1983,6 +1983,7 @@ static coroutine_fn int 
>> qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
>>   
>>   ret = qcow2_get_cluster_offset(bs, offset, _bytes, 
>> _offset);
> 
> Isn't this the only function in the loop that actually needs the lock?
> Wouldn't it make more sense to just take it around this call?
> 

Hmm, looks correct, I'll resend.



-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH v5 07/10] qcow2: qcow2_co_preadv: improve locking

2019-04-29 Thread Max Reitz
On 29.04.19 18:37, Max Reitz wrote:
> On 02.04.19 17:37, Vladimir Sementsov-Ogievskiy wrote:
>> Background: decryption will be done in threads, to take benefit of it,
>> we should move it out of the lock first.
> 
> ...which is safe after your commit c972fa123c73501b4, I presume.
> 
> (At first glance, the patched looked a bit weird to me because it
> doesn't give a reason why dropping the lock around
> qcrypto_block_decrypt() would be OK.)

On second thought, I guess the actual reason it's safe is because the
crypto code never yields.

Max

>> But let's go further: it turns out, that for locking around switch
>> cases we have only two variants: when we just do memset(0) not
>> releasing the lock (it is useless) and when we actually can handle the
>> whole case out of the lock. So, refactor the whole thing to reduce
>> locked code region and make it clean.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> Reviewed-by: Alberto Garcia 
>> ---
>>  block/qcow2.c | 46 ++
>>  1 file changed, 22 insertions(+), 24 deletions(-)
>>
>> diff --git a/block/qcow2.c b/block/qcow2.c
>> index 46e8e39da5..fcf92a7eb6 100644
>> --- a/block/qcow2.c
>> +++ b/block/qcow2.c
>> @@ -1983,6 +1983,7 @@ static coroutine_fn int 
>> qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
>>  
>>  ret = qcow2_get_cluster_offset(bs, offset, _bytes, 
>> _offset);
> 
> Isn't this the only function in the loop that actually needs the lock?
> Wouldn't it make more sense to just take it around this call?
> 
> Max
> 
>>  if (ret < 0) {
>> +qemu_co_mutex_unlock(>lock);
>>  goto fail;
>>  }
>>  
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v5 07/10] qcow2: qcow2_co_preadv: improve locking

2019-04-29 Thread Max Reitz
On 02.04.19 17:37, Vladimir Sementsov-Ogievskiy wrote:
> Background: decryption will be done in threads, to take benefit of it,
> we should move it out of the lock first.

...which is safe after your commit c972fa123c73501b4, I presume.

(At first glance, the patched looked a bit weird to me because it
doesn't give a reason why dropping the lock around
qcrypto_block_decrypt() would be OK.)

> But let's go further: it turns out, that for locking around switch
> cases we have only two variants: when we just do memset(0) not
> releasing the lock (it is useless) and when we actually can handle the
> whole case out of the lock. So, refactor the whole thing to reduce
> locked code region and make it clean.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Alberto Garcia 
> ---
>  block/qcow2.c | 46 ++
>  1 file changed, 22 insertions(+), 24 deletions(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 46e8e39da5..fcf92a7eb6 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -1983,6 +1983,7 @@ static coroutine_fn int 
> qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
>  
>  ret = qcow2_get_cluster_offset(bs, offset, _bytes, 
> _offset);

Isn't this the only function in the loop that actually needs the lock?
Wouldn't it make more sense to just take it around this call?

Max

>  if (ret < 0) {
> +qemu_co_mutex_unlock(>lock);
>  goto fail;
>  }
>  



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v5 07/10] qcow2: qcow2_co_preadv: improve locking

2019-04-02 Thread Vladimir Sementsov-Ogievskiy
Background: decryption will be done in threads, to take benefit of it,
we should move it out of the lock first.

But let's go further: it turns out, that for locking around switch
cases we have only two variants: when we just do memset(0) not
releasing the lock (it is useless) and when we actually can handle the
whole case out of the lock. So, refactor the whole thing to reduce
locked code region and make it clean.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 46 ++
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 46e8e39da5..fcf92a7eb6 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1983,6 +1983,7 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState 
*bs, uint64_t offset,
 
 ret = qcow2_get_cluster_offset(bs, offset, _bytes, 
_offset);
 if (ret < 0) {
+qemu_co_mutex_unlock(>lock);
 goto fail;
 }
 
@@ -1991,39 +1992,36 @@ static coroutine_fn int 
qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
 qemu_iovec_reset(_qiov);
 qemu_iovec_concat(_qiov, qiov, bytes_done, cur_bytes);
 
+if (ret == QCOW2_CLUSTER_ZERO_PLAIN ||
+ret == QCOW2_CLUSTER_ZERO_ALLOC ||
+(ret == QCOW2_CLUSTER_UNALLOCATED && !bs->backing))
+{
+qemu_iovec_memset(_qiov, 0, 0, cur_bytes);
+
+bytes -= cur_bytes;
+offset += cur_bytes;
+bytes_done += cur_bytes;
+continue;
+}
+
+qemu_co_mutex_unlock(>lock);
+
 switch (ret) {
 case QCOW2_CLUSTER_UNALLOCATED:
-
-if (bs->backing) {
-BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
-qemu_co_mutex_unlock(>lock);
-ret = bdrv_co_preadv(bs->backing, offset, cur_bytes,
- _qiov, 0);
-qemu_co_mutex_lock(>lock);
-if (ret < 0) {
-goto fail;
-}
-} else {
-/* Note: in this case, no need to wait */
-qemu_iovec_memset(_qiov, 0, 0, cur_bytes);
+BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
+ret = bdrv_co_preadv(bs->backing, offset, cur_bytes, _qiov, 0);
+if (ret < 0) {
+goto fail;
 }
 break;
 
-case QCOW2_CLUSTER_ZERO_PLAIN:
-case QCOW2_CLUSTER_ZERO_ALLOC:
-qemu_iovec_memset(_qiov, 0, 0, cur_bytes);
-break;
-
 case QCOW2_CLUSTER_COMPRESSED:
-qemu_co_mutex_unlock(>lock);
 ret = qcow2_co_preadv_compressed(bs, cluster_offset,
  offset, cur_bytes,
  _qiov);
-qemu_co_mutex_lock(>lock);
 if (ret < 0) {
 goto fail;
 }
-
 break;
 
 case QCOW2_CLUSTER_NORMAL:
@@ -2056,11 +2054,9 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState 
*bs, uint64_t offset,
 }
 
 BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
-qemu_co_mutex_unlock(>lock);
 ret = bdrv_co_preadv(s->data_file,
  cluster_offset + offset_in_cluster,
  cur_bytes, _qiov, 0);
-qemu_co_mutex_lock(>lock);
 if (ret < 0) {
 goto fail;
 }
@@ -2091,12 +2087,14 @@ static coroutine_fn int 
qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
 bytes -= cur_bytes;
 offset += cur_bytes;
 bytes_done += cur_bytes;
+
+qemu_co_mutex_lock(>lock);
 }
 ret = 0;
 
-fail:
 qemu_co_mutex_unlock(>lock);
 
+fail:
 qemu_iovec_destroy(_qiov);
 qemu_vfree(cluster_data);
 
-- 
2.18.0