10.09.2019 15:31, Maxim Levitsky wrote:
> On Sat, 2019-09-07 at 19:08 +0000, Vladimir Sementsov-Ogievskiy wrote:
>> 06.09.2019 22:57, Maxim Levitsky wrote:
>>> This commit tries to clarify few function arguments,
>>> and add comments describing the encrypt/decrypt interface
>>>
>>> Signed-off-by: Maxim Levitsky <mlevi...@redhat.com>
>>> ---
>>>    block/qcow2-cluster.c | 10 +++----
>>>    block/qcow2-threads.c | 61 ++++++++++++++++++++++++++++++++++---------
>>>    2 files changed, 53 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
>>> index f09cc992af..1989b423da 100644
>>> --- a/block/qcow2-cluster.c
>>> +++ b/block/qcow2-cluster.c
>>> @@ -463,8 +463,8 @@ static int coroutine_fn 
>>> do_perform_cow_read(BlockDriverState *bs,
>>>    }
>>>    
>>>    static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
>>> -                                                uint64_t 
>>> src_cluster_offset,
>>> -                                                uint64_t cluster_offset,
>>> +                                                uint64_t 
>>> guest_cluster_offset,
>>> +                                                uint64_t 
>>> host_cluster_offset,
>>>                                                    unsigned 
>>> offset_in_cluster,
>>>                                                    uint8_t *buffer,
>>>                                                    unsigned bytes)
>>> @@ -474,8 +474,8 @@ static bool coroutine_fn 
>>> do_perform_cow_encrypt(BlockDriverState *bs,
>>>            assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
>>>            assert((bytes & ~BDRV_SECTOR_MASK) == 0);
>>>            assert(s->crypto);
>>> -        if (qcow2_co_encrypt(bs, cluster_offset,
>>> -                             src_cluster_offset + offset_in_cluster,
>>> +        if (qcow2_co_encrypt(bs, host_cluster_offset,
>>> +                             guest_cluster_offset + offset_in_cluster,
>>>                                 buffer, bytes) < 0) {
>>>                return false;
>>>            }
>>> @@ -496,7 +496,7 @@ static int coroutine_fn 
>>> do_perform_cow_write(BlockDriverState *bs,
>>>        }
>>>    
>>>        ret = qcow2_pre_write_overlap_check(bs, 0,
>>> -            cluster_offset + offset_in_cluster, qiov->size, true);
>>> +              cluster_offset + offset_in_cluster, qiov->size, true);
>>
>>
>> Hmm, unrelated hunk.
> 
> I was asked to do this to fix coding style, so that wrapped line,
> is 4 characters shifted to the right.

AFAIS, Eric asked only about qcow2_co_encdec calls and definition.. It's OK to 
fix style in code
you touch in you patch anyway, but no reason to fix style somewhere else, it 
dirties patch for
no reason, making it more difficult (a bit, but still) to review and more 
difficult to backport..

> 
>>
>>>        if (ret < 0) {
>>>            return ret;
>>>        }
>>> diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
>>> index 3b1e63fe41..c3cda0c6a5 100644
>>> --- a/block/qcow2-threads.c
>>> +++ b/block/qcow2-threads.c
>>> @@ -234,15 +234,19 @@ static int qcow2_encdec_pool_func(void *opaque)
>>>    }
>>>    
>>>    static int coroutine_fn
>>> -qcow2_co_encdec(BlockDriverState *bs, uint64_t file_cluster_offset,
>>> -                  uint64_t offset, void *buf, size_t len, Qcow2EncDecFunc 
>>> func)
>>> +qcow2_co_encdec(BlockDriverState *bs, uint64_t host_cluster_offset,
>>> +                uint64_t guest_offset, void *buf, size_t len,
>>> +                Qcow2EncDecFunc func)
>>>    {
>>>        BDRVQcow2State *s = bs->opaque;
>>> +
>>> +    uint64_t offset = s->crypt_physical_offset ?
>>> +        host_cluster_offset + offset_into_cluster(s, guest_offset) :
>>> +        guest_offset;
>>> +
>>>        Qcow2EncDecData arg = {
>>>            .block = s->crypto,
>>> -        .offset = s->crypt_physical_offset ?
>>> -                      file_cluster_offset + offset_into_cluster(s, offset) 
>>> :
>>> -                      offset,
>>> +        .offset = offset,
>>>            .buf = buf,
>>>            .len = len,
>>>            .func = func,
>>> @@ -251,18 +255,49 @@ qcow2_co_encdec(BlockDriverState *bs, uint64_t 
>>> file_cluster_offset,
>>>        return qcow2_co_process(bs, qcow2_encdec_pool_func, &arg);
>>>    }
>>>    
>>> +
>>> +/*
>>> + * qcow2_co_encrypt()
>>> + *
>>> + * Encrypts one or more contiguous aligned sectors
>>> + *
>>> + * @host_cluster_offset - on disk offset of the first cluster in which
>>> + * the encrypted data will be written
>>
>>
>> It's not quite right, it's not on disk, but on .file child of qcow2 node, 
>> which
>> may be any other format or protocol node.. So, I called it 
>> file_cluster_offset.
>> But I'm OK with new naming anyway. And it may be better for encryption 
>> related
>> logic..
> 
> Yes, the .file is the underlying storage for both qcow2 metadata and the data,
> and it is unlikely be another qcow2 file. Usually it will be a raw file,
> accessed with some protocol.
> I will change the wording to not include the 'disk' word though.
> 
> 
> To be really honest, the best naming here would be one that follows the 
> virtual memory concepts.
> A virtual block/cluster address and a physical block/cluster address.
> However we talked with Kevin recently and I also studied quite a lot of qcow2 
> code,
> and the usual convention is guest cluster offset and host cluster offset,
> and often guest offsets are just called offsets, which is very confusing IMHO.

It don't confuse me, as it's an interface of block-layer. Interface user just 
writes at some offset.
And the fact that internally, we consider interface parameter "offset" as 
"guest offset" and map it
to some physical offset - it's an implementation details. In other words, guest 
don't know that it
writes at "guest offset", it just writes at "offset" and don't care.

> 
> 
>>
>>   > + * Used as an initialization vector for encryption
>>
>> Hmm, is it default now?
> 
> 
> Most of block crypto implementations have IV which derive
> some way or another from the sector address.
> 
>  From what I see, the block address is either used as is,
> or encrypted itself with same encryption key,
> and the result is used as IV. Even the legacy qcow
> encryption uses this, although it uses the virtual block
> address, and apparently this is one of its security flaws.
> 
> If you don't use any IV, you end up with major security
> hole - sectors of the same content will be encrypted
> to the same cipertext.
> 
> I added this comment to clarify the usage of offset,
> since other that this aspect of IV generation,
> the crypto routines only need the data to be encrypted and
> the encryption key which is stored in the crypto context.
> 
> 
>>
>>> + *
>>> + * @guest_offset - guest (virtual) offset of the first sector of the
>>> + * data to be encrypted
>>
>> Hmm, stop. It's wrong. Data to be encrypted is in buffer, so, it's not first 
>> sector of
>> the data to be encrypted, but first sector in which guest writes data (to be 
>> encrypted
>> in meantime).
> No no no!
> 
> guest_offset is literally the guest's disk address of the first sector that 
> is in the buffer.
> The qcow2_co_encrypt is called from 2 places:
> 
> 1. qcow2_co_pwritev_part - here indeed the actually guest written data
> is encrypted, and the 'offset' is passed which is the offset on which pwritev 
> was called

I mean, that in this case your comment for me sounds like "data to be encrypted 
is on disk, at this offset".
but it's not actually on disk, data is in buffer. And what is on disk we don't 
know, we are going to
write...

But I don't really care. Hope it's actually obvious for averyone, where is data 
on write path.

> 
> 
> 2. do_perform_cow_encrypt - here the just read data from before or after 
> actually written guest
> data is encrypted, and the guest_offset represents the address of that data.
> I changed the do_perform_cow_encrypt, so that it receives from the caller the 
> host and guest offset
> of the data to be encrypted. It then aligns the host offset on start of the 
> cluster, and passes
> the guest offset as is, so that it does the same as qcow2_co_pwritev_part.
> 
> 
> So what is wrong here?
> 
>>   
>>
>>> + * Used as an initialization vector for older, qcow2 native encryption
>>> + *
>>> + * @buf - buffer with the data to encrypt
>>> + * @len - length of the buffer (in sector size multiplies)
>>> + *
>>> + * Note that the group of the sectors, don't have to be aligned
>>> + * on cluster boundary and can also cross a cluster boundary.
>>
>> And I doubt in it now. I'm afraid that if we call qcow2_co_encrypt for a 
>> group
>> of the sectors crossing a cluster boundary, we will finish up with similar 
>> bug: we'll
>> use first cluster offset as a vector for all the sectors. We still never do 
>> it.. So,
>> I think it worth assertion and corresponding comment.
>>
>> Or is it correct?
> 
> Crypto code receives the data to be encrypted and decrypted,
> and offset of first sector (512 bytes aligned) of that data (for IV 
> calculation).
> If the data spans multiple sectors (this happens already a lot)

Ah, yes, that's OK, sorry.

> then it able to handle this since the code works.
> The relevant code is in do_qcrypto_block_cipher_encdec, and
> is actually generic for all the crypto modes.
> 
> So there should not be anything special about crossing the cluster
> boundary, other that noting this here, just in case.
> 
> No only that code can cross a cluster boundary, but it can even be
> called for more that one cluster at once. It looks like qcow2 code limits all
> the IO in case crypto is used to QCOW_MAX_CRYPT_CLUSTERS, which is 32
> currently. I don't know why to be honest.
> 
> Remember that crypto code has no notion of clusters. It works purely
> on sector (512 bytes) level, and each sector will have its own IV calculated,
> based on its sector address.
> 
>>
>>> + *
>>> + *
>>> + */
>>>    int coroutine_fn
>>> -qcow2_co_encrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
>>> -                 uint64_t offset, void *buf, size_t len)
>>> +qcow2_co_encrypt(BlockDriverState *bs, uint64_t host_cluster_offset,
>>> +                 uint64_t guest_offset, void *buf, size_t len)
>>>    {
>>> -    return qcow2_co_encdec(bs, file_cluster_offset, offset, buf, len,
>>> -                             qcrypto_block_encrypt);
>>> +    return qcow2_co_encdec(bs, host_cluster_offset, guest_offset, buf, len,
>>> +                           qcrypto_block_encrypt);
>>>    }
>>>    
>>> +
>>> +/*
>>> + * qcow2_co_decrypt()
>>> + *
>>> + * Decrypts one or more contiguous aligned sectors
>>> + * Same function as qcow2_co_encrypt
>>
>> Hmm, not exactly same :)
> I'll fix that.
> 
> 
>>
>>> + *
>>> + */
>>> +
>>>    int coroutine_fn
>>> -qcow2_co_decrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
>>> -                 uint64_t offset, void *buf, size_t len)
>>> +qcow2_co_decrypt(BlockDriverState *bs, uint64_t host_cluster_offset,
>>> +                 uint64_t guest_offset, void *buf, size_t len)
>>>    {
>>> -    return qcow2_co_encdec(bs, file_cluster_offset, offset, buf, len,
>>> -                             qcrypto_block_decrypt);
>>> +    return qcow2_co_encdec(bs, host_cluster_offset, guest_offset, buf, len,
>>> +                           qcrypto_block_decrypt);
>>>    }
>>>
>>
> 
> 
> Best regards,
>       Thanks for the review,
>               Maxim Levitsky
> 
>>
>> -- 
>> Best regards,
>> Vladimir
> 
> 


-- 
Best regards,
Vladimir

Reply via email to