Re: [Qemu-devel] [PATCH v2 16/32] qcow2: Update l2_allocate() to support L2 slices

2018-01-18 Thread Alberto Garcia
On Wed 17 Jan 2018 05:57:32 PM CET, Anton Nefedov wrote:
> On 17/1/2018 6:42 PM, Alberto Garcia wrote:
>> On Tue 16 Jan 2018 05:52:36 PM CET, Anton Nefedov wrote:
>>>
>>> Cosmetic: maybe this l2_load() better be merged with the copied L2 case?
>>> e.g.
>>>
>>> if (!(l1_table[l1_index] & COPIED))
>>>   l2_allocate();
>>> l2_load();
>> 
>> I'm not sure about that, since there's also the qcow2_free_clusters()
>> call afterwards, so the final code might be harder to follow.
>
> actually shouldn't qcow2_free_clusters() be done before l2_load()?
> Otherwise we bail out on error and leak space.

Good catch, I think that you're right. I'll fix that.

Berto



Re: [Qemu-devel] [PATCH v2 16/32] qcow2: Update l2_allocate() to support L2 slices

2018-01-17 Thread Anton Nefedov



On 17/1/2018 6:42 PM, Alberto Garcia wrote:

On Tue 16 Jan 2018 05:52:36 PM CET, Anton Nefedov wrote:


Cosmetic: maybe this l2_load() better be merged with the copied L2 case?
e.g.

if (!(l1_table[l1_index] & COPIED))
  l2_allocate();
l2_load();


I'm not sure about that, since there's also the qcow2_free_clusters()
call afterwards, so the final code might be harder to follow.

Berto



actually shouldn't qcow2_free_clusters() be done before l2_load()?
Otherwise we bail out on error and leak space.

/Anton



Re: [Qemu-devel] [PATCH v2 16/32] qcow2: Update l2_allocate() to support L2 slices

2018-01-17 Thread Alberto Garcia
On Wed 17 Jan 2018 05:06:04 PM CET, Eric Blake wrote:
  /* allocate a new entry in the l2 cache */
  
 +slice_size = s->l2_slice_size * sizeof(uint64_t);
>>>
>>> Would this read any better if the earlier patch named it
>>> s->l2_slice_entries?
>> 
>> I had doubts with this. Like you, when I see size I tend to think about
>> bytes. However both s->l1_size and s->l2_size indicate entries, and the
>> documentation of the qcow2 format even describes the header field like
>> this:
>> 
>>  36 - 39:   l1_size
>> Number of entries in the active L1 table
>
> We're free to rename the field in the qcow2 format specification if it
> makes things easier to understand.  If l1_entries reads better than
> l1_size, maybe it's worth doing.

In my opinion it reads much better. We mix both kinds of meanings in the
BDRVQcow2State structure already, there's for example cluster_size
(bytes), and refcount_table_size (entries). With the latter we actually
have the exact same problem we're discussing here:

refcount_table_size2 = s->refcount_table_size * sizeof(uint64_t);
s->refcount_table = g_try_malloc(refcount_table_size2);

So yeah, we can change those variables but it won't be a small patch. I
can do that if everyone thinks that it's a good idea.

Berto



Re: [Qemu-devel] [PATCH v2 16/32] qcow2: Update l2_allocate() to support L2 slices

2018-01-17 Thread Alberto Garcia
On Tue 16 Jan 2018 11:26:40 PM CET, Eric Blake wrote:
>>  /* allocate a new entry in the l2 cache */
>>  
>> +slice_size = s->l2_slice_size * sizeof(uint64_t);
>
> Would this read any better if the earlier patch named it
> s->l2_slice_entries?

I had doubts with this. Like you, when I see size I tend to think about
bytes. However both s->l1_size and s->l2_size indicate entries, and the
documentation of the qcow2 format even describes the header field like
this:

 36 - 39:   l1_size
Number of entries in the active L1 table

So I decided to follow that same convention for l2_slice_size.

For the local variable I could call it slice_size_bytes or try to come
up with a different alternative, but I'm open to suggestions.

Berto



Re: [Qemu-devel] [PATCH v2 16/32] qcow2: Update l2_allocate() to support L2 slices

2018-01-17 Thread Eric Blake
On 01/17/2018 09:55 AM, Alberto Garcia wrote:
> On Tue 16 Jan 2018 11:26:40 PM CET, Eric Blake wrote:
>>>  /* allocate a new entry in the l2 cache */
>>>  
>>> +slice_size = s->l2_slice_size * sizeof(uint64_t);
>>
>> Would this read any better if the earlier patch named it
>> s->l2_slice_entries?
> 
> I had doubts with this. Like you, when I see size I tend to think about
> bytes. However both s->l1_size and s->l2_size indicate entries, and the
> documentation of the qcow2 format even describes the header field like
> this:
> 
>  36 - 39:   l1_size
> Number of entries in the active L1 table

We're free to rename the field in the qcow2 format specification if it
makes things easier to understand.  If l1_entries reads better than
l1_size, maybe it's worth doing.

> 
> So I decided to follow that same convention for l2_slice_size.
> 
> For the local variable I could call it slice_size_bytes or try to come
> up with a different alternative, but I'm open to suggestions.
> 
> Berto
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 16/32] qcow2: Update l2_allocate() to support L2 slices

2018-01-17 Thread Alberto Garcia
On Tue 16 Jan 2018 05:52:36 PM CET, Anton Nefedov wrote:
>> @@ -299,42 +300,50 @@ static int l2_allocate(BlockDriverState *bs, int 
>> l1_index, uint64_t **table)
>>   
> [..]>
>> -/* write the l2 table to the file */
>> -BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_WRITE);
>> +trace_qcow2_l2_allocate_write_l2(bs, l1_index);
>> +qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
>> +ret = qcow2_cache_flush(bs, s->l2_table_cache);
>> +if (ret < 0) {
>> +goto fail;
>> +}
>>   
>
> Might be an overoptimization but do we really have to flush each slice
> separately?
> Otherwise a number of write ops will remain the same but at least we
> would bdrv_flush() just once.

You're completely right, I'll fix that.

>>   } else {
>> +uint64_t new_l2_offset;
>>   /* First allocate a new L2 table (and do COW if needed) */
>> -ret = l2_allocate(bs, l1_index, _table);
>> +ret = l2_allocate(bs, l1_index);
>> +if (ret < 0) {
>> +return ret;
>> +}
>> +
>> +/* Get the offset of the newly-allocated l2 table */
>> +new_l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
>> +assert(offset_into_cluster(s, new_l2_offset) == 0);
>> +/* Load the l2 table in memory */
>> +ret = l2_load(bs, offset, new_l2_offset, _table);
>
> Cosmetic: maybe this l2_load() better be merged with the copied L2 case?
> e.g.
>
>if (!(l1_table[l1_index] & COPIED))
>  l2_allocate();
>l2_load();

I'm not sure about that, since there's also the qcow2_free_clusters()
call afterwards, so the final code might be harder to follow.

Berto



Re: [Qemu-devel] [PATCH v2 16/32] qcow2: Update l2_allocate() to support L2 slices

2018-01-16 Thread Eric Blake
On 12/15/2017 06:53 AM, Alberto Garcia wrote:
> This patch updates l2_allocate() to support the qcow2 cache returning
> L2 slices instead of full L2 tables.
> 
> The old code simply gets an L2 table from the cache and initializes it
> with zeroes or with the contents of an existing table. With a cache
> that returns slices instead of tables the idea remains the same, but
> the code must now iterate over all the slices that are contained in an
> L2 table.
> 
> Since now we're operating with slices the function can no longer
> return the newly-allocated table, so it's up to the caller to retrieve
> the appropriate L2 slice after calling l2_allocate() (note that with
> this patch the caller is still loading full L2 tables, but we'll deal
> with that in a separate patch).
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block/qcow2-cluster.c | 86 
> +++
>  1 file changed, 52 insertions(+), 34 deletions(-)
> 

> @@ -299,42 +300,50 @@ static int l2_allocate(BlockDriverState *bs, int 
> l1_index, uint64_t **table)
>  
>  /* allocate a new entry in the l2 cache */
>  
> +slice_size = s->l2_slice_size * sizeof(uint64_t);

Would this read any better if the earlier patch named it
s->l2_slice_entries?  (When I see size, I try to think bytes, even
though the earlier patch made l2_slice_size be the number of entries;
worse, you are multiplying it back to a local variable slice_size that
IS in bytes).


> +
> +/* write the l2 slice to the file */
> +BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_WRITE);
>  
> -/* write the l2 table to the file */
> -BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_WRITE);
> +trace_qcow2_l2_allocate_write_l2(bs, l1_index);
> +qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
> +ret = qcow2_cache_flush(bs, s->l2_table_cache);

As Anton pointed out, flushing the cache seems like you could do it once
after the loop rather than on each iteration.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 16/32] qcow2: Update l2_allocate() to support L2 slices

2018-01-16 Thread Anton Nefedov



On 15/12/2017 3:53 PM, Alberto Garcia wrote:

This patch updates l2_allocate() to support the qcow2 cache returning
L2 slices instead of full L2 tables.

The old code simply gets an L2 table from the cache and initializes it
with zeroes or with the contents of an existing table. With a cache
that returns slices instead of tables the idea remains the same, but
the code must now iterate over all the slices that are contained in an
L2 table.

Since now we're operating with slices the function can no longer
return the newly-allocated table, so it's up to the caller to retrieve
the appropriate L2 slice after calling l2_allocate() (note that with
this patch the caller is still loading full L2 tables, but we'll deal
with that in a separate patch).

Signed-off-by: Alberto Garcia 
---
  block/qcow2-cluster.c | 86 +++
  1 file changed, 52 insertions(+), 34 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 8d92d623d8..ecb75b6be6 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -264,11 +264,12 @@ int qcow2_write_l1_entry(BlockDriverState *bs, int 
l1_index)
   *
   */
  
-static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)

+static int l2_allocate(BlockDriverState *bs, int l1_index)
  {
  BDRVQcow2State *s = bs->opaque;
  uint64_t old_l2_offset;
-uint64_t *l2_table = NULL;
+uint64_t *l2_slice = NULL;
+unsigned slice, slice_size, n_slices;
  int64_t l2_offset;
  int ret;
  
@@ -299,42 +300,50 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
  

[..]>

-/* write the l2 table to the file */
-BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_WRITE);
+trace_qcow2_l2_allocate_write_l2(bs, l1_index);
+qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
+ret = qcow2_cache_flush(bs, s->l2_table_cache);
+if (ret < 0) {
+goto fail;
+}
  


Might be an overoptimization but do we really have to flush each slice
separately?
Otherwise a number of write ops will remain the same but at least we
would bdrv_flush() just once.



-trace_qcow2_l2_allocate_write_l2(bs, l1_index);
-qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
-ret = qcow2_cache_flush(bs, s->l2_table_cache);
-if (ret < 0) {
-goto fail;
+qcow2_cache_put(s->l2_table_cache, (void **) _slice);
  }
  
  /* update the L1 entry */

@@ -345,14 +354,13 @@ static int l2_allocate(BlockDriverState *bs, int 
l1_index, uint64_t **table)
  goto fail;
  }
  
-*table = l2_table;

  trace_qcow2_l2_allocate_done(bs, l1_index, 0);
  return 0;
  
  fail:

  trace_qcow2_l2_allocate_done(bs, l1_index, ret);
-if (l2_table != NULL) {
-qcow2_cache_put(s->l2_table_cache, (void **) table);
+if (l2_slice != NULL) {
+qcow2_cache_put(s->l2_table_cache, (void **) _slice);
  }
  s->l1_table[l1_index] = old_l2_offset;
  if (l2_offset > 0) {
@@ -696,8 +704,18 @@ static int get_cluster_table(BlockDriverState *bs, 
uint64_t offset,
  return ret;
  }
  } else {
+uint64_t new_l2_offset;
  /* First allocate a new L2 table (and do COW if needed) */
-ret = l2_allocate(bs, l1_index, _table);
+ret = l2_allocate(bs, l1_index);
+if (ret < 0) {
+return ret;
+}
+
+/* Get the offset of the newly-allocated l2 table */
+new_l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
+assert(offset_into_cluster(s, new_l2_offset) == 0);
+/* Load the l2 table in memory */
+ret = l2_load(bs, offset, new_l2_offset, _table);


Cosmetic: maybe this l2_load() better be merged with the copied L2 case?
e.g.

  if (!(l1_table[l1_index] & COPIED))
l2_allocate();
  l2_load();



  if (ret < 0) {
  return ret;
  }





[Qemu-devel] [PATCH v2 16/32] qcow2: Update l2_allocate() to support L2 slices

2017-12-15 Thread Alberto Garcia
This patch updates l2_allocate() to support the qcow2 cache returning
L2 slices instead of full L2 tables.

The old code simply gets an L2 table from the cache and initializes it
with zeroes or with the contents of an existing table. With a cache
that returns slices instead of tables the idea remains the same, but
the code must now iterate over all the slices that are contained in an
L2 table.

Since now we're operating with slices the function can no longer
return the newly-allocated table, so it's up to the caller to retrieve
the appropriate L2 slice after calling l2_allocate() (note that with
this patch the caller is still loading full L2 tables, but we'll deal
with that in a separate patch).

Signed-off-by: Alberto Garcia 
---
 block/qcow2-cluster.c | 86 +++
 1 file changed, 52 insertions(+), 34 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 8d92d623d8..ecb75b6be6 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -264,11 +264,12 @@ int qcow2_write_l1_entry(BlockDriverState *bs, int 
l1_index)
  *
  */
 
-static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
+static int l2_allocate(BlockDriverState *bs, int l1_index)
 {
 BDRVQcow2State *s = bs->opaque;
 uint64_t old_l2_offset;
-uint64_t *l2_table = NULL;
+uint64_t *l2_slice = NULL;
+unsigned slice, slice_size, n_slices;
 int64_t l2_offset;
 int ret;
 
@@ -299,42 +300,50 @@ static int l2_allocate(BlockDriverState *bs, int 
l1_index, uint64_t **table)
 
 /* allocate a new entry in the l2 cache */
 
+slice_size = s->l2_slice_size * sizeof(uint64_t);
+n_slices = s->cluster_size / slice_size;
+
 trace_qcow2_l2_allocate_get_empty(bs, l1_index);
-ret = qcow2_cache_get_empty(bs, s->l2_table_cache, l2_offset, (void**) 
table);
-if (ret < 0) {
-goto fail;
-}
-
-l2_table = *table;
-
-if ((old_l2_offset & L1E_OFFSET_MASK) == 0) {
-/* if there was no old l2 table, clear the new table */
-memset(l2_table, 0, s->l2_size * sizeof(uint64_t));
-} else {
-uint64_t* old_table;
-
-/* if there was an old l2 table, read it from the disk */
-BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_COW_READ);
-ret = qcow2_cache_get(bs, s->l2_table_cache,
-old_l2_offset & L1E_OFFSET_MASK,
-(void**) _table);
+for (slice = 0; slice < n_slices; slice++) {
+ret = qcow2_cache_get_empty(bs, s->l2_table_cache,
+l2_offset + slice * slice_size,
+(void **) _slice);
 if (ret < 0) {
 goto fail;
 }
 
-memcpy(l2_table, old_table, s->cluster_size);
+if ((old_l2_offset & L1E_OFFSET_MASK) == 0) {
+/* if there was no old l2 table, clear the new slice */
+memset(l2_slice, 0, slice_size);
+} else {
+uint64_t *old_slice;
+uint64_t old_l2_slice_offset =
+(old_l2_offset & L1E_OFFSET_MASK) + slice * slice_size;
+
+/* if there was an old l2 table, read an slice from the disk */
+BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_COW_READ);
+ret = qcow2_cache_get(bs, s->l2_table_cache, old_l2_slice_offset,
+  (void **) _slice);
+if (ret < 0) {
+goto fail;
+}
+
+memcpy(l2_slice, old_slice, slice_size);
 
-qcow2_cache_put(s->l2_table_cache, (void **) _table);
-}
+qcow2_cache_put(s->l2_table_cache, (void **) _slice);
+}
+
+/* write the l2 slice to the file */
+BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_WRITE);
 
-/* write the l2 table to the file */
-BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_WRITE);
+trace_qcow2_l2_allocate_write_l2(bs, l1_index);
+qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
+ret = qcow2_cache_flush(bs, s->l2_table_cache);
+if (ret < 0) {
+goto fail;
+}
 
-trace_qcow2_l2_allocate_write_l2(bs, l1_index);
-qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
-ret = qcow2_cache_flush(bs, s->l2_table_cache);
-if (ret < 0) {
-goto fail;
+qcow2_cache_put(s->l2_table_cache, (void **) _slice);
 }
 
 /* update the L1 entry */
@@ -345,14 +354,13 @@ static int l2_allocate(BlockDriverState *bs, int 
l1_index, uint64_t **table)
 goto fail;
 }
 
-*table = l2_table;
 trace_qcow2_l2_allocate_done(bs, l1_index, 0);
 return 0;
 
 fail:
 trace_qcow2_l2_allocate_done(bs, l1_index, ret);
-if (l2_table != NULL) {
-qcow2_cache_put(s->l2_table_cache, (void **) table);
+if (l2_slice != NULL) {
+qcow2_cache_put(s->l2_table_cache, (void **) _slice);
 }
 s->l1_table[l1_index] = old_l2_offset;
 if (l2_offset > 0) {
@@ -696,8 +704,18