[Qemu-block] [PATCH 0/4] block: Mirror discarded sectors

2015-05-05 Thread Fam Zheng
This fixes the mirror assert failure reported by wangxiaolong:

https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg04458.html

The direct cause is that hbitmap code couldn't handle unset of bits *after*
iterator's current position. We could fix that, but the bdrv_reset_dirty() call
is more questionable:

Before, if guest discarded some sectors during migration, it could see
different data after moving to dest side, depending on block backends of the
src and the dest. This is IMO worse than mirroring the actual reading as done
in this series, because we don't know what the guest is doing.

For example if a guest first issues WRITE SAME to wipe out the area then issues
UNMAP to discard it, just to get rid of some sensitive data completely, we may
miss both operations and leave stale data on dest image.


Fam Zheng (4):
  block: Fix dirty bitmap in bdrv_co_discard
  block: Remove bdrv_reset_dirty
  qemu-iotests: Make block job methods common
  qemu-iotests: Add test case for mirror with unmap

 block.c   | 12 
 block/io.c|  4 +--
 include/block/block_int.h |  2 --
 tests/qemu-iotests/041| 66 ++-
 tests/qemu-iotests/131| 59 ++
 tests/qemu-iotests/131.out|  5 
 tests/qemu-iotests/group  |  1 +
 tests/qemu-iotests/iotests.py | 28 ++
 8 files changed, 110 insertions(+), 67 deletions(-)
 create mode 100644 tests/qemu-iotests/131
 create mode 100644 tests/qemu-iotests/131.out

-- 
1.9.3




[Qemu-block] [PATCH 2/4] block: Remove bdrv_reset_dirty

2015-05-05 Thread Fam Zheng
Using this function woule always be wrong because a dirty bitmap must
have a specific owner that consumes the dirty bits and calls
bdrv_reset_dirty_bitmap().

Remove the unused function to avoid future misuse.

Signed-off-by: Fam Zheng f...@redhat.com
---
 block.c   | 12 
 include/block/block_int.h |  2 --
 2 files changed, 14 deletions(-)

diff --git a/block.c b/block.c
index 7904098..511e13c 100644
--- a/block.c
+++ b/block.c
@@ -3324,18 +3324,6 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
 }
 }
 
-void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector,
-  int nr_sectors)
-{
-BdrvDirtyBitmap *bitmap;
-QLIST_FOREACH(bitmap, bs-dirty_bitmaps, list) {
-if (!bdrv_dirty_bitmap_enabled(bitmap)) {
-continue;
-}
-hbitmap_reset(bitmap-bitmap, cur_sector, nr_sectors);
-}
-}
-
 /**
  * Advance an HBitmapIter to an arbitrary offset.
  */
diff --git a/include/block/block_int.h b/include/block/block_int.h
index db29b74..aaed2fa 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -635,7 +635,5 @@ bool blk_dev_is_medium_locked(BlockBackend *blk);
 void blk_dev_resize_cb(BlockBackend *blk);
 
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector, int nr_sectors);
-void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector,
-  int nr_sectors);
 
 #endif /* BLOCK_INT_H */
-- 
1.9.3




[Qemu-block] [PATCH 4/4] qemu-iotests: Add test case for mirror with unmap

2015-05-05 Thread Fam Zheng
This checks that the discard on mirror source that effectively zeroes
data is also reflected by the data of target.

Signed-off-by: Fam Zheng f...@redhat.com
---
 tests/qemu-iotests/131 | 59 ++
 tests/qemu-iotests/131.out |  5 
 tests/qemu-iotests/group   |  1 +
 3 files changed, 65 insertions(+)
 create mode 100644 tests/qemu-iotests/131
 create mode 100644 tests/qemu-iotests/131.out

diff --git a/tests/qemu-iotests/131 b/tests/qemu-iotests/131
new file mode 100644
index 000..e33ca72
--- /dev/null
+++ b/tests/qemu-iotests/131
@@ -0,0 +1,59 @@
+#!/usr/bin/env python
+#
+# Test mirror with unmap
+#
+# Copyright (C) 2015 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/.
+#
+
+import time
+import os
+import iotests
+from iotests import qemu_img, qemu_io
+
+test_img = os.path.join(iotests.test_dir, 'test.img')
+target_img = os.path.join(iotests.test_dir, 'target.img')
+
+class TestSingleDrive(iotests.QMPTestCase):
+image_len = 2 * 1024 * 1024 # MB
+
+def setUp(self):
+# Write data to the image so we can compare later
+qemu_img('create', '-f', iotests.imgfmt, test_img, 
str(TestSingleDrive.image_len))
+qemu_io('-f', iotests.imgfmt, '-c', 'write -P0x5d 0 2M', test_img)
+
+self.vm = iotests.VM().add_drive(test_img, 'discard=unmap')
+self.vm.launch()
+
+def tearDown(self):
+self.vm.shutdown()
+os.remove(test_img)
+try:
+os.remove(target_img)
+except OSError:
+pass
+
+def test_mirror_discard(self):
+result = self.vm.qmp('drive-mirror', device='drive0', sync='full',
+ target=target_img)
+self.assert_qmp(result, 'return', {})
+self.vm.hmp_qemu_io('drive0', 'discard 0 64k')
+self.complete_and_wait('drive0')
+self.vm.shutdown()
+self.assertTrue(iotests.compare_images(test_img, target_img),
+'target image does not match source after mirroring')
+
+if __name__ == '__main__':
+iotests.main(supported_fmts=['raw'])
diff --git a/tests/qemu-iotests/131.out b/tests/qemu-iotests/131.out
new file mode 100644
index 000..ae1213e
--- /dev/null
+++ b/tests/qemu-iotests/131.out
@@ -0,0 +1,5 @@
+.
+--
+Ran 1 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 6ca3466..34b16cb 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -128,3 +128,4 @@
 128 rw auto quick
 129 rw auto quick
 130 rw auto quick
+131 rw auto quick
-- 
1.9.3




Re: [Qemu-block] [Qemu-devel] [PATCH 1/6] qcow2: use one single memory block for the L2/refcount cache tables

2015-05-05 Thread Alberto Garcia
On Thu 30 Apr 2015 05:08:05 PM CEST, Eric Blake ebl...@redhat.com wrote:

  typedef struct Qcow2CachedTable {
 -void*   table;
  int64_t offset;
  booldirty;
  int cache_hits;
 @@ -40,39 +39,34 @@ struct Qcow2Cache {
  struct Qcow2Cache*  depends;
  int size;
  booldepends_on_flush;
 +void   *table_array;
 +int table_size;

 Should this be size_t? [1]

The maximum supported table size is 2MB (MAX_CLUSTER_BITS == 21).

  c-entries = g_try_new0(Qcow2CachedTable, num_tables);
 -if (!c-entries) {
 -goto fail;
 -}
 +c-table_array = qemu_try_blockalign(bs-file, num_tables * 
 c-table_size);

 Are we sure this won't overflow?

That's a good catch. I was making some numbers and I doubt that scenario
would happen in practice, but I think it's possible so I'll fix it.

Berto



Re: [Qemu-block] [Qemu-devel] [PATCH 2/6] qcow2: simplify qcow2_cache_put() and qcow2_cache_entry_mark_dirty()

2015-05-05 Thread Alberto Garcia
On Fri 01 May 2015 04:31:52 PM CEST, Stefan Hajnoczi wrote:

  int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table)
  {
 -int i;
 +int i = (*table - c-table_array) / c-table_size;
  
 -for (i = 0; i  c-size; i++) {
 -if (table_addr(c, i) == *table) {
 -goto found;
 -}
 +if (c-entries[i].offset == 0) {
 +return -ENOENT;
  }
 -return -ENOENT;
  
 -found:
  c-entries[i].ref--;
  *table = NULL;
  

 When is this function called with a bogus table pointer?

I also could not image any such scenario, but I decided to be
conservative and keep the error handling code. I'll double check all
places where it's used and remove the relevant code.

Berto



Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description

2015-05-05 Thread Fam Zheng
On Wed, 05/06 02:26, Dong, Eddie wrote:
 
 
  -Original Message-
  From: Dr. David Alan Gilbert [mailto:dgilb...@redhat.com]
  Sent: Tuesday, May 05, 2015 11:24 PM
  To: Stefan Hajnoczi
  Cc: Paolo Bonzini; Wen Congyang; Fam Zheng; Kevin Wolf; Lai Jiangshan; qemu
  block; Jiang, Yunhong; Dong, Eddie; qemu devel; Max Reitz; Gonglei; Yang
  Hongyang; zhanghailiang; arm...@redhat.com; jc...@redhat.com
  Subject: Re: [PATCH COLO v3 01/14] docs: block replication's description
  
  * Stefan Hajnoczi (stefa...@redhat.com) wrote:
   On Fri, Apr 24, 2015 at 11:36:35AM +0200, Paolo Bonzini wrote:
   
   
On 24/04/2015 11:38, Wen Congyang wrote:
 
  That can be done with drive-mirror.  But I think it's too early 
  for that.
 Do you mean use drive-mirror instead of quorum?
   
Only before starting up a new secondary.  Basically you do a
migration with non-shared storage, and then start the secondary in colo
  mode.
   
But it's only for the failover case.  Quorum (or a new block/colo.c
driver or filter) is fine for normal colo operation.
  
   Perhaps this patch series should mirror the Secondary's disk to a
   Backup Secondary so that the system can be protected very quickly
   after failover.
  
   I think anyone serious about fault tolerance would deploy a Backup
   Secondary, otherwise the system cannot survive two failures unless a
   human administrator is lucky/fast enough to set up a new Secondary.
  
  I'd assumed that a higher level management layer would do the allocation of 
  a
  new secondary after the first failover, so no human need be involved.
  
 
 I agree. The cloud OS, such as open stack, will have the capability to handle
 the case, together with certain API in VMM side for this (libvirt?). 

The question here is the QMP API to switch secondary mode to primary mode is
not mentioned in this series.  I think that interface matters for this series.

Fam



Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description

2015-05-05 Thread Dong, Eddie


 -Original Message-
 From: Dr. David Alan Gilbert [mailto:dgilb...@redhat.com]
 Sent: Tuesday, May 05, 2015 11:24 PM
 To: Stefan Hajnoczi
 Cc: Paolo Bonzini; Wen Congyang; Fam Zheng; Kevin Wolf; Lai Jiangshan; qemu
 block; Jiang, Yunhong; Dong, Eddie; qemu devel; Max Reitz; Gonglei; Yang
 Hongyang; zhanghailiang; arm...@redhat.com; jc...@redhat.com
 Subject: Re: [PATCH COLO v3 01/14] docs: block replication's description
 
 * Stefan Hajnoczi (stefa...@redhat.com) wrote:
  On Fri, Apr 24, 2015 at 11:36:35AM +0200, Paolo Bonzini wrote:
  
  
   On 24/04/2015 11:38, Wen Congyang wrote:

 That can be done with drive-mirror.  But I think it's too early for 
 that.
Do you mean use drive-mirror instead of quorum?
  
   Only before starting up a new secondary.  Basically you do a
   migration with non-shared storage, and then start the secondary in colo
 mode.
  
   But it's only for the failover case.  Quorum (or a new block/colo.c
   driver or filter) is fine for normal colo operation.
 
  Perhaps this patch series should mirror the Secondary's disk to a
  Backup Secondary so that the system can be protected very quickly
  after failover.
 
  I think anyone serious about fault tolerance would deploy a Backup
  Secondary, otherwise the system cannot survive two failures unless a
  human administrator is lucky/fast enough to set up a new Secondary.
 
 I'd assumed that a higher level management layer would do the allocation of a
 new secondary after the first failover, so no human need be involved.
 

I agree. The cloud OS, such as open stack, will have the capability to handle 
the case, together with certain API in VMM side for this (libvirt?). 

Thx Eddie



Re: [Qemu-block] [PATCH 3/4] qemu-iotests: Make block job methods common

2015-05-05 Thread Fam Zheng
On Tue, 05/05 18:17, John Snow wrote:
 
 
 On 05/05/2015 08:46 AM, Fam Zheng wrote:
  Signed-off-by: Fam Zheng f...@redhat.com
  ---
   tests/qemu-iotests/041| 66 
  ++-
   tests/qemu-iotests/iotests.py | 28 ++
   2 files changed, 43 insertions(+), 51 deletions(-)
  
  diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
  index 59a8f73..3d46ed7 100755
  --- a/tests/qemu-iotests/041
  +++ b/tests/qemu-iotests/041
  @@ -34,38 +34,8 @@ quorum_img3 = os.path.join(iotests.test_dir, 
  'quorum3.img')
   quorum_repair_img = os.path.join(iotests.test_dir, 'quorum_repair.img')
   quorum_snapshot_file = os.path.join(iotests.test_dir, 
  'quorum_snapshot.img')
   
  -class ImageMirroringTestCase(iotests.QMPTestCase):
  -'''Abstract base class for image mirroring test cases'''
   
  -def wait_ready(self, drive='drive0'):
  -'''Wait until a block job BLOCK_JOB_READY event'''
  -ready = False
  -while not ready:
  -for event in self.vm.get_qmp_events(wait=True):
  -if event['event'] == 'BLOCK_JOB_READY':
  -self.assert_qmp(event, 'data/type', 'mirror')
  -self.assert_qmp(event, 'data/device', drive)
  -ready = True
  -
  -def wait_ready_and_cancel(self, drive='drive0'):
  -self.wait_ready(drive=drive)
  -event = self.cancel_and_wait(drive=drive)
  -self.assertEquals(event['event'], 'BLOCK_JOB_COMPLETED')
  -self.assert_qmp(event, 'data/type', 'mirror')
  -self.assert_qmp(event, 'data/offset', event['data']['len'])
  -
  -def complete_and_wait(self, drive='drive0', wait_ready=True):
  -'''Complete a block job and wait for it to finish'''
  -if wait_ready:
  -self.wait_ready(drive=drive)
  -
  -result = self.vm.qmp('block-job-complete', device=drive)
  -self.assert_qmp(result, 'return', {})
  -
  -event = self.wait_until_completed(drive=drive)
  -self.assert_qmp(event, 'data/type', 'mirror')
  -
  -class TestSingleDrive(ImageMirroringTestCase):
  +class TestSingleDrive(iotests.QMPTestCase):
   image_len = 1 * 1024 * 1024 # MB
   
   def setUp(self):
  @@ -221,17 +191,9 @@ class TestSingleDriveUnalignedLength(TestSingleDrive):
   test_small_buffer2 = None
   test_large_cluster = None
   
  -class TestMirrorNoBacking(ImageMirroringTestCase):
  +class TestMirrorNoBacking(iotests.QMPTestCase):
   image_len = 2 * 1024 * 1024 # MB
   
  -def complete_and_wait(self, drive='drive0', wait_ready=True):
  -iotests.create_image(target_backing_img, 
  TestMirrorNoBacking.image_len)
  -return ImageMirroringTestCase.complete_and_wait(self, drive, 
  wait_ready)
  -
  -def compare_images(self, img1, img2):
  -iotests.create_image(target_backing_img, 
  TestMirrorNoBacking.image_len)
  -return iotests.compare_images(img1, img2)
  -
   def setUp(self):
   iotests.create_image(backing_img, TestMirrorNoBacking.image_len)
   qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
  backing_img, test_img)
  @@ -242,7 +204,10 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
   self.vm.shutdown()
   os.remove(test_img)
   os.remove(backing_img)
  -os.remove(target_backing_img)
  +try:
  +os.remove(target_backing_img)
  +except:
  +pass
   os.remove(target_img)
   
   def test_complete(self):
  @@ -257,7 +222,7 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
   result = self.vm.qmp('query-block')
   self.assert_qmp(result, 'return[0]/inserted/file', target_img)
   self.vm.shutdown()
  -self.assertTrue(self.compare_images(test_img, target_img),
  +self.assertTrue(iotests.compare_images(test_img, target_img),
   'target image does not match source after 
  mirroring')
   
   def test_cancel(self):
  @@ -272,7 +237,7 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
   result = self.vm.qmp('query-block')
   self.assert_qmp(result, 'return[0]/inserted/file', test_img)
   self.vm.shutdown()
  -self.assertTrue(self.compare_images(test_img, target_img),
  +self.assertTrue(iotests.compare_images(test_img, target_img),
   'target image does not match source after 
  mirroring')
   
   def test_large_cluster(self):
  @@ -283,7 +248,6 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
   %(TestMirrorNoBacking.image_len), 
  target_backing_img)
   qemu_img('create', '-f', iotests.imgfmt, '-o', 
  'cluster_size=%d,backing_file=%s'
   % (TestMirrorNoBacking.image_len, 
  target_backing_img), target_img)
  -os.remove(target_backing_img)
   
   result = 

[Qemu-block] [PATCH v2 0/6] block: Mirror discarded sectors

2015-05-05 Thread Fam Zheng
v2: Fix typo and add Eric's rev-by in patch 3.
Add patch 1 to discard target in mirror job. (Paolo)
Add patch 6 to improve iotests.wait_ready. (John)

This fixes the mirror assert failure reported by wangxiaolong:

https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg04458.html

The direct cause is that hbitmap code couldn't handle unset of bits *after*
iterator's current position. We could fix that, but the bdrv_reset_dirty() call
is more questionable:

Before, if guest discarded some sectors during migration, it could see
different data after moving to dest side, depending on block backends of the
src and the dest. This is IMO worse than mirroring the actual reading as done
in this series, because we don't know what the guest is doing.

For example if a guest first issues WRITE SAME to wipe out the area then issues
UNMAP to discard it, just to get rid of some sensitive data completely, we may
miss both operations and leave stale data on dest image.


Fam Zheng (6):
  mirror: Discard target sectors if not allocated at source side
  block: Fix dirty bitmap in bdrv_co_discard
  block: Remove bdrv_reset_dirty
  qemu-iotests: Make block job methods common
  qemu-iotests: Add test case for mirror with unmap
  iotests: Use event_wait in wait_ready

 block.c   | 12 
 block/io.c|  4 +--
 block/mirror.c| 12 ++--
 include/block/block_int.h |  2 --
 tests/qemu-iotests/041| 66 ++-
 tests/qemu-iotests/131| 59 ++
 tests/qemu-iotests/131.out|  5 
 tests/qemu-iotests/group  |  1 +
 tests/qemu-iotests/iotests.py | 23 +++
 9 files changed, 115 insertions(+), 69 deletions(-)
 create mode 100644 tests/qemu-iotests/131
 create mode 100644 tests/qemu-iotests/131.out

-- 
1.9.3




[Qemu-block] [PATCH v2 1/6] mirror: Discard target sectors if not allocated at source side

2015-05-05 Thread Fam Zheng
If guest discards a source cluster during mirror, we would want to
discard target side as well.

Signed-off-by: Fam Zheng f...@redhat.com
---
 block/mirror.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 58f391a..37a5b61 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -163,6 +163,7 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
 uint64_t delay_ns = 0;
 MirrorOp *op;
+int pnum;
 
 s-sector_num = hbitmap_iter_next(s-hbi);
 if (s-sector_num  0) {
@@ -289,8 +290,15 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 s-in_flight++;
 s-sectors_in_flight += nb_sectors;
 trace_mirror_one_iteration(s, sector_num, nb_sectors);
-bdrv_aio_readv(source, sector_num, op-qiov, nb_sectors,
-   mirror_read_complete, op);
+
+if (!bdrv_is_allocated_above(source, NULL, sector_num,
+ nb_sectors, pnum)) {
+bdrv_aio_discard(s-target, sector_num, nb_sectors,
+ mirror_write_complete, op);
+} else {
+bdrv_aio_readv(source, sector_num, op-qiov, nb_sectors,
+   mirror_read_complete, op);
+}
 return delay_ns;
 }
 
-- 
1.9.3




[Qemu-block] [PATCH v2 2/6] block: Fix dirty bitmap in bdrv_co_discard

2015-05-05 Thread Fam Zheng
Unsetting dirty globally with discard is not very correct. The discard may zero
out sectors (depending on can_write_zeroes_with_unmap), we should replicate
this change to destinition side to make sure that the guest sees the same data.

Calling bdrv_reset_dirty also troubles mirror job because the hbitmap iterator
doesn't expect unsetting of bits after current position.

So let's do it the opposite way which fixes both problems: set the dirty bits
if we are to discard it.

Reported-by: wangxiaol...@ucloud.cn
Signed-off-by: Fam Zheng f...@redhat.com
---
 block/io.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index 1ce62c4..809688b 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2343,8 +2343,6 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, 
int64_t sector_num,
 return -EROFS;
 }
 
-bdrv_reset_dirty(bs, sector_num, nb_sectors);
-
 /* Do nothing if disabled.  */
 if (!(bs-open_flags  BDRV_O_UNMAP)) {
 return 0;
@@ -2354,6 +2352,8 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, 
int64_t sector_num,
 return 0;
 }
 
+bdrv_set_dirty(bs, sector_num, nb_sectors);
+
 max_discard = MIN_NON_ZERO(bs-bl.max_discard, BDRV_REQUEST_MAX_SECTORS);
 while (nb_sectors  0) {
 int ret;
-- 
1.9.3




[Qemu-block] [PATCH v2 6/6] iotests: Use event_wait in wait_ready

2015-05-05 Thread Fam Zheng
Only poll the specific type of event we are interested in, to avoid
stealing events that should be consumed by someone else.

Suggested-by: John Snow js...@redhat.com
Signed-off-by: Fam Zheng f...@redhat.com
---
 tests/qemu-iotests/iotests.py | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 2e07cc4..0ddc513 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -328,13 +328,8 @@ class QMPTestCase(unittest.TestCase):
 
 def wait_ready(self, drive='drive0'):
 '''Wait until a block job BLOCK_JOB_READY event'''
-ready = False
-while not ready:
-for event in self.vm.get_qmp_events(wait=True):
-if event['event'] == 'BLOCK_JOB_READY':
-self.assert_qmp(event, 'data/type', 'mirror')
-self.assert_qmp(event, 'data/device', drive)
-ready = True
+f = {'data': {'type': 'mirror', 'device': drive } }
+event = self.vm.event_wait(name='BLOCK_JOB_READY', match=f)
 
 def wait_ready_and_cancel(self, drive='drive0'):
 self.wait_ready(drive=drive)
-- 
1.9.3




[Qemu-block] [PATCH v2 4/6] qemu-iotests: Make block job methods common

2015-05-05 Thread Fam Zheng
Signed-off-by: Fam Zheng f...@redhat.com
---
 tests/qemu-iotests/041| 66 ++-
 tests/qemu-iotests/iotests.py | 28 ++
 2 files changed, 43 insertions(+), 51 deletions(-)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 59a8f73..3d46ed7 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -34,38 +34,8 @@ quorum_img3 = os.path.join(iotests.test_dir, 'quorum3.img')
 quorum_repair_img = os.path.join(iotests.test_dir, 'quorum_repair.img')
 quorum_snapshot_file = os.path.join(iotests.test_dir, 'quorum_snapshot.img')
 
-class ImageMirroringTestCase(iotests.QMPTestCase):
-'''Abstract base class for image mirroring test cases'''
 
-def wait_ready(self, drive='drive0'):
-'''Wait until a block job BLOCK_JOB_READY event'''
-ready = False
-while not ready:
-for event in self.vm.get_qmp_events(wait=True):
-if event['event'] == 'BLOCK_JOB_READY':
-self.assert_qmp(event, 'data/type', 'mirror')
-self.assert_qmp(event, 'data/device', drive)
-ready = True
-
-def wait_ready_and_cancel(self, drive='drive0'):
-self.wait_ready(drive=drive)
-event = self.cancel_and_wait(drive=drive)
-self.assertEquals(event['event'], 'BLOCK_JOB_COMPLETED')
-self.assert_qmp(event, 'data/type', 'mirror')
-self.assert_qmp(event, 'data/offset', event['data']['len'])
-
-def complete_and_wait(self, drive='drive0', wait_ready=True):
-'''Complete a block job and wait for it to finish'''
-if wait_ready:
-self.wait_ready(drive=drive)
-
-result = self.vm.qmp('block-job-complete', device=drive)
-self.assert_qmp(result, 'return', {})
-
-event = self.wait_until_completed(drive=drive)
-self.assert_qmp(event, 'data/type', 'mirror')
-
-class TestSingleDrive(ImageMirroringTestCase):
+class TestSingleDrive(iotests.QMPTestCase):
 image_len = 1 * 1024 * 1024 # MB
 
 def setUp(self):
@@ -221,17 +191,9 @@ class TestSingleDriveUnalignedLength(TestSingleDrive):
 test_small_buffer2 = None
 test_large_cluster = None
 
-class TestMirrorNoBacking(ImageMirroringTestCase):
+class TestMirrorNoBacking(iotests.QMPTestCase):
 image_len = 2 * 1024 * 1024 # MB
 
-def complete_and_wait(self, drive='drive0', wait_ready=True):
-iotests.create_image(target_backing_img, TestMirrorNoBacking.image_len)
-return ImageMirroringTestCase.complete_and_wait(self, drive, 
wait_ready)
-
-def compare_images(self, img1, img2):
-iotests.create_image(target_backing_img, TestMirrorNoBacking.image_len)
-return iotests.compare_images(img1, img2)
-
 def setUp(self):
 iotests.create_image(backing_img, TestMirrorNoBacking.image_len)
 qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
backing_img, test_img)
@@ -242,7 +204,10 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
 self.vm.shutdown()
 os.remove(test_img)
 os.remove(backing_img)
-os.remove(target_backing_img)
+try:
+os.remove(target_backing_img)
+except:
+pass
 os.remove(target_img)
 
 def test_complete(self):
@@ -257,7 +222,7 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
 result = self.vm.qmp('query-block')
 self.assert_qmp(result, 'return[0]/inserted/file', target_img)
 self.vm.shutdown()
-self.assertTrue(self.compare_images(test_img, target_img),
+self.assertTrue(iotests.compare_images(test_img, target_img),
 'target image does not match source after mirroring')
 
 def test_cancel(self):
@@ -272,7 +237,7 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
 result = self.vm.qmp('query-block')
 self.assert_qmp(result, 'return[0]/inserted/file', test_img)
 self.vm.shutdown()
-self.assertTrue(self.compare_images(test_img, target_img),
+self.assertTrue(iotests.compare_images(test_img, target_img),
 'target image does not match source after mirroring')
 
 def test_large_cluster(self):
@@ -283,7 +248,6 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
 %(TestMirrorNoBacking.image_len), target_backing_img)
 qemu_img('create', '-f', iotests.imgfmt, '-o', 
'cluster_size=%d,backing_file=%s'
 % (TestMirrorNoBacking.image_len, target_backing_img), 
target_img)
-os.remove(target_backing_img)
 
 result = self.vm.qmp('drive-mirror', device='drive0', sync='full',
  mode='existing', target=target_img)
@@ -293,10 +257,10 @@ class TestMirrorNoBacking(ImageMirroringTestCase):
 result = self.vm.qmp('query-block')
 self.assert_qmp(result, 'return[0]/inserted/file', target_img)
 self.vm.shutdown()
-

[Qemu-block] [PATCH v2 5/6] qemu-iotests: Add test case for mirror with unmap

2015-05-05 Thread Fam Zheng
This checks that the discard on mirror source that effectively zeroes
data is also reflected by the data of target.

Signed-off-by: Fam Zheng f...@redhat.com
---
 tests/qemu-iotests/131 | 59 ++
 tests/qemu-iotests/131.out |  5 
 tests/qemu-iotests/group   |  1 +
 3 files changed, 65 insertions(+)
 create mode 100644 tests/qemu-iotests/131
 create mode 100644 tests/qemu-iotests/131.out

diff --git a/tests/qemu-iotests/131 b/tests/qemu-iotests/131
new file mode 100644
index 000..f53ef6e
--- /dev/null
+++ b/tests/qemu-iotests/131
@@ -0,0 +1,59 @@
+#!/usr/bin/env python
+#
+# Test mirror with unmap
+#
+# Copyright (C) 2015 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/.
+#
+
+import time
+import os
+import iotests
+from iotests import qemu_img, qemu_io
+
+test_img = os.path.join(iotests.test_dir, 'test.img')
+target_img = os.path.join(iotests.test_dir, 'target.img')
+
+class TestSingleDrive(iotests.QMPTestCase):
+image_len = 2 * 1024 * 1024 # MB
+
+def setUp(self):
+# Write data to the image so we can compare later
+qemu_img('create', '-f', iotests.imgfmt, test_img, 
str(TestSingleDrive.image_len))
+qemu_io('-f', iotests.imgfmt, '-c', 'write -P0x5d 0 2M', test_img)
+
+self.vm = iotests.VM().add_drive(test_img, 'discard=unmap')
+self.vm.launch()
+
+def tearDown(self):
+self.vm.shutdown()
+os.remove(test_img)
+try:
+os.remove(target_img)
+except OSError:
+pass
+
+def test_mirror_discard(self):
+result = self.vm.qmp('drive-mirror', device='drive0', sync='full',
+ target=target_img)
+self.assert_qmp(result, 'return', {})
+self.vm.hmp_qemu_io('drive0', 'discard 0 64k')
+self.complete_and_wait('drive0')
+self.vm.shutdown()
+self.assertTrue(iotests.compare_images(test_img, target_img),
+'target image does not match source after mirroring')
+
+if __name__ == '__main__':
+iotests.main(supported_fmts=['raw', 'qcow2'])
diff --git a/tests/qemu-iotests/131.out b/tests/qemu-iotests/131.out
new file mode 100644
index 000..ae1213e
--- /dev/null
+++ b/tests/qemu-iotests/131.out
@@ -0,0 +1,5 @@
+.
+--
+Ran 1 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 6ca3466..34b16cb 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -128,3 +128,4 @@
 128 rw auto quick
 129 rw auto quick
 130 rw auto quick
+131 rw auto quick
-- 
1.9.3




Re: [Qemu-block] [PATCH 1/6] qcow2: use one single memory block for the L2/refcount cache tables

2015-05-05 Thread Alberto Garcia
On Tue 05 May 2015 01:20:19 PM CEST, Kevin Wolf wrote:

 Though looking at the code again I see now that c-table_size isn't
 consistently used. The I/O requests still use s-cluster_size. We
 should either use it everywhere or not introduce it at all.

c-table_size is necessary in order to calculate the offset of a
particular table, because s-cluster_size is not always available
(e.g. qcow2_cache_entry_mark_dirty()). We could make it a requirement of
the API, though.

Alternatively we could have c-bs instead of c-table_size. That would
spare us from passing the BlockDriverState pointer to all qcow2_cache_
functions.

The assumption would be that the BlockDriverState pointer is the same
for the whole lifetime of the cache, but that's always guaranteed to be
like that, right?

Berto



Re: [Qemu-block] [Qemu-devel] [PATCH 2/6] qcow2: simplify qcow2_cache_put() and qcow2_cache_entry_mark_dirty()

2015-05-05 Thread Eric Blake
On 04/30/2015 04:11 AM, Alberto Garcia wrote:
 Since all tables are now stored together, it is possible to obtain
 the position of a particular table directly from its address, so the
 operation becomes O(1).
 
 Signed-off-by: Alberto Garcia be...@igalia.com
 ---
  block/qcow2-cache.c | 22 +-
  1 file changed, 5 insertions(+), 17 deletions(-)
 
 diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
 index 586880b..d3274f4 100644
 --- a/block/qcow2-cache.c
 +++ b/block/qcow2-cache.c
 @@ -335,16 +335,12 @@ int qcow2_cache_get_empty(BlockDriverState *bs, 
 Qcow2Cache *c, uint64_t offset,
  
  int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table)
  {
 -int i;
 +int i = (*table - c-table_array) / c-table_size;

Arithmetic on void* is not portable.

  void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table)
  {
 -int i;
 -
 -for (i = 0; i  c-size; i++) {
 -if (table_addr(c, i) == table) {
 -goto found;
 -}
 -}
 -abort();
 -
 -found:
 +int i = (table - c-table_array) / c-table_size;

and again.

 +assert(c-entries[i].offset != 0);
  c-entries[i].dirty = true;
  }
 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH COLO v3 01/14] docs: block replication's description

2015-05-05 Thread Dr. David Alan Gilbert
* Stefan Hajnoczi (stefa...@redhat.com) wrote:
 On Fri, Apr 24, 2015 at 11:36:35AM +0200, Paolo Bonzini wrote:
  
  
  On 24/04/2015 11:38, Wen Congyang wrote:

That can be done with drive-mirror.  But I think it's too early for 
that.
   Do you mean use drive-mirror instead of quorum?
  
  Only before starting up a new secondary.  Basically you do a migration
  with non-shared storage, and then start the secondary in colo mode.
  
  But it's only for the failover case.  Quorum (or a new block/colo.c
  driver or filter) is fine for normal colo operation.
 
 Perhaps this patch series should mirror the Secondary's disk to a Backup
 Secondary so that the system can be protected very quickly after
 failover.
 
 I think anyone serious about fault tolerance would deploy a Backup
 Secondary, otherwise the system cannot survive two failures unless a
 human administrator is lucky/fast enough to set up a new Secondary.

I'd assumed that a higher level management layer would do the allocation
of a new secondary after the first failover, so no human need be involved.

Dave

 Stefan


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-block] [Qemu-devel] [PATCH v3 0/9] ahci: enable migration

2015-05-05 Thread John Snow


On 04/30/2015 02:07 PM, John Snow wrote:
 The day we all feared is here, and I am proposing we allow the migration
 of the AHCI device tentatively for the 2.4 development window.
 
 There are some more NCQ migration tests are needed, but I felt that it was
 important to get migration enabled as close to the start of the 2.4
 development window as possible.
 
 If the NCQ patches don't pan out by the time the 2.4 freeze occurs, we can
 revert the migration boolean and add a conditional around the ahci tests
 that rely on the migration feature being enabled.
 
 I am justifying this checkin based on a series of ping-pong
 migration tests I ran under heavy load (using google's stressapptest)
 and saw over 300 successful migrations without a single failure.
 
 This series does a few things:
 (1) Add migration facilities to libqos
 (2) Enable AHCI and ICH9 migration
 (3) Add a series of migration tests to ahci-test
 
 v3:
  - Rebase and resend for 2.4.
  - Minor style guide fix.
 
 v2:
  - Added a URI parameter to the migrate() helper
  - Adjust ahci_shutdown to set qtest context for itself
  - Make sure verify() is part of ahci_migrate() and redundant
calls are eliminated
  - Add new helpers to make tests with blkdebug injections more
succint
  - Change the flush migrate test to not load the blkdebug rule
on the destination host
  - Modify the migrate() function so that it does not poll the
VM for migration status if it can rely on RESUME events.
  - New patch: Repair the ahci_command_set_offset helper.
  - New test: Test DMA halt and resume.
  - New test: Test DMA halt, migrate, and resume.
 
 ==
 For convenience, this branch is available at:
 https://github.com/jnsnow/qemu.git branch ahci-migration-test
 https://github.com/jnsnow/qemu/tree/ahci-migration-test
 
 This version is tagged ahci-migration-test-v3:
 https://github.com/jnsnow/qemu/releases/tag/ahci-migration-test-v3
 ==
 
 John Snow (9):
   libqos/ahci: Add halted command helpers
   libqos/ahci: Fix sector set method
   libqos: Add migration helpers
   ich9/ahci: Enable Migration
   qtest/ahci: Add migration test
   qtest/ahci: add migrate dma test
   qtest/ahci: add flush migrate test
   qtest/ahci: add halted dma test
   qtest/ahci: add migrate halted dma test
 
  hw/ide/ahci.c |   1 -
  hw/ide/ich.c  |   1 -
  tests/ahci-test.c | 318 
 +-
  tests/libqos/ahci.c   |  34 +-
  tests/libqos/ahci.h   |   3 +
  tests/libqos/libqos.c |  84 +
  tests/libqos/libqos.h |   2 +
  tests/libqos/malloc.c |  74 +---
  tests/libqos/malloc.h |   1 +
  9 files changed, 496 insertions(+), 22 deletions(-)
 

Staged: https://github.com/jnsnow/qemu/commits/ide
(with one edit to patch 3 as suggested by Kevin.)

--js



Re: [Qemu-block] [Qemu-devel] [PATCH v3 3/9] libqos: Add migration helpers

2015-05-05 Thread John Snow



On 05/05/2015 07:35 AM, Kevin Wolf wrote:

Am 04.05.2015 um 19:52 hat John Snow geschrieben:



On 05/04/2015 08:07 AM, Kevin Wolf wrote:

Am 30.04.2015 um 20:07 hat John Snow geschrieben:

+/* Otherwise, we need to wait: poll until migration is completed. */
+while (1) {
+rsp = qmp_execute(query-migrate);
+g_assert(qdict_haskey(rsp, return));
+sub = qdict_get_qdict(rsp, return);
+g_assert(qdict_haskey(sub, status));
+st = qdict_get_str(sub, status);
+
+/* setup, active, completed, failed, cancelled */
+if (strcmp(st, completed) == 0) {
+QDECREF(rsp);
+break;
+}
+
+if ((strcmp(st, setup) == 0) || (strcmp(st, active) == 0)) {
+QDECREF(rsp);
+continue;


Wouldn't it be nicer to sleep a bit before retrying?



I actually figured that all the string and stream manipulation for
sending and receiving QMP queries was enough sleep because of how
quick a migration without any guest should complete -- in practice
this loop doesn't ever seem to trigger more than once.


This surprised me a bit at first because there's no way that string
operations are _that_ slow. You would definitely spin a while in this
loop (and potentially slow down the migration by that).

I think what saves you is that you wait for the STOP event first, and
when qemu's migration thread sends that event, it happens to have
already taken the global mutex. This means that you get your enough
sleep from the qemu monitor, which won't respond before migration has
completed.


If you still think sleep is necessary, I can add some very small
sleep in a separate patch, or when I merge the tree. Something like:

g_usleep(5000) /* 5 msec */


If I were you, I'd add it just to be nice (just applying it to your tree
instead of sending out a new version would be okay). If you don't want
to, I won't insist, though. I mean, I already gave my R-b...

Kevin



It's worth finding out if my reasoning is sane, and you cared enough to 
comment.


I'll add the sleep when I merge, no problem :)

Thanks!
--js



Re: [Qemu-block] [Qemu-devel] [RFC] Differential Backups

2015-05-05 Thread John Snow



On 05/05/2015 06:25 AM, Stefan Hajnoczi wrote:

On Wed, Apr 29, 2015 at 06:51:08PM -0400, John Snow wrote:

This is a feature that should be very easy to add on top of the existing
incremental feature, since it's just a difference in how the bitmap is
treated:

Incremental
- Links to the last incremental (managed by libvirt)
- Clears the bitmap after creation

Differential:
- Links to the last full backup always (managed by libvirt)
- Does not clear the bitmap after creation

No biggie.


Differential backups can be done using incremental backup functionality
in QEMU:

The client application points QEMU to the same target repeatedly instead
of keeping separate incremental backups.

Stefan



Oh, so you're saying:

[anchor]--[diff1]

And then when making a new incremental, we re-use diff1 as a target and 
overwrite it so that it becomes:


[anchor]--[diff2]

In effect giving us a differential.

OK, so it's possible, but we still lose out on some flexibility that a 
slightly different mode would provide us, like the ability to keep 
multiple differentials if desired. (Well, I suppose we *can* create 
those by manually copying differentials after we create them, but that 
seems hackier than necessary.)


Still, it would be such a paltry few lines of code and introduce no real 
complexity to the subsystem, and it might make libvirt's time a little 
easier for managing such things.


--js



[Qemu-block] [PATCH] vmdk: Fix overflow if l1_size is 0x20000000

2015-05-05 Thread Fam Zheng
Richard Jones caught this bug with afl fuzzer.

In fact, that's the only possible value to overflow (extent-l1_size =
0x2000) l1_size:

l1_size = extent-l1_size * sizeof(long) = 0x8000;

g_try_malloc returns NULL because l1_size is interpreted as negative
during type casting from 'int' to 'gsize', which yields a enormous
value. Hence, by coincidence, we get a not too bad behavior:

qemu-img: Could not open '/tmp/afl6.img': Could not open
'/tmp/afl6.img': Cannot allocate memory

Values larger than 0x2000 will be refused by the validation in
vmdk_add_extent.

Values smaller than 0x2000 will not overflow l1_size.

Reported-by: Richard W.M. Jones rjo...@redhat.com
Signed-off-by: Fam Zheng f...@redhat.com
---
 block/vmdk.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 1c5e2ef..e095156 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -451,7 +451,8 @@ static int vmdk_init_tables(BlockDriverState *bs, 
VmdkExtent *extent,
 Error **errp)
 {
 int ret;
-int l1_size, i;
+size_t l1_size;
+int i;
 
 /* read the L1 table */
 l1_size = extent-l1_size * sizeof(uint32_t);
-- 
1.9.3




Re: [Qemu-block] [PATCH] vmdk: Fix overflow if l1_size is 0x20000000

2015-05-05 Thread Richard W.M. Jones
On Tue, May 05, 2015 at 05:28:13PM +0800, Fam Zheng wrote:
 Richard Jones caught this bug with afl fuzzer.
 
 In fact, that's the only possible value to overflow (extent-l1_size =
 0x2000) l1_size:
 
 l1_size = extent-l1_size * sizeof(long) = 0x8000;
 
 g_try_malloc returns NULL because l1_size is interpreted as negative
 during type casting from 'int' to 'gsize', which yields a enormous
 value. Hence, by coincidence, we get a not too bad behavior:
 
 qemu-img: Could not open '/tmp/afl6.img': Could not open
 '/tmp/afl6.img': Cannot allocate memory
 
 Values larger than 0x2000 will be refused by the validation in
 vmdk_add_extent.
 
 Values smaller than 0x2000 will not overflow l1_size.
 
 Reported-by: Richard W.M. Jones rjo...@redhat.com
 Signed-off-by: Fam Zheng f...@redhat.com

ACK, and:

Tested-by: Richard W.M. Jones rjo...@redhat.com

Rich.

  block/vmdk.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/block/vmdk.c b/block/vmdk.c
 index 1c5e2ef..e095156 100644
 --- a/block/vmdk.c
 +++ b/block/vmdk.c
 @@ -451,7 +451,8 @@ static int vmdk_init_tables(BlockDriverState *bs, 
 VmdkExtent *extent,
  Error **errp)
  {
  int ret;
 -int l1_size, i;
 +size_t l1_size;
 +int i;
  
  /* read the L1 table */
  l1_size = extent-l1_size * sizeof(uint32_t);
 -- 
 1.9.3

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW



Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/3] block: Warn about usage of growing formats over non-growable protocols

2015-05-05 Thread Stefan Hajnoczi
On Thu, Mar 19, 2015 at 03:03:18PM -0400, Max Reitz wrote:
 Some image formats (e.g. qcow2) require the underlying file to grow on
 write accesses, but this is in fact not supported by all protocols (e.g.
 nbd does not). If such a format requiring file growth is used
 non-read-only over a protocol which does not support this, a warning
 should be issued.
 
 This warning is issued for example whenever one tries to export a qcow2
 image over nbd-server and use the export from qemu.

The warning implies that the user should switch to read-only or a
different protocol, but this configuration is perfectly normal.  For
example, oVirt uses qcow2 on LVM volumes.

Introducing a warning for a normal QEMU invocation is a bit weird.

What is the point of this series?  Were users confused that they hit
ENOSPC?

Stefan


pgptGtY8wVrcK.pgp
Description: PGP signature


Re: [Qemu-block] [PATCH 1/6] qcow2: use one single memory block for the L2/refcount cache tables

2015-05-05 Thread Kevin Wolf
Am 05.05.2015 um 12:28 hat Stefan Hajnoczi geschrieben:
 On Mon, May 04, 2015 at 12:58:13PM +0200, Kevin Wolf wrote:
  Am 01.05.2015 um 16:23 hat Stefan Hajnoczi geschrieben:
   On Thu, Apr 30, 2015 at 01:11:40PM +0300, Alberto Garcia wrote:
 Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
 {
 BDRVQcowState *s = bs-opaque;
 Qcow2Cache *c;
-int i;
 
 c = g_new0(Qcow2Cache, 1);
 c-size = num_tables;
+c-table_size = s-cluster_size;
   
   This assumes c-table_size meets bs' memory alignment requirements.  The
   following would be safer:
   
   c-table_size = QEMU_ALIGN_UP(c-table_size, 
   bdrv_opt_mem_align(bs-file));
  
  You can't just access more than one cluster. You might be caching data
  and later write it back when it's stale.
 
 I don't mean I/O alignment, just memory alignment (i.e. the start
 address of the data buffer in memory).

The start address is already taken care of by qemu_blockalign(). With
rounding c-table_size, you'd align the length, and that would be wrong.

Though looking at the code again I see now that c-table_size isn't
consistently used. The I/O requests still use s-cluster_size. We should
either use it everywhere or not introduce it at all.

Kevin


pgpj48emVJpQJ.pgp
Description: PGP signature