from:"Benoît Canet"

Re: [Qemu-devel] [PATCH 8/9] throttle: Update throttle infrastructure copyright

2015-02-24 Thread Benoît Canet

On Tue, Feb 24, 2015 at 09:49:26AM -0700, Eric Blake wrote:
 On 02/13/2015 09:06 AM, Alberto Garcia wrote:
  From: Benoît Canet benoit.ca...@nodalink.com
  
  Signed-off-by: Benoit Canet benoit.ca...@nodalink.com
  Signed-off-by: Alberto Garcia be...@igalia.com
  ---
   include/qemu/throttle.h | 4 ++--
   tests/test-throttle.c   | 4 ++--
   util/throttle.c | 4 ++--
   3 files changed, 6 insertions(+), 6 deletions(-)
  
  diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
  index f846e5a..6174332 100644
  --- a/include/qemu/throttle.h
  +++ b/include/qemu/throttle.h
  @@ -1,10 +1,10 @@
   /*
* QEMU throttling infrastructure
*
  - * Copyright (C) Nodalink, SARL. 2013
  + * Copyright (C) Nodalink, EURL. 2013-2014
 
 Don't you also want to claim 2015 now?

I didn't worked on it on 2015.
Maybe Alberto will want to claim his work on it.

Best regards

Benoît

 
 -- 
 Eric Blake   eblake redhat com+1-919-301-3266
 Libvirt virtualization library http://libvirt.org

Re: [Qemu-devel] Block filter status

2014-12-15 Thread Benoît Canet

On Mon, Dec 15, 2014 at 04:32:19PM +, Farr, Kaitlin M. wrote:
 Hello all,
 
 

Hello,

1) I once wanted to write block filters.
2) I am not a QEMU contributor anymore.
3) If you need block filters you also need to write them.

Best regards

Benoît

 I would like to inquire about the status of block filters in QEMU. I work for 
 a team at the Johns Hopkins University Applied Physics Laboratory, and we 
 would be interested in a QEMU encryption feature, such as described in the 
 to-do list [1]. However, it seems the method described relies on block 
 filters, also on the to-do list. Does development work on block filters live 
 in a public branch somewhere? If so, I'd greatly appreciate it if someone 
 could point me to it so that I could take a look. Does anyone know of an 
 estimated date for when the block filters might be in state so that one could 
 write an encryption filter? Additionally, does anyone have an estimate for 
 time and amount of effort if might take to implement such an encryption 
 filter? If someone could assist me in planning out what steps need to be 
 taken to support encryption, I would appreciate it.
 
 Thanks,
 
 Kaitlin Farr
 
 Software engineer
 
 JHU/APL
 
 
 1. 
 http://qemu-project.org/Features/Block/Todo#New_design_for_better_encryption_support_.5Barmbru.5D

Re: [Qemu-devel] [PATCH] block: qemu-iotests change _supported_proto to file once more.

2014-10-20 Thread Benoît Canet

  
  _supported_fmt qcow2
 -_supported_proto file
 +_supported_proto file nfs
  _supported_os Linux
  
  IMG_SIZE=128K
 diff --git a/tests/qemu-iotests/092 b/tests/qemu-iotests/092
 index a8c0c9c..52c529b 100755
 --- a/tests/qemu-iotests/092
 +++ b/tests/qemu-iotests/092
 @@ -40,7 +40,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
  . ./common.filter
  
  _supported_fmt qcow
 -_supported_proto generic
 +_supported_proto file
  _supported_os Linux
  
  offset_backing_file_offset=8
 diff --git a/tests/qemu-iotests/103 b/tests/qemu-iotests/103
 index 0f1dc9f..ccab551 100755
 --- a/tests/qemu-iotests/103
 +++ b/tests/qemu-iotests/103
 @@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
  . ./common.filter
  
  _supported_fmt qcow2
 -_supported_proto file
 +_supported_proto file nfs
  _supported_os Linux
  
  IMG_SIZE=64K
 -- 
 1.7.9.5
 
 

Seems legit.

Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [EXTERNAL] Re: how to dynamically add a block device using qmp?

2014-10-14 Thread Benoît Canet

The Monday 13 Oct 2014 à 14:08:18 (-0700), Ken Chiang wrote :
 I also tried to add a virt io device to the frontend but still no disc in the 
 VM.  Am I missing anything else?
 
 
 {QMP: {version: {qemu: {micro: 0, minor: 0, major: 2}, package: 
  (Debian 2.0.0+dfsg-2ubuntu1.2)}, capabilities: []}}
 {execute:qmp_capabilities}
 {return: {}}
 { execute: blockdev-add,
arguments: { options : {
 id: tc1,
 driver: qcow2,
 file: { driver: file,
 filename: test.qc2 } } } }
 
 {return: {}}
 { execute: device_add, arguments: {
 driver: virtio-blk-pci, id: vda, drive: tc1  } }
 {return: {}}
 
 Thanks!

Maybe try
for m in acpiphp pci_hotplug; do sudo modprobe ${m}; done
in the guest before adding the pci device.

Best regards

Benoît

 
 Ken
 
 
 

 Date: Mon, 13 Oct 2014 12:28:51 -0700
 From: Ken Chiang kchi...@sandia.gov
 To: Markus Armbruster arm...@redhat.com
 Subject: Re: [EXTERNAL] Re: [Qemu-devel] how to dynamically add a block
  device using qmp?
 Message-ID: 20141013192851.ga12...@sandia.gov
 In-Reply-To: 87k347f0do@blackfin.pond.sub.org
 User-Agent: Mutt/1.5.18 (2008-05-17)
 
 Markus,
 
 Nevermind my previous question, I see now that you add a scsi bus prior to 
 adding the scsi disk.  
 
 However, I did this and qmp returned {[]} in both cases and I am still not 
 seeing the disk in the vm.
 
 
 here's the qmp session:
 
 {execute:qmp_capabilities}
 {return: {}}
 { execute: blockdev-add,  
   
arguments: { options : {   
   
 id: tc1,  
   
 driver: qcow2,
   
 file: { driver: file,   
   
 filename: /home/kchiang/kvmimages/xpsp2_b-0012.qc2 } } } 
 } 
 {return: {}}
 { execute: device_add, arguments: {
 driver: virtio-scsi-pci, id: vs-hba } }
 {return: {}}
 { execute: device_add, arguments: {
 driver: scsi-disk, id: sdb, drive: tc1,
 bus: vs-hba.0  } }
 {return: {}}
 { execute: query-block }
 {return: [{io-status: ok, device: ide0-hd0, locked: false, 
 removable: false, inserted: {iops_rd: 0, image: {virtual-size: 
 551872, filename: kvmimages/vts-6g-u12-inetsim.image, format: 
 raw, actual-size: 555968, dirty-flag: false}, iops_wr: 0, ro: 
 false, backing_file_depth: 0, drv: raw, iops: 0, bps_wr: 0, 
 encrypted: false, bps: 0, bps_rd: 0, file: 
 kvmimages/vts-6g-u12-inetsim.image, encryption_key_missing: false}, 
 type: unknown}, {io-status: ok, device: ide1-cd0, locked: 
 false, removable: true, tray_open: false, type: unknown}, {device: 
 floppy0, locked: false, removable: true, tray_open: false, type: 
 unknown}, {device: sd0, locked: false, removable: true, 
 tray_open: false, type: unknown}, {io-status: ok, device: tc1, 
 locked: false, removable: false, inserted: {iops_rd: 0, image: 
 {virtual-size: 5368709120, filename: 
 /home/kchiang/kvmimages/xpsp2_b-0012.qc2, cluster-size: 65536, 
 format: qcow2, actual-size: 1550454784, format-specific: {type: 
 qcow2, data: {compat: 0.10}}, dirty-flag: false}, iops_wr: 0, 
 ro: false, backing_file_depth: 0, drv: qcow2, iops: 0, bps_wr: 0, 
 encrypted: false, bps: 0, bps_rd: 0, file: 
 /home/kchiang/kvmimages/xpsp2_b-0012.qc2, encryption_key_missing: 
 false}, type: unknown}]}
 
 am I missing anything else?
 
 I've also tried rebooting the VM and running rescan-scsi-bus to no avail.
 
 Ken
 
 On Sat, Oct 11, 2014 at 10:58:59AM +0200, Markus Armbruster wrote:
  Ken Chiang kchi...@sandia.gov writes:
  
   Hello,
  
   I am trying to add a block device dynamically using qmp and are having
   some issues.
  
   After successfully adding the block device using blockdev-add and
   verifying that it has been added using query-block, I am unable to
   see the block device in the VM under /dev/sdXX
  
  Block devices consist of a frontend (a.k.a. device model) and a backend.
  You added only a backend.  To add the frontend, use device_add.
  
   I am using ubuntu14.04LTS: qmp version: 
   {QMP: {version: {qemu: {micro: 0, minor: 0, major: 2}, 
   package:  (Debian 2.0.0+dfsg-2ubuntu1.2)}, capabilities: []}}
  
   Here's what I am doing:
   1.  /usr/bin/kvm-spice -hda kvmimages/ubuntu.image -m 768 -usbdevice 
   tablet -vnc :5 -qmp tcp:localhost:,server,nowait
  
   2.  telnet localhost 
  
   3.  In the telnet session run:
   {execute:qmp_capabilities}
   {return: {}}
   { execute: blockdev-add,
  arguments: { options : {
   id: disk1,
   driver: qcow2,
 file: { driver: file,
   filename: testvm.qc2 } } } }
   {return: {}}
  
   { execute: query-block }
  
   4. query-block returns the device disk1, but when I check the vm:
  # vncviewer :5
  
   there are no new devices.
  
  For a virtio-blk disk, try:
  
  { execute: device_add,

Re: [Qemu-devel] [PATCH v3 3/3] util: Infrastructure for computing recent averages

2014-10-13 Thread Benoît Canet

On Mon, Oct 13, 2014 at 02:05:34PM +0200, Paolo Bonzini wrote:
 Il 24/09/2014 17:21, Benoît Canet ha scritto:
  The module takes care of computing minimal and maximal
  values over the time slice duration.
 
 The code looks good, just two comments:
 
  +/* Get the average value
  + *
  + * @ta:  the timed average structure used
  + * @ret: the average value
  + */
  +uint64_t timed_average_avg(TimedAverage *ta)
  +{
  +Window *w;
  +check_expirations(ta);
  +
  +w = current_window(ta);
  +
  +if (w-count) {
  +return w-sum / w-count;
 
 First, do we want this to return double?
 
 Second, this will return the min/max/avg in an unknown amount of time
 between period/2 and period---on average period*3/4, e.g. 0.75 seconds
 for a period equal to one second.
 
 Would it make sense to tweak the TimedAverage period to be higher, e.g.
 1.33 seconds/80 seconds/80 minutes, so that the _average_ period is 1
 second/1 minute/1 hour?
 
 This only applies to how the code is used, not to TimedAverage itself;
 hence, feel free to post the patch with Reviewed-by once
 timed_average_avg's return type is changed to a double.

This would require either changing _init period units or making it a float
or baking the 1.33 ratio in timed average.

Which one would you prefer ?

Best regards

Benoît

Thanks for reviewing.

 
 Paolo
 
  +}
  +
  +return 0;
  +}
  +
  +/* Get the maximum value
  + *
  + * @ta:  the timed average structure used
  + * @ret: the maximal value
  + */
  +uint64_t timed_average_max(TimedAverage *ta)
  +{
  +check_expirations(ta);
  +return current_window(ta)-max;
  +}

Re: [Qemu-devel] [PATCH 2/3] qemu-io: Respect early image end for map

2014-10-11 Thread Benoît Canet

The Saturday 11 Oct 2014 à 11:53:40 (+0200), Max Reitz wrote :
 Am 10.10.2014 um 14:03 schrieb Benoît Canet:
 +} else if (!num) {
 +error_report(Unexpected end of image);
 +return 0;
 I think this test can miss some case of Unexpected end of image.
 
 For example supose that in map_is_allocated the first bdrv_is_allocated
 actually succeed then *pnum = num. Then the bottom loop has exit on failure
 and the function return.
 
 in map_f num is map_is_allocated *pnum so map_f's num != 0 and this very 
 test
 fails to see the unexpected end of image error.
 
 num != 0 because some sectors where successfully queried. In my opinion, we
 should print the information about them we have. Then, the do-while loop is
 repeated; and this time, map_is_allocated() either again returns num  0
 (for whatever reason, but I'd be fine with it) or, more probably, it'll be
 num == 0 this time. So the error is not missed, it's just printed one
 iteration later.

Ok that make sense.

Best regards

Benoît

 
 Max
 
 Best regards
 
 Benoît
 
   }
   retstr = ret ? allocated : not allocated;
 -- 
 2.0.4

Re: [Qemu-devel] [PATCH 2/3] qemu-io: Respect early image end for map

2014-10-11 Thread Benoît Canet

The Saturday 16 Aug 2014 à 20:54:17 (+0200), Max Reitz wrote :
 bdrv_is_allocated() may report zero clusters which most probably means
 the image (file) is shorter than expected. Respect this case in order to
 avoid an infinite loop.
 
 Signed-off-by: Max Reitz mre...@redhat.com
 ---
  qemu-io-cmds.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)
 
 diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
 index c503fc6..860b834 100644
 --- a/qemu-io-cmds.c
 +++ b/qemu-io-cmds.c
 @@ -1909,7 +1909,7 @@ static int map_is_allocated(BlockDriverState *bs, 
 int64_t sector_num,
  
  num_checked = MIN(nb_sectors, INT_MAX);
  ret = bdrv_is_allocated(bs, sector_num, num_checked, num);
 -if (ret == firstret) {
 +if (ret == firstret  num) {
  *pnum += num;
  } else {
  break;
 @@ -1936,6 +1936,9 @@ static int map_f(BlockDriverState *bs, int argc, char 
 **argv)
  if (ret  0) {
  error_report(Failed to get allocation status: %s, 
 strerror(-ret));
  return 0;
 +} else if (!num) {
 +error_report(Unexpected end of image);
 +return 0;
  }
  
  retstr = ret ? allocated : not allocated;
 -- 
 2.0.4
 
 

Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH 1/3] block: Ignore allocation size in underlying file

2014-10-11 Thread Benoît Canet

The Saturday 11 Oct 2014 à 11:44:20 (+0200), Max Reitz wrote :
 Am 10.10.2014 um 13:50 schrieb Benoît Canet:
 The Saturday 16 Aug 2014 à 20:54:16 (+0200), Max Reitz wrote :
 When falling through to the underlying file in
 bdrv_co_get_block_status(), do not let the number of sectors for which
 information could be obtained be overwritten.
 
 Signed-off-by: Max Reitz mre...@redhat.com
 ---
   block.c | 6 --
   1 file changed, 4 insertions(+), 2 deletions(-)
 
 diff --git a/block.c b/block.c
 index 3e252a2..c922664 100644
 --- a/block.c
 +++ b/block.c
 @@ -3991,9 +3991,11 @@ static int64_t coroutine_fn 
 bdrv_co_get_block_status(BlockDriverState *bs,
   if (bs-file 
   (ret  BDRV_BLOCK_DATA)  !(ret  BDRV_BLOCK_ZERO) 
   (ret  BDRV_BLOCK_OFFSET_VALID)) {
 +int backing_pnum;
 +
   ret2 = bdrv_co_get_block_status(bs-file, ret  BDRV_SECTOR_BITS,
 -*pnum, pnum);
 -if (ret2 = 0) {
 +*pnum, backing_pnum);
 +if (ret2 = 0  backing_pnum = *pnum) {
 About backing_pnum = *pnum.
 
 The documentation of bdrv_co_get_block_status says:
 
   * 'nb_sectors' is the max value 'pnum' should be set to.  If nb_sectors 
  goes
   * beyond the end of the disk image it will be clamped.
   */
 static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
   int64_t sector_num,
   int nb_sectors, int 
  *pnum)
 
 So clearly after the bdrv_co_get_block_status *pnum = backing_pnum.
 
 This means that  backing_pnum  *pnum will never happen.
 
 I think either this test is wrong or the doc is wrong.
 
 Thank you for confusing me, I had to think quite a while about this. *g*
 
 The condition is not for error checking. If it was, it would be the wrong
 order (the condition should be true on success, that's why it's ret2 = 0
 and not ret2  0, so it should then be backing_pnum = *pnum). So what
 this is testing is whether all sectors in the underlying file in the queried
 range are read as zero. But if backing_pnum  *pnum that is not the case,
 some clusters are not zero. So we may not set the zero flag if backing_pnum
  *pnum; or as it reads in the code, we may only set it if backing_pnum =
 *pnum. This is not about whether *pnum  backing_pnum, but more about
 whether backing_pnum == *pnum (but = would be fine, too, if
 bdrv_co_get_block_status() supported it, so that's why I wrote it that way).
 
 However, I'm starting to think about whether it would be better, for the
 backing_pnum  *pnum case, not to not set the zero flag, but rather simply
 set *pnum = backing_pnum. And this in turn would be pretty equivalent to
 just omitting this patch, because:
 
 If we get to this point where we query the underlying file and it returns a
 certain number of sectors is zero; then we therefore want to set *pnum =
 backing_pnum (both if backing_pnum  *pnum and if backing_pnum == *pnum;
 backing_pnum  *pnum cannot happen, as you pointed out). On the other hand,
 if the sectors are not reported to be zero, but backing_pnum  *pnum, we
 want to shorten *pnum accordingly as well because this may indicate that
 after another backing_pnum sectors, we arrive at a hole in the file.
 
 There is only one point I can imagine where it makes sense not to let
 backing_pnum overwrite *pnum: And that's if bdrv_co_get_block_status()
 reported BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID with an offset beyond the
 EOF. I think this might actually happen with qcow2, if one cluster simply
 lies beyond the EOF (which is perfectly valid). So I conclude that this
 patch has its use after all but needs to be modified so that backing_pnum
 always overwrites *pnum; except for when backing_pnum is zero (which should
 only happen at or after the EOF) in which case the zero flag should be set
 and *pnum should be left as it was.
 
 And now in all honesty: Thanks for confusing me, I guess I can think better
 when I'm confused. :-)
 

You better have killer english skills to sumarize this in a nice commit message 
:)
I'll read the next version.

Best regards

Benoît

 Max
 
 Best regards
 
 Benoît
 
 
   /* Ignore errors.  This is just providing extra information, 
  it
* is useful but not necessary.
*/
 -- 
 2.0.4

Re: [Qemu-devel] [PATCH v5 08/11] qcow2: Rebuild refcount structure during check

2014-10-11 Thread Benoît Canet


 +int64_t first_free_cluster = 0, rt_ofs = -1, cluster = 0;
 +int64_t rb_ofs, rb_start, rb_index;

Everytime a few day pass and I read rb_ofs and rt_ofs again I found these names
obfuscated. I know Linus says that C is a spartan language but these look odd.
Maybe refcount_block_offset, refcount_block_* and refcount_table_offset would be
better.

I'll try to read this patch for real before you repost it.

Best regards

Benoît

Re: [Qemu-devel] [PATCH v1 7/8] throttle: Add throttle group support

2014-10-10 Thread Benoît Canet

On Thu, Oct 09, 2014 at 04:58:22PM +0800, Fam Zheng wrote:
 On Wed, 10/08 11:05, Benoît Canet wrote:
  On Wed, Oct 08, 2014 at 02:53:38PM +0800, Fam Zheng wrote:
   
   Does this mean that after this series, all the throttle_states must be
   contained inside its own throttle group? If so, we could embed 
   ThrottleGroup
   fields in ThrottleState.
   
   It's weird when a function called throttle_group_compare takes a 
   parameter of
   ThrottleState pointer, and cast it back to ThrottleGroup with 
   container_of.
  
  It's done like this to fullfill a design goal: the throttle should be 
  reusable
  without the groups and any reference to block related stuff.
  So it's just a way to split the responsabilities.
 
 I see.
 
 Having both ThrottleGroup and ThrottleState interfaces is more complicated 
 than
 just use ThrottleGroup, where a one-member group is exactly the same as
 ThrottleState.

Here my interest conflict between a short term strategy (comply and getting 
these
patches merged asap) and the technical advantage of keeping ThrottleState and
ThrottleGroup separated.

orthogonality
-

An advantage to keep ThrottleState and ThrottleGroup separated is that
the two roles of implementing the throttle itself and implementing the behavior
of a group are keep separated.

As a result each piece of code is simpler to write, review and test.

genericity
--

When writing the initial throttle code I carefully designed it to be independant
of BlockDriverState.

As a result the throttle code lives in util/ and is easilly reusable into 
whatever
we want (a device model or network throttling).

It is mean, designed and written to be as generic as possible.

So yes merging ThrottleState and ThrottleGroup would result in the seemlingly 
nice
thing that one structure is simpler than two structure.

But it would also imply the fact that I really hate that this pesky
BlockDriverState structure would become tied to ThrottleState.

If we do this ThrottleState would move out of util/ to block/ and we would
have lost the ability to reuse this infrastructure. It would be a short term 
gain
and a long term loss.

Another fact is that the rationale 1 structure is simpler than 2 is not alway
relevant: look for BlockBackend that we are desesperately trying to move out of
BlockDriverState.

Best regards

Benoît
 
 Fam

Re: [Qemu-devel] [PATCH 1/2] raw-posix: Fix raw_co_get_block_status() after EOF

2014-10-10 Thread Benoît Canet

The Monday 22 Sep 2014 à 17:23:44 (+0200), Max Reitz wrote :
 As its comment states, raw_co_get_block_status() should unconditionally
 return 0 and set *pnum to 0 for after EOF.
 
 An assertion after lseek(..., SEEK_HOLE) tried to catch this case by
 asserting that errno != -ENXIO (which would indicate a position after
 the EOF); but it should be errno != ENXIO instead. Fix this, too.
 
 Reported-by: Kevin Wolf kw...@redhat.com
 Signed-off-by: Max Reitz mre...@redhat.com
 ---
  block/raw-posix.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)
 
 diff --git a/block/raw-posix.c b/block/raw-posix.c
 index a253697..de4e3f3 100644
 --- a/block/raw-posix.c
 +++ b/block/raw-posix.c
 @@ -1509,9 +1509,9 @@ static int64_t try_seek_hole(BlockDriverState *bs, 
 off_t start, off_t *data,
  
  *hole = lseek(s-fd, start, SEEK_HOLE);
  if (*hole == -1) {
 -/* -ENXIO indicates that sector_num was past the end of the file.
 +/* ENXIO indicates that sector_num was past the end of the file.
   * There is a virtual hole there.  */
 -assert(errno != -ENXIO);
 +assert(errno != ENXIO);
  
  return -errno;
  }
 @@ -1560,6 +1560,10 @@ static int64_t coroutine_fn 
 raw_co_get_block_status(BlockDriverState *bs,
  }
  
  start = sector_num * BDRV_SECTOR_SIZE;
 +if (start = bdrv_getlength(bs)) {
 +*pnum = 0;
 +return 0;
 +}
  
  ret = try_fiemap(bs, start, data, hole, nb_sectors, pnum);
  if (ret  0) {
 -- 
 2.1.0
 
 
Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH 1/3] block: Ignore allocation size in underlying file

2014-10-10 Thread Benoît Canet

The Saturday 16 Aug 2014 à 20:54:16 (+0200), Max Reitz wrote :
 When falling through to the underlying file in
 bdrv_co_get_block_status(), do not let the number of sectors for which
 information could be obtained be overwritten.
 
 Signed-off-by: Max Reitz mre...@redhat.com
 ---
  block.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)
 
 diff --git a/block.c b/block.c
 index 3e252a2..c922664 100644
 --- a/block.c
 +++ b/block.c
 @@ -3991,9 +3991,11 @@ static int64_t coroutine_fn 
 bdrv_co_get_block_status(BlockDriverState *bs,
  if (bs-file 
  (ret  BDRV_BLOCK_DATA)  !(ret  BDRV_BLOCK_ZERO) 
  (ret  BDRV_BLOCK_OFFSET_VALID)) {
 +int backing_pnum;
 +
  ret2 = bdrv_co_get_block_status(bs-file, ret  BDRV_SECTOR_BITS,
 -*pnum, pnum);
 -if (ret2 = 0) {
 +*pnum, backing_pnum);
 +if (ret2 = 0  backing_pnum = *pnum) {

About backing_pnum = *pnum.

The documentation of bdrv_co_get_block_status says:

 * 'nb_sectors' is the max value 'pnum' should be set to.  If nb_sectors goes   
 * beyond the end of the disk image it will be clamped. 
 */ 
static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,  
 int64_t sector_num,
 int nb_sectors, int *pnum) 

So clearly after the bdrv_co_get_block_status *pnum = backing_pnum.

This means that  backing_pnum  *pnum will never happen.

I think either this test is wrong or the doc is wrong.

Best regards

Benoît


  /* Ignore errors.  This is just providing extra information, it
   * is useful but not necessary.
   */
 -- 
 2.0.4

Re: [Qemu-devel] [PATCH 2/3] qemu-io: Respect early image end for map

2014-10-10 Thread Benoît Canet

 +} else if (!num) {
 +error_report(Unexpected end of image);
 +return 0;

I think this test can miss some case of Unexpected end of image.

For example supose that in map_is_allocated the first bdrv_is_allocated
actually succeed then *pnum = num. Then the bottom loop has exit on failure
and the function return.

in map_f num is map_is_allocated *pnum so map_f's num != 0 and this very test
fails to see the unexpected end of image error.

Best regards

Benoît

  }
  
  retstr = ret ? allocated : not allocated;
 -- 
 2.0.4

Re: [Qemu-devel] [PATCH v5 01/11] qcow2: Calculate refcount block entry count

2014-10-10 Thread Benoît Canet

On Fri, Aug 29, 2014 at 11:40:53PM +0200, Max Reitz wrote:
 The size of a refblock entry is (in theory) variable; calculate
 therefore the number of entries per refblock and the according bit shift
 (1  x == entry count) when opening an image.
 
 Signed-off-by: Max Reitz mre...@redhat.com
 ---
  block/qcow2.c | 2 ++
  block/qcow2.h | 2 ++
  2 files changed, 4 insertions(+)
 
 diff --git a/block/qcow2.c b/block/qcow2.c
 index f9e045f..172ad00 100644
 --- a/block/qcow2.c
 +++ b/block/qcow2.c
 @@ -689,6 +689,8 @@ static int qcow2_open(BlockDriverState *bs, QDict 
 *options, int flags,
  
  s-l2_bits = s-cluster_bits - 3; /* L2 is always one cluster */
  s-l2_size = 1  s-l2_bits;
 +s-refcount_block_bits = s-cluster_bits - (s-refcount_order - 3);

After carefull examination (s-refcount_order - 3) == REFCOUNT_SHIFT.
Why not also creating s-refcount_shift and make use of it in order to avoid
torturing the mind of the next reader.

Or simply recall in a comment that there is 2^3 bits in a byte.

Best regards

Benoît

 +s-refcount_block_size = 1  s-refcount_block_bits;
  bs-total_sectors = header.size / 512;
  s-csize_shift = (62 - (s-cluster_bits - 8));
  s-csize_mask = (1  (s-cluster_bits - 8)) - 1;
 diff --git a/block/qcow2.h b/block/qcow2.h
 index 6aeb7ea..7c01fb7 100644
 --- a/block/qcow2.h
 +++ b/block/qcow2.h
 @@ -222,6 +222,8 @@ typedef struct BDRVQcowState {
  int l2_size;
  int l1_size;
  int l1_vm_state_index;
 +int refcount_block_bits;
 +int refcount_block_size;
  int csize_shift;
  int csize_mask;
  uint64_t cluster_offset_mask;
 -- 
 2.1.0

Re: [Qemu-devel] [PATCH v5 08/11] qcow2: Rebuild refcount structure during check

2014-10-10 Thread Benoît Canet

 +*nb_clusters = cluster + cluster_count - contiguous_free_clusters;
 +*refcount_table = g_try_realloc(*refcount_table,
 +*nb_clusters * sizeof(uint16_t));

Something tells me that these sizeof(uint16_t) are connected to 
s-refcount_order
and indirectely to REFCOUNT_SHIFT and that this code could benefit from this
relationship thus probably saving you work in the future.


 +if (!*refcount_table) {
 +return -ENOMEM;
 +}
 +
 +memset(*refcount_table + old_nb_clusters, 0,
 +   (*nb_clusters - old_nb_clusters) * sizeof(uint16_t));
 +}
 +
 +/* Go back to the first free cluster */
 +cluster -= contiguous_free_clusters;
 +for (i = 0; i  cluster_count; i++) {
 +(*refcount_table)[cluster + i] = 1;
 +}
 +
 +return cluster  s-cluster_bits;
 +}
 +
 +/*
 + * Creates a new refcount structure based solely on the in-memory information
 + * given through *refcount_table. All necessary allocations will be reflected
 + * in that array.
 + *
 + * On success, the old refcount structure is leaked (it will be covered by 
 the
 + * new refcount structure).
 + */
 +static int rebuild_refcount_structure(BlockDriverState *bs,
 +  BdrvCheckResult *res,
 +  uint16_t **refcount_table,
 +  int64_t *nb_clusters)
 +{
 +BDRVQcowState *s = bs-opaque;
 +int64_t first_free_cluster = 0, rt_ofs = -1, cluster = 0;
 +int64_t rb_ofs, rb_start, rb_index;
 +uint32_t reftable_size = 0;
 +uint64_t *reftable = NULL;
 +uint16_t *on_disk_rb;
 +int i, ret = 0;
 +struct {
 +uint64_t rt_offset;
 +uint32_t rt_clusters;
 +} QEMU_PACKED rt_offset_and_clusters;
 +
 +qcow2_cache_empty(bs, s-refcount_block_cache);
 +
 +write_refblocks:
 +for (; cluster  *nb_clusters; cluster++) {
 +if (!(*refcount_table)[cluster]) {
 +continue;
 +}
 +
 +rb_index = cluster  s-refcount_block_bits;
 +rb_start = rb_index  s-refcount_block_bits;
 +
 +/* Don't allocate a cluster in a refblock already written to disk */
 +if (first_free_cluster  rb_start) {
 +first_free_cluster = rb_start;
 +}
 +rb_ofs = alloc_clusters_imrt(bs, 1, refcount_table, nb_clusters,
 + first_free_cluster);
 +if (rb_ofs  0) {
 +fprintf(stderr, ERROR allocating refblock: %s\n, 
 strerror(-ret));
 +res-check_errors++;
 +ret = rb_ofs;
 +goto fail;
 +}
 +
 +if (reftable_size = rb_index) {
 +uint32_t old_rt_size = reftable_size;
 +reftable_size = ROUND_UP((rb_index + 1) * sizeof(uint64_t),
 + s-cluster_size) / sizeof(uint64_t);
 +reftable = g_try_realloc(reftable,
 + reftable_size * sizeof(uint64_t));
 +if (!reftable) {
 +res-check_errors++;
 +ret = -ENOMEM;
 +goto fail;
 +}
 +
 +memset(reftable + old_rt_size, 0,
 +   (reftable_size - old_rt_size) * sizeof(uint64_t));
 +
 +/* The offset we have for the reftable is now no longer valid;
 + * this will leak that range, but we can easily fix that by 
 running
 + * a leak-fixing check after this rebuild operation */
 +rt_ofs = -1;
 +}
 +reftable[rb_index] = rb_ofs;
 +
 +/* If this is apparently the last refblock (for now), try to squeeze 
 the
 + * reftable in */
 +if (rb_index == (*nb_clusters - 1)  s-refcount_block_bits 
 +rt_ofs  0)
 +{
 +rt_ofs = alloc_clusters_imrt(bs, size_to_clusters(s, 
 reftable_size *
 +  
 sizeof(uint64_t)),
 + refcount_table, nb_clusters,
 + first_free_cluster);
 +if (rt_ofs  0) {
 +fprintf(stderr, ERROR allocating reftable: %s\n,
 +strerror(-ret));
 +res-check_errors++;
 +ret = rt_ofs;
 +goto fail;
 +}
 +}
 +
 +ret = qcow2_pre_write_overlap_check(bs, 0, rb_ofs, s-cluster_size);
 +if (ret  0) {
 +fprintf(stderr, ERROR writing refblock: %s\n, strerror(-ret));
 +goto fail;
 +}
 +
 +on_disk_rb = g_malloc0(s-cluster_size);
 +for (i = 0; i  s-cluster_size / sizeof(uint16_t) 
 +rb_start + i  *nb_clusters; i++)
 +{
 +on_disk_rb[i] = cpu_to_be16((*refcount_table)[rb_start + i]);
 +}
 +
 +ret = bdrv_write(bs-file, rb_ofs / BDRV_SECTOR_SIZE,
 +

Re: [Qemu-devel] [PATCH v2 0/2] raw-posix: Fix raw_co_get_block_status()

2014-10-09 Thread Benoît Canet

On Wed, Oct 08, 2014 at 09:43:19PM +0200, Max Reitz wrote:
 On 22.09.2014 17:36, Max Reitz wrote:
 raw_co_get_block_status() should return 0 and set *pnum to 0 after the
 EOF; currently it does this merely by accident, so implement it
 directly. Also, nb_sectors should be clamped against the image end.
 
 While doing that, centralize the generation of
 raw_co_get_block_status()'s return value along the way.
 
 
 v2:
 - Patch 1: Clamp nb_sectors against image end
 - Patch 2: Fix alignment issue
 
 
 Max Reitz (2):
raw-posix: Fix raw_co_get_block_status() after EOF
raw-posix: raw_co_get_block_status() return value
 
   block/raw-posix.c | 36 ++--
   1 file changed, 22 insertions(+), 14 deletions(-)
 
 Ping. (This should be rather simple to review)

Hi Max,

I will review these tomorow.

Best regards

Benoît

Re: [Qemu-devel] [PATCH v1 7/8] throttle: Add throttle group support

2014-10-08 Thread Benoît Canet

On Wed, Oct 08, 2014 at 02:53:38PM +0800, Fam Zheng wrote:
 
 Does this mean that after this series, all the throttle_states must be
 contained inside its own throttle group? If so, we could embed ThrottleGroup
 fields in ThrottleState.
 
 It's weird when a function called throttle_group_compare takes a parameter of
 ThrottleState pointer, and cast it back to ThrottleGroup with container_of.

It's done like this to fullfill a design goal: the throttle should be reusable
without the groups and any reference to block related stuff.
So it's just a way to split the responsabilities.

Best regards

Benoît

Thanks for reviewing.

 
 Fam

[Qemu-devel] [PATCH v1 2/8] throttle: Add throttle group infrastructure

2014-10-07 Thread Benoît Canet

The throttle_group_incref increment the refcount of a throttle group given it's
name and return the associated throttle state.

The throttle_group_unref is the mirror function for cleaning up.

Signed-off-by: Benoit Canet benoit.ca...@nodalink.com
---
 block/Makefile.objs |   1 +
 block/throttle-groups.c | 212 
 include/block/block_int.h   |   1 +
 include/block/throttle-groups.h |  45 +
 4 files changed, 259 insertions(+)
 create mode 100644 block/throttle-groups.c
 create mode 100644 include/block/throttle-groups.h

diff --git a/block/Makefile.objs b/block/Makefile.objs
index a833ed5..d257b05 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -10,6 +10,7 @@ block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
 block-obj-$(CONFIG_POSIX) += raw-posix.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 block-obj-y += null.o
+block-obj-y += throttle-groups.o
 
 block-obj-y += nbd.o nbd-client.o sheepdog.o
 block-obj-$(CONFIG_LIBISCSI) += iscsi.o
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
new file mode 100644
index 000..ea5baca
--- /dev/null
+++ b/block/throttle-groups.c
@@ -0,0 +1,212 @@
+/*
+ * QEMU block throttling group infrastructure
+ *
+ * Copyright (C) Nodalink, EURL. 2014
+ *
+ * Author:
+ *   Benoît Canet benoit.ca...@nodalink.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 or
+ * (at your option) version 3 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include block/throttle-groups.h
+#include qemu/queue.h
+#include qemu/thread.h
+
+typedef struct ThrottleGroup {
+char name[32];
+ThrottleState ts;
+uint64_t refcount;
+QTAILQ_ENTRY(ThrottleGroup) list;
+QLIST_HEAD(, BlockDriverState) head;
+BlockDriverState *tokens[2]; /* current round-robin tokens */
+QemuMutex lock; /* Used to synchronize all elements belonging to a group */
+} ThrottleGroup;
+
+static QTAILQ_HEAD(, ThrottleGroup) throttle_groups =
+QTAILQ_HEAD_INITIALIZER(throttle_groups);
+
+/* increments a ThrottleGroup reference count given it's name
+ *
+ * If no ThrottleGroup is found with the given name a new one is created.
+ *
+ * @name: the name of the ThrottleGroup
+ * @ret:  the ThrottleGroup's ThrottleState address
+ */
+ThrottleState *throttle_group_incref(const char *name)
+{
+ThrottleGroup *tg;
+
+/* return the correct ThrottleState if a group with this name exists */
+QTAILQ_FOREACH(tg, throttle_groups, list) {
+/* group not found - continue */
+if (strcmp(name, tg-name)) {
+continue;
+}
+/* group found - increment it's refcount and return ThrottleState */
+tg-refcount++;
+return tg-ts;
+}
+
+/* throttle group not found - prepare new entry */
+tg = g_new0(ThrottleGroup, 1);
+pstrcpy(tg-name, sizeof(tg-name), name);
+qemu_mutex_init(tg-lock);
+throttle_init(tg-ts);
+QLIST_INIT(tg-head);
+tg-refcount = 1;
+
+/* insert new entry in the list */
+QTAILQ_INSERT_TAIL(throttle_groups, tg, list);
+
+/* return newly allocated ThrottleState */
+return tg-ts;
+}
+
+/* decrement a ThrottleGroup given it's ThrottleState address
+ *
+ * When the refcount reach zero the ThrottleGroup is destroyed
+ *
+ * @ts:  The ThrottleState address belonging to the ThrottleGroup to unref
+ * @ret: true on success else false
+ */
+bool throttle_group_unref(ThrottleState *ts)
+{
+ThrottleGroup *tg;
+bool found = false;
+
+/* Find the ThrottleGroup of the given ThrottleState */
+QTAILQ_FOREACH(tg, throttle_groups, list) {
+/* correct group found stop iterating */
+if (tg-ts == ts) {
+qemu_mutex_lock(tg-lock);
+found = true;
+break;
+}
+}
+
+/* If the ThrottleState was not found something is seriously broken */
+if (!found) {
+return false;
+}
+
+tg-refcount--;
+
+/* If ThrottleGroup is used keep it. */
+if (tg-refcount) {
+qemu_mutex_unlock(tg-lock);
+return true;
+}
+
+/* Else destroy it */
+QTAILQ_REMOVE(throttle_groups, tg, list);
+qemu_mutex_unlock(tg-lock);
+qemu_mutex_destroy(tg-lock);
+g_free(tg);
+return true;
+}
+
+/* Compare a name with a given ThrottleState group name
+ *
+ * @ts:   the throttle state whose group we are inspecting
+ * @name: the name to compare
+ * @ret:  true if names are equal else false
+ */
+bool

[Qemu-devel] [PATCH v1 8/8] throttle: Update throttle infrastructure copyright

2014-10-07 Thread Benoît Canet

Signed-off-by: Benoit Canet benoit.ca...@nodalink.com
---
 include/qemu/throttle.h | 4 ++--
 tests/test-throttle.c   | 4 ++--
 util/throttle.c | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
index 3b9d1b8..8abd94d 100644
--- a/include/qemu/throttle.h
+++ b/include/qemu/throttle.h
@@ -1,10 +1,10 @@
 /*
  * QEMU throttling infrastructure
  *
- * Copyright (C) Nodalink, SARL. 2013
+ * Copyright (C) Nodalink, EURL. 2013-2014
  *
  * Author:
- *   Benoît Canet benoit.ca...@irqsave.net
+ *   Benoît Canet benoit.ca...@nodalink.com
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as
diff --git a/tests/test-throttle.c b/tests/test-throttle.c
index ecb5504..0efd372 100644
--- a/tests/test-throttle.c
+++ b/tests/test-throttle.c
@@ -1,10 +1,10 @@
 /*
  * Throttle infrastructure tests
  *
- * Copyright Nodalink, SARL. 2013
+ * Copyright Nodalink, EURL. 2013-2014
  *
  * Authors:
- *  Benoît Canet benoit.ca...@irqsave.net
+ *  Benoît Canet benoit.ca...@nodalink.com
  *
  * This work is licensed under the terms of the GNU LGPL, version 2 or later.
  * See the COPYING.LIB file in the top-level directory.
diff --git a/util/throttle.c b/util/throttle.c
index 163b9d0..4884c08 100644
--- a/util/throttle.c
+++ b/util/throttle.c
@@ -1,10 +1,10 @@
 /*
  * QEMU throttling infrastructure
  *
- * Copyright (C) Nodalink, SARL. 2013
+ * Copyright (C) Nodalink, EURL. 2013-2014
  *
  * Author:
- *   Benoît Canet benoit.ca...@irqsave.net
+ *   Benoît Canet benoit.ca...@nodalink.com
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as
-- 
2.1.1

[Qemu-devel] [PATCH v1 5/8] throttle: Add a way to know if throttle_schedule_timer had armed a timer

2014-10-07 Thread Benoît Canet

This will be needed by the group throttling patches for the algorithm to be
accurate.

Signed-off-by: Benoit Canet benoit.ca...@nodalink.com
---
 block.c |  7 +--
 include/qemu/throttle.h |  3 ++-
 util/throttle.c | 12 +++-
 3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/block.c b/block.c
index 079b0dc..527ea48 100644
--- a/block.c
+++ b/block.c
@@ -202,10 +202,13 @@ static void bdrv_io_limits_intercept(BlockDriverState *bs,
  unsigned int bytes,
  bool is_write)
 {
+bool armed;
+
 /* does this io must wait */
 bool must_wait = throttle_schedule_timer(bs-throttle_state,
  bs-throttle_timers,
- is_write);
+ is_write,
+ armed);
 
 /* if must wait or any request of this type throttled queue the IO */
 if (must_wait ||
@@ -219,7 +222,7 @@ static void bdrv_io_limits_intercept(BlockDriverState *bs,
 
 /* if the next request must wait - do nothing */
 if (throttle_schedule_timer(bs-throttle_state, bs-throttle_timers,
-is_write)) {
+is_write, armed)) {
 return;
 }
 
diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
index 3aece3a..3a16c48 100644
--- a/include/qemu/throttle.h
+++ b/include/qemu/throttle.h
@@ -124,7 +124,8 @@ void throttle_get_config(ThrottleState *ts, ThrottleConfig 
*cfg);
 /* usage */
 bool throttle_schedule_timer(ThrottleState *ts,
  ThrottleTimers *tt,
- bool is_write);
+ bool is_write,
+ bool *armed);
 
 void throttle_timer_fired(ThrottleState *ts, bool is_write);
 
diff --git a/util/throttle.c b/util/throttle.c
index 0e305e3..a273acb 100644
--- a/util/throttle.c
+++ b/util/throttle.c
@@ -375,11 +375,13 @@ void throttle_get_config(ThrottleState *ts, 
ThrottleConfig *cfg)
  */
 bool throttle_schedule_timer(ThrottleState *ts,
  ThrottleTimers *tt,
- bool is_write)
+ bool is_write,
+ bool *armed)
 {
 int64_t now = qemu_clock_get_ns(tt-clock_type);
 int64_t next_timestamp;
 bool must_wait;
+*armed = false;
 
 must_wait = throttle_compute_timer(ts,
is_write,
@@ -392,12 +394,12 @@ bool throttle_schedule_timer(ThrottleState *ts,
 }
 
 /* request throttled and any timer pending - do nothing */
-if (ts-any_timer_armed[is_write]) {
-return true;
+if (!ts-any_timer_armed[is_write]) {
+*armed = true;
+ts-any_timer_armed[is_write] = true;
+timer_mod(tt-timers[is_write], next_timestamp);
 }
 
-ts-any_timer_armed[is_write] = true;
-timer_mod(tt-timers[is_write], next_timestamp);
 return true;
 }
 
-- 
2.1.1

[Qemu-devel] [PATCH v1 0/8] Block Throttle Group Support

2014-10-07 Thread Benoît Canet

Hi,

For the user interface I implemented Stefanha's idea proposed in Stuttgart.

For the throttling algorithm I use a cooperative round robin scheduler.

Classical round robin works with a fixed HZ ticks and it's totaly incompatible
with the throttling algorithm.

So the cooperative round robin scheduler is a way for each block device to 
decide
if a pause must be done and a timer be armed and most important of all which
other block device of the group must resume the work once the timer is fired.

The advantages of this algorigthm are:

-only one timer active at a given time (no more cpu usage than regular 
throttling)
-no central place didacting the sheduling policy like a didactureship:
 we love collaboration isn't it ?:)
-No need to deal with  incoming queues to collect requests before scheduling
 then with and dispatchs queues
-Compatible with the throttling code with almost no changes
-As you go scheduling

Best regards

Benoît

Benoît Canet (8):
  throttle: Extract timers from ThrottleState into a separate
ThrottleTimers structure
  throttle: Add throttle group infrastructure
  throttle: Add throttle group infrastructure tests
  throttle: Prepare to have multiple timers for one ThrottleState
  throttle: Add a way to know if throttle_schedule_timer had armed a
timer
  throttle: Add a way to fire one of the timers asap like a bottom half
  throttle: Add throttle group support
  throttle: Update throttle infrastructure copyright

 block.c | 211 ++-
 block/Makefile.objs |   1 +
 block/qapi.c|   7 +-
 block/throttle-groups.c | 212 
 blockdev.c  |  19 +++-
 hmp.c   |   4 +-
 include/block/block.h   |   3 +-
 include/block/block_int.h   |   9 +-
 include/block/throttle-groups.h |  45 +
 include/qemu/throttle.h |  46 ++---
 qapi/block-core.json|   5 +-
 qemu-options.hx |   1 +
 qmp-commands.hx |   3 +-
 tests/test-throttle.c   | 137 +++---
 util/throttle.c | 107 +---
 15 files changed, 685 insertions(+), 125 deletions(-)
 create mode 100644 block/throttle-groups.c
 create mode 100644 include/block/throttle-groups.h

-- 
2.1.1

[Qemu-devel] [PATCH v1 6/8] throttle: Add a way to fire one of the timers asap like a bottom half

2014-10-07 Thread Benoît Canet

This will be needed by the group throttling algorithm.

Signed-off-by: Benoit Canet benoit.ca...@nodalink.com
---
 include/qemu/throttle.h |  2 ++
 util/throttle.c | 11 +++
 2 files changed, 13 insertions(+)

diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
index 3a16c48..3b9d1b8 100644
--- a/include/qemu/throttle.h
+++ b/include/qemu/throttle.h
@@ -127,6 +127,8 @@ bool throttle_schedule_timer(ThrottleState *ts,
  bool is_write,
  bool *armed);
 
+void throttle_fire_timer(ThrottleTimers *tt, bool is_write);
+
 void throttle_timer_fired(ThrottleState *ts, bool is_write);
 
 void throttle_account(ThrottleState *ts, bool is_write, uint64_t size);
diff --git a/util/throttle.c b/util/throttle.c
index a273acb..163b9d0 100644
--- a/util/throttle.c
+++ b/util/throttle.c
@@ -403,6 +403,17 @@ bool throttle_schedule_timer(ThrottleState *ts,
 return true;
 }
 
+/* Schedule a throttle timer like a BH
+ *
+ * @tt:   The timers structure
+ * @is_write: the type of operation (read/write)
+ */
+void throttle_fire_timer(ThrottleTimers *tt, bool is_write)
+{
+int64_t now = qemu_clock_get_ns(tt-clock_type);
+timer_mod(tt-timers[is_write], now + 1);
+}
+
 /* Remember that now timers are currently armed
  *
  * @ts:   the throttle state we are working on
-- 
2.1.1

[Qemu-devel] [PATCH v1 3/8] throttle: Add throttle group infrastructure tests

2014-10-07 Thread Benoît Canet

Signed-off-by: Benoit Canet benoit.ca...@nodalink.com
---
 tests/test-throttle.c | 51 +++
 1 file changed, 51 insertions(+)

diff --git a/tests/test-throttle.c b/tests/test-throttle.c
index 3e52df3..ecb5504 100644
--- a/tests/test-throttle.c
+++ b/tests/test-throttle.c
@@ -15,6 +15,7 @@
 #include block/aio.h
 #include qemu/throttle.h
 #include qemu/error-report.h
+#include block/throttle-groups.h
 
 static AioContext *ctx;
 static LeakyBucketbkt;
@@ -500,6 +501,55 @@ static void test_accounting(void)
 (64.0 / 13)));
 }
 
+static void test_groups(void)
+{
+bool removed;
+
+ThrottleState *ts_foo, *ts_bar, *tmp;
+
+ts_bar = throttle_group_incref(bar);
+throttle_group_set_token(ts_bar, (BlockDriverState *) 0x5, false);
+ts_foo = throttle_group_incref(foo);
+
+tmp = throttle_group_incref(foo);
+throttle_group_set_token(tmp, (BlockDriverState *) 0x7, true);
+g_assert(tmp == ts_foo);
+
+tmp = throttle_group_incref(bar);
+g_assert(tmp == ts_bar);
+
+tmp = throttle_group_incref(bar);
+g_assert(tmp == ts_bar);
+
+g_assert((int64_t) throttle_group_token(ts_bar, false) == 0x5);
+g_assert((int64_t) throttle_group_token(ts_foo, true) == 0x7);
+
+removed = throttle_group_unref(ts_foo);
+g_assert(removed);
+removed = throttle_group_unref(ts_bar);
+g_assert(removed);
+
+g_assert((int64_t) throttle_group_token(ts_foo, true) == 0x7);
+
+removed = throttle_group_unref(ts_foo);
+g_assert(removed);
+removed = throttle_group_unref(ts_bar);
+g_assert(removed);
+
+/* foo group should be destroyed when reaching this */
+removed = throttle_group_unref(ts_foo);
+g_assert(!removed);
+
+g_assert((int64_t) throttle_group_token(ts_bar, false) == 0x5);
+
+removed = throttle_group_unref(ts_bar);
+g_assert(removed);
+
+/* bar group should be destroyed when reaching this */
+removed = throttle_group_unref(ts_bar);
+g_assert(!removed);
+}
+
 int main(int argc, char **argv)
 {
 GSource *src;
@@ -533,6 +583,7 @@ int main(int argc, char **argv)
 g_test_add_func(/throttle/config/is_valid,test_is_valid);
 g_test_add_func(/throttle/config_functions,   test_config_functions);
 g_test_add_func(/throttle/accounting, test_accounting);
+g_test_add_func(/throttle/groups, test_groups);
 return g_test_run();
 }
 
-- 
2.1.1

[Qemu-devel] [PATCH v1 7/8] throttle: Add throttle group support

2014-10-07 Thread Benoît Canet

The throttle group support use a cooperative round robin scheduling algorithm.

The principle of the algorithm are simple:
- Each BDS of the group is used as a token in a circular way.
- The active BDS compute if a wait must be done and arm the right timer.
- If a wait must be done the token timer will be armed so the token will become
  the next active BDS.

Signed-off-by: Benoit Canet benoit.ca...@nodalink.com
---
 block.c   | 191 --
 block/qapi.c  |   7 +-
 block/throttle-groups.c   |   2 +-
 blockdev.c|  19 -
 hmp.c |   4 +-
 include/block/block.h |   3 +-
 include/block/block_int.h |   9 ++-
 qapi/block-core.json  |   5 +-
 qemu-options.hx   |   1 +
 qmp-commands.hx   |   3 +-
 10 files changed, 209 insertions(+), 35 deletions(-)

diff --git a/block.c b/block.c
index 527ea48..e7e5607 100644
--- a/block.c
+++ b/block.c
@@ -36,6 +36,7 @@
 #include qmp-commands.h
 #include qemu/timer.h
 #include qapi-event.h
+#include block/throttle-groups.h
 
 #ifdef CONFIG_BSD
 #include sys/types.h
@@ -129,7 +130,9 @@ void bdrv_set_io_limits(BlockDriverState *bs,
 {
 int i;
 
-throttle_config(bs-throttle_state, bs-throttle_timers, cfg);
+throttle_group_lock(bs-throttle_state);
+throttle_config(bs-throttle_state, bs-throttle_timers, cfg);
+throttle_group_unlock(bs-throttle_state);
 
 for (i = 0; i  2; i++) {
 qemu_co_enter_next(bs-throttled_reqs[i]);
@@ -156,34 +159,99 @@ static bool bdrv_start_throttled_reqs(BlockDriverState 
*bs)
 return drained;
 }
 
+static void bdrv_throttle_group_add(BlockDriverState *bs)
+{
+int i;
+BlockDriverState *token;
+
+for (i = 0; i  2; i++) {
+/* Get the BlockDriverState having the round robin token */
+token = throttle_group_token(bs-throttle_state, i);
+
+/* If the ThrottleGroup is new set the current BlockDriverState as
+ * token
+ */
+if (!token) {
+throttle_group_set_token(bs-throttle_state, bs, i);
+}
+
+}
+
+throttle_group_register_bs(bs-throttle_state, bs);
+}
+
+static void bdrv_throttle_group_remove(BlockDriverState *bs)
+{
+BlockDriverState *token;
+int i;
+
+for (i = 0; i  2; i++) {
+/* Get the BlockDriverState having the round robin token */
+token = throttle_group_token(bs-throttle_state, i);
+/* if this bs is the current token set the next bs as token */
+if (token == bs) {
+token = throttle_group_next_bs(token);
+/* take care of the case where bs is the only bs of the group */
+if (token == bs) {
+token = NULL;
+}
+throttle_group_set_token(bs-throttle_state, token, i);
+}
+}
+
+/* remove the current bs from the list */
+QLIST_REMOVE(bs, round_robin);
+}
+
 void bdrv_io_limits_disable(BlockDriverState *bs)
 {
+
+throttle_group_lock(bs-throttle_state);
 bs-io_limits_enabled = false;
+throttle_group_unlock(bs-throttle_state);
 
 bdrv_start_throttled_reqs(bs);
 
+throttle_group_lock(bs-throttle_state);
+bdrv_throttle_group_remove(bs);
+throttle_group_unlock(bs-throttle_state);
+
+throttle_group_unref(bs-throttle_state);
+bs-throttle_state = NULL;
+
 throttle_timers_destroy(bs-throttle_timers);
 }
 
 static void bdrv_throttle_read_timer_cb(void *opaque)
 {
 BlockDriverState *bs = opaque;
-throttle_timer_fired(bs-throttle_state, false);
+
+throttle_group_lock(bs-throttle_state);
+throttle_timer_fired(bs-throttle_state, false);
+throttle_group_unlock(bs-throttle_state);
+
 qemu_co_enter_next(bs-throttled_reqs[0]);
 }
 
 static void bdrv_throttle_write_timer_cb(void *opaque)
 {
 BlockDriverState *bs = opaque;
-throttle_timer_fired(bs-throttle_state, true);
+
+throttle_group_lock(bs-throttle_state);
+throttle_timer_fired(bs-throttle_state, true);
+throttle_group_unlock(bs-throttle_state);
+
 qemu_co_enter_next(bs-throttled_reqs[1]);
 }
 
 /* should be called before bdrv_set_io_limits if a limit is set */
-void bdrv_io_limits_enable(BlockDriverState *bs)
+void bdrv_io_limits_enable(BlockDriverState *bs, const char *group)
 {
 assert(!bs-io_limits_enabled);
-throttle_init(bs-throttle_state);
+bs-throttle_state = throttle_group_incref(group ? group : 
bs-device_name);
+
+throttle_group_lock(bs-throttle_state);
+bdrv_throttle_group_add(bs);
 throttle_timers_init(bs-throttle_timers,
  bdrv_get_aio_context(bs),
  QEMU_CLOCK_VIRTUAL,
@@ -191,6 +259,53 @@ void bdrv_io_limits_enable(BlockDriverState *bs)
  bdrv_throttle_write_timer_cb,
  bs);
 bs-io_limits_enabled = true;
+throttle_group_unlock(bs-throttle_state);
+}
+
+void bdrv_io_limits_update_group(BlockDriverState *bs, const char

[Qemu-devel] [PATCH v1 4/8] throttle: Prepare to have multiple timers for one ThrottleState

2014-10-07 Thread Benoît Canet

This patch transform the timer_pending call into two boolean values in the
ThrottleState structure.

This way we are sure that when multiple timers will be used only
one can be armed at a time.

Signed-off-by: Benoit Canet benoit.ca...@nodalink.com
---
 block.c |  2 ++
 include/qemu/throttle.h |  3 +++
 util/throttle.c | 17 ++---
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index f209f55..079b0dc 100644
--- a/block.c
+++ b/block.c
@@ -168,12 +168,14 @@ void bdrv_io_limits_disable(BlockDriverState *bs)
 static void bdrv_throttle_read_timer_cb(void *opaque)
 {
 BlockDriverState *bs = opaque;
+throttle_timer_fired(bs-throttle_state, false);
 qemu_co_enter_next(bs-throttled_reqs[0]);
 }
 
 static void bdrv_throttle_write_timer_cb(void *opaque)
 {
 BlockDriverState *bs = opaque;
+throttle_timer_fired(bs-throttle_state, true);
 qemu_co_enter_next(bs-throttled_reqs[1]);
 }
 
diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
index b89d4d8..3aece3a 100644
--- a/include/qemu/throttle.h
+++ b/include/qemu/throttle.h
@@ -65,6 +65,7 @@ typedef struct ThrottleConfig {
 typedef struct ThrottleState {
 ThrottleConfig cfg;   /* configuration */
 int64_t previous_leak;/* timestamp of the last leak done */
+bool any_timer_armed[2];  /* is any timer armed for this throttle state */
 } ThrottleState;
 
 typedef struct ThrottleTimers {
@@ -125,6 +126,8 @@ bool throttle_schedule_timer(ThrottleState *ts,
  ThrottleTimers *tt,
  bool is_write);
 
+void throttle_timer_fired(ThrottleState *ts, bool is_write);
+
 void throttle_account(ThrottleState *ts, bool is_write, uint64_t size);
 
 #endif
diff --git a/util/throttle.c b/util/throttle.c
index 4d7c8d4..0e305e3 100644
--- a/util/throttle.c
+++ b/util/throttle.c
@@ -172,6 +172,7 @@ void throttle_timers_attach_aio_context(ThrottleTimers *tt,
 void throttle_init(ThrottleState *ts)
 {
 memset(ts, 0, sizeof(ThrottleState));
+ts-any_timer_armed[0] = ts-any_timer_armed[1] = false;
 }
 
 /* To be called first on the ThrottleTimers */
@@ -390,16 +391,26 @@ bool throttle_schedule_timer(ThrottleState *ts,
 return false;
 }
 
-/* request throttled and timer pending - do nothing */
-if (timer_pending(tt-timers[is_write])) {
+/* request throttled and any timer pending - do nothing */
+if (ts-any_timer_armed[is_write]) {
 return true;
 }
 
-/* request throttled and timer not pending - arm timer */
+ts-any_timer_armed[is_write] = true;
 timer_mod(tt-timers[is_write], next_timestamp);
 return true;
 }
 
+/* Remember that now timers are currently armed
+ *
+ * @ts:   the throttle state we are working on
+ * @is_write: the type of operation (read/write)
+ */
+void throttle_timer_fired(ThrottleState *ts, bool is_write)
+{
+ts-any_timer_armed[is_write] = false;
+}
+
 /* do the accounting for this operation
  *
  * @is_write: the type of operation (read/write)
-- 
2.1.1

[Qemu-devel] [PATCH v1 1/8] throttle: Extract timers from ThrottleState into a separate ThrottleTimers structure

2014-10-07 Thread Benoît Canet

Group throttling will share ThrottleState between multiple bs.
As a consequence the ThrottleState will be accessed by multiple aio context.

Timers are tied to their aio context so they must go out of the ThrottleState 
structure.

This commit pave the way for each bs of a common ThrottleState to have it's own
timer.

Signed-off-by: Benoit Canet benoit.ca...@nodalink.com
---
 block.c   | 35 
 include/block/block_int.h |  1 +
 include/qemu/throttle.h   | 36 +
 tests/test-throttle.c | 82 ++-
 util/throttle.c   | 73 -
 5 files changed, 134 insertions(+), 93 deletions(-)

diff --git a/block.c b/block.c
index d3aebeb..f209f55 100644
--- a/block.c
+++ b/block.c
@@ -129,7 +129,7 @@ void bdrv_set_io_limits(BlockDriverState *bs,
 {
 int i;
 
-throttle_config(bs-throttle_state, cfg);
+throttle_config(bs-throttle_state, bs-throttle_timers, cfg);
 
 for (i = 0; i  2; i++) {
 qemu_co_enter_next(bs-throttled_reqs[i]);
@@ -162,7 +162,7 @@ void bdrv_io_limits_disable(BlockDriverState *bs)
 
 bdrv_start_throttled_reqs(bs);
 
-throttle_destroy(bs-throttle_state);
+throttle_timers_destroy(bs-throttle_timers);
 }
 
 static void bdrv_throttle_read_timer_cb(void *opaque)
@@ -181,12 +181,13 @@ static void bdrv_throttle_write_timer_cb(void *opaque)
 void bdrv_io_limits_enable(BlockDriverState *bs)
 {
 assert(!bs-io_limits_enabled);
-throttle_init(bs-throttle_state,
-  bdrv_get_aio_context(bs),
-  QEMU_CLOCK_VIRTUAL,
-  bdrv_throttle_read_timer_cb,
-  bdrv_throttle_write_timer_cb,
-  bs);
+throttle_init(bs-throttle_state);
+throttle_timers_init(bs-throttle_timers,
+ bdrv_get_aio_context(bs),
+ QEMU_CLOCK_VIRTUAL,
+ bdrv_throttle_read_timer_cb,
+ bdrv_throttle_write_timer_cb,
+ bs);
 bs-io_limits_enabled = true;
 }
 
@@ -200,7 +201,9 @@ static void bdrv_io_limits_intercept(BlockDriverState *bs,
  bool is_write)
 {
 /* does this io must wait */
-bool must_wait = throttle_schedule_timer(bs-throttle_state, is_write);
+bool must_wait = throttle_schedule_timer(bs-throttle_state,
+ bs-throttle_timers,
+ is_write);
 
 /* if must wait or any request of this type throttled queue the IO */
 if (must_wait ||
@@ -213,7 +216,8 @@ static void bdrv_io_limits_intercept(BlockDriverState *bs,
 
 
 /* if the next request must wait - do nothing */
-if (throttle_schedule_timer(bs-throttle_state, is_write)) {
+if (throttle_schedule_timer(bs-throttle_state, bs-throttle_timers,
+is_write)) {
 return;
 }
 
@@ -1990,6 +1994,9 @@ static void bdrv_move_feature_fields(BlockDriverState 
*bs_dest,
 memcpy(bs_dest-throttle_state,
bs_src-throttle_state,
sizeof(ThrottleState));
+memcpy(bs_dest-throttle_timers,
+   bs_src-throttle_timers,
+   sizeof(ThrottleTimers));
 bs_dest-throttled_reqs[0]  = bs_src-throttled_reqs[0];
 bs_dest-throttled_reqs[1]  = bs_src-throttled_reqs[1];
 bs_dest-io_limits_enabled  = bs_src-io_limits_enabled;
@@ -2052,7 +2059,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState 
*bs_old)
 assert(bs_new-job == NULL);
 assert(bs_new-dev == NULL);
 assert(bs_new-io_limits_enabled == false);
-assert(!throttle_have_timer(bs_new-throttle_state));
+assert(!throttle_timers_are_init(bs_new-throttle_timers));
 
 tmp = *bs_new;
 *bs_new = *bs_old;
@@ -2070,7 +2077,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState 
*bs_old)
 assert(bs_new-dev == NULL);
 assert(bs_new-job == NULL);
 assert(bs_new-io_limits_enabled == false);
-assert(!throttle_have_timer(bs_new-throttle_state));
+assert(!throttle_timers_are_init(bs_new-throttle_timers));
 
 /* insert the nodes back into the graph node list if needed */
 if (bs_new-node_name[0] != '\0') {
@@ -5746,7 +5753,7 @@ void bdrv_detach_aio_context(BlockDriverState *bs)
 }
 
 if (bs-io_limits_enabled) {
-throttle_detach_aio_context(bs-throttle_state);
+throttle_timers_detach_aio_context(bs-throttle_timers);
 }
 if (bs-drv-bdrv_detach_aio_context) {
 bs-drv-bdrv_detach_aio_context(bs);
@@ -5782,7 +5789,7 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
 bs-drv-bdrv_attach_aio_context(bs, new_context);
 }
 if (bs-io_limits_enabled) {
-throttle_attach_aio_context(bs-throttle_state, new_context);
+throttle_timers_attach_aio_context(bs-throttle_timers, new_context);
 }
 
 QLIST_FOREACH(ban,

Re: [Qemu-devel] [PATCH v2 2/2] block: Make op blockers recursive

2014-10-01 Thread Benoît Canet


Thanks a lot for reviewing this patch.

Since the code is not trivial I will give my arguments for writing it
this way.


  +/* block recursively a BDS
  + *
  + * If base != NULL the caller must make sure that there is no multiple 
  child
  + * BDS in the subtree pointed to by bs
 
 In the case when base != NULL, should that really matter?  In a
 driver's .bdrv_op_recursive_block/unblock, if that driver has private
 children (multiple or not), shouldn't the private children be treated
 as one black box, and blocked / unblocked just like the parent
 BDS?

This is a stale comment. My bad.

 
 For instance, what if we had a tree such as this:
 
[quorum1]  [active]
|
| (private)
|
v
 [node2]-- [node1] --- [imag 
 if 'quorum1' was to be op blocked, and 'image1' and its children all
 comprise 'quorum1', wouldn't we always want to lock 'image1' all the
 way down to 'node2'?

That's what the patch does.

 
 Or do we expect that someone will intentionally pass a 'base' pointer
 along that matches somewhere in the private child tree?

This is not expected by the caller but I wrote the patch so it will also work.

 
  + *
  + * @bs: the BDS to start to recurse from
  + * @base:   the BDS where the recursion should end
  + *  could be NULL if the recursion should go down everywhere
  + * @op: the type of op blocker to block
  + * @reason: the error message associated with the blocking
  + */
  +void bdrv_op_block(BlockDriverState *bs, BlockDriverState *base,
  +   BlockOpType op, Error *reason)
  +{
  +if (!bs) {
  +return;
  +}
  +
  +/* did we recurse down to base ? */
  +if (bs == base) {
  +return;
  +}
  +
  +/* prevent recursion loop */
  +if (bdrv_op_is_blocked_by(bs, op, reason)) {
  +return;
  +}
 
 Should we handle this somehow (isn't this effectively an error, that
 will prematurely end the blocking of this particular line)? 

The main purpose of this is mirror.c and commit.c would form BDS loops on 
completion.
These callers could break the look manually but the code would fail
if a loop is not breaked and the blocker function are called on it.
So the blocker code have to handle recursion loops.

The bdrv_op_is_blocked_by match a reason being the criteria to test the blocker.

So this test will trigger only if a BDS is already blocked by the same reason
pointer.

This make the bdrv_op_block function idempotent.

note that it is the only sane way I found to make the blockers function handle
loops.

 
 If we hit this, we are going to end up in a scenario where we haven't
 blocked the chain as requested, and we don't know the state of the
 blocking below this failure point.  Seems like the caller should know,
 and we should have a way of cleaning up.

If we hit this the chain was already blocked with the same reason pointer.

 
 Actually, now I am wondering if we perhaps shouldn't be building a
 list to block, and then blocking everything atomically once we know
 there are no failure points.


I don't think this particular test is a failure point.

  +
  +/* block first for recursion loop protection to work */
  +bdrv_do_op_block(bs, op, reason);
  +
  +bdrv_op_block(bs-file, base, op, reason);
  +
  +if (bs-drv  bs-drv-supports_backing) {
  +bdrv_op_block(bs-backing_hd, base, op, reason);
  +}
  +
  +if (bs-drv  bs-drv-bdrv_op_recursive_block) {
  +bs-drv-bdrv_op_recursive_block(bs, base, op, reason);
 
 Do we need to allow .bdrv_op_recursive_block() to fail and return
 error (and handle it, of course)?

I don't know yet: but lets let this question float a little more in the mail 
thread.

 
 
 
 s/block/unblock
Thanks

 
 
  + * @reason: the error message associated with the blocking
  + */
  +void bdrv_op_unblock(BlockDriverState *bs, BlockDriverState *base,
  + BlockOpType op, Error *reason)
  +{
  +if (!bs) {
  +return;
  +}
  +
  +/* did we recurse down to base ? */
  +if (bs == base) {
  +return;
  +}
  +
  +/* prevent recursion loop */
  +if (!bdrv_op_is_blocked_by(bs, op, reason)) {
  +return;
  +}
 
 Do we want to do this negative check here?  Even if a given node is
 not blocked, its children may be, and I would assume (since this is
 recursive) that if I called bdrv_op_unblock() all of the chain down to
 'base' would be unblocked for the given reason.
 
 E.g. 
 
[image1] -- [image2] -- [image3]
blocked  unblocked blocked

To reach this state the caller code would have to do the following twisted 
sequence.

block(image3, with_reason1)
unblock(image2, with_reason1)
block(image1, with_reason1)

There is no such sequence in the code thanks to the base argument and we could
enforce that no such sequence ever get written.

 
 I would assume that

Re: [Qemu-devel] [PATCH v2 2/2] block: Make op blockers recursive

2014-10-01 Thread Benoît Canet

  
  The main purpose of this is mirror.c and commit.c would form BDS loops on 
  completion.
  These callers could break the look manually but the code would fail
  if a loop is not breaked and the blocker function are called on it.
  So the blocker code have to handle recursion loops.
 
 I think commit/mirror/etc should break any loops prior to calling
 recursive functions on those loops (just like it should do before
 calling bdrv_unref(), etc..).  Otherwise, I think the recursive
 functions make assumptions that may be true in certain contexts, but
 not necessarily all.
 
 (Hmm, I believe that Fam had a series that did some nice cleanup on
 bdrv_drop_intermediate() and related areas that did some loop
 breaking, IIRC).

Ok I could use that as a basis.

  
  I don't think this particular test is a failure point.
 
 
 With the way it is written, the individual BDS is blocked with the
 same reason pointer, but not necessarily the whole chain - unless you
 make the assumption that blockers will only be used via the recursive
 interface, and never individually on a node.

there is no more a no recursive version with this patch so this assumption
will be respected.

 
 The caller doesn't have an interface to check that the chain is not
 blocked - bdrv_op_is_blocked_by() doesn't operate recursively.  
 
 If we tried to block a chain that has some portion already blocked for
 the same reason, shouldn't that be an error?

Why should we allow this ?
My understanding is that blocking something should be associated to a
single operation whatever they are.
So one operation to block implying one different reason is not so strange.

+
+/* block first for recursion loop protection to work */
+bdrv_do_op_block(bs, op, reason);
+
+bdrv_op_block(bs-file, base, op, reason);
+
+if (bs-drv  bs-drv-supports_backing) {
+bdrv_op_block(bs-backing_hd, base, op, reason);
+}
+
+if (bs-drv  bs-drv-bdrv_op_recursive_block) {
+bs-drv-bdrv_op_recursive_block(bs, base, op, reason);
   
   Do we need to allow .bdrv_op_recursive_block() to fail and return
   error (and handle it, of course)?
  
  I don't know yet: but lets let this question float a little more in the 
  mail thread.
  
   
   
   


  
  To reach this state the caller code would have to do the following twisted 
  sequence.
  
  block(image3, with_reason1)
  unblock(image2, with_reason1)
  block(image1, with_reason1)
 
  There is no such sequence in the code thanks to the base argument and we 
  could
  enforce that no such sequence ever get written.
 
 
 If we ignore blockdev-add and scenarios where an image node may have
 multiple overlays, we might be able to assume that such a sequence
 could not exist.
 
 But in that case, should this negative check result in an error?
 
 It would seem at this point we would have encountered one of these
 scenarios:
 
 1.) An invalid block/unblock state in the chain, if we assume that no
 such sequence should exist
 
 2.) We tried to unblock more than we originally blocked

 
   
   I would assume that bdrv_op_unblock(image2, NULL, reason) would still
   unblock image1, even if image2 was unblocked.
  
  Should we also assume that bdrv_op_unblock(image4, NULL, reason) with 
  image4 being
  image3 parent unblock everything underneath ?
  
 
 I think we either do that, or return an error.  But to have
 bdrv_op_unblock() (or bdrv_op_block()) silently stop at some point in
 the chain prior to reaching 'base' doesn't seem correct to me.
 

Maybe you are right.
I don't mind rewriting the patchset with error handling and without the 
recursion
loop avoidance code given I find Fam's loop breaking patches on the list.

I remember trying to write loop breaking by myself just before 2.1 and it was
annoying.

Best regards

Benoît

Re: [Qemu-devel] [PATCH 0/2] Virtio-9p live migration patchset

2014-09-30 Thread Benoît Canet

The Tuesday 30 Sep 2014 à 22:08:12 (+0300), Boris Sukholitko wrote :
 On Tue, Sep 30, 2014 at 12:46 AM, Benoît Canet benoit.ca...@irqsave.net 
 wrote:
  The Friday 26 Sep 2014 à 18:19:55 (+0300), Boris Sukholitko wrote :
  This patchset is a small rebase of the 9p live migration patches made a 
  year
  ago by Benoit Canet.
 
  See http://lists.gnu.org/archive/html/qemu-devel/2013-04/msg02190.html
  for the previous thread.
 
  I took the liberty to drop the second patch (waiting for completion of 9p
  operations) as it wasn't working in my testing.
 
  It's probable that the second patch has bitrot but I remember I was asked to
  write it for a meaningfull reason.
 
 AFAICT, the reason was to drain requests queue before saving the state.
 
 Unfortunately, releasing BQL haven't led to the callbacks being executed.
 Therefore deadlock ensued.
 
  Maybe you should give it a bit more love to resurect it properly.
 
 
 I probably should. Still, IMHO, the two patches work good enough
 to deserve merging on their own right :)

I am afraid nobody will want to merge a patchset where there is a
theorical potential bug.

It should work as well on paper as on silicon.

Best regards

Benoît

 
 Thanks,
 Boris.
 
  Best regards
 
  Benoît
 
 
  Thanks,
  Boris.
 
  Boris Sukholitko (2):
virtio-9p: Add support for 9p migration.
virtio-9p: Remove migration blockers.
 
   hw/9pfs/virtio-9p-device.c | 152 
  +
   hw/9pfs/virtio-9p.c|  24 ---
   hw/9pfs/virtio-9p.h|   2 -
   3 files changed, 152 insertions(+), 26 deletions(-)
 
  --
  2.0.2

Re: [Qemu-devel] [PATCH v2 2/2] block: Make op blockers recursive

2014-09-30 Thread Benoît Canet

 
 Seems like these new functions would be better named '.bdrv_op_block'
 and '.bdrv_op_unblock'?  That way, recursive or not, it is clear block
 drivers can implement whatever blocking is appropriate for themselves.
 
   QLIST_ENTRY(BlockDriver) list;
   };

Hi,

Thanks a lot for review.
I will mull this and respond in the begining of the afternoon.

Best regards

Benoît

Re: [Qemu-devel] [PATCH v3 2/3] timers: Move NANOSECONDS_PER_SECONDS to timer.h for future reuse

2014-09-29 Thread Benoît Canet

On Wed, Sep 24, 2014 at 05:33:45PM +0200, Paolo Bonzini wrote:
 Il 24/09/2014 17:21, Benoît Canet ha scritto:
  Reviewed-by: Eric Blake ebl...@redhat.com
  Signed-off-by: Benoît Canet benoit.ca...@nodalink.com
  ---
   include/qemu/throttle.h | 2 --
   include/qemu/timer.h| 2 ++
   2 files changed, 2 insertions(+), 2 deletions(-)
  
  diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
  index 8f9e611..1c639d2 100644
  --- a/include/qemu/throttle.h
  +++ b/include/qemu/throttle.h
  @@ -27,8 +27,6 @@
   #include qemu-common.h
   #include qemu/timer.h
   
  -#define NANOSECONDS_PER_SECOND  10
  -
   typedef enum {
   THROTTLE_BPS_TOTAL,
   THROTTLE_BPS_READ,
  diff --git a/include/qemu/timer.h b/include/qemu/timer.h
  index 5f5210d..0884e72 100644
  --- a/include/qemu/timer.h
  +++ b/include/qemu/timer.h
  @@ -5,6 +5,8 @@
   #include qemu-common.h
   #include qemu/notify.h
   
  +#define NANOSECONDS_PER_SECOND  10
  +
   /* timers */
   
   #define SCALE_MS 100
  
 
 Reviewed-by: Paolo Bonzini pbonz...@redhat.com
 
 :)

Two rev by for one commit are better than one but does this belong to
the third commit ? :)

Best regards

Benoît

Re: [Qemu-devel] [PATCH v2 0/2] Recursive op blockers

2014-09-29 Thread Benoît Canet

On Mon, Sep 22, 2014 at 09:00:50PM +0200, Benoît Canet wrote:
 in v2:
 Added a base parameters [Kevin]
 Remove multiple blockers [Kevin]
 
 Addressed Kevin's comments except BLOCK_OP_TYPE_MIRROR_REPLACE 
 relaxing
 and You need to unblock all of the BDSes that are going to be 
 removed.
 Aren't you unblocking more than that here?.
 
 
 pass ./check -qcow2
 
 Benoît Canet (2):
   block: Rename BLOCK_OP_TYPE_REPLACE to BLOCK_OP_TYPE_MIRROR_REPLACE
   block: Make op blockers recursive
 
  block-migration.c |   4 +-
  block.c   | 123 
 ++
  block/blkverify.c |  21 
  block/commit.c|   2 +
  block/mirror.c|  26 +++---
  block/quorum.c|  29 +++
  block/stream.c|   2 +
  block/vmdk.c  |  32 
  blockjob.c|   6 +--
  include/block/block.h |  14 --
  include/block/block_int.h |  10 
  11 files changed, 242 insertions(+), 27 deletions(-)
 
 -- 
 2.1.0
 

Hello people,

Gentle ping,

I would really like these patches to get rev-by and merged for the next release
since they actually fix QEMU.

Under the new rule of code review I choose Markus or Jeff to do the initial
revision.
I could nag Kevin into reviewing it but it's not the process anymore.

Best regards

Benoît

Re: [Qemu-devel] [PATCH 0/2] Virtio-9p live migration patchset

2014-09-29 Thread Benoît Canet

The Friday 26 Sep 2014 à 18:19:55 (+0300), Boris Sukholitko wrote :
 This patchset is a small rebase of the 9p live migration patches made a year
 ago by Benoit Canet.
 
 See http://lists.gnu.org/archive/html/qemu-devel/2013-04/msg02190.html
 for the previous thread.
 
 I took the liberty to drop the second patch (waiting for completion of 9p
 operations) as it wasn't working in my testing.

It's probable that the second patch has bitrot but I remember I was asked to
write it for a meaningfull reason.
Maybe you should give it a bit more love to resurect it properly.

Best regards

Benoît

 
 Thanks,
 Boris.
 
 Boris Sukholitko (2):
   virtio-9p: Add support for 9p migration.
   virtio-9p: Remove migration blockers.
 
  hw/9pfs/virtio-9p-device.c | 152 
 +
  hw/9pfs/virtio-9p.c|  24 ---
  hw/9pfs/virtio-9p.h|   2 -
  3 files changed, 152 insertions(+), 26 deletions(-)
 
 -- 
 2.0.2

Re: [Qemu-devel] [PATCH 1/2] virtio-9p: Add support for 9p migration.

2014-09-29 Thread Benoît Canet

The Friday 26 Sep 2014 à 18:19:56 (+0300), Boris Sukholitko wrote :


 This patch is a rebase of Aneesh Kumar's and Benoit Canet's previous
 work.
 
 Signed-off-by: Boris Sukholitko bor...@gmail.com

If Aneesh and me worked on this you should also keep our signed-off in addition
of yours.

Best regards

Benoît

 ---
  hw/9pfs/virtio-9p-device.c | 152 
 +
  1 file changed, 152 insertions(+)
 
 diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
 index 2572747..078ad39 100644
 --- a/hw/9pfs/virtio-9p-device.c
 +++ b/hw/9pfs/virtio-9p-device.c
 @@ -21,6 +21,149 @@
  #include virtio-9p-coth.h
  #include hw/virtio/virtio-access.h
  
 +static void virtio_9p_save_path(QEMUFile *f, V9fsPath *path)
 +{
 +qemu_put_be16(f, path-size);
 +qemu_put_buffer(f, (const uint8_t *) path-data, path-size);
 +}
 +
 +static void virtio_9p_save_string(QEMUFile *f, V9fsString *s)
 +{
 +qemu_put_be16(f, s-size);
 +qemu_put_buffer(f, (const uint8_t *) s-data, s-size);
 +}
 +
 +static void virtio_9p_save_xattr(QEMUFile *f, V9fsXattr *xattr)
 +{
 +qemu_put_be64(f, xattr-copied_len);
 +qemu_put_be64(f, xattr-len);
 +qemu_put_buffer(f, (const uint8_t *) xattr-value, xattr-len);
 +virtio_9p_save_string(f, xattr-name);
 +qemu_put_be32(f, xattr-flags);
 +}
 +
 +static void virtio_9p_save_fid(QEMUFile *f, V9fsFidState *fid)
 +{
 +/* First close the fid and mark it for reopen if migration fail */
 +if (fid-fid_type == P9_FID_FILE) {
 +close(fid-fs.fd);
 +fid-fs.fd = -1;
 +} else if (fid-fid_type == P9_FID_DIR) {
 +closedir(fid-fs.dir);
 +fid-fs.dir = NULL;
 +}
 +
 +qemu_put_be32(f, fid-fid_type);
 +if (fid-fid_type == P9_FID_XATTR) {
 +/* we don't save fs_reclaim */
 +virtio_9p_save_xattr(f, fid-fs.xattr);
 +}
 +qemu_put_be32(f, fid-fid);
 +virtio_9p_save_path(f, fid-path);
 +qemu_put_be32(f, fid-flags);
 +qemu_put_be32(f, fid-open_flags);
 +qemu_put_be32(f, fid-uid);
 +qemu_put_be32(f, fid-ref);
 +qemu_put_be32(f, fid-clunked);
 +}
 +
 +static void virtio_9p_save(QEMUFile *f, void *opaque)
 +{
 +int fidcount = 0;
 +V9fsState *s = opaque;
 +V9fsFidState *fid;
 +
 +virtio_save(VIRTIO_DEVICE(s), f);
 +
 +for (fid = s-fid_list; fid; fid = fid-next) {
 +fidcount++;
 +}
 +/* Write the total number of fid structure */
 +qemu_put_be32(f, fidcount);
 +
 +for (fid = s-fid_list; fid; fid = fid-next) {
 +virtio_9p_save_fid(f, fid);
 +}
 +
 +qemu_put_be32(f, s-proto_version);
 +qemu_put_be32(f, s-msize);
 +}
 +
 +static void virtio_9p_load_path(QEMUFile *f, V9fsPath *path)
 +{
 +path-size = qemu_get_be16(f);
 +path-data = g_malloc0(path-size);
 +qemu_get_buffer(f, (uint8_t *) path-data, path-size);
 +}
 +
 +static void virtio_9p_load_string(QEMUFile *f, V9fsString *s)
 +{
 +s-size = qemu_get_be16(f);
 +s-data = g_malloc0(s-size);
 +qemu_get_buffer(f, (uint8_t *) s-data, s-size);
 +}
 +
 +static void virtio_9p_load_xattr(QEMUFile *f, V9fsXattr *xattr)
 +{
 +xattr-copied_len = qemu_get_be64(f);
 +xattr-len = qemu_get_be64(f);
 +qemu_get_buffer(f, (uint8_t *) xattr-value, xattr-len);
 +virtio_9p_load_string(f, xattr-name);
 +xattr-flags = qemu_get_be32(f);
 +}
 +
 +static V9fsFidState *virtio_9p_load_fid(QEMUFile *f)
 +{
 +V9fsFidState *fid;
 +fid = g_new0(V9fsFidState, 1);
 +
 +fid-fid_type= qemu_get_be32(f);
 +if (fid-fid_type == P9_FID_XATTR) {
 +virtio_9p_load_xattr(f, fid-fs.xattr);
 +}
 +fid-fid = qemu_get_be32(f);
 +virtio_9p_load_path(f, fid-path);
 +fid-flags   = qemu_get_be32(f);
 +fid-open_flags  = qemu_get_be32(f);
 +fid-uid = qemu_get_be32(f);
 +fid-ref = qemu_get_be32(f);
 +fid-clunked = qemu_get_be32(f);
 +
 +/* If it's a file fid mark the file descriptors as closed.
 + * DIR is null thanks to g_new0.
 + * When doing get_fid v9fs_reopen_fid will reopen the file or the 
 directory.
 + */
 +if (fid-fid_type == P9_FID_FILE) {
 +fid-fs.fd = -1;
 +}
 +return fid;
 +}
 +
 +static int virtio_9p_load(QEMUFile *f, void *opaque, int version_id)
 +{
 +int fidcount;
 +V9fsState *s = opaque;
 +V9fsFidState **fid;
 +
 +if (version_id != 2) {
 +return -EINVAL;
 +}
 +virtio_load(VIRTIO_DEVICE(s), f, version_id);
 +fidcount = qemu_get_be32(f);
 +
 +fid = s-fid_list;
 +while (fidcount) {
 +*fid = virtio_9p_load_fid(f);
 +fid = ((*fid)-next);
 +fidcount--;
 +}
 +
 +s-proto_version  = qemu_get_be32(f);
 +s-msize  = qemu_get_be32(f);
 +
 +return 0;
 +}
 +
  static uint32_t virtio_9p_get_features(VirtIODevice *vdev, uint32_t features)
  {
  features |= 1  VIRTIO_9P_MOUNT_TAG;
 @@ -50,6 +193,7 @@ static void

Re: [Qemu-devel] [PATCH v3 2/3] timers: Move NANOSECONDS_PER_SECONDS to timer.h for future reuse

2014-09-29 Thread Benoît Canet

On Mon, Sep 29, 2014 at 11:54:47PM +0200, Paolo Bonzini wrote:
 Il 29/09/2014 18:04, Benoît Canet ha scritto:
  On Wed, Sep 24, 2014 at 05:33:45PM +0200, Paolo Bonzini wrote:
  Il 24/09/2014 17:21, Benoît Canet ha scritto:
  Reviewed-by: Eric Blake ebl...@redhat.com
  Signed-off-by: Benoît Canet benoit.ca...@nodalink.com
  ---
   include/qemu/throttle.h | 2 --
   include/qemu/timer.h| 2 ++
   2 files changed, 2 insertions(+), 2 deletions(-)
 
  diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
  index 8f9e611..1c639d2 100644
  --- a/include/qemu/throttle.h
  +++ b/include/qemu/throttle.h
  @@ -27,8 +27,6 @@
   #include qemu-common.h
   #include qemu/timer.h
   
  -#define NANOSECONDS_PER_SECOND  10
  -
   typedef enum {
   THROTTLE_BPS_TOTAL,
   THROTTLE_BPS_READ,
  diff --git a/include/qemu/timer.h b/include/qemu/timer.h
  index 5f5210d..0884e72 100644
  --- a/include/qemu/timer.h
  +++ b/include/qemu/timer.h
  @@ -5,6 +5,8 @@
   #include qemu-common.h
   #include qemu/notify.h
   
  +#define NANOSECONDS_PER_SECOND  10
  +
   /* timers */
   
   #define SCALE_MS 100
 
 
  Reviewed-by: Paolo Bonzini pbonz...@redhat.com
 
  :)
  
  Two rev by for one commit are better than one but does this belong to
  the third commit ? :)
 
 The rev-bys belong to the first and second.  The third's on my list...
 

I misread (freudian slip) the 20 seconds delta between the two rev-by as 20 min.
20 min was credible but 20 seconds is not despite you being fast :)

Best regards

Benoît

 Paolo

Re: [Qemu-devel] [PATCH v3 03/23] block: Connect BlockBackend to BlockDriverState

2014-09-25 Thread Benoît Canet

On Tue, Sep 23, 2014 at 03:36:03PM +0200, Kevin Wolf wrote:
 Do we have a KVM Forum block layer agenda yet? I think this thread could
 already contain a few topics to discuss there.

Being the guy who constantly bring back painfull issues
(Block filters, Block Backend) on the table I think we should also do a BOFH
(Stefan's idea on a private discussion) about how we can further tweak and
improve the review process.

I experienced on some other open sources projects as complex as the QEMU block
layer a feeling of reactivity while contributing patches and while the new
QEMU block layer review process is starting we are not here yet.

One idea I have is that we could benefit from this event to establish semi
informal peering review contracts between contributors like the ISP does for
bandwith.

Of course we should be carefull to avoid to go the academic review circle route
and left some for unknown people.

Best regards

Benoît

 
 Kevin

Re: [Qemu-devel] [PATCH v2 3/3] util: Add an utility infrastructure used to compute an average on a time slice

2014-09-24 Thread Benoît Canet

On Mon, Sep 08, 2014 at 04:29:26PM +0200, Paolo Bonzini wrote:
 Il 08/09/2014 14:18, Benoît Canet ha scritto:
  The algorithm used was defined on the list while discussing the new IO 
  accounting
  overhaul.
  See http://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg04954.html
  
  Also the module takes care of computing minimal and maximal values over the 
  time
  slice duration.
  
  Signed-off-by: Benoît Canet benoit.ca...@nodalink.com
 
 If you add
 
 int64_t cpu_get_clock(void)
 {
 return my_clock_value;
 }
 
 to the test, and use a QEMU_CLOCK_VIRTUAL-based average, you should be
 able to advance the clock directly in the test with no need for sleep()
 and with 100% deterministic results.

That's a really neat trick. Thanks

[Qemu-devel] [PATCH v3 2/3] timers: Move NANOSECONDS_PER_SECONDS to timer.h for future reuse

2014-09-24 Thread Benoît Canet

Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Benoît Canet benoit.ca...@nodalink.com
---
 include/qemu/throttle.h | 2 --
 include/qemu/timer.h| 2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
index 8f9e611..1c639d2 100644
--- a/include/qemu/throttle.h
+++ b/include/qemu/throttle.h
@@ -27,8 +27,6 @@
 #include qemu-common.h
 #include qemu/timer.h
 
-#define NANOSECONDS_PER_SECOND  10
-
 typedef enum {
 THROTTLE_BPS_TOTAL,
 THROTTLE_BPS_READ,
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 5f5210d..0884e72 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -5,6 +5,8 @@
 #include qemu-common.h
 #include qemu/notify.h
 
+#define NANOSECONDS_PER_SECOND  10
+
 /* timers */
 
 #define SCALE_MS 100
-- 
2.1.1

[Qemu-devel] [PATCH v3 1/3] throttle: Make NANOSECONDS_PER_SECOND an integer

2014-09-24 Thread Benoît Canet

We will want to reuse this define in the future by making it common to multiple
QEMU modules.
It would be safer that this define be an integer so we avoid strange float
rounding errors.
Do this conversion.

Signed-off-by: Benoît Canet benoit.ca...@nodalink.com
---
 include/qemu/throttle.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
index b890613..8f9e611 100644
--- a/include/qemu/throttle.h
+++ b/include/qemu/throttle.h
@@ -27,7 +27,7 @@
 #include qemu-common.h
 #include qemu/timer.h
 
-#define NANOSECONDS_PER_SECOND  10.0
+#define NANOSECONDS_PER_SECOND  10
 
 typedef enum {
 THROTTLE_BPS_TOTAL,
-- 
2.1.1

Re: [Qemu-devel] [PATCH v2 1/2] block: Specify -drive legacy option aliases in array

2014-09-24 Thread Benoît Canet

The Wednesday 24 Sep 2014 à 16:46:28 (+0200), Kevin Wolf wrote :
 Instead of a series of qemu_opt_rename() calls, use an array that
 contains all of the renames and call qemu_opt_rename() in a loop. This
 will keep the code readable even when we add an error return to
 qemu_opt_rename().
 
 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
  blockdev.c | 39 ---
  1 file changed, 24 insertions(+), 15 deletions(-)
 
 diff --git a/blockdev.c b/blockdev.c
 index a194d04..6a33fd2 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -641,28 +641,37 @@ DriveInfo *drive_new(QemuOpts *all_opts, 
 BlockInterfaceType block_default_type)
  const char *serial;
  const char *filename;
  Error *local_err = NULL;
 +int i;
  
  /* Change legacy command line options into QMP ones */
 -qemu_opt_rename(all_opts, iops, throttling.iops-total);
 -qemu_opt_rename(all_opts, iops_rd, throttling.iops-read);
 -qemu_opt_rename(all_opts, iops_wr, throttling.iops-write);
 +static const struct {
 +const char *from;
 +const char *to;
 +} opt_renames[] = {
 +{ iops,   throttling.iops-total },
 +{ iops_rd,throttling.iops-read },
 +{ iops_wr,throttling.iops-write },
  
 -qemu_opt_rename(all_opts, bps, throttling.bps-total);
 -qemu_opt_rename(all_opts, bps_rd, throttling.bps-read);
 -qemu_opt_rename(all_opts, bps_wr, throttling.bps-write);
 +{ bps,throttling.bps-total },
 +{ bps_rd, throttling.bps-read },
 +{ bps_wr, throttling.bps-write },
  
 -qemu_opt_rename(all_opts, iops_max, throttling.iops-total-max);
 -qemu_opt_rename(all_opts, iops_rd_max, throttling.iops-read-max);
 -qemu_opt_rename(all_opts, iops_wr_max, throttling.iops-write-max);
 +{ iops_max,   throttling.iops-total-max },
 +{ iops_rd_max,throttling.iops-read-max },
 +{ iops_wr_max,throttling.iops-write-max },
  
 -qemu_opt_rename(all_opts, bps_max, throttling.bps-total-max);
 -qemu_opt_rename(all_opts, bps_rd_max, throttling.bps-read-max);
 -qemu_opt_rename(all_opts, bps_wr_max, throttling.bps-write-max);
 +{ bps_max,throttling.bps-total-max },
 +{ bps_rd_max, throttling.bps-read-max },
 +{ bps_wr_max, throttling.bps-write-max },
  
 -qemu_opt_rename(all_opts,
 -iops_size, throttling.iops-size);
 +{ iops_size,  throttling.iops-size },
  
 -qemu_opt_rename(all_opts, readonly, read-only);
 +{ readonly,   read-only },
 +};
 +
 +for (i = 0; i  ARRAY_SIZE(opt_renames); i++) {
 +qemu_opt_rename(all_opts, opt_renames[i].from, opt_renames[i].to);
 +}
  
  value = qemu_opt_get(all_opts, cache);
  if (value) {
 -- 
 1.8.3.1
 
 

Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

[Qemu-devel] [PATCH v3 0/3] timed average infrastructure for use in block statistics

2014-09-24 Thread Benoît Canet

Hi,

Here is the new iteration of the timed average module.

A better algorithm avoiding unwanted noise has been implemented and the test
have been rewritten as suggested.

Best regards

in V3:
Add Eric rev by of patch 2
drop unneeded (double) casts [Eric]
Implement new algorithm [Paolo]
rewrite commit message [Markus]
stub cpu_get_clock in tests [Paolo]

Benoît

Benoît Canet (3):
  throttle: Make NANOSECONDS_PER_SECOND an integer
  timers: Move NANOSECONDS_PER_SECONDS to timer.h for future reuse
  util: Infrastructure for computing recent averages

 include/qemu/throttle.h  |   2 -
 include/qemu/timed-average.h |  60 +
 include/qemu/timer.h |   2 +
 tests/Makefile   |   2 +
 tests/test-timed-average.c   |  89 ++
 util/Makefile.objs   |   1 +
 util/timed-average.c | 208 +++
 7 files changed, 362 insertions(+), 2 deletions(-)
 create mode 100644 include/qemu/timed-average.h
 create mode 100644 tests/test-timed-average.c
 create mode 100644 util/timed-average.c

-- 
2.1.1

[Qemu-devel] [PATCH v3 3/3] util: Infrastructure for computing recent averages

2014-09-24 Thread Benoît Canet

The module takes care of computing minimal and maximal
values over the time slice duration.

Suggested-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Benoît Canet benoit.ca...@nodalink.com
---
 include/qemu/timed-average.h |  60 +
 tests/Makefile   |   2 +
 tests/test-timed-average.c   |  89 ++
 util/Makefile.objs   |   1 +
 util/timed-average.c | 208 +++
 5 files changed, 360 insertions(+)
 create mode 100644 include/qemu/timed-average.h
 create mode 100644 tests/test-timed-average.c
 create mode 100644 util/timed-average.c

diff --git a/include/qemu/timed-average.h b/include/qemu/timed-average.h
new file mode 100644
index 000..8150ead
--- /dev/null
+++ b/include/qemu/timed-average.h
@@ -0,0 +1,60 @@
+/*
+ * QEMU timed average computation
+ *
+ * Copyright (C) Nodalink, EURL. 2014
+ *
+ * Author:
+ *   Benoît Canet benoit.ca...@nodalink.com
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) version 3 or any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see http://www.gnu.org/licenses/.
+ */
+
+#ifndef TIMED_AVERAGE_H
+#define TIMED_AVERAGE_H
+
+#include stdint.h
+
+#include qemu/timer.h
+
+typedef struct Window {
+uint64_t  min;   /* minimal value accounted in the window */
+uint64_t  max;   /* maximal value accounted in the window */
+uint64_t  sum;   /* sum of the discrete values */
+uint64_t  count; /* number of values */
+int64_t   expiration;/* the end of the current window in ns */
+} Window;
+
+typedef struct TimedAverage {
+Windowwindows[2];/* two overlapping windows of period
+  * nanoseconds with an offset of period / 2
+  * between them
+  */
+
+uint64_t  period;/* period in nanoseconds */
+uint64_t  current;   /* the current window index: it's also the
+  * oldest window index
+  */
+QEMUClockType clock_type;/* the clock used */
+} TimedAverage;
+
+void timed_average_init(TimedAverage *ta, QEMUClockType clock_type,
+uint64_t period);
+
+void timed_average_account(TimedAverage *ta, uint64_t value);
+
+uint64_t timed_average_min(TimedAverage *ta);
+uint64_t timed_average_avg(TimedAverage *ta);
+uint64_t timed_average_max(TimedAverage *ta);
+
+#endif
diff --git a/tests/Makefile b/tests/Makefile
index 469c0a5..d3669f4 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -64,6 +64,7 @@ gcov-files-check-qom-interface-y = qom/object.c
 check-unit-$(CONFIG_POSIX) += tests/test-vmstate$(EXESUF)
 check-unit-y += tests/test-qemu-opts$(EXESUF)
 gcov-files-test-qemu-opts-y = qom/test-qemu-opts.c
+check-unit-y += tests/test-timed-average$(EXESUF)
 
 check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
 
@@ -259,6 +260,7 @@ tests/test-qdev-global-props$(EXESUF): 
tests/test-qdev-global-props.o \
 tests/test-vmstate$(EXESUF): tests/test-vmstate.o \
vmstate.o qemu-file.o \
libqemuutil.a
+tests/test-timed-average$(EXESUF): tests/test-timed-average.o qemu-timer.o 
libqemuutil.a stubs/clock-warp.o stubs/cpu-get-icount.o stubs/notify-event.o
 
 tests/test-qapi-types.c tests/test-qapi-types.h :\
 $(SRC_PATH)/tests/qapi-schema/qapi-schema-test.json 
$(SRC_PATH)/scripts/qapi-types.py
diff --git a/tests/test-timed-average.c b/tests/test-timed-average.c
new file mode 100644
index 000..5091589
--- /dev/null
+++ b/tests/test-timed-average.c
@@ -0,0 +1,89 @@
+/*
+ * Timed average computation tests
+ *
+ * Copyright Nodalink, EURL. 2014
+ *
+ * Authors:
+ *  Benoît Canet benoit.ca...@nodalink.com
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+
+#include glib.h
+#include unistd.h
+
+#include qemu/timed-average.h
+
+static int64_t my_clock_value;
+
+int64_t cpu_get_clock(void)
+{
+return my_clock_value;
+}
+
+static void account(TimedAverage *ta)
+{
+timed_average_account(ta, 1);
+timed_average_account(ta, 5);
+timed_average_account(ta, 2);
+timed_average_account(ta, 4);
+timed_average_account(ta, 3);
+}
+
+static void test_average(void)
+{
+TimedAverage ta;
+uint64_t result;
+int i;
+
+/* we will compute some average on a period of 1

Re: [Qemu-devel] [PATCH v2 2/2] block: Catch simultaneous usage of options and their aliases

2014-09-24 Thread Benoît Canet

The Wednesday 24 Sep 2014 à 16:46:29 (+0200), Kevin Wolf wrote :
 While thinking about precedence of conflicting block device options from
 different sources, I noticed that you can specify both an option and its
 legacy alias at the same time (e.g. readonly=on,read-only=off). Rather
 than specifying the order of precedence, we should simply forbid such
 combinations.
 
 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
  blockdev.c | 16 ++--
  tests/qemu-iotests/051 | 23 +++
  tests/qemu-iotests/051.out | 45 +
  3 files changed, 82 insertions(+), 2 deletions(-)
 
 diff --git a/blockdev.c b/blockdev.c
 index 6a33fd2..7adecae 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -532,12 +532,18 @@ err_no_opts:
  return NULL;
  }
  
 -static void qemu_opt_rename(QemuOpts *opts, const char *from, const char *to)
 +static void qemu_opt_rename(QemuOpts *opts, const char *from, const char *to,
 +Error **errp)
  {
  const char *value;
  
  value = qemu_opt_get(opts, from);
  if (value) {
 +if (qemu_opt_find(opts, to)) {
 +error_setg(errp, '%s' and its alias '%s' can't be used at the 

Maybe specify we are talking about options in the error message so the user is
provided with a bit of context about what's happening.

 +error_setg(errp, the option '%s' and its alias '%s' can't be 
 used at the 
 +   same time, to, from);
 +return;
 +}

Anyway:

Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v3] docs: add blkdebug block driver documentation

2014-09-24 Thread Benoît Canet

The Wednesday 24 Sep 2014 à 10:44:14 (+0100), Stefan Hajnoczi wrote :
 The blkdebug block driver is undocumented.  Documenting it is worthwhile
 since it offers powerful error injection features that are used by
 qemu-iotests test cases.
 
 This document will make it easier for people to learn about and use
 blkdebug.
 
 Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
 ---
 v3:
  * Fix tab space damage [Eric]
  * Rephrase event_names[] as full list of events [Eric]
  * Explain that blkdebug state is not observable from outside [Eric]
  * Clarify state 0 and state 1 [Eric]
 
 v2:
  * Added GPL v2 or later license and Red Hat copyright [Eric]
  * Expanded ini rules file explanation [Paolo]
  * Added note that errno values depend on the host [Eric]
 
  docs/blkdebug.txt | 161 
 ++
  1 file changed, 161 insertions(+)
  create mode 100644 docs/blkdebug.txt
 
 diff --git a/docs/blkdebug.txt b/docs/blkdebug.txt
 new file mode 100644
 index 000..5dde072
 --- /dev/null
 +++ b/docs/blkdebug.txt
 @@ -0,0 +1,161 @@
 +Block I/O error injection using blkdebug
 +
 +Copyright (C) 2014 Red Hat Inc
 +
 +This work is licensed under the terms of the GNU GPL, version 2 or later.  
 See
 +the COPYING file in the top-level directory.
 +
 +The blkdebug block driver is a rule-based error injection engine.  It can be
 +used to exercise error code paths in block drivers including ENOSPC (out of
 +space) and EIO.
 +
 +This document gives an overview of the features available in blkdebug.
 +
 +Background
 +--
 +Block drivers have many error code paths that handle I/O errors.  Image 
 formats
 +are especially complex since metadata I/O errors during cluster allocation or
 +while updating tables happen halfway through request processing and require
 +discipline to keep image files consistent.
 +
 +Error injection allows test cases to trigger I/O errors at specific points.
 +This way, all error paths can be tested to make sure they are correct.
 +
 +Rules
 +-
 +The blkdebug block driver takes a list of rules that tell the error 
 injection
 +engine when to fail an I/O request.
 +
 +Each I/O request is evaluated against the rules.  If a rule matches the 
 request
 +then its action is executed.
 +
 +Rules can be placed in a configuration file; the configuration file
 +follows the same .ini-like format used by QEMU's -readconfig option, and
 +each section of the file represents a rule.
 +
 +The following configuration file defines a single rule:
 +
 +  $ cat blkdebug.conf
 +  [inject-error]
 +  event = read_aio
 +  errno = 28
 +
 +This rule fails all aio read requests with ENOSPC (28).  Note that the errno
 +value depends on the host.  On Linux, see
 +/usr/include/asm-generic/errno-base.h for errno values.
 +
 +Invoke QEMU as follows:
 +
 +  $ qemu-system-x86_64
 +-drive 
 if=none,cache=none,file=blkdebug:blkdebug.conf:test.img,id=drive0 \
 +-device virtio-blk-pci,drive=drive0,id=virtio-blk-pci0
 +
 +Rules support the following attributes:
 +
 +  event - which type of operation to match (e.g. read_aio, write_aio,
 +  flush_to_os, flush_to_disk).  See the Events section for
 +  information on events.
 +
 +  state - (optional) the engine must be in this state number in order for 
 this
 +  rule to match.  See the State transitions section for information
 +  on states.
 +
 +  errno - the numeric errno value to return when a request matches this rule.
 +  The errno values depend on the host since the numeric values are 
 not
 +  standarized in the POSIX specification.
 +
 +  sector - (optional) a sector number that the request must overlap in order 
 to
 +   match this rule
 +
 +  once - (optional, default off) only execute this action on the first
 + matching request
 +
 +  immediately - (optional, default off) return a NULL BlockDriverAIOCB
 +pointer and fail without an errno instead.  This exercises 
 the
 +code path where BlockDriverAIOCB fails and the caller's
 +BlockDriverCompletionFunc is not invoked.
 +
 +Events
 +--
 +Block drivers provide information about the type of I/O request they are 
 about
 +to make so rules can match specific types of requests.  For example, the 
 qcow2
 +block driver tells blkdebug when it accesses the L1 table so rules can match
 +only L1 table accesses and not other metadata or guest data requests.
 +
 +The core events are:
 +
 +  read_aio - guest data read
 +
 +  write_aio - guest data write
 +
 +  flush_to_os - write out unwritten block driver state (e.g. cached metadata)
 +
 +  flush_to_disk - flush the host block device's disk cache
 +
 +See block/blkdebug.c:event_names[] for the full list of events.  You may need
 +to grep block driver source code to understand the meaning of specific 
 events.
 +
 +State transitions
 +-
 +There are cases where

Re: [Qemu-devel] [PATCH v3 23/23] block: Make device model's references to BlockBackend strong

2014-09-22 Thread Benoît Canet

On Tue, Sep 16, 2014 at 08:12:28PM +0200, Markus Armbruster wrote:
 Doesn't make a difference just yet, but it's the right thing to do.
 
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  block/block-backend.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)
 
 diff --git a/block/block-backend.c b/block/block-backend.c
 index d49c988..5646628 100644
 --- a/block/block-backend.c
 +++ b/block/block-backend.c
 @@ -253,6 +253,7 @@ int blk_attach_dev(BlockBackend *blk, void *dev)
  if (blk-dev) {
  return -EBUSY;
  }
 +blk_ref(blk);
  blk-dev = dev;
  bdrv_iostatus_reset(blk-bs);
  
 @@ -281,9 +282,10 @@ void blk_detach_dev(BlockBackend *blk, void *dev)
  /* TODO change to DeviceState *dev when all users are qdevified */
  {
  assert(blk-dev == dev);
 -blk-dev = NULL;
  blk-dev_ops = NULL;
  blk-dev_opaque = NULL;
 +blk-dev = NULL;
 +blk_unref(blk);
  bdrv_set_guest_block_size(blk-bs, 512);
  qemu_coroutine_adjust_pool_size(-COROUTINE_POOL_RESERVATION);
  }
 -- 
 1.9.3
 
Reviewed-by: Benoit Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v3 21/23] blockdev: Convert qmp_eject(), qmp_change_blockdev() to BlockBackend

2014-09-22 Thread Benoît Canet

On Tue, Sep 16, 2014 at 08:12:26PM +0200, Markus Armbruster wrote:
 Much more command code needs conversion.  I'm converting these now
 because they's using bdrv_dev_* functions, which I'm about to lift
 into BlockBackend.
 
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  blockdev.c | 20 
  1 file changed, 12 insertions(+), 8 deletions(-)
 
 diff --git a/blockdev.c b/blockdev.c
 index e218c59..e115bde 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -1502,8 +1502,10 @@ exit:
  }
  
  
 -static void eject_device(BlockDriverState *bs, int force, Error **errp)
 +static void eject_device(BlockBackend *blk, int force, Error **errp)
  {
 +BlockDriverState *bs = blk_bs(blk);
 +
  if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_EJECT, errp)) {
  return;
  }
 @@ -1527,15 +1529,15 @@ static void eject_device(BlockDriverState *bs, int 
 force, Error **errp)
  
  void qmp_eject(const char *device, bool has_force, bool force, Error **errp)
  {
 -BlockDriverState *bs;
 +BlockBackend *blk;
  
 -bs = bdrv_find(device);
 -if (!bs) {
 +blk = blk_by_name(device);
 +if (!blk) {
  error_set(errp, QERR_DEVICE_NOT_FOUND, device);
  return;
  }
  
 -eject_device(bs, force, errp);
 +eject_device(blk, force, errp);
  }
  
  void qmp_block_passwd(bool has_device, const char *device,
 @@ -1594,16 +1596,18 @@ static void qmp_bdrv_open_encrypted(BlockDriverState 
 *bs, const char *filename,
  void qmp_change_blockdev(const char *device, const char *filename,
   const char *format, Error **errp)
  {
 +BlockBackend *blk;
  BlockDriverState *bs;
  BlockDriver *drv = NULL;
  int bdrv_flags;
  Error *err = NULL;
  
 -bs = bdrv_find(device);
 -if (!bs) {
 +blk = blk_by_name(device);
 +if (!blk) {
  error_set(errp, QERR_DEVICE_NOT_FOUND, device);
  return;
  }
 +bs = blk_bs(blk);
  
  if (format) {
  drv = bdrv_find_whitelisted_format(format, bs-read_only);
 @@ -1613,7 +1617,7 @@ void qmp_change_blockdev(const char *device, const char 
 *filename,
  }
  }
  
 -eject_device(bs, 0, err);
 +eject_device(blk, 0, err);
  if (err) {
  error_propagate(errp, err);
  return;
 -- 
 1.9.3
 
Reviewed-by: Benoit Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v3 22/23] block: Lift device model API into BlockBackend

2014-09-22 Thread Benoît Canet

On Tue, Sep 16, 2014 at 08:12:27PM +0200, Markus Armbruster wrote:
 Move device model attachment / detachment and the BlockDevOps device
 model callbacks and their wrappers from BlockDriverState to
 BlockBackend.
 
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  block.c| 126 --
  block/block-backend.c  | 151 
 ++---
  block/qapi.c   |   8 +--
  blockdev.c |   8 +--
  include/block/block.h  |  45 
  include/block/block_int.h  |  12 ++--
  include/sysemu/block-backend.h |  35 ++
  7 files changed, 203 insertions(+), 182 deletions(-)
 
 diff --git a/block.c b/block.c
 index 1d9a680..fac1211 100644
 --- a/block.c
 +++ b/block.c
 @@ -58,9 +58,6 @@ struct BdrvDirtyBitmap {
  
  #define NOT_DONE 0x7fff /* used while emulated sync operation in 
 progress */
  
 -#define COROUTINE_POOL_RESERVATION 64 /* number of coroutines to reserve */
 -
 -static void bdrv_dev_change_media_cb(BlockDriverState *bs, bool load);
  static BlockAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
  int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
  BlockCompletionFunc *cb, void *opaque);
 @@ -1527,7 +1524,9 @@ int bdrv_open(BlockDriverState **pbs, const char 
 *filename,
  }
  
  if (!bdrv_key_required(bs)) {
 -bdrv_dev_change_media_cb(bs, true);
 +if (bs-blk) {
 +blk_dev_change_media_cb(bs-blk, true);
 +}
  } else if (!runstate_check(RUN_STATE_PRELAUNCH)
  !runstate_check(RUN_STATE_INMIGRATE)
  !runstate_check(RUN_STATE_PAUSED)) { /* HACK */
 @@ -1852,7 +1851,9 @@ void bdrv_close(BlockDriverState *bs)
  }
  }
  
 -bdrv_dev_change_media_cb(bs, false);
 +if (bs-blk) {
 +blk_dev_change_media_cb(bs-blk, false);
 +}
  
  /*throttling disk I/O limits*/
  if (bs-io_limits_enabled) {
 @@ -1971,9 +1972,6 @@ static void bdrv_move_feature_fields(BlockDriverState 
 *bs_dest,
  /* move some fields that need to stay attached to the device */
  
  /* dev info */
 -bs_dest-dev_ops= bs_src-dev_ops;
 -bs_dest-dev_opaque = bs_src-dev_opaque;
 -bs_dest-dev= bs_src-dev;
  bs_dest-guest_block_size   = bs_src-guest_block_size;
  bs_dest-copy_on_read   = bs_src-copy_on_read;
  
 @@ -2041,7 +2039,6 @@ void bdrv_swap(BlockDriverState *bs_new, 
 BlockDriverState *bs_old)
  assert(!bs_new-blk);
  assert(QLIST_EMPTY(bs_new-dirty_bitmaps));
  assert(bs_new-job == NULL);
 -assert(bs_new-dev == NULL);
  assert(bs_new-io_limits_enabled == false);
  assert(!throttle_have_timer(bs_new-throttle_state));
  
 @@ -2058,7 +2055,6 @@ void bdrv_swap(BlockDriverState *bs_new, 
 BlockDriverState *bs_old)
  assert(!bs_new-blk);
  
  /* Check a few fields that should remain attached to the device */
 -assert(bs_new-dev == NULL);
  assert(bs_new-job == NULL);
  assert(bs_new-io_limits_enabled == false);
  assert(!throttle_have_timer(bs_new-throttle_state));
 @@ -2097,7 +2093,6 @@ void bdrv_append(BlockDriverState *bs_new, 
 BlockDriverState *bs_top)
  
  static void bdrv_delete(BlockDriverState *bs)
  {
 -assert(!bs-dev);
  assert(!bs-job);
  assert(bdrv_op_blocker_is_empty(bs));
  assert(!bs-refcnt);
 @@ -2111,105 +2106,6 @@ static void bdrv_delete(BlockDriverState *bs)
  g_free(bs);
  }
  
 -int bdrv_attach_dev(BlockDriverState *bs, void *dev)
 -/* TODO change to DeviceState *dev when all users are qdevified */
 -{
 -if (bs-dev) {
 -return -EBUSY;
 -}
 -bs-dev = dev;
 -bdrv_iostatus_reset(bs);
 -
 -/* We're expecting I/O from the device so bump up coroutine pool size */
 -qemu_coroutine_adjust_pool_size(COROUTINE_POOL_RESERVATION);
 -return 0;
 -}
 -
 -/* TODO qdevified devices don't use this, remove when devices are qdevified 
 */
 -void bdrv_attach_dev_nofail(BlockDriverState *bs, void *dev)
 -{
 -if (bdrv_attach_dev(bs, dev)  0) {
 -abort();
 -}
 -}
 -
 -void bdrv_detach_dev(BlockDriverState *bs, void *dev)
 -/* TODO change to DeviceState *dev when all users are qdevified */
 -{
 -assert(bs-dev == dev);
 -bs-dev = NULL;
 -bs-dev_ops = NULL;
 -bs-dev_opaque = NULL;
 -bs-guest_block_size = 512;
 -qemu_coroutine_adjust_pool_size(-COROUTINE_POOL_RESERVATION);
 -}
 -
 -/* TODO change to return DeviceState * when all users are qdevified */
 -void *bdrv_get_attached_dev(BlockDriverState *bs)
 -{
 -return bs-dev;
 -}
 -
 -void bdrv_set_dev_ops(BlockDriverState *bs, const BlockDevOps *ops,
 -  void *opaque)
 -{
 -bs-dev_ops = ops;
 -bs-dev_opaque = opaque;
 -}
 -
 -static void bdrv_dev_change_media_cb(BlockDriverState *bs, bool load)
 -{
 -if (bs-dev_ops  bs-dev_ops-change_media_cb) {
 -bool tray_was_closed =

Re: [Qemu-devel] [PATCH v3 20/23] block/qapi: Convert qmp_query_block() to BlockBackend

2014-09-22 Thread Benoît Canet

On Tue, Sep 16, 2014 at 08:12:25PM +0200, Markus Armbruster wrote:
 Much more command code needs conversion.  I start with this one
 because it's using bdrv_dev_* functions, which I'm about to lift into
 BlockBackend.
 
 While there, give bdrv_query_info() internal linkage.
 
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  block/qapi.c | 15 ---
  include/block/qapi.h |  3 ---
  2 files changed, 8 insertions(+), 10 deletions(-)
 
 diff --git a/block/qapi.c b/block/qapi.c
 index d071ee5..fca981d 100644
 --- a/block/qapi.c
 +++ b/block/qapi.c
 @@ -28,6 +28,7 @@
  #include qapi-visit.h
  #include qapi/qmp-output-visitor.h
  #include qapi/qmp/types.h
 +#include sysemu/block-backend.h
  #ifdef __linux__
  #include linux/fs.h
  #include sys/ioctl.h
 @@ -264,15 +265,15 @@ void bdrv_query_image_info(BlockDriverState *bs,
  }
  
  /* @p_info will be set only on success. */
 -void bdrv_query_info(BlockDriverState *bs,
 - BlockInfo **p_info,
 - Error **errp)
 +static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info,
 +Error **errp)
  {
  BlockInfo *info = g_malloc0(sizeof(*info));
 +BlockDriverState *bs = blk_bs(blk);
  BlockDriverState *bs0;
  ImageInfo **p_image_info;
  Error *local_err = NULL;
 -info-device = g_strdup(bdrv_get_device_name(bs));
 +info-device = g_strdup(blk_name(blk));
  info-type = g_strdup(unknown);
  info-locked = bdrv_dev_is_medium_locked(bs);
  info-removable = bdrv_dev_has_removable_media(bs);
 @@ -360,12 +361,12 @@ static BlockStats *bdrv_query_stats(const 
 BlockDriverState *bs)
  BlockInfoList *qmp_query_block(Error **errp)
  {
  BlockInfoList *head = NULL, **p_next = head;
 -BlockDriverState *bs = NULL;
 +BlockBackend *blk;
  Error *local_err = NULL;
  
 - while ((bs = bdrv_next(bs))) {
 +for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
  BlockInfoList *info = g_malloc0(sizeof(*info));
 -bdrv_query_info(bs, info-value, local_err);
 +bdrv_query_info(blk, info-value, local_err);
  if (local_err) {
  error_propagate(errp, local_err);
  goto err;
 diff --git a/include/block/qapi.h b/include/block/qapi.h
 index 0374546..168d788 100644
 --- a/include/block/qapi.h
 +++ b/include/block/qapi.h
 @@ -36,9 +36,6 @@ int bdrv_query_snapshot_info_list(BlockDriverState *bs,
  void bdrv_query_image_info(BlockDriverState *bs,
 ImageInfo **p_info,
 Error **errp);
 -void bdrv_query_info(BlockDriverState *bs,
 - BlockInfo **p_info,
 - Error **errp);
  
  void bdrv_snapshot_dump(fprintf_function func_fprintf, void *f,
  QEMUSnapshotInfo *sn);
 -- 
 1.9.3
 
Reviewed-by: Benoit Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH] block: Make op blockers recursive

2014-09-22 Thread Benoît Canet

On Thu, Sep 18, 2014 at 10:57:48AM +0800, Fam Zheng wrote:

BLOCK_OP_TYPE_MIRROR_REPLACE is checked and blocked by 
block-job-complete
during the time the mirror finish when an arbitrary node of the graph 
must be
replaced.
   
   It seems to me mirror unblocks this operation from the job-blocker when 
   job
   starts, and never block it (with the job-blocker) again. It's leaked.
   
  
  block-job-complete will block it in mirror_complete.
  
  BLOCK_OP_TYPE_MIRROR_REPLACE is blocked by driver-mirror code triggered by
  block-job complete to block the replaces BDS during the end of the 
  mirroring.
  
  If you find silly that block-job-complete prevent itself from running twice 
  on
  the same BDS by checking the blocker then blocking it then the existing 
  code is
  wrong.
  
  Else the code in this current patch is correct.
  
  Could you have a glance at static void mirror_complete(BlockJob *job, 
  Error **errp)
  and tell me what you think about the situation. You should also look at
  check_to_replace_node.
  
 
 I'd prefer early error from user's point of view, so maybe checking and
 blocking can be done during mirror_start instead, then we don't need the
 relaxation. What's the advantage to delay the check to block-job-complete?
 
 Fam

BLOCK_OP_TYPE_MIRROR_REPLACE is the blocker for the replacement work so I 
blocked
it where the replacement happen.
Another point is that until block-job-complete happen we are not completely sure
that this replacement operation will happen.

Best regards

Benoît

[Qemu-devel] [PATCH v2 2/2] block: Make op blockers recursive

2014-09-22 Thread Benoît Canet

Since the block layer code is starting to modify the BDS graph right in the
middle of BDS chains (block-mirror's replace parameter for example) QEMU needs
to properly block and unblock whole BDS subtrees; recursion is a neat way to
achieve this task.

An optional base arguments was added to the API to optionally allow blocking
only a part of a BDS chain.

This patch also takes care of modifying the op blockers users.

Signed-off-by: Benoit Canet benoit.ca...@nodalink.com
---
 block-migration.c |   4 +-
 block.c   | 121 ++
 block/blkverify.c |  21 
 block/commit.c|   2 +
 block/mirror.c|  26 +++---
 block/quorum.c|  29 +++
 block/stream.c|   2 +
 block/vmdk.c  |  32 
 blockjob.c|   6 +--
 include/block/block.h |  12 +++--
 include/block/block_int.h |  10 
 11 files changed, 240 insertions(+), 25 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index 3ad31a2..ad7aa03 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -362,7 +362,7 @@ static void init_blk_migration_it(void *opaque, 
BlockDriverState *bs)
 bmds-shared_base = block_mig_state.shared_base;
 alloc_aio_bitmap(bmds);
 error_setg(bmds-blocker, block device is in use by migration);
-bdrv_op_block_all(bs, bmds-blocker);
+bdrv_op_block_all(bs, NULL, bmds-blocker);
 bdrv_ref(bs);
 
 block_mig_state.total_sector_sum += sectors;
@@ -600,7 +600,7 @@ static void blk_mig_cleanup(void)
 blk_mig_lock();
 while ((bmds = QSIMPLEQ_FIRST(block_mig_state.bmds_list)) != NULL) {
 QSIMPLEQ_REMOVE_HEAD(block_mig_state.bmds_list, entry);
-bdrv_op_unblock_all(bmds-bs, bmds-blocker);
+bdrv_op_unblock_all(bmds-bs, NULL, bmds-blocker);
 error_free(bmds-blocker);
 bdrv_unref(bmds-bs);
 g_free(bmds-aio_bitmap);
diff --git a/block.c b/block.c
index 9a9f8a0..b2d6953 100644
--- a/block.c
+++ b/block.c
@@ -1148,7 +1148,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd)
 
 if (bs-backing_hd) {
 assert(bs-backing_blocker);
-bdrv_op_unblock_all(bs-backing_hd, bs-backing_blocker);
+bdrv_op_unblock_all(bs-backing_hd, NULL, bs-backing_blocker);
 } else if (backing_hd) {
 error_setg(bs-backing_blocker,
device is used as backing hd of '%s',
@@ -1166,9 +1166,9 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd)
 pstrcpy(bs-backing_format, sizeof(bs-backing_format),
 backing_hd-drv ? backing_hd-drv-format_name : );
 
-bdrv_op_block_all(bs-backing_hd, bs-backing_blocker);
+bdrv_op_block_all(bs-backing_hd, NULL, bs-backing_blocker);
 /* Otherwise we won't be able to commit due to check in bdrv_commit */
-bdrv_op_unblock(bs-backing_hd, BLOCK_OP_TYPE_COMMIT,
+bdrv_op_unblock(bs-backing_hd, NULL, BLOCK_OP_TYPE_COMMIT,
 bs-backing_blocker);
 out:
 bdrv_refresh_limits(bs, NULL);
@@ -5482,7 +5482,9 @@ bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType 
op, Error **errp)
 return false;
 }
 
-void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
+/* do the real work of blocking a BDS */
+static void bdrv_do_op_block(BlockDriverState *bs, BlockOpType op,
+ Error *reason)
 {
 BdrvOpBlocker *blocker;
 assert((int) op = 0  op  BLOCK_OP_TYPE_MAX);
@@ -5492,7 +5494,9 @@ void bdrv_op_block(BlockDriverState *bs, BlockOpType op, 
Error *reason)
 QLIST_INSERT_HEAD(bs-op_blockers[op], blocker, list);
 }
 
-void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason)
+/* do the real work of unblocking a BDS */
+static void bdrv_do_op_unblock(BlockDriverState *bs, BlockOpType op,
+   Error *reason)
 {
 BdrvOpBlocker *blocker, *next;
 assert((int) op = 0  op  BLOCK_OP_TYPE_MAX);
@@ -5504,19 +5508,118 @@ void bdrv_op_unblock(BlockDriverState *bs, BlockOpType 
op, Error *reason)
 }
 }
 
-void bdrv_op_block_all(BlockDriverState *bs, Error *reason)
+static bool bdrv_op_is_blocked_by(BlockDriverState *bs, BlockOpType op,
+  Error *reason)
+{
+BdrvOpBlocker *blocker;
+assert((int) op = 0  op  BLOCK_OP_TYPE_MAX);
+QLIST_FOREACH(blocker, bs-op_blockers[op], list) {
+if (blocker-reason == reason) {
+return true;
+}
+}
+return false;
+}
+
+/* block recursively a BDS
+ *
+ * If base != NULL the caller must make sure that there is no multiple child
+ * BDS in the subtree pointed to by bs
+ *
+ * @bs: the BDS to start to recurse from
+ * @base:   the BDS where the recursion should end
+ *  could be NULL if the recursion should go down everywhere
+ * @op: the type of op blocker to block
+ * @reason: the error

[Qemu-devel] [PATCH v2 0/2] Recursive op blockers

2014-09-22 Thread Benoît Canet

in v2:
Added a base parameters [Kevin]
Remove multiple blockers [Kevin]

Addressed Kevin's comments except BLOCK_OP_TYPE_MIRROR_REPLACE relaxing
and You need to unblock all of the BDSes that are going to be removed.
Aren't you unblocking more than that here?.


pass ./check -qcow2

Benoît Canet (2):
  block: Rename BLOCK_OP_TYPE_REPLACE to BLOCK_OP_TYPE_MIRROR_REPLACE
  block: Make op blockers recursive

 block-migration.c |   4 +-
 block.c   | 123 ++
 block/blkverify.c |  21 
 block/commit.c|   2 +
 block/mirror.c|  26 +++---
 block/quorum.c|  29 +++
 block/stream.c|   2 +
 block/vmdk.c  |  32 
 blockjob.c|   6 +--
 include/block/block.h |  14 --
 include/block/block_int.h |  10 
 11 files changed, 242 insertions(+), 27 deletions(-)

-- 
2.1.0

[Qemu-devel] [PATCH v2 1/2] block: Rename BLOCK_OP_TYPE_REPLACE to BLOCK_OP_TYPE_MIRROR_REPLACE

2014-09-22 Thread Benoît Canet

This operation blocker is really specific to the mirroring code so its name
should reflect this.

Signed-off-by: Benoît Canet benoit.ca...@nodalink.com
---
 block.c   | 2 +-
 include/block/block.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index bcd952a..9a9f8a0 100644
--- a/block.c
+++ b/block.c
@@ -5912,7 +5912,7 @@ BlockDriverState *check_to_replace_node(const char 
*node_name, Error **errp)
 return NULL;
 }
 
-if (bdrv_op_is_blocked(to_replace_bs, BLOCK_OP_TYPE_REPLACE, errp)) {
+if (bdrv_op_is_blocked(to_replace_bs, BLOCK_OP_TYPE_MIRROR_REPLACE, errp)) 
{
 return NULL;
 }
 
diff --git a/include/block/block.h b/include/block/block.h
index 07d6d8e..10952ed 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -176,7 +176,7 @@ typedef enum BlockOpType {
 BLOCK_OP_TYPE_MIRROR,
 BLOCK_OP_TYPE_RESIZE,
 BLOCK_OP_TYPE_STREAM,
-BLOCK_OP_TYPE_REPLACE,
+BLOCK_OP_TYPE_MIRROR_REPLACE,
 BLOCK_OP_TYPE_MAX,
 } BlockOpType;
 
-- 
2.1.0

[Qemu-devel] [PATCH v2 2/2] block: Make op blockers recursive

2014-09-22 Thread Benoît Canet

Since the block layer code is starting to modify the BDS graph right in the
middle of BDS chains (block-mirror's replace parameter for example) QEMU needs
to properly block and unblock whole BDS subtrees; recursion is a neat way to
achieve this task.

An optional base arguments was added to the API to optionally allow blocking
only a part of a BDS chain.

This patch also takes care of modifying the op blockers users.

Signed-off-by: Benoit Canet benoit.ca...@nodalink.com
---
 block-migration.c |   4 +-
 block.c   | 121 ++
 block/blkverify.c |  21 
 block/commit.c|   2 +
 block/mirror.c|  26 +++---
 block/quorum.c|  29 +++
 block/stream.c|   2 +
 block/vmdk.c  |  32 
 blockjob.c|   6 +--
 include/block/block.h |  12 +++--
 include/block/block_int.h |  10 
 11 files changed, 240 insertions(+), 25 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index 3ad31a2..ad7aa03 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -362,7 +362,7 @@ static void init_blk_migration_it(void *opaque, 
BlockDriverState *bs)
 bmds-shared_base = block_mig_state.shared_base;
 alloc_aio_bitmap(bmds);
 error_setg(bmds-blocker, block device is in use by migration);
-bdrv_op_block_all(bs, bmds-blocker);
+bdrv_op_block_all(bs, NULL, bmds-blocker);
 bdrv_ref(bs);
 
 block_mig_state.total_sector_sum += sectors;
@@ -600,7 +600,7 @@ static void blk_mig_cleanup(void)
 blk_mig_lock();
 while ((bmds = QSIMPLEQ_FIRST(block_mig_state.bmds_list)) != NULL) {
 QSIMPLEQ_REMOVE_HEAD(block_mig_state.bmds_list, entry);
-bdrv_op_unblock_all(bmds-bs, bmds-blocker);
+bdrv_op_unblock_all(bmds-bs, NULL, bmds-blocker);
 error_free(bmds-blocker);
 bdrv_unref(bmds-bs);
 g_free(bmds-aio_bitmap);
diff --git a/block.c b/block.c
index 9a9f8a0..b2d6953 100644
--- a/block.c
+++ b/block.c
@@ -1148,7 +1148,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd)
 
 if (bs-backing_hd) {
 assert(bs-backing_blocker);
-bdrv_op_unblock_all(bs-backing_hd, bs-backing_blocker);
+bdrv_op_unblock_all(bs-backing_hd, NULL, bs-backing_blocker);
 } else if (backing_hd) {
 error_setg(bs-backing_blocker,
device is used as backing hd of '%s',
@@ -1166,9 +1166,9 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd)
 pstrcpy(bs-backing_format, sizeof(bs-backing_format),
 backing_hd-drv ? backing_hd-drv-format_name : );
 
-bdrv_op_block_all(bs-backing_hd, bs-backing_blocker);
+bdrv_op_block_all(bs-backing_hd, NULL, bs-backing_blocker);
 /* Otherwise we won't be able to commit due to check in bdrv_commit */
-bdrv_op_unblock(bs-backing_hd, BLOCK_OP_TYPE_COMMIT,
+bdrv_op_unblock(bs-backing_hd, NULL, BLOCK_OP_TYPE_COMMIT,
 bs-backing_blocker);
 out:
 bdrv_refresh_limits(bs, NULL);
@@ -5482,7 +5482,9 @@ bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType 
op, Error **errp)
 return false;
 }
 
-void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
+/* do the real work of blocking a BDS */
+static void bdrv_do_op_block(BlockDriverState *bs, BlockOpType op,
+ Error *reason)
 {
 BdrvOpBlocker *blocker;
 assert((int) op = 0  op  BLOCK_OP_TYPE_MAX);
@@ -5492,7 +5494,9 @@ void bdrv_op_block(BlockDriverState *bs, BlockOpType op, 
Error *reason)
 QLIST_INSERT_HEAD(bs-op_blockers[op], blocker, list);
 }
 
-void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason)
+/* do the real work of unblocking a BDS */
+static void bdrv_do_op_unblock(BlockDriverState *bs, BlockOpType op,
+   Error *reason)
 {
 BdrvOpBlocker *blocker, *next;
 assert((int) op = 0  op  BLOCK_OP_TYPE_MAX);
@@ -5504,19 +5508,118 @@ void bdrv_op_unblock(BlockDriverState *bs, BlockOpType 
op, Error *reason)
 }
 }
 
-void bdrv_op_block_all(BlockDriverState *bs, Error *reason)
+static bool bdrv_op_is_blocked_by(BlockDriverState *bs, BlockOpType op,
+  Error *reason)
+{
+BdrvOpBlocker *blocker;
+assert((int) op = 0  op  BLOCK_OP_TYPE_MAX);
+QLIST_FOREACH(blocker, bs-op_blockers[op], list) {
+if (blocker-reason == reason) {
+return true;
+}
+}
+return false;
+}
+
+/* block recursively a BDS
+ *
+ * If base != NULL the caller must make sure that there is no multiple child
+ * BDS in the subtree pointed to by bs
+ *
+ * @bs: the BDS to start to recurse from
+ * @base:   the BDS where the recursion should end
+ *  could be NULL if the recursion should go down everywhere
+ * @op: the type of op blocker to block
+ * @reason: the error

[Qemu-devel] [PATCH v2 1/2] block: Rename BLOCK_OP_TYPE_REPLACE to BLOCK_OP_TYPE_MIRROR_REPLACE

2014-09-22 Thread Benoît Canet

This operation blocker is really specific to the mirroring code so its name
should reflect this.

Signed-off-by: Benoît Canet benoit.ca...@nodalink.com
---
 block.c   | 2 +-
 include/block/block.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index bcd952a..9a9f8a0 100644
--- a/block.c
+++ b/block.c
@@ -5912,7 +5912,7 @@ BlockDriverState *check_to_replace_node(const char 
*node_name, Error **errp)
 return NULL;
 }
 
-if (bdrv_op_is_blocked(to_replace_bs, BLOCK_OP_TYPE_REPLACE, errp)) {
+if (bdrv_op_is_blocked(to_replace_bs, BLOCK_OP_TYPE_MIRROR_REPLACE, errp)) 
{
 return NULL;
 }
 
diff --git a/include/block/block.h b/include/block/block.h
index 07d6d8e..10952ed 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -176,7 +176,7 @@ typedef enum BlockOpType {
 BLOCK_OP_TYPE_MIRROR,
 BLOCK_OP_TYPE_RESIZE,
 BLOCK_OP_TYPE_STREAM,
-BLOCK_OP_TYPE_REPLACE,
+BLOCK_OP_TYPE_MIRROR_REPLACE,
 BLOCK_OP_TYPE_MAX,
 } BlockOpType;
 
-- 
2.1.0

[Qemu-devel] [PATCH v2 0/2] Recursive op blockers

2014-09-22 Thread Benoît Canet

in v2:
Added a base parameters [Kevin]
Remove multiple blockers [Kevin]

Addressed Kevin's comments except BLOCK_OP_TYPE_MIRROR_REPLACE relaxing
and You need to unblock all of the BDSes that are going to be removed.
Aren't you unblocking more than that here?.


pass ./check -qcow2

Benoît Canet (2):
  block: Rename BLOCK_OP_TYPE_REPLACE to BLOCK_OP_TYPE_MIRROR_REPLACE
  block: Make op blockers recursive

 block-migration.c |   4 +-
 block.c   | 123 ++
 block/blkverify.c |  21 
 block/commit.c|   2 +
 block/mirror.c|  26 +++---
 block/quorum.c|  29 +++
 block/stream.c|   2 +
 block/vmdk.c  |  32 
 blockjob.c|   6 +--
 include/block/block.h |  14 --
 include/block/block_int.h |  10 
 11 files changed, 242 insertions(+), 27 deletions(-)

-- 
2.1.0

Re: [Qemu-devel] [PATCH 0/2] Handle interferences between multiple children drivers and stream and commit

2014-09-19 Thread Benoît Canet

On Fri, Sep 19, 2014 at 12:35:36AM -0400, Jeff Cody wrote:

 On Mon, Sep 15, 2014 at 04:21:18PM +0200, Benoît Canet wrote:
  We do not want to try to stream or commit with a base argument through
  a multiple children driver.
  
  Handle this case.
  
  Benoît Canet (2):
block: Introduce a BlockDriver field to flag drivers supporting
  multiple children
block: commit and stream return an error when a subtree is found
  
   block.c   | 17 +++--
   block/blkverify.c |  1 +
   block/quorum.c|  1 +
   block/vmdk.c  |  1 +
   blockdev.c| 18 --
   include/block/block.h |  3 ++-
   include/block/block_int.h |  6 ++
   7 files changed, 38 insertions(+), 9 deletions(-)
  
  -- 
  2.1.0
 
 
 
 I had some specific comments on the implementation, but once I stepped
 back some I am not sure this series is needed.
 
 When we talked on irc, you mentioned it was needed because your
 recursive blocker code would assert in quorum/blkverify/vmdk - but I
 think that might mean the recursive code needs refactoring.
 
 Here is why I am not sure this is needed: we should be able to commit
 through/to an image that has multiple child nodes, if we can treat
 that image as an actual BDS.
 
 I view these as a black box, like so:
 
(imageE)
  
  | [childA] |
  | [childB] |
 [base]  |  | - [imageF] --- [active]
  | [childC] |
  ^   | [childD] |
  |   
  |
  |
 (Perhaps base isn't support by the format, in which case it will always
 be NULL anyway)
 
 
 ImageE may have multiple children (and perhaps each of those children
 are an entire chain themselves), but ultimately imageE is a 'BDS'
 entry.
 
 Perhaps the format, like quorum, will not have a backing_hd pointer.
 But in that case, backing_hd will be NULL.
 
 Thinking about your recursive blocker code, why assert when the 'base'
 termination argument is NULL?  If the multi-child format does not
 support a backing_hd, then bdrv_find_base_image() will just find it as
 the end.
 
 The .bdrv_op_recursive_block() in that case could still navigate down
 each private child node, because if the original 'base' blocker
 applied to the multi-child parent (imageE), then that same blocker
 should apply to each private child (and their children), regardless of
 'base' - because they are essentially part of the pseudo-atomic entity
 of (imageE).
 
 Does this make sense, or am I missing a piece?

Thanks for the reply Jeff.

I think I'll drop the entry I CCed you, remove the assert then work to post
the new recursive blocker code.

Best regards

Benoît

 
 Thanks,
 Jeff

Re: [Qemu-devel] [PATCH v3 10/23] block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()

2014-09-17 Thread Benoît Canet

On Tue, Sep 16, 2014 at 08:12:15PM +0200, Markus Armbruster wrote:

 -s-enabled = dinfo ? bdrv_is_inserted(dinfo-bdrv) : 0;
 +s-enabled = bs  bdrv_is_inserted(bs);

This is not so mechanical but seems correct anyway.

Reviewed-by: Benoit Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v3 12/23] virtio-blk: Drop redundant VirtIOBlock member conf

2014-09-17 Thread Benoît Canet

On Tue, Sep 16, 2014 at 08:12:17PM +0200, Markus Armbruster wrote:
 Commit 12c5674 turned it into a pointer to member blk.conf.
 
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  hw/block/virtio-blk.c  | 28 ++--
  include/hw/virtio/virtio-blk.h |  1 -
  2 files changed, 14 insertions(+), 15 deletions(-)
 
 diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
 index 38ad38f..5943af5 100644
 --- a/hw/block/virtio-blk.c
 +++ b/hw/block/virtio-blk.c
 @@ -298,7 +298,7 @@ static bool virtio_blk_sect_range_ok(VirtIOBlock *dev,
  if (sector  dev-sector_mask) {
  return false;
  }
 -if (size % dev-conf-logical_block_size) {
 +if (size % dev-blk.conf.logical_block_size) {
  return false;
  }
  bdrv_get_geometry(dev-bs, total_sectors);
 @@ -519,19 +519,20 @@ static void virtio_blk_reset(VirtIODevice *vdev)
  static void virtio_blk_update_config(VirtIODevice *vdev, uint8_t *config)
  {
  VirtIOBlock *s = VIRTIO_BLK(vdev);
 +BlockConf *conf = s-blk.conf;
  struct virtio_blk_config blkcfg;
  uint64_t capacity;
 -int blk_size = s-conf-logical_block_size;
 +int blk_size = conf-logical_block_size;
  
  bdrv_get_geometry(s-bs, capacity);
  memset(blkcfg, 0, sizeof(blkcfg));
  virtio_stq_p(vdev, blkcfg.capacity, capacity);
  virtio_stl_p(vdev, blkcfg.seg_max, 128 - 2);
 -virtio_stw_p(vdev, blkcfg.cylinders, s-conf-cyls);
 +virtio_stw_p(vdev, blkcfg.cylinders, conf-cyls);
  virtio_stl_p(vdev, blkcfg.blk_size, blk_size);
 -virtio_stw_p(vdev, blkcfg.min_io_size, s-conf-min_io_size / blk_size);
 -virtio_stw_p(vdev, blkcfg.opt_io_size, s-conf-opt_io_size / blk_size);
 -blkcfg.heads = s-conf-heads;
 +virtio_stw_p(vdev, blkcfg.min_io_size, conf-min_io_size / blk_size);
 +virtio_stw_p(vdev, blkcfg.opt_io_size, conf-opt_io_size / blk_size);
 +blkcfg.heads = conf-heads;
  /*
   * We must ensure that the block device capacity is a multiple of
   * the logical block size. If that is not the case, let's use
 @@ -543,13 +544,13 @@ static void virtio_blk_update_config(VirtIODevice 
 *vdev, uint8_t *config)
   * divided by 512 - instead it is the amount of blk_size blocks
   * per track (cylinder).
   */
 -if (bdrv_getlength(s-bs) /  s-conf-heads / s-conf-secs % blk_size) {
 -blkcfg.sectors = s-conf-secs  ~s-sector_mask;
 +if (bdrv_getlength(s-bs) /  conf-heads / conf-secs % blk_size) {
 +blkcfg.sectors = conf-secs  ~s-sector_mask;
  } else {
 -blkcfg.sectors = s-conf-secs;
 +blkcfg.sectors = conf-secs;
  }
  blkcfg.size_max = 0;
 -blkcfg.physical_block_exp = get_physical_block_exp(s-conf);
 +blkcfg.physical_block_exp = get_physical_block_exp(s-blk.conf);
  blkcfg.alignment_offset = 0;
  blkcfg.wce = bdrv_enable_write_cache(s-bs);
  memcpy(config, blkcfg, sizeof(struct virtio_blk_config));
 @@ -756,9 +757,8 @@ static void virtio_blk_device_realize(DeviceState *dev, 
 Error **errp)
  sizeof(struct virtio_blk_config));
  
  s-bs = blk-conf.bs;
 -s-conf = blk-conf;
  s-rq = NULL;
 -s-sector_mask = (s-conf-logical_block_size / BDRV_SECTOR_SIZE) - 1;
 +s-sector_mask = (s-blk.conf.logical_block_size / BDRV_SECTOR_SIZE) - 1;
  
  s-vq = virtio_add_queue(vdev, 128, virtio_blk_handle_output);
  s-complete_request = virtio_blk_complete_request;
 @@ -777,11 +777,11 @@ static void virtio_blk_device_realize(DeviceState *dev, 
 Error **errp)
  register_savevm(dev, virtio-blk, virtio_blk_id++, 2,
  virtio_blk_save, virtio_blk_load, s);
  bdrv_set_dev_ops(s-bs, virtio_block_ops, s);
 -bdrv_set_guest_block_size(s-bs, s-conf-logical_block_size);
 +bdrv_set_guest_block_size(s-bs, s-blk.conf.logical_block_size);
  
  bdrv_iostatus_enable(s-bs);
  
 -add_boot_device_path(s-conf-bootindex, dev, /disk@0,0);
 +add_boot_device_path(s-blk.conf.bootindex, dev, /disk@0,0);
  }
  
  static void virtio_blk_device_unrealize(DeviceState *dev, Error **errp)
 diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
 index cf61154..18967c8 100644
 --- a/include/hw/virtio/virtio-blk.h
 +++ b/include/hw/virtio/virtio-blk.h
 @@ -125,7 +125,6 @@ typedef struct VirtIOBlock {
  VirtQueue *vq;
  void *rq;
  QEMUBH *bh;
 -BlockConf *conf;
  VirtIOBlkConf blk;
  unsigned short sector_mask;
  bool original_wce;
 -- 
 1.9.3
 

Look good

Reviewed-by: Benoit Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v3 13/23] virtio-blk: Rename VirtIOBlkConf variables to conf

2014-09-17 Thread Benoît Canet

 -if (size % dev-blk.conf.logical_block_size) {
 +if (size % dev-conf.conf.logical_block_size) {

This look strange (conf.conf)

Anyway this seems correct.

Reviewed-by: Benoit Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v3 17/23] blockdev: Drop superfluous DriveInfo member id

2014-09-17 Thread Benoît Canet

On Tue, Sep 16, 2014 at 08:12:22PM +0200, Markus Armbruster wrote:
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  block/block-backend.c | 1 -
  blockdev.c| 3 +--
  include/sysemu/blockdev.h | 1 -
  3 files changed, 1 insertion(+), 4 deletions(-)
 
 diff --git a/block/block-backend.c b/block/block-backend.c
 index 5f796b4..141a31b 100644
 --- a/block/block-backend.c
 +++ b/block/block-backend.c
 @@ -104,7 +104,6 @@ static void drive_info_del(DriveInfo *dinfo)
  if (dinfo-opts) {
  qemu_opts_del(dinfo-opts);
  }
 -g_free(dinfo-id);
  g_free(dinfo-serial);
  g_free(dinfo);
  }
 diff --git a/blockdev.c b/blockdev.c
 index e87bf4b..2def256 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -458,7 +458,6 @@ static BlockBackend *blockdev_init(const char *file, 
 QDict *bs_opts,
  }
  
  dinfo = g_malloc0(sizeof(*dinfo));
 -dinfo-id = g_strdup(qemu_opts_id(opts));
  blk_set_legacy_dinfo(blk, dinfo);
  
  if (!file || !*file) {
 @@ -492,7 +491,7 @@ static BlockBackend *blockdev_init(const char *file, 
 QDict *bs_opts,
  
  if (ret  0) {
  error_setg(errp, could not open disk image %s: %s,
 -   file ?: dinfo-id, error_get_pretty(error));
 +   file ?: blk_name(blk), error_get_pretty(error));
  error_free(error);
  goto err;
  }
 diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h
 index f66b89a..27a40d5 100644
 --- a/include/sysemu/blockdev.h
 +++ b/include/sysemu/blockdev.h
 @@ -30,7 +30,6 @@ typedef enum {
  } BlockInterfaceType;
  
  struct DriveInfo {
 -char *id;
  const char *devaddr;
  BlockInterfaceType type;
  int bus;
 -- 
 1.9.3
 
Reviewed-by: Benoit Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v3 16/23] pc87312: Drop unused members of PC87312State

2014-09-17 Thread Benoît Canet

On Tue, Sep 16, 2014 at 08:12:21PM +0200, Markus Armbruster wrote:
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  include/hw/isa/pc87312.h | 3 ---
  1 file changed, 3 deletions(-)
 
 diff --git a/include/hw/isa/pc87312.h b/include/hw/isa/pc87312.h
 index befc8bd..bf74470 100644
 --- a/include/hw/isa/pc87312.h
 +++ b/include/hw/isa/pc87312.h
 @@ -47,13 +47,10 @@ typedef struct PC87312State {
  
  struct {
  ISADevice *dev;
 -BlockDriverState *drive[2];
 -uint32_t base;
  } fdc;
  
  struct {
  ISADevice *dev;
 -uint32_t base;
  } ide;
  
  MemoryRegion io;
 -- 
 1.9.3
 
Reviewed-by: Benoit Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH] block: Validate node-name

2014-09-17 Thread Benoît Canet



 +int qemu_opts_id_wellformed(const char *id)

This return 0 and 1 as a bool.
Could we make the function return bool in the same series ?

I wonder what are the possible interferences between !strchr(-._, id[i])
and Jeff's node name auto naming series.

Re: [Qemu-devel] [PATCH v3 18/23] blockdev: Fix blockdev-add not to create IDE drive (0, 0)

2014-09-17 Thread Benoît Canet

 -- 
 1.9.3
 
Reviewed-by: Benoit Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v3 19/23] blockdev: Drop DriveInfo member enable_auto_del

2014-09-17 Thread Benoît Canet

 1.9.3
 

Reviewed-by: Benoit Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH] block: Validate node-name

2014-09-17 Thread Benoît Canet

The Wednesday 17 Sep 2014 à 13:31:06 (+0200), Kevin Wolf wrote :
 The device_name of a BlockDriverState is currently checked because it is
 always used as a QemuOpts ID and qemu_opts_create() checks whether such
 IDs are wellformed.
 
 node-name is supposed to share the same namespace, but it isn't checked
 currently. This patch adds explicit checks both for device_name and
 node-name so that the same rules will still apply even if QemuOpts won't
 be used any more at some point.
 
 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
  block.c   | 16 +---
  include/qemu/option.h |  1 +
  util/qemu-option.c|  4 ++--
  3 files changed, 16 insertions(+), 5 deletions(-)
 
 diff --git a/block.c b/block.c
 index e144fd5..bddf1a0 100644
 --- a/block.c
 +++ b/block.c
 @@ -335,12 +335,22 @@ void bdrv_register(BlockDriver *bdrv)
  QLIST_INSERT_HEAD(bdrv_drivers, bdrv, list);
  }
  
 +static bool bdrv_is_valid_name(const char *name)
 +{
 +return qemu_opts_id_wellformed(name);
 +}
 +
  /* create a new block device (by default it is empty) */
  BlockDriverState *bdrv_new(const char *device_name, Error **errp)
  {
  BlockDriverState *bs;
  int i;
  
 +if (*device_name  !bdrv_is_valid_name(device_name)) {
 +error_setg(errp, Invalid device name);
 +return NULL;
 +}
 +
  if (bdrv_find(device_name)) {
  error_setg(errp, Device with id '%s' already exists,
 device_name);
 @@ -903,9 +913,9 @@ static void bdrv_assign_node_name(BlockDriverState *bs,
  return;
  }
  
 -/* empty string node name is invalid */
 -if (node_name[0] == '\0') {
 -error_setg(errp, Empty node name);
 +/* Check for empty string or invalid characters */
 +if (!bdrv_is_valid_name(node_name)) {
 +error_setg(errp, Invalid node name);
  return;
  }
  
 diff --git a/include/qemu/option.h b/include/qemu/option.h
 index 59bea75..945347c 100644
 --- a/include/qemu/option.h
 +++ b/include/qemu/option.h
 @@ -103,6 +103,7 @@ typedef int (*qemu_opt_loopfunc)(const char *name, const 
 char *value, void *opaq
  int qemu_opt_foreach(QemuOpts *opts, qemu_opt_loopfunc func, void *opaque,
   int abort_on_failure);
  
 +int qemu_opts_id_wellformed(const char *id);
  QemuOpts *qemu_opts_find(QemuOptsList *list, const char *id);
  QemuOpts *qemu_opts_create(QemuOptsList *list, const char *id,
 int fail_if_exists, Error **errp);
 diff --git a/util/qemu-option.c b/util/qemu-option.c
 index 6dc27ce..0cf9960 100644
 --- a/util/qemu-option.c
 +++ b/util/qemu-option.c
 @@ -641,7 +641,7 @@ QemuOpts *qemu_opts_find(QemuOptsList *list, const char 
 *id)
  return NULL;
  }
  
 -static int id_wellformed(const char *id)
 +int qemu_opts_id_wellformed(const char *id)
  {
  int i;
  
 @@ -662,7 +662,7 @@ QemuOpts *qemu_opts_create(QemuOptsList *list, const char 
 *id,
  QemuOpts *opts = NULL;
  
  if (id) {
 -if (!id_wellformed(id)) {
 +if (!qemu_opts_id_wellformed(id)) {
  error_set(errp,QERR_INVALID_PARAMETER_VALUE, id, an 
 identifier);
  #if 0 /* conversion from qerror_report() to error_set() broke this: */
  error_printf_unless_qmp(Identifiers consist of letters, digits, 
 '-', '.', '_', starting with a letter.\n);
 -- 
 1.8.3.1
 

Reviewed-by: Benoit Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v2 04/23] block: Connect BlockBackend and DriveInfo

2014-09-16 Thread Benoît Canet

 +DriveInfo *blk_legacy_dinfo(BlockBackend *blk)
 +{
 +return blk-legacy_dinfo;
 +}
 +
 +/*
 + * Set @blk's DriveInfo to @dinfo, and return it.
 + * @blk must not have a DriveInfo set already.

 + * No other BlockBackend may have the same DriveInfo set.
Must we try to assert this ?
This would require an extra linked list traversal but I don't think it's a
performance path anyway.

Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v2 05/23] block: Code motion to get rid of stubs/blockdev.c

2014-09-16 Thread Benoît Canet

On Sat, Sep 13, 2014 at 05:00:09PM +0200, Markus Armbruster wrote:
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  block/block-backend.c | 15 +++
  blockdev.c| 13 -
  include/sysemu/blockdev.h |  1 -
  stubs/Makefile.objs   |  1 -
  stubs/blockdev.c  | 12 
  5 files changed, 15 insertions(+), 27 deletions(-)
  delete mode 100644 stubs/blockdev.c
 
 diff --git a/block/block-backend.c b/block/block-backend.c
 index 7b8c062..0842abe 100644
 --- a/block/block-backend.c
 +++ b/block/block-backend.c
 @@ -22,6 +22,8 @@ struct BlockBackend {
  QTAILQ_ENTRY(BlockBackend) link; /* for blk_backends */
  };
  
 +static void drive_info_del(DriveInfo *dinfo);
 +
  /* All the BlockBackends (except for hidden ones) */
  static QTAILQ_HEAD(, BlockBackend) blk_backends =
  QTAILQ_HEAD_INITIALIZER(blk_backends);
 @@ -92,6 +94,19 @@ static void blk_delete(BlockBackend *blk)
  g_free(blk);
  }
  
 +static void drive_info_del(DriveInfo *dinfo)
 +{
 +if (!dinfo) {
 +return;
 +}
 +if (dinfo-opts) {
 +qemu_opts_del(dinfo-opts);
 +}
 +g_free(dinfo-id);
 +g_free(dinfo-serial);
 +g_free(dinfo);
 +}
 +
  /*
   * Increment @blk's reference count.
   * @blk must not be null.
 diff --git a/blockdev.c b/blockdev.c
 index aec9f0e..0ed108d 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -223,19 +223,6 @@ void drive_del(DriveInfo *dinfo)
  blk_unref(blk_by_legacy_dinfo(dinfo));
  }
  
 -void drive_info_del(DriveInfo *dinfo)
 -{
 -if (!dinfo) {
 -return;
 -}
 -if (dinfo-opts) {
 -qemu_opts_del(dinfo-opts);
 -}
 -g_free(dinfo-id);
 -g_free(dinfo-serial);
 -g_free(dinfo);
 -}
 -
  typedef struct {
  QEMUBH *bh;
  BlockDriverState *bs;
 diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h
 index 1dc5906..2ed297b 100644
 --- a/include/sysemu/blockdev.h
 +++ b/include/sysemu/blockdev.h
 @@ -60,7 +60,6 @@ QemuOpts *drive_add(BlockInterfaceType type, int index, 
 const char *file,
  const char *optstr);
  DriveInfo *drive_new(QemuOpts *arg, BlockInterfaceType block_default_type);
  void drive_del(DriveInfo *dinfo);
 -void drive_info_del(DriveInfo *dinfo);
  
  /* device-hotplug */
  
 diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
 index c0b1f6a..5e347d0 100644
 --- a/stubs/Makefile.objs
 +++ b/stubs/Makefile.objs
 @@ -1,6 +1,5 @@
  stub-obj-y += arch-query-cpu-def.o
  stub-obj-y += bdrv-commit-all.o
 -stub-obj-y += blockdev.o
  stub-obj-y += chr-baum-init.o
  stub-obj-y += chr-msmouse.o
  stub-obj-y += chr-testdev.o
 diff --git a/stubs/blockdev.c b/stubs/blockdev.c
 deleted file mode 100644
 index 5d0a79c..000
 --- a/stubs/blockdev.c
 +++ /dev/null
 @@ -1,12 +0,0 @@
 -#include assert.h
 -#include sysemu/blockdev.h
 -
 -DriveInfo *drive_get_by_blockdev(BlockDriverState *bs)
 -{
 -return NULL;
 -}
 -
 -void drive_info_del(DriveInfo *dinfo)
 -{
 -assert(!dinfo);
 -}
 -- 
 1.9.3
 
Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v2 06/23] block: Make BlockBackend own its BlockDriverState

2014-09-16 Thread Benoît Canet

  /* blkdev-bs is not create by us, we get a reference
   * so we can bdrv_unref() unconditionally */
 -bdrv_ref(blkdev-bs);

 +/* Except we don't bdrv_unref() anymore, we blk_unref().
Is this dot extra ?   ^
The following line seems to make better sense without this dot and the initial 
cap.
 + * Conditionally, because we can't easily blk_ref() here.

 @@ -712,9 +711,7 @@ static int img_check(int argc, char **argv)
  
  fail:
  qapi_free_ImageCheck(check);
 -bdrv_unref(bs);
  blk_unref(blk);
 -
Spurious blank line removal here.

 @@ -483,9 +481,6 @@ int main(int argc, char **argv)
   */
  bdrv_drain_all();
  
 -if (qemuio_bs) {
 -bdrv_unref(qemuio_bs);
 -}
  blk_unref(qemuio_blk);
  g_free(readline_state);
  return 0;
 diff --git a/qemu-nbd.c b/qemu-nbd.c
 index fa8a7d0..0c496af 100644
 --- a/qemu-nbd.c
 +++ b/qemu-nbd.c
 @@ -771,7 +771,6 @@ int main(int argc, char **argv)

Well that does seems to simplify things a lot.

Re: [Qemu-devel] [PATCH v2 07/23] block: Eliminate bdrv_iterate(), use bdrv_next()

2014-09-16 Thread Benoît Canet

 @@ -4717,10 +4699,14 @@ static void monitor_find_completion_by_table(Monitor 
 *mon,
  break;
  case 'B':
  /* block device name completion */
 -mbs.mon = mon;
 -mbs.input = str;
  readline_set_completion_index(mon-rs, strlen(str));
 -bdrv_iterate(block_completion_it, mbs);
 +for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
 +name = bdrv_get_device_name(bs);
 +if (str[0] == '\0' ||
 +!strncmp(name, str, strlen(str))) {
 +readline_add_completion(mon-rs, name);
 +}
 +}
  break;
  case 's':
  case 'S':
 -- 
 1.9.3
 
Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v2 08/23] block: Eliminate BlockDriverState member device_name[]

2014-09-16 Thread Benoît Canet



  #include qemu-common.h
 -#include block/block_int.h
 +#include block/block.h
 +#include qemu/error-report.h
 +#include qemu/main-loop.h

That's a lot of include fiddling I am not sure to understand them while looking
at the following diff.

  #include hw/hw.h
  #include qemu/queue.h
  #include qemu/timer.h
 @@ -130,9 +132,9 @@ static void blk_send(QEMUFile *f, BlkMigBlock * blk)
   | flags);
  
  /* device name */
 -len = strlen(blk-bmds-bs-device_name);
 +len = strlen(bdrv_get_device_name(blk-bmds-bs));
  qemu_put_byte(f, len);
 -qemu_put_buffer(f, (uint8_t *)blk-bmds-bs-device_name, len);
 +qemu_put_buffer(f, (uint8_t *)bdrv_get_device_name(blk-bmds-bs), len);
  
  /* if a block is zero we need to flush here since the network
   * bandwidth is now a lot higher than the storage device bandwidth.
 @@ -382,9 +384,9 @@ static void init_blk_migration(QEMUFile *f)
  
  if (bmds-shared_base) {
  DPRINTF(Start migration for %s with shared base image\n,
 -bs-device_name);
 +bdrv_get_device_name(bs));
  } else {
 -DPRINTF(Start full migration for %s\n, bs-device_name);
 +DPRINTF(Start full migration for %s\n, 
 bdrv_get_device_name(bs));
  }
  
  QSIMPLEQ_INSERT_TAIL(block_mig_state.bmds_list, bmds, entry);


 @@ -1959,10 +1941,10 @@ void bdrv_drain_all(void)
   V  Also, NULL terminate the device_name to prevent double remove */
  void bdrv_make_anon(BlockDriverState *bs)
  {
 -if (bs-device_name[0] != '\0') {


 +if (bs-device_list.tqe_prev) {
  QTAILQ_REMOVE(bdrv_states, bs, device_list);
 +bs-device_list.tqe_prev = NULL;

I think a comments explaining the trick you are doing here would be worthy:
after all you are touching directly the inner parts of a linked list and
bypassing the list API.


  }
 -bs-device_name[0] = '\0';
  if (bs-node_name[0] != '\0') {
  QTAILQ_REMOVE(graph_bdrv_states, bs, node_list);
  }

Re: [Qemu-devel] [PATCH v2 09/23] block: Merge BlockBackend and BlockDriverState name spaces

2014-09-16 Thread Benoît Canet

 1.9.3
 
Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v2] async: aio_context_new(): Handle event_notifier_init failure

2014-09-16 Thread Benoît Canet

  static void iothread_complete(UserCreatable *obj, Error **errp)
  {
 +Error *local_error = NULL;
  IOThread *iothread = IOTHREAD(obj);
  
  iothread-stopping = false;
 -iothread-ctx = aio_context_new();
 +iothread-ctx = aio_context_new(local_error);
 +if (!iothread-ctx) {
 +error_report(%s, error_get_pretty(local_error));
 +error_report(Failed to create AIO context);

I think reporting one line is sufficient.

You could do something in the vein of.
error_report(Failed to create AIO Context: \'%s\', 
error_get_pretty(local_error));


 +exit(1);
Also here I don't know if exiting in the middle of a class initialization
completion function is good taste.
You should ask to someone knowing QOM.

  signal(SIGPIPE, SIG_IGN);
 @@ -444,7 +446,13 @@ int main(int argc, char **argv)
  exit(1);
  }
  
 -qemu_init_main_loop();
 +ret = qemu_init_main_loop(local_error);
 +if (ret  0) {
 +error_report(%s, error_get_pretty(local_error));
 +error_report(qemu_init_main_loop failed);
 +error_free(local_error);

 +return ret;

We are in main here.
Should it be exit(something) like above ?


  if (local_err) {
 -errx(EXIT_FAILURE, Failed to parse detect_zeroes mode: %s, 
 +errx(EXIT_FAILURE, Failed to parse detect_zeroes mode: %s,
   error_get_pretty(local_err));
  }
  if (detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP 
  !(flags  BDRV_O_UNMAP)) {
  errx(EXIT_FAILURE, setting detect-zeroes to unmap is not 
 allowed 
 -   without setting discard operation to 
 unmap); 
 +   without setting discard operation to 
 unmap);

No cosmetic fixes on parts of the code unmodified by a patch please.
Your editor is probably removing these extra EOL spaces on your behalf. Fix it.

  }
  break;
  case 'b':
 @@ -674,7 +674,13 @@ int main(int argc, char **argv)
  snprintf(sockpath, 128, SOCKET_PATH, basename(device));
  }
  
 -qemu_init_main_loop();
 +ret = qemu_init_main_loop(local_err);
 +if (ret  0) {
 +error_report(%s, error_get_pretty(local_err));
 +error_report(qemu_init_main_loop failed);
 +error_free(local_err);
 +exit(EXIT_FAILURE);
 +}
  bdrv_init();
  atexit(bdrv_close_all);
  
 diff --git a/tests/test-aio.c b/tests/test-aio.c
 index c6a8713..4cb3783 100644
 --- a/tests/test-aio.c
 +++ b/tests/test-aio.c
 @@ -14,6 +14,7 @@
  #include block/aio.h
  #include qemu/timer.h
  #include qemu/sockets.h
 +#include qemu/error-report.h
  
  static AioContext *ctx;
  
 @@ -810,11 +811,18 @@ static void test_source_timer_schedule(void)
  
  int main(int argc, char **argv)
  {
 +Error *local_error;
  GSource *src;
  
  init_clocks();
  
 -ctx = aio_context_new();
 +ctx = aio_context_new(local_error);
 +if (!ctx) {
 +error_report(%s, error_get_pretty(local_error));
 +error_report(Failed to create AIO context);
 +error_free(local_error);
 +exit(1);
 +}
  src = aio_get_g_source(ctx);
  g_source_attach(src, NULL);
  g_source_unref(src);
 diff --git a/tests/test-thread-pool.c b/tests/test-thread-pool.c
 index f40b7fc..ff6a8b0 100644
 --- a/tests/test-thread-pool.c
 +++ b/tests/test-thread-pool.c
 @@ -4,6 +4,7 @@
  #include block/thread-pool.h
  #include block/block.h
  #include qemu/timer.h
 +#include qemu/error-report.h
  
  static AioContext *ctx;
  static ThreadPool *pool;
 @@ -205,10 +206,17 @@ static void test_cancel(void)
  int main(int argc, char **argv)
  {
  int ret;
 +Error *local_error;
  
  init_clocks();
  
 -ctx = aio_context_new();
 +ctx = aio_context_new(local_error);
 +if (!ctx) {
 +error_report(%s, error_get_pretty(local_error));
 +error_report(Failed to create AIO context);
 +error_free(local_error);
 +exit(1);
 +}
  pool = aio_get_thread_pool(ctx);
  
  g_test_init(argc, argv, NULL);
 diff --git a/tests/test-throttle.c b/tests/test-throttle.c
 index 000ae31..7be607a 100644
 --- a/tests/test-throttle.c
 +++ b/tests/test-throttle.c
 @@ -14,6 +14,7 @@
  #include math.h
  #include block/aio.h
  #include qemu/throttle.h
 +#include qemu/error-report.h
  
  static AioContext *ctx;
  static LeakyBucketbkt;
 @@ -492,10 +493,17 @@ static void test_accounting(void)
  int main(int argc, char **argv)
  {
  GSource *src;
 +Error *local_error;
  
  init_clocks();
  
 -ctx = aio_context_new();
 +ctx = aio_context_new(local_error);
 +if (!ctx) {
 +error_report(%s, error_get_pretty(local_error));
 +error_report(Failed to create AIO context);
 +error_free(local_error);
 +exit(1);
 +}
  src = aio_get_g_source(ctx);
  g_source_attach(src, NULL);

Re: [Qemu-devel] [PATCH v3] async: aio_context_new(): Handle event_notifier_init failure

2014-09-16 Thread Benoît Canet

The Tuesday 16 Sep 2014 à 13:40:24 (-0600), Eric Blake wrote :
 On 09/16/2014 12:04 PM, Chrysostomos Nanakos wrote:
  If event_notifier_init fails QEMU exits without printing
  any error information to the user. This commit adds an error
  message on failure:
  
   # qemu [...]
 
 Showing the actual command line you used would be helpful.
 
   qemu: Failed to initialize event notifier: Too many open files in system
  
  Signed-off-by: Chrysostomos Nanakos cnana...@grnet.gr
  ---
   async.c  |   16 +++-
   include/block/aio.h  |2 +-
   include/qemu/main-loop.h |2 +-
   iothread.c   |   11 ++-
   main-loop.c  |9 +++--
   qemu-img.c   |8 +++-
   qemu-io.c|7 ++-
   qemu-nbd.c   |6 +-
   tests/test-aio.c |   10 +-
   tests/test-thread-pool.c |   10 +-
   tests/test-throttle.c|   10 +-
   vl.c |5 +++--
   12 files changed, 78 insertions(+), 18 deletions(-)
  
 
  -AioContext *aio_context_new(void)
  +AioContext *aio_context_new(Error **errp)
   {
  +int ret;
   AioContext *ctx;
   ctx = (AioContext *) g_source_new(aio_source_funcs, 
  sizeof(AioContext));
  +ret = event_notifier_init(ctx-notifier, false);
  +if (ret  0) {
  +g_source_destroy(ctx-source);
 
 Does g_source_destroy() guarantee that errno is unmolested? If not,
 
  +error_setg_errno(errp, -ret, Failed to initialize event 
  notifier);

Actually -ret is passed to error_setg_errno.

See.

void error_set_errno(Error **errp, int os_errno, ErrorClass err_class,  
 const char *fmt, ...) 
and
if (os_errno != 0) {
err-msg = g_strdup_printf(%s: %s, msg1, strerror(os_errno));  

 Use of \' inside  is unusual. (Multiple times in your patch)

My fault.

Re: [Qemu-devel] [PATCH 1/4] blockdev: Disentangle BlockDriverState and DriveInfo creation

2014-09-15 Thread Benoît Canet

The Friday 12 Sep 2014 à 21:26:21 (+0200), Markus Armbruster wrote :
 blockdev_init() mixes up BlockDriverState and DriveInfo initialization
 Finish the BlockDriverState job before starting to mess with
 DriveInfo.  Easier on the eyes.
 
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  blockdev.c | 43 +++
  1 file changed, 23 insertions(+), 20 deletions(-)
 
 diff --git a/blockdev.c b/blockdev.c
 index b361fbb..5ec4635 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -301,6 +301,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
 *bs_opts,
  int ro = 0;
  int bdrv_flags = 0;
  int on_read_error, on_write_error;
 +BlockDriverState *bs;
  DriveInfo *dinfo;
  ThrottleConfig cfg;
  int snapshot = 0;
 @@ -456,26 +457,27 @@ static DriveInfo *blockdev_init(const char *file, QDict 
 *bs_opts,
  }
  
  /* init */
 +bs = bdrv_new(qemu_opts_id(opts), errp);
 +if (!bs) {
 +goto early_err;
 +}
 +bs-open_flags = snapshot ? BDRV_O_SNAPSHOT : 0;
 +bs-read_only = ro;
 +bs-detect_zeroes = detect_zeroes;
 +
 +bdrv_set_on_error(bs, on_read_error, on_write_error);
 +
 +/* disk I/O throttling */
 +if (throttle_enabled(cfg)) {
 +bdrv_io_limits_enable(bs);
 +bdrv_set_io_limits(bs, cfg);
 +}
 +
  dinfo = g_malloc0(sizeof(*dinfo));
  dinfo-id = g_strdup(qemu_opts_id(opts));
 -dinfo-bdrv = bdrv_new(dinfo-id, error);
 -if (error) {
 -error_propagate(errp, error);
 -goto bdrv_new_err;
 -}
 -dinfo-bdrv-open_flags = snapshot ? BDRV_O_SNAPSHOT : 0;
 -dinfo-bdrv-read_only = ro;
 -dinfo-bdrv-detect_zeroes = detect_zeroes;
 +dinfo-bdrv = bs;
  QTAILQ_INSERT_TAIL(drives, dinfo, next);
  
 -bdrv_set_on_error(dinfo-bdrv, on_read_error, on_write_error);
 -
 -/* disk I/O throttling */
 -if (throttle_enabled(cfg)) {
 -bdrv_io_limits_enable(dinfo-bdrv);
 -bdrv_set_io_limits(dinfo-bdrv, cfg);
 -}
 -
  if (!file || !*file) {
  if (has_driver_specific_opts) {
  file = NULL;
 @@ -502,7 +504,8 @@ static DriveInfo *blockdev_init(const char *file, QDict 
 *bs_opts,
  bdrv_flags |= ro ? 0 : BDRV_O_RDWR;
  
  QINCREF(bs_opts);
 -ret = bdrv_open(dinfo-bdrv, file, NULL, bs_opts, bdrv_flags, drv, 
 error);
 +ret = bdrv_open(bs, file, NULL, bs_opts, bdrv_flags, drv, error);
 +assert(bs == dinfo-bdrv);
  
  if (ret  0) {
  error_setg(errp, could not open disk image %s: %s,
 @@ -511,8 +514,9 @@ static DriveInfo *blockdev_init(const char *file, QDict 
 *bs_opts,
  goto err;
  }
  
 -if (bdrv_key_required(dinfo-bdrv))
 +if (bdrv_key_required(bs)) {
  autostart = 0;
 +}
  
  QDECREF(bs_opts);
  qemu_opts_del(opts);
 @@ -520,9 +524,8 @@ static DriveInfo *blockdev_init(const char *file, QDict 
 *bs_opts,
  return dinfo;
  
  err:
 -bdrv_unref(dinfo-bdrv);

 +bdrv_unref(bs);
I would have moved this.

  QTAILQ_REMOVE(drives, dinfo, next);
 -bdrv_new_err:
  g_free(dinfo-id);
  g_free(dinfo);

To Here.

No functional difference but it would reflect it's goto chain role:
being in the reverse order of the allocations.

  early_err:
 -- 
 1.9.3

Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH 2/4] block: Keep DriveInfo alive until BlockDriverState dies

2014-09-15 Thread Benoît Canet

The Friday 12 Sep 2014 à 21:26:22 (+0200), Markus Armbruster wrote :
 If the BDS's refcnt  0, drive_del() destroys the DriveInfo, but not
 the BDS.  This can happen in three places:
 
 * Device model destruction during unplug: blockdev_auto_del()
 
 * Xen IDE unplug: pci_piix3_xen_ide_unplug()
 
 * drive_del command when no device model is attached: do_drive_del()
 
 The other callers of drive_del are on error paths where refcnt == 1.
 
 If the user somehow manages to plug in a device model using a BDS that
 has gone through drive_del(), the legacy configuration passed in
 DriveInfo doesn't reach the device model, and automatic deletion on
 unplug doesn't work.  Worse, some device models such as scsi-disk
 crash when DriveInfo doesn't exist.
 
 This is theoretical; I didn't research an actual reproducer.
 
 Fix by keeping DriveInfo alive until its BDS dies.
 
 This affects qemu_drive_opts: now you can't reuse the same ID for new
 drive options until the BDS dies.  Before, you could, but since the
 code always attempts to create a BDS with the same ID next, the
 enclosing operation create a new drive failed anyway.  Different
 error path, same result.
 
 Unfortunately, the fix involves use of blockdev.c stuff from block.c,
 which is a layering violation.  Fortunately, my forthcoming
 BlockBackend work will get rid of it again.
 
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  block.c   |  2 ++
  blockdev.c| 13 -
  include/sysemu/blockdev.h |  1 +
  stubs/Makefile.objs   |  1 +
  stubs/blockdev.c  | 12 
  5 files changed, 24 insertions(+), 5 deletions(-)
  create mode 100644 stubs/blockdev.c
 
 diff --git a/block.c b/block.c
 index d06dd51..6faf36f 100644
 --- a/block.c
 +++ b/block.c
 @@ -29,6 +29,7 @@
  #include qemu/module.h
  #include qapi/qmp/qjson.h
  #include sysemu/sysemu.h
 +#include sysemu/blockdev.h/* FIXME layering violation */
  #include qemu/notify.h
  #include block/coroutine.h
  #include block/qapi.h
 @@ -2110,6 +2111,7 @@ static void bdrv_delete(BlockDriverState *bs)
  /* remove from list, if necessary */
  bdrv_make_anon(bs);
  
 +drive_info_del(drive_get_by_blockdev(bs));
  g_free(bs);
  }
  
 diff --git a/blockdev.c b/blockdev.c
 index 5ec4635..450f95c 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -216,11 +216,17 @@ static void bdrv_format_print(void *opaque, const char 
 *name)
  
  void drive_del(DriveInfo *dinfo)
  {
 +bdrv_unref(dinfo-bdrv);
 +}
 +
 +void drive_info_del(DriveInfo *dinfo)
 +{
 +if (!dinfo) {
 +return;
 +}
  if (dinfo-opts) {
  qemu_opts_del(dinfo-opts);
  }
 -
 -bdrv_unref(dinfo-bdrv);
  g_free(dinfo-id);
  QTAILQ_REMOVE(drives, dinfo, next);
  g_free(dinfo-serial);
 @@ -525,9 +531,6 @@ static DriveInfo *blockdev_init(const char *file, QDict 
 *bs_opts,
  
  err:
  bdrv_unref(bs);
 -QTAILQ_REMOVE(drives, dinfo, next);
 -g_free(dinfo-id);
 -g_free(dinfo);
  early_err:
  qemu_opts_del(opts);
  err_no_opts:
 diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h
 index 23a5d10..abec381 100644
 --- a/include/sysemu/blockdev.h
 +++ b/include/sysemu/blockdev.h
 @@ -56,6 +56,7 @@ QemuOpts *drive_add(BlockInterfaceType type, int index, 
 const char *file,
  const char *optstr);
  DriveInfo *drive_new(QemuOpts *arg, BlockInterfaceType block_default_type);
  void drive_del(DriveInfo *dinfo);
 +void drive_info_del(DriveInfo *dinfo);
  
  /* device-hotplug */
  
 diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
 index 5e347d0..c0b1f6a 100644
 --- a/stubs/Makefile.objs
 +++ b/stubs/Makefile.objs
 @@ -1,5 +1,6 @@
  stub-obj-y += arch-query-cpu-def.o
  stub-obj-y += bdrv-commit-all.o
 +stub-obj-y += blockdev.o
  stub-obj-y += chr-baum-init.o
  stub-obj-y += chr-msmouse.o
  stub-obj-y += chr-testdev.o
 diff --git a/stubs/blockdev.c b/stubs/blockdev.c
 new file mode 100644
 index 000..5d0a79c
 --- /dev/null
 +++ b/stubs/blockdev.c
 @@ -0,0 +1,12 @@
 +#include assert.h
 +#include sysemu/blockdev.h
 +
 +DriveInfo *drive_get_by_blockdev(BlockDriverState *bs)
 +{
 +return NULL;
 +}
 +
 +void drive_info_del(DriveInfo *dinfo)
 +{
 +assert(!dinfo);
 +}
 -- 
 1.9.3



Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v2 3/3] util: Add an utility infrastructure used to compute an average on a time slice

2014-09-15 Thread Benoît Canet

On Mon, Sep 15, 2014 at 01:13:08PM +0200, Markus Armbruster wrote:
 Benoît Canet benoit.ca...@nodalink.com writes:
 
  On Mon, Sep 08, 2014 at 05:09:38PM +0200, Paolo Bonzini wrote:
  Il 08/09/2014 16:49, Benoît Canet ha scritto:
- create two windows, with twice the suggested expiration period, and
return min/avg/max from the oldest window.  Example

   t=0  |t=1  |t=2  |t=3  |t=4
   wnd0: [0,1)  |wnd0: [1,3)  | |wnd0: [3,5)  |
   wnd1: [0,2)  | |wnd1: [2,4)  | |

Values are returned from:

   wnd0-|wnd1-|wnd0-|wnd1-|
   
   This is neat.
  
  Alternatively, you can make it probabilistically correct:
  
  t=0|t=0.66   |t=1.33 |t=2  
  |t=2.66
 |wnd0: [0.66,2)   |   |wnd0: [2,3.33)   
  |
  wnd1: [0,0.66) | |wnd1: [1.33,2.66)  | 
  |
  
  Return from:
  
  
  wnd1---|wnd1-|wnd0---|wnd1-|wnd0
  
  So you always have 2/3 seconds worth of data, and on average exactly 1 
  second
  worth of data.
  
  The problem is the delay in getting data, which can be big for the minute-
  and hour-based statistics.  Suppose you have a spike that lasts 10 seconds,
  it might not show in the minute-based statistics for as much as 30 seconds
  after it ends (the window switches every 40 seconds).
  
  For min/max you could return min(min0, min1) and max(max0, max1).  Only the
  average has this problem.
  
  Exponential smoothing doesn't have this problem.  IIRC uptime uses that.
 
  I am writing this so cloud end users can programatically get informations 
  about
  their vms disk statistics.
 
  Cloud end users are known to use their cloud API to script the
  elasticity of their
  architecture.
  Some code will poll system statistics to decide if new instances must
  be launched
  or existing instances must be pruned.
  This means introducing a delay in the accounting code would slow down their
  decisions.
 
  min and max is also useful to know since it gives an idea of the deviation.
 
 For what it's worth, the algorithm in the Dr. Dobb's Paolo referenced
 can compute a standard deviation.  Can we figure out what users really
 want, standard deviation, min/max, or both?

I'll ask to my test subject.

 
 [...]

Re: [Qemu-devel] [PATCH v2 3/3] util: Add an utility infrastructure used to compute an average on a time slice

2014-09-15 Thread Benoît Canet

On Mon, Sep 15, 2014 at 01:13:08PM +0200, Markus Armbruster wrote:
 Benoît Canet benoit.ca...@nodalink.com writes:
 
  On Mon, Sep 08, 2014 at 05:09:38PM +0200, Paolo Bonzini wrote:
  Il 08/09/2014 16:49, Benoît Canet ha scritto:
- create two windows, with twice the suggested expiration period, and
return min/avg/max from the oldest window.  Example

   t=0  |t=1  |t=2  |t=3  |t=4
   wnd0: [0,1)  |wnd0: [1,3)  | |wnd0: [3,5)  |
   wnd1: [0,2)  | |wnd1: [2,4)  | |

Values are returned from:

   wnd0-|wnd1-|wnd0-|wnd1-|
   
   This is neat.
  
  Alternatively, you can make it probabilistically correct:
  
  t=0|t=0.66   |t=1.33 |t=2  
  |t=2.66
 |wnd0: [0.66,2)   |   |wnd0: [2,3.33)   
  |
  wnd1: [0,0.66) | |wnd1: [1.33,2.66)  | 
  |
  
  Return from:
  
  
  wnd1---|wnd1-|wnd0---|wnd1-|wnd0
  
  So you always have 2/3 seconds worth of data, and on average exactly 1 
  second
  worth of data.
  
  The problem is the delay in getting data, which can be big for the minute-
  and hour-based statistics.  Suppose you have a spike that lasts 10 seconds,
  it might not show in the minute-based statistics for as much as 30 seconds
  after it ends (the window switches every 40 seconds).
  
  For min/max you could return min(min0, min1) and max(max0, max1).  Only the
  average has this problem.
  
  Exponential smoothing doesn't have this problem.  IIRC uptime uses that.
 
  I am writing this so cloud end users can programatically get informations 
  about
  their vms disk statistics.
 
  Cloud end users are known to use their cloud API to script the
  elasticity of their
  architecture.
  Some code will poll system statistics to decide if new instances must
  be launched
  or existing instances must be pruned.
  This means introducing a delay in the accounting code would slow down their
  decisions.
 
  min and max is also useful to know since it gives an idea of the deviation.
 
 For what it's worth, the algorithm in the Dr. Dobb's Paolo referenced
 can compute a standard deviation.  Can we figure out what users really
 want, standard deviation, min/max, or both?

My test subject think that min/max convey more information than standard 
deviation
and that anyway the second can be computed with the formers.

Does Red Hat have other tests subjects at hand ?

Best regards

Benoît

 
 [...]

Re: [Qemu-devel] [PATCH v2 01/23] block: Split bdrv_new_root() off bdrv_new()

2014-09-15 Thread Benoît Canet

 *opts, Error **errp);
  int bdrv_create_file(const char *filename, QemuOpts *opts, Error **errp);
 -BlockDriverState *bdrv_new(const char *device_name, Error **errp);
 +BlockDriverState *bdrv_new_root(const char *device_name, Error **errp);
 +BlockDriverState *bdrv_new(void);
  void bdrv_make_anon(BlockDriverState *bs);
  void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old);
  void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top);
 diff --git a/qemu-img.c b/qemu-img.c
 index 91d1ac3..58d59d0 100644
 --- a/qemu-img.c
 +++ b/qemu-img.c
 @@ -296,7 +296,7 @@ static BlockDriverState *bdrv_new_open(const char *id,
  Error *local_err = NULL;
  int ret;
  
 -bs = bdrv_new(id, error_abort);
 +bs = bdrv_new_root(id, error_abort);
  
  if (fmt) {
  drv = bdrv_find_format(fmt);
 @@ -2425,7 +2425,7 @@ static int img_rebase(int argc, char **argv)
  if (!unsafe) {
  char backing_name[1024];
  
 -bs_old_backing = bdrv_new(old_backing, error_abort);
 +bs_old_backing = bdrv_new_root(old_backing, error_abort);
  bdrv_get_backing_filename(bs, backing_name, sizeof(backing_name));
  ret = bdrv_open(bs_old_backing, backing_name, NULL, NULL, src_flags,
  old_backing_drv, local_err);
 @@ -2436,7 +2436,7 @@ static int img_rebase(int argc, char **argv)
  goto out;
  }
  if (out_baseimg[0]) {
 -bs_new_backing = bdrv_new(new_backing, error_abort);
 +bs_new_backing = bdrv_new_root(new_backing, error_abort);
  ret = bdrv_open(bs_new_backing, out_baseimg, NULL, NULL, 
 src_flags,
  new_backing_drv, local_err);
  if (ret) {
 diff --git a/qemu-io.c b/qemu-io.c
 index d2ab694..24ca59c 100644
 --- a/qemu-io.c
 +++ b/qemu-io.c
 @@ -58,7 +58,7 @@ static int openfile(char *name, int flags, int growable, 
 QDict *opts)
  return 1;
  }
  
 -qemuio_bs = bdrv_new(hda, error_abort);
 +qemuio_bs = bdrv_new_root(hda, error_abort);
  
  if (growable) {
  flags |= BDRV_O_PROTOCOL;
 diff --git a/qemu-nbd.c b/qemu-nbd.c
 index f3cf8a2..24b8454 100644
 --- a/qemu-nbd.c
 +++ b/qemu-nbd.c
 @@ -687,7 +687,7 @@ int main(int argc, char **argv)
  drv = NULL;
  }
  
 -bs = bdrv_new(hda, error_abort);
 +bs = bdrv_new_root(hda, error_abort);
  
  srcpath = argv[optind];
  ret = bdrv_open(bs, srcpath, NULL, NULL, flags, drv, local_err);
 -- 
 1.9.3
 
Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v2 02/23] block: New BlockBackend

2014-09-15 Thread Benoît Canet

 --- a/block.c
 +++ b/block.c
 @@ -2119,10 +2119,11 @@ static void bdrv_delete(BlockDriverState *bs)
  
  bdrv_close(bs);
  


 +drive_info_del(drive_get_by_blockdev(bs));
 +
  /* remove from list, if necessary */
  bdrv_make_anon(bs);
  
 -drive_info_del(drive_get_by_blockdev(bs));

Do we really want this move ?
Or is it an artefact of your work on the preparation series ?

  g_free(bs);
  }
  

 + * Return the BlockBackend with name @name if it exists, else null.

 + * @name must not be null.
The contract is that no one will call blk_by_name(NULL);

 + */
 +BlockBackend *blk_by_name(const char *name)
 +{
 +BlockBackend *blk;

Can we ?

 assert(name);

 +
 +QTAILQ_FOREACH(blk, blk_backends, link) {
 +if (!strcmp(name, blk-name)) {
 +return blk;
 +}
 +}
 +return NULL;
 +}


 +/*
 + * Hairy special case: if do_drive_del() has made dinfo-bdrv
 + * anonymous, it also unref'ed the associated BlockBackend.
 + */

Then you are filling the other case here.

maybe

/*
 * Hairy special case: if do_drive_del() has made dinfo-bdrv
 * anonymous, it also unref'ed the associated BlockBackend.
 * Process the other case here.
 */

would further explain you intents.


 +if (dinfo-bdrv-device_name[0]) {
 +blk_unref(blk_by_name(dinfo-bdrv-device_name));
 +}
 +
  g_free(dinfo-id);
  QTAILQ_REMOVE(drives, dinfo, next);
  g_free(dinfo-serial);
 @@ -307,6 +317,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
 *bs_opts,
  int ro = 0;
  int bdrv_flags = 0;
  int on_read_error, on_write_error;
 +BlockBackend *blk;
  BlockDriverState *bs;
  DriveInfo *dinfo;
  ThrottleConfig cfg;
 @@ -463,9 +474,13 @@ static DriveInfo *blockdev_init(const char *file, QDict 
 *bs_opts,
  }

  const char *fmt1 = NULL, *fmt2 = NULL, *cache, *filename1, *filename2;
 +BlockBackend *blk1, *blk2;
  BlockDriverState *bs1, *bs2;
  int64_t total_sectors1, total_sectors2;
  uint8_t *buf1 = NULL, *buf2 = NULL;
 @@ -1011,18 +1022,20 @@ static int img_compare(int argc, char **argv)
  goto out3;
  }
  
 +blk1 = blk_new(image 1, error_abort);
  bs1 = bdrv_new_open(image 1, filename1, fmt1, flags, true, quiet);
  if (!bs1) {
  error_report(Can't open file %s, filename1);
  ret = 2;
 -goto out3;
 +goto out2;

Here we allocate blk1 and bs1 an if bs1 is null (error) we jump to out2 which 
is:

  out2:
  bdrv_unref(bs1);
 +blk_unref(blk1);

It's a bit strange that we will bdrv_unref(NULL) in this case.

This goto chain seems odd.

  }
  
 +blk2 = blk_new(image 2, error_abort);
  bs2 = bdrv_new_open(image 2, filename2, fmt2, flags, true, quiet);
  if (!bs2) {
  error_report(Can't open file %s, filename2);
  ret = 2;
 -goto out2;
 +goto out1;
  }
  
  buf1 = qemu_blockalign(bs1, IO_BUF_SIZE);
 @@ -1183,11 +1196,14 @@ static int img_compare(int argc, char **argv)
  ret = 0;
  
  out:
 -bdrv_unref(bs2);
  qemu_vfree(buf1);
  qemu_vfree(buf2);
 +out1:
 +bdrv_unref(bs2);
 +blk_unref(blk2);
  out2:
  bdrv_unref(bs1);
 +blk_unref(blk1);
  out3:
  qemu_progress_end();
  return ret;
 @@ -1200,6 +1216,7 @@ static int img_convert(int argc, char **argv)
  int progress = 0, flags, src_flags;
  const char *fmt, *out_fmt, *cache, *src_cache, *out_baseimg, 
 *out_filename;
  BlockDriver *drv, *proto_drv;
 +BlockBackend **blk = NULL, *out_blk = NULL;

Should blk be blks or blkes ? They are multiple.

  BlockDriverState **bs = NULL, *out_bs = NULL;
  int64_t total_sectors, nb_sectors, sector_num, bs_offset;
  int64_t *bs_sectors = NULL;
 @@ -1354,6 +1371,7 @@ static int img_convert(int argc, char **argv)
  
  qemu_progress_print(0, 100);
  
 +blk = g_new0(BlockBackend *, bs_n);
  bs = g_new0(BlockDriverState *, bs_n);
  bs_sectors = g_new(int64_t, bs_n);
  
 @@ -1361,6 +1379,7 @@ static int img_convert(int argc, char **argv)
  for (bs_i = 0; bs_i  bs_n; bs_i++) {
  char *id = bs_n  1 ? g_strdup_printf(source %d, bs_i)
  : g_strdup(source);
 +blk[bs_i] = blk_new(id, error_abort);
  bs[bs_i] = bdrv_new_open(id, argv[optind + bs_i], fmt, src_flags,
   true, quiet);
  g_free(id);
 @@ -1486,6 +1505,7 @@ static int img_convert(int argc, char **argv)
  goto out;
  }
  
 +out_blk = blk_new(target, error_abort);
  out_bs = bdrv_new_open(target, out_filename, out_fmt, flags, true, 
 quiet);
  if (!out_bs) {
  ret = -1;
 @@ -1742,6 +1762,7 @@ out:
  if (out_bs) {
  bdrv_unref(out_bs);
  }
 +blk_unref(out_blk);
  if (bs) {
  for (bs_i = 0; bs_i  bs_n; bs_i++) {
  if (bs[bs_i]) {
 @@ -1750,6 +1771,12 @@ out:
  }

[Qemu-devel] [PATCH 0/2] Handle interferences between multiple children drivers and stream and commit

2014-09-15 Thread Benoît Canet

We do not want to try to stream or commit with a base argument through
a multiple children driver.

Handle this case.

Benoît Canet (2):
  block: Introduce a BlockDriver field to flag drivers supporting
multiple children
  block: commit and stream return an error when a subtree is found

 block.c   | 17 +++--
 block/blkverify.c |  1 +
 block/quorum.c|  1 +
 block/vmdk.c  |  1 +
 blockdev.c| 18 --
 include/block/block.h |  3 ++-
 include/block/block_int.h |  6 ++
 7 files changed, 38 insertions(+), 9 deletions(-)

-- 
2.1.0

[Qemu-devel] [PATCH 2/2] block: commit and stream return an error when a subtree is found

2014-09-15 Thread Benoît Canet

In bdrv_find_backing_image when looking for the base BDS of a backing chain and
a BDS having a multiple children block driver is found return NULL and set an
error message.

This will hopefully remove any unwanted interference between stream, commit
and drivers like quorum.

CC: Jeff Cody jc...@redhat.com
Signed-off-by: Benoît Canet benoit.ca...@nodalink.com
---
 block.c   | 17 +++--
 blockdev.c| 18 --
 include/block/block.h |  3 ++-
 3 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index d06dd51..ea86252 100644
--- a/block.c
+++ b/block.c
@@ -4339,9 +4339,13 @@ int bdrv_is_snapshot(BlockDriverState *bs)
 /* backing_file can either be relative, or absolute, or a protocol.  If it is
  * relative, it must be relative to the chain.  So, passing in bs-filename
  * from a BDS as backing_file should not be done, as that may be relative to
- * the CWD rather than the chain. */
+ * the CWD rather than the chain.
+ * If a multiple children BDS driver is found down the backing chain path then
+ * return NULL and set an error message.
+ */
 BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
-const char *backing_file)
+  const char *backing_file,
+  Error **errp)
 {
 char *filename_full = NULL;
 char *backing_file_full = NULL;
@@ -4362,6 +4366,15 @@ BlockDriverState 
*bdrv_find_backing_image(BlockDriverState *bs,
 
 for (curr_bs = bs; curr_bs-backing_hd; curr_bs = curr_bs-backing_hd) {
 
+/* If a BDS having a multiple children block driver is found then the
+ * function should fail.
+ */
+if (curr_bs-drv  curr_bs-drv-supports_multiple_children) {
+error_setg(errp, multiple children block driver found down the 
+ backing chain);
+break;
+}
+
 /* If either of the filename paths is actually a protocol, then
  * compare unmodified paths; otherwise make paths relative */
 if (is_protocol || path_has_protocol(curr_bs-backing_file)) {
diff --git a/blockdev.c b/blockdev.c
index e919566..4c5c28e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1894,9 +1894,11 @@ void qmp_block_stream(const char *device,
 }
 
 if (has_base) {
-base_bs = bdrv_find_backing_image(bs, base);
+base_bs = bdrv_find_backing_image(bs, base, errp);
 if (base_bs == NULL) {
-error_set(errp, QERR_BASE_NOT_FOUND, base);
+if (!*errp) {
+error_set(errp, QERR_BASE_NOT_FOUND, base);
+}
 return;
 }
 base_name = base;
@@ -1965,23 +1967,27 @@ void qmp_block_commit(const char *device,
 
 if (has_top  top) {
 if (strcmp(bs-filename, top) != 0) {
-top_bs = bdrv_find_backing_image(bs, top);
+top_bs = bdrv_find_backing_image(bs, top, errp);
 }
 }
 
 if (top_bs == NULL) {
-error_setg(errp, Top image file %s not found, top ? top : NULL);
+if (!*errp) {
+error_setg(errp, Top image file %s not found, top ? top : 
NULL);
+}
 return;
 }
 
 if (has_base  base) {
-base_bs = bdrv_find_backing_image(top_bs, base);
+base_bs = bdrv_find_backing_image(top_bs, base, errp);
 } else {
 base_bs = bdrv_find_base(top_bs);
 }
 
 if (base_bs == NULL) {
-error_set(errp, QERR_BASE_NOT_FOUND, base ? base : NULL);
+if (!*errp) {
+error_set(errp, QERR_BASE_NOT_FOUND, base ? base : NULL);
+}
 return;
 }
 
diff --git a/include/block/block.h b/include/block/block.h
index 8f4ad16..65cde65 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -272,7 +272,8 @@ int coroutine_fn bdrv_co_writev(BlockDriverState *bs, 
int64_t sector_num,
 int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
 int nb_sectors, BdrvRequestFlags flags);
 BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
-const char *backing_file);
+  const char *backing_file,
+  Error **errp);
 int bdrv_get_backing_file_depth(BlockDriverState *bs);
 void bdrv_refresh_filename(BlockDriverState *bs);
 int bdrv_truncate(BlockDriverState *bs, int64_t offset);
-- 
2.1.0

[Qemu-devel] [PATCH 1/2] block: Introduce a BlockDriver field to flag drivers supporting multiple children

2014-09-15 Thread Benoît Canet

The recursive op blocker patch to come will take an optional base argument
which will not have any meaning when an intermediary BDS driver of the graph
support multiple children.

Flag such drivers to be able to handle this case.

CC: Jeff Cody jc...@redhat.com
Signed-off-by: Benoît Canet benoit.ca...@nodalink.com
---
 block/blkverify.c | 1 +
 block/quorum.c| 1 +
 block/vmdk.c  | 1 +
 include/block/block_int.h | 6 ++
 4 files changed, 9 insertions(+)

diff --git a/block/blkverify.c b/block/blkverify.c
index 163064c..13f8359 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -368,6 +368,7 @@ static BlockDriver bdrv_blkverify = {
 .bdrv_detach_aio_context  = blkverify_detach_aio_context,
 
 .is_filter= true,
+.supports_multiple_children   = true,
 .bdrv_recurse_is_first_non_filter = blkverify_recurse_is_first_non_filter,
 };
 
diff --git a/block/quorum.c b/block/quorum.c
index 093382e..a4d11eb 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -1087,6 +1087,7 @@ static BlockDriver bdrv_quorum = {
 .bdrv_attach_aio_context= quorum_attach_aio_context,
 
 .is_filter  = true,
+.supports_multiple_children = true,
 .bdrv_recurse_is_first_non_filter   = quorum_recurse_is_first_non_filter,
 };
 
diff --git a/block/vmdk.c b/block/vmdk.c
index a1cb911..933a57d 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -2244,6 +2244,7 @@ static BlockDriver bdrv_vmdk = {
 .bdrv_attach_aio_context  = vmdk_attach_aio_context,
 
 .supports_backing = true,
+.supports_multiple_children   = true,
 .create_opts  = vmdk_create_opts,
 };
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8a61215..b653e48 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -104,6 +104,12 @@ struct BlockDriver {
 /* Set if a driver can support backing files */
 bool supports_backing;
 
+/* Set to true if the BlockDriver supports multiple children.
+ * (quorum, blkverify, vmdk)
+ * This field being true will disable some block layer operations.
+ */
+bool supports_multiple_children;
+
 /* For handling image reopen for split or non-split files */
 int (*bdrv_reopen_prepare)(BDRVReopenState *reopen_state,
BlockReopenQueue *queue, Error **errp);
-- 
2.1.0

Re: [Qemu-devel] [PATCH v2 03/23] block: Connect BlockBackend to BlockDriverState

2014-09-15 Thread Benoît Canet

 index ff95da6..fa8a7d0 100644
 --- a/qemu-nbd.c
 +++ b/qemu-nbd.c
 @@ -689,7 +689,7 @@ int main(int argc, char **argv)
  }
  

  blk = blk_new(hda, error_abort);
Is a blk_new_with_bs converssion missing here ?

 -bs = bdrv_new_root(hda, error_abort);
 +bs = blk_bs(blk);
  
  srcpath = argv[optind];
  ret = bdrv_open(bs, srcpath, NULL, NULL, flags, drv, local_err);
 -- 
 1.9.3

Re: [Qemu-devel] [PATCH] block: Make op blockers recursive

2014-09-15 Thread Benoît Canet

On Fri, Sep 12, 2014 at 11:48:33AM +0800, Fam Zheng wrote:
 On Tue, 09/09 14:28, Benoît Canet wrote:
  On Tue, Sep 09, 2014 at 01:56:46PM +0200, Kevin Wolf wrote:
   Am 22.08.2014 um 18:11 hat Benoît Canet geschrieben:
Since the block layer code is starting to modify the BDS graph right in 
the
middle of BDS chains (block-mirror's replace parameter for example) 
QEMU needs
to properly block and unblock whole BDS subtrees; recursion is a neat 
way to
achieve this task.

This patch also takes care of modifying the op blockers users.

Signed-off-by: Benoit Canet benoit.ca...@nodalink.com
---
 block.c   | 69 
---
 block/blkverify.c | 21 +++
 block/commit.c|  3 +++
 block/mirror.c| 17 
 block/quorum.c| 25 +
 block/stream.c|  3 +++
 block/vmdk.c  | 34 +++
 include/block/block.h |  2 +-
 include/block/block_int.h |  6 +
 9 files changed, 171 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index 6fa0201..d964b6c 100644
--- a/block.c
+++ b/block.c
@@ -5446,7 +5446,9 @@ bool bdrv_op_is_blocked(BlockDriverState *bs, 
BlockOpType op, Error **errp)
 return false;
 }
 
-void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
+/* do the real work of blocking a BDS */
+static void bdrv_do_op_block(BlockDriverState *bs, BlockOpType op,
+ Error *reason)
 {
 BdrvOpBlocker *blocker;
 assert((int) op = 0  op  BLOCK_OP_TYPE_MAX);
@@ -5456,7 +5458,9 @@ void bdrv_op_block(BlockDriverState *bs, 
BlockOpType op, Error *reason)
 QLIST_INSERT_HEAD(bs-op_blockers[op], blocker, list);
 }
 
-void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error 
*reason)
+/* do the real work of unblocking a BDS */
+static void bdrv_do_op_unblock(BlockDriverState *bs, BlockOpType op,
+   Error *reason)
 {
 BdrvOpBlocker *blocker, *next;
 assert((int) op = 0  op  BLOCK_OP_TYPE_MAX);
@@ -5468,6 +5472,65 @@ void bdrv_op_unblock(BlockDriverState *bs, 
BlockOpType op, Error *reason)
 }
 }
 
+static bool bdrv_op_is_blocked_by(BlockDriverState *bs, BlockOpType op,
+  Error *reason)
+{
+BdrvOpBlocker *blocker, *next;
+assert((int) op = 0  op  BLOCK_OP_TYPE_MAX);
+QLIST_FOREACH_SAFE(blocker, bs-op_blockers[op], list, next) {
   
   This doesn't actually need the SAFE version.
   
+if (blocker-reason == reason) {
+return true;
+}
+}
+return false;
+}
+
+/* block recursively a BDS */
+void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
+{
+if (!bs) {
+return;
 
+/* remove BLOCK_OP_TYPE_MIRROR_REPLACE from the list of
+ * blocked operations so the replaces parameter can work
+ */
+bdrv_op_unblock(bs, BLOCK_OP_TYPE_MIRROR_REPLACE, 
bs-job-blocker);
   
   What purpose does a blocker serve when it's disabled before it is
   checked? I would only ever expect a bdrv_op_unblock() after some
   operation on the BDS has finished, but not when starting an operation
   that needs it.
 
 I agree. It makes no sense if the blocker is also the checker.
 
  
  BLOCK_OP_TYPE_MIRROR_REPLACE is checked and blocked by block-job-complete
  during the time the mirror finish when an arbitrary node of the graph must 
  be
  replaced.
 
 It seems to me mirror unblocks this operation from the job-blocker when job
 starts, and never block it (with the job-blocker) again. It's leaked.
 

block-job-complete will block it in mirror_complete.

BLOCK_OP_TYPE_MIRROR_REPLACE is blocked by driver-mirror code triggered by
block-job complete to block the replaces BDS during the end of the mirroring.

If you find silly that block-job-complete prevent itself from running twice on
the same BDS by checking the blocker then blocking it then the existing code is
wrong.

Else the code in this current patch is correct.

Could you have a glance at static void mirror_complete(BlockJob *job, Error 
**errp)
and tell me what you think about the situation. You should also look at
check_to_replace_node.

Best regards

Benoît

 Fam

 
  
  The start of the mirroring block everything including
  BLOCK_OP_TYPE_MIRROR_REPLACE so this hunk relax the blocking so 
  block-job-complete
  can have a chance of being able to block it.
  
  The comment of this hunk seems to be deficient and the code not self 
  explanatory.
  
  On the other way maybe block-job-complete is done wrong from the start but
  relaxing BLOCK_OP_TYPE_MIRROR_REPLACE when the mirror start

Re: [Qemu-devel] [PATCH v7 0/3] block: Introduce null drivers

2014-09-15 Thread Benoît Canet

The Thursday 11 Sep 2014 à 14:09:55 (+0800), Fam Zheng wrote :
 v7: Add Benoît's rev-by line in patch 2.
 Improved help text in patch 1. (Benoît)
 
 v6: Don't inherit from BlockdevOptionsFile. (Stefan)
 Use .bdrv_co_readv instead of .bdrv_read. (Kevin)
 Sort items in qapi schema definitions in patch 2  3.
 
 
 Fam Zheng (3):
   block: Introduce null drivers
   qapi: Sort BlockdevDriver enum data list
   qapi: Sort items in BlockdevOptions definition
 
  block/Makefile.objs  |   1 +
  block/null.c | 178 
 +++
  qapi/block-core.json |  57 +++--
  3 files changed, 215 insertions(+), 21 deletions(-)
  create mode 100644 block/null.c
 
 -- 
 1.9.3
 

I don't think I will be able to break it further.

Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH v1] async: aio_context_new(): Handle event_notifier_init failure

2014-09-15 Thread Benoît Canet

The Sunday 14 Sep 2014 à 13:23:13 (+0300), Chrysostomos Nanakos wrote :
 If event_notifier_init fails QEMU exits without printing
 any error information to the user. This commit adds an error
 message on failure:
 
  # qemu [...]
  qemu: event_notifier_init failed: Too many open files in system
  qemu: qemu_init_main_loop failed
 
 Signed-off-by: Chrysostomos Nanakos cnana...@grnet.gr
 ---
  async.c  |   19 +--
  include/block/aio.h  |2 +-
  include/qemu/main-loop.h |2 +-
  iothread.c   |   12 +++-
  main-loop.c  |   11 +--
  qemu-img.c   |   11 ++-
  qemu-io.c|   10 +-
  qemu-nbd.c   |   12 +---
  tests/test-aio.c |   12 +++-
  tests/test-thread-pool.c |   10 +-
  tests/test-throttle.c|   12 +++-
  vl.c |7 +--
  12 files changed, 99 insertions(+), 21 deletions(-)
 
 diff --git a/async.c b/async.c
 index a99e7f6..d4b6687 100644
 --- a/async.c
 +++ b/async.c
 @@ -289,21 +289,28 @@ static void aio_rfifolock_cb(void *opaque)
  aio_notify(opaque);
  }
  
 -AioContext *aio_context_new(void)
 +int aio_context_new(AioContext **context, Error **errp)
  {
 +int ret;
  AioContext *ctx;
  ctx = (AioContext *) g_source_new(aio_source_funcs, sizeof(AioContext));
 +ret = event_notifier_init(ctx-notifier, false);
 +if (ret  0) {
 +g_source_destroy(ctx-source);
 +error_setg_errno(errp, -ret, event_notifier_init failed);

aio_context_new does not seems to be guest triggered so it may be actually 
correct
to bake an error message in it. I don't have enough AIO knowledge to juge this.

However your current error message: event_notifier_init failed look more like
a trace than an actual QEMU error message.

Please grep the code for example error messages: they are usually more plaintext
than this one

Also switching to returning a -errno make the caller's code convoluted.
Maybe returning NULL would be enough.

I see further in the patch that the return code is not really used on the
top most level maybe it's a hint that return NULL here would be better.

 +return ret;
 +}
 +aio_set_event_notifier(ctx, ctx-notifier,
 +   (EventNotifierHandler *)
 +   event_notifier_test_and_clear);
  iothread-stopping = false;
 -iothread-ctx = aio_context_new();
 +ret = aio_context_new(iothread-ctx, local_error);
 +if (ret  0) {

 +errno = -ret;
You don't seem to reuse errno further down in the code.

  gpollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
 -qemu_aio_context = aio_context_new();
 +ret = aio_context_new(qemu_aio_context, local_error);
 +if (ret  0) {

 +errno = -ret;

Same here errno does not seems used by error propagate.

Also you seems to leak gpollfds but probably you count
on the fact that the kernel will collect it for you
because QEMU will die.

I think you could move down the gpollfds = g_array_new
after the end of the test.

 +error_propagate(errp, local_error);
 +return ret;
 +}
 +
  
 -qemu_init_main_loop();
 +ret = qemu_init_main_loop(local_error);
 +if (ret  0) {
 +errno = -ret;
 +error_report(%s, error_get_pretty(local_error));
 +error_free(local_error);
 +error_exit(qemu_init_main_loop failed);
 +}
 +
  bdrv_init();
  if (argc  2) {
  error_exit(Not enough arguments);
 diff --git a/qemu-io.c b/qemu-io.c
 index 33c96c4..f10dab5 100644
 --- a/qemu-io.c
 +++ b/qemu-io.c
 @@ -387,8 +387,10 @@ int main(int argc, char **argv)
  { NULL, 0, NULL, 0 }
  };
  int c;
 +int ret;
  int opt_index = 0;
  int flags = BDRV_O_UNMAP;
 +Error *local_error = NULL;
  
  #ifdef CONFIG_POSIX
  signal(SIGPIPE, SIG_IGN);
 @@ -454,7 +456,13 @@ int main(int argc, char **argv)
  exit(1);
  }
  
 -qemu_init_main_loop();
 +ret = qemu_init_main_loop(local_error);
 +if (ret  0) {
 +error_report(%s, error_get_pretty(local_error));
 +error_report(qemu_init_main_loop failed);
 +error_free(local_error);
 +return ret;
 +}
  bdrv_init();
  
  /* initialize commands */
 diff --git a/qemu-nbd.c b/qemu-nbd.c
 index 9bc152e..4ba9635 100644
 --- a/qemu-nbd.c
 +++ b/qemu-nbd.c
 @@ -498,13 +498,13 @@ int main(int argc, char **argv)
  BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF,
  local_err);
  if (local_err) {
 -errx(EXIT_FAILURE, Failed to parse detect_zeroes mode: %s, 
 +errx(EXIT_FAILURE, Failed to parse detect_zeroes mode: %s,
   error_get_pretty(local_err));
  }
  if (detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP 
  !(flags  BDRV_O_UNMAP)) {

Re: [Qemu-devel] [PATCH v1] async: aio_context_new(): Handle event_notifier_init failure

2014-09-15 Thread Benoît Canet

The Monday 15 Sep 2014 à 21:03:18 (+0200), Benoît Canet wrote :
 The Sunday 14 Sep 2014 à 13:23:13 (+0300), Chrysostomos Nanakos wrote :
  If event_notifier_init fails QEMU exits without printing
  any error information to the user. This commit adds an error
  message on failure:
  
   # qemu [...]
   qemu: event_notifier_init failed: Too many open files in system
   qemu: qemu_init_main_loop failed
  
  Signed-off-by: Chrysostomos Nanakos cnana...@grnet.gr
  ---
   async.c  |   19 +--
   include/block/aio.h  |2 +-
   include/qemu/main-loop.h |2 +-
   iothread.c   |   12 +++-
   main-loop.c  |   11 +--
   qemu-img.c   |   11 ++-
   qemu-io.c|   10 +-
   qemu-nbd.c   |   12 +---
   tests/test-aio.c |   12 +++-
   tests/test-thread-pool.c |   10 +-
   tests/test-throttle.c|   12 +++-
   vl.c |7 +--
   12 files changed, 99 insertions(+), 21 deletions(-)
  
  diff --git a/async.c b/async.c
  index a99e7f6..d4b6687 100644
  --- a/async.c
  +++ b/async.c
  @@ -289,21 +289,28 @@ static void aio_rfifolock_cb(void *opaque)
   aio_notify(opaque);
   }
   
  -AioContext *aio_context_new(void)
  +int aio_context_new(AioContext **context, Error **errp)
   {
  +int ret;
   AioContext *ctx;
   ctx = (AioContext *) g_source_new(aio_source_funcs, 
  sizeof(AioContext));
  +ret = event_notifier_init(ctx-notifier, false);
  +if (ret  0) {
  +g_source_destroy(ctx-source);
  +error_setg_errno(errp, -ret, event_notifier_init failed);
 
 aio_context_new does not seems to be guest triggered so it may be actually 
 correct
 to bake an error message in it. I don't have enough AIO knowledge to juge 
 this.
 
 However your current error message: event_notifier_init failed look more 
 like
 a trace than an actual QEMU error message.
 
 Please grep the code for example error messages: they are usually more 
 plaintext
 than this one
 
 Also switching to returning a -errno make the caller's code convoluted.
 Maybe returning NULL would be enough.
 
 I see further in the patch that the return code is not really used on the
 top most level maybe it's a hint that return NULL here would be better.

I though a bit more about this.

You could just do

if (ret  0) {
 g_source_destroy(ctx-source);
 error_setg_errno(errp, -ret, event_notifier_init failed);
 errno = -ret;
 return NULL;
}

when the test fail.

This way you have both a straightforward return path _and_
a regular errno usage to propagate it to the callers if you need so.

 
  +return ret;
  +}
  +aio_set_event_notifier(ctx, ctx-notifier,
  +   (EventNotifierHandler *)
  +   event_notifier_test_and_clear);
   iothread-stopping = false;
  -iothread-ctx = aio_context_new();
  +ret = aio_context_new(iothread-ctx, local_error);
  +if (ret  0) {
 
  +errno = -ret;
 You don't seem to reuse errno further down in the code.
 
   gpollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
  -qemu_aio_context = aio_context_new();
  +ret = aio_context_new(qemu_aio_context, local_error);
  +if (ret  0) {
 
  +errno = -ret;
 
 Same here errno does not seems used by error propagate.
 
 Also you seems to leak gpollfds but probably you count
 on the fact that the kernel will collect it for you
 because QEMU will die.
 
 I think you could move down the gpollfds = g_array_new
 after the end of the test.
 
  +error_propagate(errp, local_error);
  +return ret;
  +}
  +
   
  -qemu_init_main_loop();
  +ret = qemu_init_main_loop(local_error);
  +if (ret  0) {
  +errno = -ret;
  +error_report(%s, error_get_pretty(local_error));
  +error_free(local_error);
  +error_exit(qemu_init_main_loop failed);
  +}
  +
   bdrv_init();
   if (argc  2) {
   error_exit(Not enough arguments);
  diff --git a/qemu-io.c b/qemu-io.c
  index 33c96c4..f10dab5 100644
  --- a/qemu-io.c
  +++ b/qemu-io.c
  @@ -387,8 +387,10 @@ int main(int argc, char **argv)
   { NULL, 0, NULL, 0 }
   };
   int c;
  +int ret;
   int opt_index = 0;
   int flags = BDRV_O_UNMAP;
  +Error *local_error = NULL;
   
   #ifdef CONFIG_POSIX
   signal(SIGPIPE, SIG_IGN);
  @@ -454,7 +456,13 @@ int main(int argc, char **argv)
   exit(1);
   }
   
  -qemu_init_main_loop();
  +ret = qemu_init_main_loop(local_error);
  +if (ret  0) {
  +error_report(%s, error_get_pretty(local_error));
  +error_report(qemu_init_main_loop failed);
  +error_free(local_error);
  +return ret;
  +}
   bdrv_init();
   
   /* initialize commands */
  diff --git a/qemu-nbd.c b/qemu-nbd.c
  index 9bc152e..4ba9635

Re: [Qemu-devel] [PATCH 4/4] block: Improve message for device name clashing with node name

2014-09-13 Thread Benoît Canet

The Friday 12 Sep 2014 à 21:26:24 (+0200), Markus Armbruster wrote :
 Suggested-by: Benoit Canet benoit.ca...@nodalink.com
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  block.c| 3 ++-
  tests/qemu-iotests/087.out | 2 +-
  2 files changed, 3 insertions(+), 2 deletions(-)
 
 diff --git a/block.c b/block.c
 index 6faf36f..02ea90f 100644
 --- a/block.c
 +++ b/block.c
 @@ -347,7 +347,8 @@ BlockDriverState *bdrv_new(const char *device_name, Error 
 **errp)
  return NULL;
  }
  if (bdrv_find_node(device_name)) {
 -error_setg(errp, Device with node-name '%s' already exists,
 +error_setg(errp,
 +   Device name '%s' conflicts with an existing node name,
 device_name);
  return NULL;
  }
 diff --git a/tests/qemu-iotests/087.out b/tests/qemu-iotests/087.out
 index 7fbee3f..75a54e0 100644
 --- a/tests/qemu-iotests/087.out
 +++ b/tests/qemu-iotests/087.out
 @@ -20,7 +20,7 @@ QMP_VERSION
  {return: {}}
  {return: {}}
  {error: {class: GenericError, desc: Device with id 'disk' already 
 exists}}
 -{error: {class: GenericError, desc: Device with node-name 
 'test-node' already exists}}
 +{error: {class: GenericError, desc: Device name 'test-node' 
 conflicts with an existing node name}}
  main-loop: WARNING: I/O thread spun for 1000 iterations
  {error: {class: GenericError, desc: could not open disk image 
 disk2: node-name=disk is conflicting with a device id}}
  {error: {class: GenericError, desc: could not open disk image 
 disk2: Duplicate node name}}
 -- 
 1.9.3
 

Reviewed-by: Benoît Canet benoit.ca...@nodalink.com

Re: [Qemu-devel] [PATCH 06/23] block: Eliminate bdrv_states, use block_next() instead

2014-09-11 Thread Benoît Canet

The Wednesday 10 Sep 2014 à 10:13:35 (+0200), Markus Armbruster wrote :
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  block.c   | 43 +++
  block/block-backend.c |  4 
  include/block/block_int.h |  2 --
  3 files changed, 23 insertions(+), 26 deletions(-)
 
 diff --git a/block.c b/block.c
 index a6c03da..89f9cf0 100644
 --- a/block.c
 +++ b/block.c
 @@ -24,6 +24,7 @@
  #include config-host.h
  #include qemu-common.h
  #include trace.h
 +#include sysemu/block-backend.h
  #include block/block_int.h
  #include block/blockjob.h
  #include qemu/module.h
 @@ -90,9 +91,6 @@ static void coroutine_fn bdrv_co_do_rw(void *opaque);
  static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
  int64_t sector_num, int nb_sectors, BdrvRequestFlags flags);
  
 -static QTAILQ_HEAD(, BlockDriverState) bdrv_states =
 -QTAILQ_HEAD_INITIALIZER(bdrv_states);
 -
  static QTAILQ_HEAD(, BlockDriverState) graph_bdrv_states =
  QTAILQ_HEAD_INITIALIZER(graph_bdrv_states);
  
 @@ -355,9 +353,6 @@ BlockDriverState *bdrv_new_named(const char *device_name, 
 Error **errp)
  bs = bdrv_new();
  
  pstrcpy(bs-device_name, sizeof(bs-device_name), device_name);
 -if (device_name[0] != '\0') {
 -QTAILQ_INSERT_TAIL(bdrv_states, bs, device_list);
 -}
  
  return bs;
  }
 @@ -1888,7 +1883,7 @@ void bdrv_close_all(void)
  {
  BlockDriverState *bs;
  
 -QTAILQ_FOREACH(bs, bdrv_states, device_list) {
 +for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
  AioContext *aio_context = bdrv_get_aio_context(bs);
  
  aio_context_acquire(aio_context);
 @@ -1939,7 +1934,7 @@ void bdrv_drain_all(void)
  while (busy) {
  busy = false;
  
 -QTAILQ_FOREACH(bs, bdrv_states, device_list) {
 +for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
  AioContext *aio_context = bdrv_get_aio_context(bs);
  bool bs_busy;
  
 @@ -1960,9 +1955,6 @@ void bdrv_drain_all(void)
 Also, NULL terminate the device_name to prevent double remove */
  void bdrv_make_anon(BlockDriverState *bs)
  {
 -if (bs-device_name[0] != '\0') {
 -QTAILQ_REMOVE(bdrv_states, bs, device_list);
 -}
  bs-device_name[0] = '\0';
  if (bs-node_name[0] != '\0') {
  QTAILQ_REMOVE(graph_bdrv_states, bs, node_list);
 @@ -2016,10 +2008,9 @@ static void bdrv_move_feature_fields(BlockDriverState 
 *bs_dest,
  /* job */
  bs_dest-job= bs_src-job;
  
 -/* keep the same entry in bdrv_states */
  pstrcpy(bs_dest-device_name, sizeof(bs_dest-device_name),
  bs_src-device_name);
 -bs_dest-device_list = bs_src-device_list;
 +
  memcpy(bs_dest-op_blockers, bs_src-op_blockers,
 sizeof(bs_dest-op_blockers));
  }
 @@ -2363,7 +2354,7 @@ int bdrv_commit_all(void)
  {
  BlockDriverState *bs;
  
 -QTAILQ_FOREACH(bs, bdrv_states, device_list) {
 +for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
  AioContext *aio_context = bdrv_get_aio_context(bs);
  
  aio_context_acquire(aio_context);
 @@ -3807,7 +3798,7 @@ BlockDriverState *bdrv_find(const char *name)
  {
  BlockDriverState *bs;
  
 -QTAILQ_FOREACH(bs, bdrv_states, device_list) {
 +for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
  if (!strcmp(name, bs-device_name)) {
  return bs;
  }
 @@ -3888,17 +3879,21 @@ bool bdrv_chain_contains(BlockDriverState *top, 
 BlockDriverState *base)
  
  BlockDriverState *bdrv_next(BlockDriverState *bs)
  {
 -if (!bs) {
 -return QTAILQ_FIRST(bdrv_states);
 -}
 -return QTAILQ_NEXT(bs, device_list);

 +BlockBackend *blk;
 +
 +for (blk = blk_next(bs ? bs-blk : NULL);
 + blk  !blk_bs(blk);
 + blk = blk_next(blk))
 +;
 +
 +return blk ? blk_bs(blk) : NULL;

I find ?: hard to read and I know at least one maintainer
prefers simples ifs to it.

BlockDriverState *bdrv_next(BlockDriverState *bs)
{
BlockBackend *blk = NULL;

/* if a BDS is provided use it's block backend as a starting point */
if (bs) {
blk = bl-blk;
}

/* looking for the next block backend having a BDS attached */
for (blk = blk_next(blk);
 blk  !blk_bs(blk);
 blk = blk_next(blk))
 ;

   /* the search was successfull */
   if (blk) {
   return blk_bs(blk);
   }

   return NULL;
}

I think getting rid of ?: spread the brain load sequentially by being less 
compact
but anyway your code is correct so it's up to you.

  }
  
  void bdrv_iterate(void (*it)(void *opaque, BlockDriverState *bs), void 
 *opaque)
  {
  BlockDriverState *bs;
  
 -QTAILQ_FOREACH(bs, bdrv_states, device_list) {
 +for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
  it(opaque, bs);
  }
  }
 @@ -3918,7 +3913,7 @@ int bdrv_flush_all(void)
  BlockDriverState *bs;
  int result = 0;
  
 -

Re: [Qemu-devel] [PATCH 07/23] block: Eliminate bdrv_iterate(), use bdrv_next()

2014-09-11 Thread Benoît Canet

The Wednesday 10 Sep 2014 à 10:13:36 (+0200), Markus Armbruster wrote :
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  block-migration.c | 30 +++---
  block.c   |  9 -
  blockdev.c| 31 +--
  include/block/block.h |  2 --
  monitor.c | 33 +
  5 files changed, 37 insertions(+), 68 deletions(-)
 
 diff --git a/block-migration.c b/block-migration.c
 index 3ad31a2..cb3e16c 100644
 --- a/block-migration.c
 +++ b/block-migration.c
 @@ -343,12 +343,25 @@ static void unset_dirty_tracking(void)
  }
  }
  
 -static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
 +static void init_blk_migration(QEMUFile *f)
  {
 +BlockDriverState *bs;
  BlkMigDevState *bmds;
  int64_t sectors;
  
 -if (!bdrv_is_read_only(bs)) {
 +block_mig_state.submitted = 0;
 +block_mig_state.read_done = 0;
 +block_mig_state.transferred = 0;
 +block_mig_state.total_sector_sum = 0;
 +block_mig_state.prev_progress = -1;
 +block_mig_state.bulk_completed = 0;
 +block_mig_state.zero_blocks = migrate_zero_blocks();
 +
 +for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
 +if (bdrv_is_read_only(bs)) {
 +continue;
 +}
 +
  sectors = bdrv_nb_sectors(bs);
  if (sectors = 0) {
  return;
 @@ -378,19 +391,6 @@ static void init_blk_migration_it(void *opaque, 
 BlockDriverState *bs)
  }
  }
  
 -static void init_blk_migration(QEMUFile *f)
 -{
 -block_mig_state.submitted = 0;
 -block_mig_state.read_done = 0;
 -block_mig_state.transferred = 0;
 -block_mig_state.total_sector_sum = 0;
 -block_mig_state.prev_progress = -1;
 -block_mig_state.bulk_completed = 0;
 -block_mig_state.zero_blocks = migrate_zero_blocks();
 -
 -bdrv_iterate(init_blk_migration_it, NULL);
 -}
 -
  /* Called with no lock taken.  */
  
  static int blk_mig_save_bulked_block(QEMUFile *f)
 diff --git a/block.c b/block.c
 index 89f9cf0..593d89b 100644
 --- a/block.c
 +++ b/block.c
 @@ -3889,15 +3889,6 @@ BlockDriverState *bdrv_next(BlockDriverState *bs)
  return blk ? blk_bs(blk) : NULL;
  }
  
 -void bdrv_iterate(void (*it)(void *opaque, BlockDriverState *bs), void 
 *opaque)
 -{
 -BlockDriverState *bs;
 -
 -for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
 -it(opaque, bs);
 -}
 -}
 -
  const char *bdrv_get_device_name(BlockDriverState *bs)
  {
  return bs-device_name;
 diff --git a/blockdev.c b/blockdev.c
 index 791f6d9..353563e 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -2520,26 +2520,21 @@ fail:
  qmp_output_visitor_cleanup(ov);
  }
  
 -static void do_qmp_query_block_jobs_one(void *opaque, BlockDriverState *bs)
 -{
 -BlockJobInfoList **prev = opaque;
 -BlockJob *job = bs-job;
 -
 -if (job) {
 -BlockJobInfoList *elem = g_new0(BlockJobInfoList, 1);
 -elem-value = block_job_query(bs-job);
 -(*prev)-next = elem;
 -*prev = elem;
 -}
 -}
 -
  BlockJobInfoList *qmp_query_block_jobs(Error **errp)
  {
 -/* Dummy is a fake list element for holding the head pointer */
 -BlockJobInfoList dummy = {};
 -BlockJobInfoList *prev = dummy;
 -bdrv_iterate(do_qmp_query_block_jobs_one, prev);
 -return dummy.next;
 +BlockJobInfoList *head = NULL, **p_next = head;
 +BlockDriverState *bs;
 +
 +for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
 +if (bs-job) {
 +BlockJobInfoList *elem = g_new0(BlockJobInfoList, 1);
 +elem-value = block_job_query(bs-job);
 +*p_next = elem;
 +p_next = elem-next;
 +}
 +}
 +
 +return head;
  }
  
  QemuOptsList qemu_common_drive_opts = {
 diff --git a/include/block/block.h b/include/block/block.h
 index 95139c0..8cf9ea3 100644
 --- a/include/block/block.h
 +++ b/include/block/block.h
 @@ -409,8 +409,6 @@ BlockDriverState *bdrv_lookup_bs(const char *device,
   Error **errp);
  bool bdrv_chain_contains(BlockDriverState *top, BlockDriverState *base);
  BlockDriverState *bdrv_next(BlockDriverState *bs);
 -void bdrv_iterate(void (*it)(void *opaque, BlockDriverState *bs),
 -  void *opaque);
  int bdrv_is_encrypted(BlockDriverState *bs);
  int bdrv_key_required(BlockDriverState *bs);
  int bdrv_set_key(BlockDriverState *bs, const char *key);
 diff --git a/monitor.c b/monitor.c
 index 34cee74..4ae66df 100644
 --- a/monitor.c
 +++ b/monitor.c
 @@ -4208,24 +4208,6 @@ static void file_completion(Monitor *mon, const char 
 *input)
  closedir(ffs);
  }
  
 -typedef struct MonitorBlockComplete {
 -Monitor *mon;
 -const char *input;
 -} MonitorBlockComplete;
 -
 -static void block_completion_it(void *opaque, BlockDriverState *bs)
 -{
 -const char *name = bdrv_get_device_name(bs);
 -MonitorBlockComplete *mbc = opaque;
 -Monitor *mon = mbc-mon;
 -

Re: [Qemu-devel] [PATCH 08/23] block: Eliminate BlockDriverState member device_name[]

2014-09-11 Thread Benoît Canet

The Wednesday 10 Sep 2014 à 10:13:37 (+0200), Markus Armbruster wrote :
 device_name[] is can become non-empty only in bdrv_new_named() and
 bdrv_move_feature_fields().  The latter is used only to undo damage
 done by bdrv_swap().  The former is called only by blk_new_with_bs().
 Therefore, when a BlockDriverState's device_name[] is non-empty, then
 it's owned by a BlockBackend.
 
 The converse is also true, because blk_attach_bs() is called only by
 blk_new_with_bs() so far.
 
 Furthermore, blk_new_with_bs() keeps the two names equal.
 
 Therefore, device_name[] is redundant.  Eliminate it.
 
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  block-migration.c | 12 +
  block.c   | 63 
 ++-
  block/block-backend.c | 12 -
  block/cow.c   |  2 +-
  block/mirror.c|  3 ++-
  block/qapi.c  |  6 ++---
  block/qcow.c  |  4 +--
  block/qcow2.c |  4 +--
  block/qed.c   |  2 +-
  block/quorum.c|  4 +--
  block/vdi.c   |  2 +-
  block/vhdx.c  |  2 +-
  block/vmdk.c  |  4 +--
  block/vpc.c   |  2 +-
  block/vvfat.c |  2 +-
  blockjob.c|  3 ++-
  include/block/block.h |  3 +--
  include/block/block_int.h |  2 --
  18 files changed, 53 insertions(+), 79 deletions(-)
 
 diff --git a/block-migration.c b/block-migration.c
 index cb3e16c..da30e93 100644
 --- a/block-migration.c
 +++ b/block-migration.c
 @@ -14,7 +14,9 @@
   */
  
  #include qemu-common.h
 -#include block/block_int.h
 +#include block/block.h
 +#include qemu/error-report.h
 +#include qemu/main-loop.h
  #include hw/hw.h
  #include qemu/queue.h
  #include qemu/timer.h
 @@ -130,9 +132,9 @@ static void blk_send(QEMUFile *f, BlkMigBlock * blk)
   | flags);
  
  /* device name */
 -len = strlen(blk-bmds-bs-device_name);
 +len = strlen(bdrv_get_device_name(blk-bmds-bs));
  qemu_put_byte(f, len);
 -qemu_put_buffer(f, (uint8_t *)blk-bmds-bs-device_name, len);
 +qemu_put_buffer(f, (uint8_t *)bdrv_get_device_name(blk-bmds-bs), len);
  
  /* if a block is zero we need to flush here since the network
   * bandwidth is now a lot higher than the storage device bandwidth.
 @@ -382,9 +384,9 @@ static void init_blk_migration(QEMUFile *f)
  
  if (bmds-shared_base) {
  DPRINTF(Start migration for %s with shared base image\n,
 -bs-device_name);
 +bdrv_get_device_name(bs));
  } else {
 -DPRINTF(Start full migration for %s\n, bs-device_name);
 +DPRINTF(Start full migration for %s\n, 
 bdrv_get_device_name(bs));
  }
  
  QSIMPLEQ_INSERT_TAIL(block_mig_state.bmds_list, bmds, entry);
 diff --git a/block.c b/block.c
 index 593d89b..61ea15d 100644
 --- a/block.c
 +++ b/block.c
 @@ -332,31 +332,6 @@ void bdrv_register(BlockDriver *bdrv)
  QLIST_INSERT_HEAD(bdrv_drivers, bdrv, list);
  }
  
 -/* create a new block device (by default it is empty) */
 -BlockDriverState *bdrv_new_named(const char *device_name, Error **errp)
 -{
 -BlockDriverState *bs;
 -
 -assert(*device_name);
 -
 -if (bdrv_find(device_name)) {
 -error_setg(errp, Device with id '%s' already exists,
 -   device_name);
 -return NULL;
 -}
 -if (bdrv_find_node(device_name)) {
 -error_setg(errp, Device with node-name '%s' already exists,
 -   device_name);
 -return NULL;
 -}
 -
 -bs = bdrv_new();
 -
 -pstrcpy(bs-device_name, sizeof(bs-device_name), device_name);
 -
 -return bs;
 -}
 -
  BlockDriverState *bdrv_new(void)
  {
  BlockDriverState *bs;
 @@ -1159,7 +1134,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
 BlockDriverState *backing_hd)
  } else if (backing_hd) {
  error_setg(bs-backing_blocker,
 device is used as backing hd of '%s',
 -   bs-device_name);
 +   bdrv_get_device_name(bs));
  }
  
  bs-backing_hd = backing_hd;
 @@ -1533,7 +1508,7 @@ int bdrv_open(BlockDriverState **pbs, const char 
 *filename,
  } else {
  error_setg(errp, Block format '%s' used by device '%s' doesn't 
 support the option '%s', drv-format_name,
 -   bs-device_name, entry-key);
 +   bdrv_get_device_name(bs), entry-key);
  }
  
  ret = -EINVAL;
 @@ -1740,7 +1715,7 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, 
 BlockReopenQueue *queue,
  if (!(reopen_state-bs-open_flags  BDRV_O_ALLOW_RDWR) 
  reopen_state-flags  BDRV_O_RDWR) {
  error_set(errp, QERR_DEVICE_IS_READ_ONLY,
 -  reopen_state-bs-device_name);
 +  bdrv_get_device_name(reopen_state-bs));
  goto error;
  }
  
 @@ -1767,7

Re: [Qemu-devel] [PATCH 08/23] block: Eliminate BlockDriverState member device_name[]

2014-09-11 Thread Benoît Canet

The Thursday 11 Sep 2014 à 13:34:33 (+0200), Benoît Canet wrote :
 The Wednesday 10 Sep 2014 à 10:13:37 (+0200), Markus Armbruster wrote :
  device_name[] is can become non-empty only in bdrv_new_named() and
  bdrv_move_feature_fields().  The latter is used only to undo damage
  done by bdrv_swap().  The former is called only by blk_new_with_bs().
  Therefore, when a BlockDriverState's device_name[] is non-empty, then
  it's owned by a BlockBackend.
  
  The converse is also true, because blk_attach_bs() is called only by
  blk_new_with_bs() so far.
  
  Furthermore, blk_new_with_bs() keeps the two names equal.
  
  Therefore, device_name[] is redundant.  Eliminate it.
  
  Signed-off-by: Markus Armbruster arm...@redhat.com
  ---
   block-migration.c | 12 +
   block.c   | 63 
  ++-
   block/block-backend.c | 12 -
   block/cow.c   |  2 +-
   block/mirror.c|  3 ++-
   block/qapi.c  |  6 ++---
   block/qcow.c  |  4 +--
   block/qcow2.c |  4 +--
   block/qed.c   |  2 +-
   block/quorum.c|  4 +--
   block/vdi.c   |  2 +-
   block/vhdx.c  |  2 +-
   block/vmdk.c  |  4 +--
   block/vpc.c   |  2 +-
   block/vvfat.c |  2 +-
   blockjob.c|  3 ++-
   include/block/block.h |  3 +--
   include/block/block_int.h |  2 --
   18 files changed, 53 insertions(+), 79 deletions(-)
  
  diff --git a/block-migration.c b/block-migration.c
  index cb3e16c..da30e93 100644
  --- a/block-migration.c
  +++ b/block-migration.c
  @@ -14,7 +14,9 @@
*/
   
   #include qemu-common.h
  -#include block/block_int.h
  +#include block/block.h
  +#include qemu/error-report.h
  +#include qemu/main-loop.h
   #include hw/hw.h
   #include qemu/queue.h
   #include qemu/timer.h
  @@ -130,9 +132,9 @@ static void blk_send(QEMUFile *f, BlkMigBlock * blk)
| flags);
   
   /* device name */
  -len = strlen(blk-bmds-bs-device_name);
  +len = strlen(bdrv_get_device_name(blk-bmds-bs));
   qemu_put_byte(f, len);
  -qemu_put_buffer(f, (uint8_t *)blk-bmds-bs-device_name, len);
  +qemu_put_buffer(f, (uint8_t *)bdrv_get_device_name(blk-bmds-bs), 
  len);
   
   /* if a block is zero we need to flush here since the network
* bandwidth is now a lot higher than the storage device bandwidth.
  @@ -382,9 +384,9 @@ static void init_blk_migration(QEMUFile *f)
   
   if (bmds-shared_base) {
   DPRINTF(Start migration for %s with shared base image\n,
  -bs-device_name);
  +bdrv_get_device_name(bs));
   } else {
  -DPRINTF(Start full migration for %s\n, bs-device_name);
  +DPRINTF(Start full migration for %s\n, 
  bdrv_get_device_name(bs));
   }
   
   QSIMPLEQ_INSERT_TAIL(block_mig_state.bmds_list, bmds, entry);
  diff --git a/block.c b/block.c
  index 593d89b..61ea15d 100644
  --- a/block.c
  +++ b/block.c
  @@ -332,31 +332,6 @@ void bdrv_register(BlockDriver *bdrv)
   QLIST_INSERT_HEAD(bdrv_drivers, bdrv, list);
   }
   
  -/* create a new block device (by default it is empty) */
  -BlockDriverState *bdrv_new_named(const char *device_name, Error **errp)
  -{
  -BlockDriverState *bs;
  -
  -assert(*device_name);
  -
  -if (bdrv_find(device_name)) {
  -error_setg(errp, Device with id '%s' already exists,
  -   device_name);
  -return NULL;
  -}
  -if (bdrv_find_node(device_name)) {
  -error_setg(errp, Device with node-name '%s' already exists,
  -   device_name);
  -return NULL;
  -}
  -
  -bs = bdrv_new();
  -
  -pstrcpy(bs-device_name, sizeof(bs-device_name), device_name);
  -
  -return bs;
  -}
  -
   BlockDriverState *bdrv_new(void)
   {
   BlockDriverState *bs;
  @@ -1159,7 +1134,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
  BlockDriverState *backing_hd)
   } else if (backing_hd) {
   error_setg(bs-backing_blocker,
  device is used as backing hd of '%s',
  -   bs-device_name);
  +   bdrv_get_device_name(bs));
   }
   
   bs-backing_hd = backing_hd;
  @@ -1533,7 +1508,7 @@ int bdrv_open(BlockDriverState **pbs, const char 
  *filename,
   } else {
   error_setg(errp, Block format '%s' used by device '%s' 
  doesn't 
  support the option '%s', drv-format_name,
  -   bs-device_name, entry-key);
  +   bdrv_get_device_name(bs), entry-key);
   }
   
   ret = -EINVAL;
  @@ -1740,7 +1715,7 @@ int bdrv_reopen_prepare(BDRVReopenState 
  *reopen_state, BlockReopenQueue *queue,
   if (!(reopen_state-bs-open_flags  BDRV_O_ALLOW_RDWR) 
   reopen_state-flags  BDRV_O_RDWR

Re: [Qemu-devel] [PATCH 10/23] block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()

2014-09-11 Thread Benoît Canet

The Wednesday 10 Sep 2014 à 10:13:39 (+0200), Markus Armbruster wrote :
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  blockdev.c   |  3 +--
  hw/arm/collie.c  |  9 +
  hw/arm/gumstix.c |  5 +++--
  hw/arm/mainstone.c   |  8 
  hw/arm/musicpal.c| 11 ++-
  hw/arm/nseries.c |  6 --
  hw/arm/omap1.c   |  4 +++-
  hw/arm/omap2.c   |  4 +++-
  hw/arm/omap_sx1.c|  9 +
  hw/arm/pxa2xx.c  |  7 +--
  hw/arm/spitz.c   |  4 +++-
  hw/arm/versatilepb.c |  4 +++-
  hw/arm/vexpress.c|  4 +++-
  hw/arm/xilinx_zynq.c |  4 +++-
  hw/arm/z2.c  |  7 ---
  hw/block/fdc.c   | 16 +++-
  hw/block/m25p80.c|  5 +++--
  hw/block/xen_disk.c  |  2 +-
  hw/cris/axis_dev88.c |  3 ++-
  hw/display/tc6393xb.c|  4 +++-
  hw/i386/pc_sysfw.c   |  3 ++-
  hw/ide/piix.c|  6 --
  hw/ide/qdev.c|  4 +++-
  hw/isa/pc87312.c |  7 +--
  hw/lm32/lm32_boards.c| 13 +++--
  hw/lm32/milkymist.c  |  7 ---
  hw/microblaze/petalogix_ml605_mmu.c  |  5 +++--
  hw/microblaze/petalogix_s3adsp1800_mmu.c |  5 +++--
  hw/mips/mips_malta.c |  4 +++-
  hw/mips/mips_r4k.c   |  5 +++--
  hw/pci/pci-hotplug-old.c |  9 ++---
  hw/ppc/ppc405_boards.c   | 25 -
  hw/ppc/spapr.c   |  4 +++-
  hw/ppc/virtex_ml507.c|  5 +++--
  hw/scsi/scsi-bus.c   |  5 +++--
  hw/sd/milkymist-memcard.c|  7 +--
  hw/sd/pl181.c|  3 ++-
  hw/sd/sdhci.c|  3 ++-
  hw/sd/ssi-sd.c   |  3 ++-
  hw/sh4/r2d.c |  5 +++--
  hw/usb/dev-storage.c |  4 +++-
  hw/xtensa/xtfpga.c   |  4 +++-
  include/sysemu/blockdev.h|  1 -
  43 files changed, 163 insertions(+), 93 deletions(-)
 
 diff --git a/blockdev.c b/blockdev.c
 index 353563e..5c75abd 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -219,7 +219,7 @@ static void bdrv_format_print(void *opaque, const char 
 *name)
  
  void drive_del(DriveInfo *dinfo)
  {
 -blk_unref(dinfo-bdrv-blk);
 +blk_unref(blk_by_legacy_dinfo(dinfo));
  }
  
  typedef struct {
 @@ -472,7 +472,6 @@ static BlockBackend *blockdev_init(const char *file, 
 QDict *bs_opts,
  
  dinfo = g_malloc0(sizeof(*dinfo));
  dinfo-id = g_strdup(qemu_opts_id(opts));
 -dinfo-bdrv = bs;
  blk_set_legacy_dinfo(blk, dinfo);
  
  if (!file || !*file) {
 diff --git a/hw/arm/collie.c b/hw/arm/collie.c
 index ed7851f..0247290 100644
 --- a/hw/arm/collie.c
 +++ b/hw/arm/collie.c
 @@ -15,6 +15,7 @@
  #include strongarm.h
  #include hw/arm/arm.h
  #include hw/block/flash.h
 +#include sysemu/block-backend.h
  #include sysemu/blockdev.h
  #include exec/address-spaces.h
  
 @@ -41,13 +42,13 @@ static void collie_init(MachineState *machine)
  
  dinfo = drive_get(IF_PFLASH, 0, 0);
  pflash_cfi01_register(SA_CS0, NULL, collie.fl1, 0x0200,
 -dinfo ? dinfo-bdrv : NULL, (64 * 1024),
 -512, 4, 0x00, 0x00, 0x00, 0x00, 0);
 +dinfo ? blk_bs(blk_by_legacy_dinfo(dinfo)) : NULL,
 +(64 * 1024), 512, 4, 0x00, 0x00, 0x00, 0x00, 0);
  
  dinfo = drive_get(IF_PFLASH, 0, 1);
  pflash_cfi01_register(SA_CS1, NULL, collie.fl2, 0x0200,
 -dinfo ? dinfo-bdrv : NULL, (64 * 1024),
 -512, 4, 0x00, 0x00, 0x00, 0x00, 0);
 +dinfo ? blk_bs(blk_by_legacy_dinfo(dinfo)) : NULL,
 +(64 * 1024), 512, 4, 0x00, 0x00, 0x00, 0x00, 0);
  
  sysbus_create_simple(scoop, 0x4080, NULL);
  
 diff --git a/hw/arm/gumstix.c b/hw/arm/gumstix.c
 index 3f8465e..49f9339 100644
 --- a/hw/arm/gumstix.c
 +++ b/hw/arm/gumstix.c
 @@ -40,6 +40,7 @@
  #include hw/block/flash.h
  #include hw/devices.h
  #include hw/boards.h
 +#include sysemu/block-backend.h
  #include sysemu/blockdev.h
  #include exec/address-spaces.h
  #include sysemu/qtest.h
 @@ -71,7 +72,7 @@ static void connex_init(MachineState *machine)
  be = 0;
  #endif
  if (!pflash_cfi01_register(0x, NULL, connext.rom, connex_rom,
 -   dinfo ? dinfo-bdrv : NULL,
 +   dinfo ?

Re: [Qemu-devel] [PATCH 11/23] block: Rename BlockDriverAIOCB* to BlockAIOCB*

2014-09-11 Thread Benoît Canet

The Wednesday 10 Sep 2014 à 10:13:40 (+0200), Markus Armbruster wrote :
 I'll use BlockDriverAIOCB with block backends shortly, and the name is
 going to fit badly there.  It's a block layer thing anyway, not just a
 block driver thing.
 
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  block-migration.c   |   2 +-
  block.c | 151 
 ++--
  block/archipelago.c |  30 -
  block/backup.c  |   2 +-
  block/blkdebug.c|  22 +++
  block/blkverify.c   |  20 +++---
  block/commit.c  |   2 +-
  block/curl.c|   8 +--
  block/iscsi.c   |   8 +--
  block/linux-aio.c   |   8 +--
  block/mirror.c  |   6 +-
  block/qed-gencb.c   |   4 +-
  block/qed-table.c   |  10 +--
  block/qed.c |  46 +++---
  block/qed.h |  12 ++--
  block/quorum.c  |  38 +--
  block/raw-aio.h |   8 +--
  block/raw-posix.c   |  32 +-
  block/raw-win32.c   |  16 ++---
  block/raw_bsd.c |   8 +--
  block/rbd.c |  58 -
  block/sheepdog.c|   4 +-
  block/stream.c  |   2 +-
  block/win32-aio.c   |   8 +--
  blockjob.c  |   4 +-
  dma-helpers.c   |  24 +++
  hw/block/nvme.h |   2 +-
  hw/ide/ahci.c   |   2 +-
  hw/ide/ahci.h   |   2 +-
  hw/ide/core.c   |  12 ++--
  hw/ide/internal.h   |  12 ++--
  hw/ide/macio.c  |   2 +-
  hw/ide/pci.c|   2 +-
  hw/ide/pci.h|   2 +-
  hw/ppc/mac.h|   2 +-
  hw/scsi/scsi-generic.c  |   2 +-
  include/block/aio.h |  12 ++--
  include/block/block.h   |  36 +--
  include/block/block_int.h   |  30 -
  include/block/blockjob.h|   4 +-
  include/block/thread-pool.h |   4 +-
  include/hw/scsi/scsi.h  |   2 +-
  include/monitor/monitor.h   |   4 +-
  include/sysemu/dma.h|  26 
  monitor.c   |   6 +-
  tests/test-thread-pool.c|   2 +-
  thread-pool.c   |   8 +--
  47 files changed, 353 insertions(+), 354 deletions(-)
 
 diff --git a/block-migration.c b/block-migration.c
 index da30e93..08db01a 100644
 --- a/block-migration.c
 +++ b/block-migration.c
 @@ -72,7 +72,7 @@ typedef struct BlkMigBlock {
  int nr_sectors;
  struct iovec iov;
  QEMUIOVector qiov;
 -BlockDriverAIOCB *aiocb;
 +BlockAIOCB *aiocb;
  
  /* Protected by block migration lock.  */
  int ret;
 diff --git a/block.c b/block.c
 index 34c8f8c..f71b87c 100644
 --- a/block.c
 +++ b/block.c
 @@ -61,12 +61,12 @@ struct BdrvDirtyBitmap {
  #define COROUTINE_POOL_RESERVATION 64 /* number of coroutines to reserve */
  
  static void bdrv_dev_change_media_cb(BlockDriverState *bs, bool load);
 -static BlockDriverAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
 +static BlockAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
  int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
 -BlockDriverCompletionFunc *cb, void *opaque);
 -static BlockDriverAIOCB *bdrv_aio_writev_em(BlockDriverState *bs,
 +BlockCompletionFunc *cb, void *opaque);
 +static BlockAIOCB *bdrv_aio_writev_em(BlockDriverState *bs,
  int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
 -BlockDriverCompletionFunc *cb, void *opaque);
 +BlockCompletionFunc *cb, void *opaque);
  static int coroutine_fn bdrv_co_readv_em(BlockDriverState *bs,
   int64_t sector_num, int nb_sectors,
   QEMUIOVector *iov);
 @@ -79,14 +79,14 @@ static int coroutine_fn 
 bdrv_co_do_preadv(BlockDriverState *bs,
  static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
  int64_t offset, unsigned int bytes, QEMUIOVector *qiov,
  BdrvRequestFlags flags);
 -static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
 -   int64_t sector_num,
 -   QEMUIOVector *qiov,
 -   int nb_sectors,
 -   BdrvRequestFlags flags,
 -   BlockDriverCompletionFunc *cb,
 -   void *opaque,
 -   bool is_write);
 +static BlockAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
 + int64_t sector_num,
 + QEMUIOVector *qiov,
 + int nb_sectors,
 + BdrvRequestFlags flags,
 + BlockCompletionFunc

Re: [Qemu-devel] [PATCH 08/23] block: Eliminate BlockDriverState member device_name[]

2014-09-11 Thread Benoît Canet

The Thursday 11 Sep 2014 à 07:00:41 (-0600), Eric Blake wrote :
 On 09/11/2014 05:34 AM, Benoît Canet wrote:
  The Wednesday 10 Sep 2014 à 10:13:37 (+0200), Markus Armbruster wrote :
  device_name[] is can become non-empty only in bdrv_new_named() and
  bdrv_move_feature_fields().  The latter is used only to undo damage
  done by bdrv_swap().  The former is called only by blk_new_with_bs().
  Therefore, when a BlockDriverState's device_name[] is non-empty, then
  it's owned by a BlockBackend.
 
 [lots of lines trimmed - it's not only okay, but desirable to trim out
 portions of a patch that you are okay with, in order to call attention
 to the problem spots that you are commenting on without making the
 reader have to scroll through pages of quoted context]
 
   
  -const char *bdrv_get_device_name(BlockDriverState *bs)
  +const char *bdrv_get_device_name(const BlockDriverState *bs)
   {
  -return bs-device_name;
  +const char *name = bs-blk ? blk_name(bs-blk) : NULL;
  +return name ?: ;
   }
  
  Why not ?
  
  return bs-blk ? blk_name(bs-blk) : ;
 
 If I understand right, it was because blk_name(bs-blk) may return NULL,

It think it can't: see patch 2 extract:

 +BlockBackend *blk_new(const char *name, Error **errp)
 +{
 +BlockBackend *blk = g_new0(BlockBackend, 1);
 + 
 +assert(name  name[0]);

 but this function is guaranteed to return non-NULL.

Re: [Qemu-devel] [PATCH 10/23] block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()

2014-09-11 Thread Benoît Canet


 +   blk_bs(blk_by_legacy_dinfo(dinfo)));

This seems to be a fairly common pattern: blk_bs(blk_by_legacy_dinfo()).
How about a helper function ?

Re: [Qemu-devel] [PATCH 10/23] block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()

2014-09-11 Thread Benoît Canet

The Thursday 11 Sep 2014 à 21:12:44 (+0200), Markus Armbruster wrote :
 Benoît Canet benoit.ca...@irqsave.net writes:
 
  +   blk_bs(blk_by_legacy_dinfo(dinfo)));
 
  This seems to be a fairly common pattern: blk_bs(blk_by_legacy_dinfo()).
  How about a helper function ?
 
 Yes, except the pattern is going to evaporate in patch 14 :)

haha

Re: [Qemu-devel] New requirement for getting block layer patches merged

2014-09-11 Thread Benoît Canet

 EOF
 ---
 If you have feedback or questions, let us know.  The process can be
 tweaked as time goes on so we can continue to improve.

Great mail.

Now we need a wiki entry describing the process.
Also we need something reminding who is the maintainer of the current week.

Best regards

Benoît

Re: [Qemu-devel] [PATCH 01/23] block: Split bdrv_new_named() off bdrv_new()

2014-09-10 Thread Benoît Canet

The Wednesday 10 Sep 2014 à 10:13:30 (+0200), Markus Armbruster wrote :
 Creating an anonymous BDS can't fail.  Make that obvious.
 
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  block.c   | 26 +++---
  block/iscsi.c |  2 +-
  block/vvfat.c |  2 +-
  blockdev.c|  2 +-
  hw/block/xen_disk.c   |  2 +-
  include/block/block.h |  3 ++-
  qemu-img.c|  6 +++---
  qemu-io.c |  2 +-
  qemu-nbd.c|  2 +-
  9 files changed, 30 insertions(+), 17 deletions(-)
 
 diff --git a/block.c b/block.c
 index d06dd51..4b3bcd4 100644
 --- a/block.c
 +++ b/block.c
 @@ -335,10 +335,11 @@ void bdrv_register(BlockDriver *bdrv)
  }
  
  /* create a new block device (by default it is empty) */
 -BlockDriverState *bdrv_new(const char *device_name, Error **errp)
 +BlockDriverState *bdrv_new_named(const char *device_name, Error **errp)
  {
  BlockDriverState *bs;
 -int i;
 +
 +assert(*device_name);

This assert that device_name is not a null pointer.
But here we are pretty sure that the BDS should be named given the name of the 
function.
Should we bake an assert on device_name[0] != '\0' to avoid bdrv_new_named 
being called
with  as device_name ?

  
  if (bdrv_find(device_name)) {
  error_setg(errp, Device with id '%s' already exists,
 @@ -351,12 +352,23 @@ BlockDriverState *bdrv_new(const char *device_name, 
 Error **errp)
  return NULL;
  }
  
 -bs = g_new0(BlockDriverState, 1);
 -QLIST_INIT(bs-dirty_bitmaps);
 +bs = bdrv_new();
 +
  pstrcpy(bs-device_name, sizeof(bs-device_name), device_name);

  if (device_name[0] != '\0') {
Then we could remove this test.

  QTAILQ_INSERT_TAIL(bdrv_states, bs, device_list);
Because here would be too late anyway: an unammed BDS would have been created 
if device_name == .


  }
 +
 +return bs;
 +}
 +
 +BlockDriverState *bdrv_new(void)
 +{
 +BlockDriverState *bs;
 +int i;
 +
 +bs = g_new0(BlockDriverState, 1);
 +QLIST_INIT(bs-dirty_bitmaps);
  for (i = 0; i  BLOCK_OP_TYPE_MAX; i++) {
  QLIST_INIT(bs-op_blockers[i]);
  }
 @@ -1217,7 +1229,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
 *options, Error **errp)
  goto free_exit;
  }
  
 -backing_hd = bdrv_new(, errp);
 +backing_hd = bdrv_new();
  
  if (bs-backing_format[0] != '\0') {
  back_drv = bdrv_find_format(bs-backing_format);
 @@ -1346,7 +1358,7 @@ int bdrv_append_temp_snapshot(BlockDriverState *bs, int 
 flags, Error **errp)
  qdict_put(snapshot_options, file.filename,
qstring_from_str(tmp_filename));
  
 -bs_snapshot = bdrv_new(, error_abort);
 +bs_snapshot = bdrv_new();
  
  ret = bdrv_open(bs_snapshot, NULL, NULL, snapshot_options,
  flags, bdrv_qcow2, local_err);
 @@ -1417,7 +1429,7 @@ int bdrv_open(BlockDriverState **pbs, const char 
 *filename,
  if (*pbs) {
  bs = *pbs;
  } else {
 -bs = bdrv_new(, error_abort);
 +bs = bdrv_new();
  }
  
  /* NULL means an empty set of options */
 diff --git a/block/iscsi.c b/block/iscsi.c
 index 3e19202..af3d0f6 100644
 --- a/block/iscsi.c
 +++ b/block/iscsi.c
 @@ -1528,7 +1528,7 @@ static int iscsi_create(const char *filename, QemuOpts 
 *opts, Error **errp)
  IscsiLun *iscsilun = NULL;
  QDict *bs_options;
  
 -bs = bdrv_new(, error_abort);
 +bs = bdrv_new();
  
  /* Read out options */
  total_size =
 diff --git a/block/vvfat.c b/block/vvfat.c
 index 731e591..6c9fde0 100644
 --- a/block/vvfat.c
 +++ b/block/vvfat.c
 @@ -2939,7 +2939,7 @@ static int enable_write_target(BDRVVVFATState *s, Error 
 **errp)
  unlink(s-qcow_filename);
  #endif
  
 -bdrv_set_backing_hd(s-bs, bdrv_new(, error_abort));
 +bdrv_set_backing_hd(s-bs, bdrv_new());
  s-bs-backing_hd-drv = vvfat_write_target;
  s-bs-backing_hd-opaque = g_new(void *, 1);
  *(void**)s-bs-backing_hd-opaque = s;
 diff --git a/blockdev.c b/blockdev.c
 index e919566..9fbd888 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -458,7 +458,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
 *bs_opts,
  /* init */
  dinfo = g_malloc0(sizeof(*dinfo));
  dinfo-id = g_strdup(qemu_opts_id(opts));
 -dinfo-bdrv = bdrv_new(dinfo-id, error);
 +dinfo-bdrv = bdrv_new_named(dinfo-id, error);
  if (error) {
  error_propagate(errp, error);
  goto bdrv_new_err;
 diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
 index 2dcef07..8bac7ff 100644
 --- a/hw/block/xen_disk.c
 +++ b/hw/block/xen_disk.c
 @@ -856,7 +856,7 @@ static int blk_connect(struct XenDevice *xendev)
  
  /* setup via xenbus - create new block driver instance */
  xen_be_printf(blkdev-xendev, 2, create new bdrv (xenbus 
 setup)\n);
 -blkdev-bs = bdrv_new(blkdev-dev, NULL);
 +blkdev-bs = bdrv_new_named(blkdev-dev, NULL);
  if

Re: [Qemu-devel] [PATCH 02/23] block: New BlockBackend

2014-09-10 Thread Benoît Canet

The Wednesday 10 Sep 2014 à 10:13:31 (+0200), Markus Armbruster wrote :
 A block device consists of a frontend device model and a backend.
 
 A block backend has a tree of block drivers doing the actual work.
 The tree is managed by the block layer.
 
 We currently use a single abstraction BlockDriverState both for tree
 nodes and the backend as a whole.  Drawbacks:
 
 * Its API includes both stuff that makes sense only at the block
   backend level (root of the tree) and stuff that's only for use
   within the block layer.  This makes the API bigger and more complex
   than necessary.  Moreover, it's not obvious which interfaces are
   meant for device models, and which really aren't.
 
 * Since device models keep a reference to their backend, the backend
   object can't just be destroyed.  But for media change, we need to
   replace the tree.  Our solution is to make the BlockDriverState
   generic, with actual driver state in a separate object, pointed to
   by member opaque.  That lets us replace the tree by deinitializing
   and reinitializing its root.  This special need of the root makes
   the data structure awkward everywhere in the tree.
 
 The general plan is to separate the APIs into block backend, for use
 by device models, monitor and whatever other code dealing with block
 backends, and block driver, for use by the block layer and whatever
 other code (if any) dealing with trees and tree nodes.
 
 Code dealing with block backends, device models in particular, should
 become completely oblivious of BlockDriverState.  This should let us
 clean up both APIs, and the tree data structures.
 
 This commit is a first step.  It creates a minimal block backend
 API: type BlockBackend and functions to create, destroy and find them.
 BlockBackend objects are created and destroyed, but not yet used for
 anything; that'll come shortly.
 
 BlockBackend is reference-counted.  Its reference count never exceeds
 one so far, but that's going to change.
 
 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  block/Makefile.objs|   2 +-
  block/block-backend.c  | 110 
 +
  blockdev.c |  10 +++-
  hw/block/xen_disk.c|  11 +
  include/qemu/typedefs.h|   1 +
  include/sysemu/block-backend.h |  26 ++
  qemu-img.c |  46 +
  qemu-io.c  |   8 +++
  qemu-nbd.c |   3 +-
  9 files changed, 214 insertions(+), 3 deletions(-)
  create mode 100644 block/block-backend.c
  create mode 100644 include/sysemu/block-backend.h
 
 diff --git a/block/Makefile.objs b/block/Makefile.objs
 index f45f939..a70140b 100644
 --- a/block/Makefile.objs
 +++ b/block/Makefile.objs
 @@ -5,7 +5,7 @@ block-obj-y += qed-check.o
  block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o
  block-obj-$(CONFIG_QUORUM) += quorum.o
  block-obj-y += parallels.o blkdebug.o blkverify.o
 -block-obj-y += snapshot.o qapi.o
 +block-obj-y += block-backend.o snapshot.o qapi.o
  block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
  block-obj-$(CONFIG_POSIX) += raw-posix.o
  block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 diff --git a/block/block-backend.c b/block/block-backend.c
 new file mode 100644
 index 000..833f7d9
 --- /dev/null
 +++ b/block/block-backend.c
 @@ -0,0 +1,110 @@
 +/*
 + * QEMU Block backends
 + *
 + * Copyright (C) 2014 Red Hat, Inc.
 + *
 + * Authors:
 + *  Markus Armbruster arm...@redhat.com,
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or
 + * later.  See the COPYING file in the top-level directory.
 + */
 +
 +#include sysemu/block-backend.h
 +#include block/block_int.h
 +
 +struct BlockBackend {
 +char *name;
 +int refcnt;
 +QTAILQ_ENTRY(BlockBackend) link; /* for blk_backends */
 +};
 +
 +static QTAILQ_HEAD(, BlockBackend) blk_backends =
 +QTAILQ_HEAD_INITIALIZER(blk_backends);
 +
 +/**
 + * blk_new:
 + * @name: name, must not be %NULL or empty
 + * @errp: return location for an error to be set on failure, or %NULL
 + *
 + * Create a new BlockBackend, with a reference count of one.  Fail if
 + * @name already exists.
 + *
 + * Returns: the BlockBackend on success, %NULL on failure
 + */
 +BlockBackend *blk_new(const char *name, Error **errp)

I am responding for the easy part first.

So here the blockbackend is identified by a name

 +{
 +BlockBackend *blk = g_new0(BlockBackend, 1);
 +
 +assert(name  name[0]);
 +if (blk_by_name(name)) {

 +error_setg(errp, Device with id '%s' already exists, name);

But here is it an id or a name ?
Do we need to make a choice everywhere in the code between id and name ?

 +return NULL;
 +}
 +blk-name = g_strdup(name);
 +blk-refcnt = 1;
 +QTAILQ_INSERT_TAIL(blk_backends, blk, link);
 +return blk;
 +}
 +
 +static void blk_delete(BlockBackend *blk)
 +{
 +assert(!blk-refcnt);
 +QTAILQ_REMOVE(blk_backends, blk,

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2243 matches

Mail list logo