Re: [Qemu-block] [PATCH 3/3] block: remove legacy I/O throttling

2017-08-02 Thread Manos Pitsidianakis

On Wed, Aug 02, 2017 at 12:33:19PM +0200, Kevin Wolf wrote:

Am 02.08.2017 um 12:07 hat Stefan Hajnoczi geschrieben:

On Tue, Aug 01, 2017 at 04:49:07PM +0300, Manos Pitsidianakis wrote:
> diff --git a/block.c b/block.c
> index 9ebdba28b0..c6aad25286 100644
> --- a/block.c
> +++ b/block.c
> @@ -1975,6 +1975,7 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
*child_bs,
>  child = g_new(BdrvChild, 1);
>  *child = (BdrvChild) {
>  .bs = NULL,
> +.parent_bs  = NULL,
>  .name   = g_strdup(child_name),
>  .role   = child_role,
>  .perm   = perm,
> @@ -2009,6 +2010,7 @@ BdrvChild *bdrv_attach_child(BlockDriverState 
*parent_bs,
>  if (child == NULL) {
>  return NULL;
>  }
> +child->parent_bs = parent_bs;
>
>  QLIST_INSERT_HEAD(_bs->children, child, next);
>  return child;
> @@ -3729,6 +3731,12 @@ const char *bdrv_get_parent_name(const 
BlockDriverState *bs)
>  return name;
>  }
>  }
> +if (c->parent_bs && c->parent_bs->implicit) {
> +name = bdrv_get_parent_name(c->parent_bs);
> +if (name && *name) {
> +return name;
> +}
> +}
>  }
>
>  return NULL;

This should be a separate patch.

Who updates parent_bs if the parent is changed (e.g.
bdrv_replace_node())?

We already have bs->parents.  Why is BdrvChild->parent_bs needed?


I haven't look at the whole patch yet, but BdrvChild->parent_bs is a
thing that intentionally doesn't exist. A node simply has no business
knowing its parents - which may or may not be BlockDriverStates (the
obvious example where they aren't BDSes are BlockBackends, but block
jobs own some BdrvChild objects, too).

Usually the replacement is a BdrvChildRole callback.

Kevin


I think accessing the parent bs is necessary for the reasons I specified 
in my reply to Stefan. Will a callback that returns BdrvChild->opaque 
(parent_bs) for child_file and child_backing be okay?




signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH v3 4/7] block: convert ThrottleGroup to object with QOM

2017-08-02 Thread Stefan Hajnoczi
On Wed, Aug 02, 2017 at 01:57:04PM +0300, Manos Pitsidianakis wrote:
> On Wed, Aug 02, 2017 at 11:39:22AM +0100, Stefan Hajnoczi wrote:
> > On Tue, Aug 01, 2017 at 07:49:33PM +0300, Manos Pitsidianakis wrote:
> > > On Tue, Aug 01, 2017 at 04:47:03PM +0100, Stefan Hajnoczi wrote:
> > > > On Mon, Jul 31, 2017 at 12:54:40PM +0300, Manos Pitsidianakis wrote:
> > > > > ThrottleGroup is converted to an object. This will allow the future
> > > > > throttle block filter drive easy creation and configuration of 
> > > > > throttle
> > > > > groups in QMP and cli.
> > > > >
> > > > > A new QAPI struct, ThrottleLimits, is introduced to provide a shared
> > > > > struct for all throttle configuration needs in QMP.
> > > > >
> > > > > ThrottleGroups can be created via CLI as
> > > > > -object throttle-group,id=foo,x-iops-total=100,x-..
> > > > > where x-* are individual limit properties. Since we can't add 
> > > > > non-scalar
> > > > > properties in -object this interface must be used instead. However,
> > > > > setting these properties must be disabled after initialization because
> > > > > certain combinations of limits are forbidden and thus configuration
> > > > > changes should be done in one transaction. The individual properties
> > > > > will go away when support for non-scalar values in CLI is implemented
> > > > > and thus are marked as experimental.
> > > > >
> > > > > ThrottleGroup also has a `limits` property that uses the 
> > > > > ThrottleLimits
> > > > > struct.  It can be used to create ThrottleGroups or set the
> > > > > configuration in existing groups as follows:
> > > > >
> > > > > { "execute": "object-add",
> > > > >   "arguments": {
> > > > > "qom-type": "throttle-group",
> > > > > "id": "foo",
> > > > > "props" : {
> > > > >   "limits": {
> > > > >   "iops-total": 100
> > > > >   }
> > > > > }
> > > > >   }
> > > > > }
> > > > > { "execute" : "qom-set",
> > > > > "arguments" : {
> > > > > "path" : "foo",
> > > > > "property" : "limits",
> > > > > "value" : {
> > > > > "iops-total" : 99
> > > > > }
> > > > > }
> > > > > }
> > > > >
> > > > > This also means a group's configuration can be fetched with qom-get.
> > > > >
> > > > > ThrottleGroups can be anonymous which means they can't get accessed by
> > > > > other users ie they will always be units instead of group (Because 
> > > > > they
> > > > > have one ThrottleGroupMember).
> > > >
> > > > blockdev.c automatically assigns -drive id= to the group name if
> > > > throttling.group= wasn't given.  So who will use anonymous single-drive
> > > > mode?
> > > 
> > > Manual filter nodes. It's possible to not pass a group name value and the
> > > resulting group will be anonymous. Are you suggesting to move this change 
> > > to
> > > the throttle filter patch?
> > 
> > What is the use case?  Human users will stick to the legacy syntax
> > because it's convenient.  Management tools will use the filter
> > explicitly in the future, and it's easy for them to choose a name.
> > 
> > Unless there is a need for this case I'd prefer to make the group name
> > mandatory.  That way there are less code paths to worry about.
> 
> I think Kevin requested this though I don't really remember the use case.

Kevin: Do you still want this?

> > 
> > > > > @@ -87,32 +99,30 @@ ThrottleState *throttle_group_incref(const char 
> > > > > *name)
> > > > >
> > > > >  qemu_mutex_lock(_groups_lock);
> > > > >
> > > > > -/* Look for an existing group with that name */
> > > > > -QTAILQ_FOREACH(iter, _groups, list) {
> > > > > -if (!strcmp(name, iter->name)) {
> > > > > -tg = iter;
> > > > > -break;
> > > > > +if (name) {
> > > > > +/* Look for an existing group with that name */
> > > > > +QTAILQ_FOREACH(iter, _groups, list) {
> > > > > +if (!g_strcmp0(name, iter->name)) {
> > > > > +tg = iter;
> > > > > +break;
> > > > > +}
> > > > >  }
> > > > >  }
> > > > >
> > > > >  /* Create a new one if not found */
> > > > >  if (!tg) {
> > > > > -tg = g_new0(ThrottleGroup, 1);
> > > > > +/* new ThrottleGroup obj will have a refcnt = 1 */
> > > > > +tg = THROTTLE_GROUP(object_new(TYPE_THROTTLE_GROUP));
> > > > >  tg->name = g_strdup(name);
> > > > > -tg->clock_type = QEMU_CLOCK_REALTIME;
> > > > > -
> > > > > -if (qtest_enabled()) {
> > > > > -/* For testing block IO throttling only */
> > > > > -tg->clock_type = QEMU_CLOCK_VIRTUAL;
> > > > > -}
> > > > > -qemu_mutex_init(>lock);
> > > > > -throttle_init(>ts);
> > > > > -QLIST_INIT(>head);
> > > > > -
> > > > > -QTAILQ_INSERT_TAIL(_groups, tg, list);
> > > > > +throttle_group_obj_complete((UserCreatable *)tg, 
> > > > > _abort);
> > > > >  }
> > > > >
> > > > > -tg->refcount++;
> > 

Re: [Qemu-block] [PATCH 3/3] block: remove legacy I/O throttling

2017-08-02 Thread Stefan Hajnoczi
On Wed, Aug 02, 2017 at 01:34:46PM +0300, Manos Pitsidianakis wrote:
> On Wed, Aug 02, 2017 at 11:07:24AM +0100, Stefan Hajnoczi wrote:
> > On Tue, Aug 01, 2017 at 04:49:07PM +0300, Manos Pitsidianakis wrote:
> > > @@ -3729,6 +3731,12 @@ const char *bdrv_get_parent_name(const 
> > > BlockDriverState *bs)
> > >  return name;
> > >  }
> > >  }
> > > +if (c->parent_bs && c->parent_bs->implicit) {
> > > +name = bdrv_get_parent_name(c->parent_bs);
> > > +if (name && *name) {
> > > +return name;
> > > +}
> > > +}
> > >  }
> > > 
> > >  return NULL;
> > 
> > This should be a separate patch.
> > 
> > Who updates parent_bs if the parent is changed (e.g.
> > bdrv_replace_node())?
> > 
> > We already have bs->parents.  Why is BdrvChild->parent_bs needed?
> > 
> 
> If I haven't misunderstood this, BdrvChild holds only the child part of the
> parent-child relationship and there's no way to access a parent from
> bs->parents. bdrv_replace_node() will thus only replace the child part in
> BdrvChild from the aspect of the parent. In the old child bs's perspective,
> one of the nodes of bs->parents is removed and in the new child bs's
> perspective a new node in bs->parents was inserted. parent_bs thus remains
> immutable.
> 
> child->parent_bs is needed in this patch because in jobs if a job-ID is not
> specified the parent name is used, but this fails if the parent is an
> implicit node instead of BlockBackend and causes a regression (certain job
> setups suddenly need an explicit job ID instead of just working).

Please see Kevin's reply to my email.

> > > -throttle_group_unregister_tgm(>public.throttle_group_member);
> > > -bdrv_drained_end(blk_bs(blk));
> > > +BlockDriverState *bs, *throttle_node;
> > > +
> > > +throttle_node = blk_get_public(blk)->throttle_node;
> > 
> > Is blk_get_public() still necessary?  Perhaps we can do away with the
> > concept of the public struct now.  It doesn't need to be done in this
> > patch though.
> 
> I can include a patch to move throttle_node to BlockBackend and remove all
> BlockBackendPublic code, is that okay?

That would be a nice cleanup, thanks!


signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-devel] [PATCH v2 4/4] maint: Include bug-reporting info in --help output.

2017-08-02 Thread Daniel P. Berrange
On Fri, Jul 28, 2017 at 11:47:08AM -0500, Eric Blake wrote:
> These days, many programs are including a bug-reporting address,
> or better yet, a link to the project web site, at the tail of
> their --help output.  However, we were not very consistent at
> doing so: only qemu-nbd and qemu-qa mentioned anything, with the
> latter pointing to an individual person instead of the project.
> 
> Add a new #define that sets up a uniform string, mentioning both
> bug reporting instructions and overall project details, and which
> a downstream vendor could tweak if they want bugs to go to a
> downstream database.  Then use it in all of our binaries which
> have --help output.
> 
> The canned text intentionally references http:// instead of https://
> because our https website currently causes certificate errors in
> some browsers.  That can be tweaked later once we have resolved the
> web site issued.
> 
> Signed-off-by: Eric Blake 
> ---
> 
> v2: tweak text to capitalize QEMU and use consistent trailing .
> 
>  include/qemu-common.h | 5 +
>  vl.c  | 4 +++-
>  bsd-user/main.c   | 2 ++
>  linux-user/main.c | 4 +++-
>  qemu-img.c| 2 +-
>  qemu-io.c | 5 +++--
>  qemu-nbd.c| 2 +-
>  qga/main.c| 2 +-
>  8 files changed, 19 insertions(+), 7 deletions(-)

Reviewed-by: Daniel P. Berrange 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [Qemu-devel] [PATCH v2 3/4] qga: Give more --version information

2017-08-02 Thread Daniel P. Berrange
On Fri, Jul 28, 2017 at 11:47:07AM -0500, Eric Blake wrote:
> Include the package version information (useful for detecting
> builds from git or downstream backports), and the copyright notice.
> 
> Signed-off-by: Eric Blake 
> ---
>  qga/main.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)

Reviewed-by: Daniel P. Berrange 
 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/4] qemu-io: Give more --version information

2017-08-02 Thread Daniel P. Berrange
On Fri, Jul 28, 2017 at 11:47:06AM -0500, Eric Blake wrote:
> Include the package version information (useful for detecting
> builds from git or downstream backports), and the copyright notice.
> 
> Signed-off-by: Eric Blake 
> ---
>  qemu-io.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Reviewed-by: Daniel P. Berrange 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/4] qemu-img: Sort sub-command names in --help

2017-08-02 Thread Daniel P. Berrange
On Fri, Jul 28, 2017 at 11:47:05AM -0500, Eric Blake wrote:
> 'amend' was the only sub-command not listed alphabetically; hoist
> it earlier, and separate the @end table block to make it easier
> to copy-and-paste the addition of future sub-commands.
> 
> Signed-off-by: Eric Blake 
> ---
>  qemu-img-cmds.hx | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)

Reviewed-by: Daniel P. Berrange 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [PATCH v3 4/7] block: convert ThrottleGroup to object with QOM

2017-08-02 Thread Manos Pitsidianakis

On Wed, Aug 02, 2017 at 11:39:22AM +0100, Stefan Hajnoczi wrote:

On Tue, Aug 01, 2017 at 07:49:33PM +0300, Manos Pitsidianakis wrote:

On Tue, Aug 01, 2017 at 04:47:03PM +0100, Stefan Hajnoczi wrote:
> On Mon, Jul 31, 2017 at 12:54:40PM +0300, Manos Pitsidianakis wrote:
> > ThrottleGroup is converted to an object. This will allow the future
> > throttle block filter drive easy creation and configuration of throttle
> > groups in QMP and cli.
> >
> > A new QAPI struct, ThrottleLimits, is introduced to provide a shared
> > struct for all throttle configuration needs in QMP.
> >
> > ThrottleGroups can be created via CLI as
> > -object throttle-group,id=foo,x-iops-total=100,x-..
> > where x-* are individual limit properties. Since we can't add non-scalar
> > properties in -object this interface must be used instead. However,
> > setting these properties must be disabled after initialization because
> > certain combinations of limits are forbidden and thus configuration
> > changes should be done in one transaction. The individual properties
> > will go away when support for non-scalar values in CLI is implemented
> > and thus are marked as experimental.
> >
> > ThrottleGroup also has a `limits` property that uses the ThrottleLimits
> > struct.  It can be used to create ThrottleGroups or set the
> > configuration in existing groups as follows:
> >
> > { "execute": "object-add",
> >   "arguments": {
> > "qom-type": "throttle-group",
> > "id": "foo",
> > "props" : {
> >   "limits": {
> >   "iops-total": 100
> >   }
> > }
> >   }
> > }
> > { "execute" : "qom-set",
> > "arguments" : {
> > "path" : "foo",
> > "property" : "limits",
> > "value" : {
> > "iops-total" : 99
> > }
> > }
> > }
> >
> > This also means a group's configuration can be fetched with qom-get.
> >
> > ThrottleGroups can be anonymous which means they can't get accessed by
> > other users ie they will always be units instead of group (Because they
> > have one ThrottleGroupMember).
>
> blockdev.c automatically assigns -drive id= to the group name if
> throttling.group= wasn't given.  So who will use anonymous single-drive
> mode?

Manual filter nodes. It's possible to not pass a group name value and the
resulting group will be anonymous. Are you suggesting to move this change to
the throttle filter patch?


What is the use case?  Human users will stick to the legacy syntax
because it's convenient.  Management tools will use the filter
explicitly in the future, and it's easy for them to choose a name.

Unless there is a need for this case I'd prefer to make the group name
mandatory.  That way there are less code paths to worry about.


I think Kevin requested this though I don't really remember the use 
case.



> > @@ -87,32 +99,30 @@ ThrottleState *throttle_group_incref(const char *name)
> >
> >  qemu_mutex_lock(_groups_lock);
> >
> > -/* Look for an existing group with that name */
> > -QTAILQ_FOREACH(iter, _groups, list) {
> > -if (!strcmp(name, iter->name)) {
> > -tg = iter;
> > -break;
> > +if (name) {
> > +/* Look for an existing group with that name */
> > +QTAILQ_FOREACH(iter, _groups, list) {
> > +if (!g_strcmp0(name, iter->name)) {
> > +tg = iter;
> > +break;
> > +}
> >  }
> >  }
> >
> >  /* Create a new one if not found */
> >  if (!tg) {
> > -tg = g_new0(ThrottleGroup, 1);
> > +/* new ThrottleGroup obj will have a refcnt = 1 */
> > +tg = THROTTLE_GROUP(object_new(TYPE_THROTTLE_GROUP));
> >  tg->name = g_strdup(name);
> > -tg->clock_type = QEMU_CLOCK_REALTIME;
> > -
> > -if (qtest_enabled()) {
> > -/* For testing block IO throttling only */
> > -tg->clock_type = QEMU_CLOCK_VIRTUAL;
> > -}
> > -qemu_mutex_init(>lock);
> > -throttle_init(>ts);
> > -QLIST_INIT(>head);
> > -
> > -QTAILQ_INSERT_TAIL(_groups, tg, list);
> > +throttle_group_obj_complete((UserCreatable *)tg, _abort);
> >  }
> >
> > -tg->refcount++;
> > +qemu_mutex_lock(>lock);
> > +if (!QLIST_EMPTY(>head)) {
> > +/* only ref if the group is not empty */
> > +object_ref(OBJECT(tg));
> > +}
> > +qemu_mutex_unlock(>lock);
>
> How do the refcounts work in various cases (legacy -drive
> throttling.group and -object throttle-group with 0, 1, or multiple
> drives)?
>
> It's not obvious to me that this code works in all cases.

Object is created with object_new(): ref count is 1
Each time we call throttle_group_incref() to add a new member to the group,
we increase the ref count by 1. We skip the first time we do that because
there's already a reference. When all members are removed, object is
deleted.


If the ThrottleGroup was created with -object throttle-group it
shouldn't disappear when the last 

Re: [Qemu-block] [Qemu-devel] [RFC] block-insert-node and block-job-delete

2017-08-02 Thread Stefan Hajnoczi
On Fri, Jul 28, 2017 at 01:55:38PM +0200, Kevin Wolf wrote:
> Am 28.07.2017 um 00:09 hat John Snow geschrieben:
> > On 07/26/2017 02:23 PM, Manos Pitsidianakis wrote:
> > > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote:
> > > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:
> > I think I agree with Stefan's concerns that this is more of a
> > temporary workaround for the benefit of jobs, but that there may well
> > be other reasons (especially in the future) where graph manipulations
> > may occur invisibly to the user.
> 
> We should do our best to avoid this. Implicit graph changes only make
> libvirt's life hard. I agree that we'll have many more graph changes in
> the future, but let's keep them as explicit as possible.

Speaking of implicit graph changes: the COLO replication driver
(block/replication.c) uses the backup job internally and that requires
before write notifiers.  That means there is an implicit graph change
when COLO decides to complete the backup job.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH v3 4/7] block: convert ThrottleGroup to object with QOM

2017-08-02 Thread Stefan Hajnoczi
On Tue, Aug 01, 2017 at 07:49:33PM +0300, Manos Pitsidianakis wrote:
> On Tue, Aug 01, 2017 at 04:47:03PM +0100, Stefan Hajnoczi wrote:
> > On Mon, Jul 31, 2017 at 12:54:40PM +0300, Manos Pitsidianakis wrote:
> > > ThrottleGroup is converted to an object. This will allow the future
> > > throttle block filter drive easy creation and configuration of throttle
> > > groups in QMP and cli.
> > > 
> > > A new QAPI struct, ThrottleLimits, is introduced to provide a shared
> > > struct for all throttle configuration needs in QMP.
> > > 
> > > ThrottleGroups can be created via CLI as
> > > -object throttle-group,id=foo,x-iops-total=100,x-..
> > > where x-* are individual limit properties. Since we can't add non-scalar
> > > properties in -object this interface must be used instead. However,
> > > setting these properties must be disabled after initialization because
> > > certain combinations of limits are forbidden and thus configuration
> > > changes should be done in one transaction. The individual properties
> > > will go away when support for non-scalar values in CLI is implemented
> > > and thus are marked as experimental.
> > > 
> > > ThrottleGroup also has a `limits` property that uses the ThrottleLimits
> > > struct.  It can be used to create ThrottleGroups or set the
> > > configuration in existing groups as follows:
> > > 
> > > { "execute": "object-add",
> > >   "arguments": {
> > > "qom-type": "throttle-group",
> > > "id": "foo",
> > > "props" : {
> > >   "limits": {
> > >   "iops-total": 100
> > >   }
> > > }
> > >   }
> > > }
> > > { "execute" : "qom-set",
> > > "arguments" : {
> > > "path" : "foo",
> > > "property" : "limits",
> > > "value" : {
> > > "iops-total" : 99
> > > }
> > > }
> > > }
> > > 
> > > This also means a group's configuration can be fetched with qom-get.
> > > 
> > > ThrottleGroups can be anonymous which means they can't get accessed by
> > > other users ie they will always be units instead of group (Because they
> > > have one ThrottleGroupMember).
> > 
> > blockdev.c automatically assigns -drive id= to the group name if
> > throttling.group= wasn't given.  So who will use anonymous single-drive
> > mode?
> 
> Manual filter nodes. It's possible to not pass a group name value and the
> resulting group will be anonymous. Are you suggesting to move this change to
> the throttle filter patch?

What is the use case?  Human users will stick to the legacy syntax
because it's convenient.  Management tools will use the filter
explicitly in the future, and it's easy for them to choose a name.

Unless there is a need for this case I'd prefer to make the group name
mandatory.  That way there are less code paths to worry about.

> > > @@ -87,32 +99,30 @@ ThrottleState *throttle_group_incref(const char *name)
> > > 
> > >  qemu_mutex_lock(_groups_lock);
> > > 
> > > -/* Look for an existing group with that name */
> > > -QTAILQ_FOREACH(iter, _groups, list) {
> > > -if (!strcmp(name, iter->name)) {
> > > -tg = iter;
> > > -break;
> > > +if (name) {
> > > +/* Look for an existing group with that name */
> > > +QTAILQ_FOREACH(iter, _groups, list) {
> > > +if (!g_strcmp0(name, iter->name)) {
> > > +tg = iter;
> > > +break;
> > > +}
> > >  }
> > >  }
> > > 
> > >  /* Create a new one if not found */
> > >  if (!tg) {
> > > -tg = g_new0(ThrottleGroup, 1);
> > > +/* new ThrottleGroup obj will have a refcnt = 1 */
> > > +tg = THROTTLE_GROUP(object_new(TYPE_THROTTLE_GROUP));
> > >  tg->name = g_strdup(name);
> > > -tg->clock_type = QEMU_CLOCK_REALTIME;
> > > -
> > > -if (qtest_enabled()) {
> > > -/* For testing block IO throttling only */
> > > -tg->clock_type = QEMU_CLOCK_VIRTUAL;
> > > -}
> > > -qemu_mutex_init(>lock);
> > > -throttle_init(>ts);
> > > -QLIST_INIT(>head);
> > > -
> > > -QTAILQ_INSERT_TAIL(_groups, tg, list);
> > > +throttle_group_obj_complete((UserCreatable *)tg, _abort);
> > >  }
> > > 
> > > -tg->refcount++;
> > > +qemu_mutex_lock(>lock);
> > > +if (!QLIST_EMPTY(>head)) {
> > > +/* only ref if the group is not empty */
> > > +object_ref(OBJECT(tg));
> > > +}
> > > +qemu_mutex_unlock(>lock);
> > 
> > How do the refcounts work in various cases (legacy -drive
> > throttling.group and -object throttle-group with 0, 1, or multiple
> > drives)?
> > 
> > It's not obvious to me that this code works in all cases.
> 
> Object is created with object_new(): ref count is 1
> Each time we call throttle_group_incref() to add a new member to the group,
> we increase the ref count by 1. We skip the first time we do that because
> there's already a reference. When all members are removed, object is
> deleted.

If the 

Re: [Qemu-block] [PATCH 3/3] block: remove legacy I/O throttling

2017-08-02 Thread Manos Pitsidianakis

On Wed, Aug 02, 2017 at 11:07:24AM +0100, Stefan Hajnoczi wrote:

On Tue, Aug 01, 2017 at 04:49:07PM +0300, Manos Pitsidianakis wrote:

diff --git a/block.c b/block.c
index 9ebdba28b0..c6aad25286 100644
--- a/block.c
+++ b/block.c
@@ -1975,6 +1975,7 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
*child_bs,
 child = g_new(BdrvChild, 1);
 *child = (BdrvChild) {
 .bs = NULL,
+.parent_bs  = NULL,
 .name   = g_strdup(child_name),
 .role   = child_role,
 .perm   = perm,
@@ -2009,6 +2010,7 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
 if (child == NULL) {
 return NULL;
 }
+child->parent_bs = parent_bs;

 QLIST_INSERT_HEAD(_bs->children, child, next);
 return child;
@@ -3729,6 +3731,12 @@ const char *bdrv_get_parent_name(const BlockDriverState 
*bs)
 return name;
 }
 }
+if (c->parent_bs && c->parent_bs->implicit) {
+name = bdrv_get_parent_name(c->parent_bs);
+if (name && *name) {
+return name;
+}
+}
 }

 return NULL;


This should be a separate patch.

Who updates parent_bs if the parent is changed (e.g.
bdrv_replace_node())?

We already have bs->parents.  Why is BdrvChild->parent_bs needed?



If I haven't misunderstood this, BdrvChild holds only the child part of 
the parent-child relationship and there's no way to access a parent from 
bs->parents. bdrv_replace_node() will thus only replace the child part 
in BdrvChild from the aspect of the parent. In the old child bs's 
perspective, one of the nodes of bs->parents is removed and in the new 
child bs's perspective a new node in bs->parents was inserted. parent_bs 
thus remains immutable.


child->parent_bs is needed in this patch because in jobs if a job-ID is 
not specified the parent name is used, but this fails if the parent is 
an implicit node instead of BlockBackend and causes a regression 
(certain job setups suddenly need an explicit job ID instead of just 
working).



-void blk_io_limits_disable(BlockBackend *blk)
+void blk_io_limits_disable(BlockBackend *blk, Error **errp)
 {
-assert(blk->public.throttle_group_member.throttle_state);
-bdrv_drained_begin(blk_bs(blk));


Is it safe to drop drained_begin?  We must ensure that no I/O requests
run during this function.


Thanks, I will put it back in.


-throttle_group_unregister_tgm(>public.throttle_group_member);
-bdrv_drained_end(blk_bs(blk));
+BlockDriverState *bs, *throttle_node;
+
+throttle_node = blk_get_public(blk)->throttle_node;


Is blk_get_public() still necessary?  Perhaps we can do away with the
concept of the public struct now.  It doesn't need to be done in this
patch though.


I can include a patch to move throttle_node to BlockBackend and remove 
all BlockBackendPublic code, is that okay?





+
+assert(throttle_node && throttle_node->refcnt == 1);


Are you sure the throttle_node->refcnt == 1 assertion holds?  For
example, does the built-in NBD server have a reference to the throttle
node if nbd-server-add is called after throttling has been enabled?

Since we have the blk->throttle_node pointer we know we're the owner.
Others may be using the node too but we may choose to remove it at any
time.


Hm.. If that's possible I guess we want the removal to be visible to the 
nbd server too. I will use bdrv_replace_node() instead.


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH 3/3] block: remove legacy I/O throttling

2017-08-02 Thread Kevin Wolf
Am 02.08.2017 um 12:07 hat Stefan Hajnoczi geschrieben:
> On Tue, Aug 01, 2017 at 04:49:07PM +0300, Manos Pitsidianakis wrote:
> > diff --git a/block.c b/block.c
> > index 9ebdba28b0..c6aad25286 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -1975,6 +1975,7 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
> > *child_bs,
> >  child = g_new(BdrvChild, 1);
> >  *child = (BdrvChild) {
> >  .bs = NULL,
> > +.parent_bs  = NULL,
> >  .name   = g_strdup(child_name),
> >  .role   = child_role,
> >  .perm   = perm,
> > @@ -2009,6 +2010,7 @@ BdrvChild *bdrv_attach_child(BlockDriverState 
> > *parent_bs,
> >  if (child == NULL) {
> >  return NULL;
> >  }
> > +child->parent_bs = parent_bs;
> >  
> >  QLIST_INSERT_HEAD(_bs->children, child, next);
> >  return child;
> > @@ -3729,6 +3731,12 @@ const char *bdrv_get_parent_name(const 
> > BlockDriverState *bs)
> >  return name;
> >  }
> >  }
> > +if (c->parent_bs && c->parent_bs->implicit) {
> > +name = bdrv_get_parent_name(c->parent_bs);
> > +if (name && *name) {
> > +return name;
> > +}
> > +}
> >  }
> >  
> >  return NULL;
> 
> This should be a separate patch.
> 
> Who updates parent_bs if the parent is changed (e.g.
> bdrv_replace_node())?
> 
> We already have bs->parents.  Why is BdrvChild->parent_bs needed?

I haven't look at the whole patch yet, but BdrvChild->parent_bs is a
thing that intentionally doesn't exist. A node simply has no business
knowing its parents - which may or may not be BlockDriverStates (the
obvious example where they aren't BDSes are BlockBackends, but block
jobs own some BdrvChild objects, too).

Usually the replacement is a BdrvChildRole callback.

Kevin


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH 3/3] block: remove legacy I/O throttling

2017-08-02 Thread Stefan Hajnoczi
On Tue, Aug 01, 2017 at 04:49:07PM +0300, Manos Pitsidianakis wrote:
> diff --git a/block.c b/block.c
> index 9ebdba28b0..c6aad25286 100644
> --- a/block.c
> +++ b/block.c
> @@ -1975,6 +1975,7 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
> *child_bs,
>  child = g_new(BdrvChild, 1);
>  *child = (BdrvChild) {
>  .bs = NULL,
> +.parent_bs  = NULL,
>  .name   = g_strdup(child_name),
>  .role   = child_role,
>  .perm   = perm,
> @@ -2009,6 +2010,7 @@ BdrvChild *bdrv_attach_child(BlockDriverState 
> *parent_bs,
>  if (child == NULL) {
>  return NULL;
>  }
> +child->parent_bs = parent_bs;
>  
>  QLIST_INSERT_HEAD(_bs->children, child, next);
>  return child;
> @@ -3729,6 +3731,12 @@ const char *bdrv_get_parent_name(const 
> BlockDriverState *bs)
>  return name;
>  }
>  }
> +if (c->parent_bs && c->parent_bs->implicit) {
> +name = bdrv_get_parent_name(c->parent_bs);
> +if (name && *name) {
> +return name;
> +}
> +}
>  }
>  
>  return NULL;

This should be a separate patch.

Who updates parent_bs if the parent is changed (e.g.
bdrv_replace_node())?

We already have bs->parents.  Why is BdrvChild->parent_bs needed?

> -void blk_io_limits_disable(BlockBackend *blk)
> +void blk_io_limits_disable(BlockBackend *blk, Error **errp)
>  {
> -assert(blk->public.throttle_group_member.throttle_state);
> -bdrv_drained_begin(blk_bs(blk));

Is it safe to drop drained_begin?  We must ensure that no I/O requests
run during this function.

> -throttle_group_unregister_tgm(>public.throttle_group_member);
> -bdrv_drained_end(blk_bs(blk));
> +BlockDriverState *bs, *throttle_node;
> +
> +throttle_node = blk_get_public(blk)->throttle_node;

Is blk_get_public() still necessary?  Perhaps we can do away with the
concept of the public struct now.  It doesn't need to be done in this
patch though.

> +
> +assert(throttle_node && throttle_node->refcnt == 1);

Are you sure the throttle_node->refcnt == 1 assertion holds?  For
example, does the built-in NBD server have a reference to the throttle
node if nbd-server-add is called after throttling has been enabled?

Since we have the blk->throttle_node pointer we know we're the owner.
Others may be using the node too but we may choose to remove it at any
time.


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH 2/3] block: skip implicit nodes in snapshots, blockjobs

2017-08-02 Thread Stefan Hajnoczi
On Tue, Aug 01, 2017 at 04:49:06PM +0300, Manos Pitsidianakis wrote:
> diff --git a/block.c b/block.c
> index 886a457ab0..9ebdba28b0 100644
> --- a/block.c
> +++ b/block.c
> @@ -4947,3 +4947,20 @@ bool bdrv_can_store_new_dirty_bitmap(BlockDriverState 
> *bs, const char *name,
>  
>  return drv->bdrv_can_store_new_dirty_bitmap(bs, name, granularity, errp);
>  }
> +
> +/* Get first non-implicit node down a bs chain. */
> +BlockDriverState *bdrv_get_first_non_implicit(BlockDriverState *bs)

Linguistically non-implicit == explicit and that would be simpler.

Alternatives to implicit/explicit are:

  transient/permanent
  internal/external
  self-created/user-created
  worker/user

Let's agree on one and use it consistently.  I like the worker nodes vs
user nodes because it reflects their function.

implicit/explicit is fine by me too.  I just think non-implicit is
clunky.


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH 1/3] block: add options parameter to bdrv_new_open_driver()

2017-08-02 Thread Stefan Hajnoczi
On Tue, Aug 01, 2017 at 04:49:05PM +0300, Manos Pitsidianakis wrote:
> diff --git a/block.c b/block.c
> index 37e72b7a96..886a457ab0 100644
> --- a/block.c
> +++ b/block.c
> @@ -1149,16 +1149,26 @@ free_and_fail:
>  return ret;
>  }
>  
> +/*
> + * If options is not NULL it is cloned (which adds another reference to the
> + * option entries). If the call to bdrv_new_open_driver() is successful, the
> + * caller should unref options to pass ownership.
> + * */

Please follow the ownership convention for the options argument.
Existing block.c function take ownership of options:

  * options is a QDict of options to pass to the block drivers, or NULL for an
  * empty set of options. The reference to the QDict belongs to the block layer
  * after the call (even on failure), so if the caller intends to reuse the
  * dictionary, it needs to use QINCREF() before calling bdrv_open.

>  BlockDriverState *bdrv_new_open_driver(BlockDriver *drv, const char 
> *node_name,
> -   int flags, Error **errp)
> +   int flags, QDict *options, Error 
> **errp)


signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-devel] [PATCH 4/6] hw/block: Fix the return type

2017-08-02 Thread Mao Zhongyi

Hi

On 08/01/2017 10:29 PM, Markus Armbruster wrote:

Stefan Hajnoczi  writes:


On Wed, Jul 26, 2017 at 08:02:53PM +0800, Mao Zhongyi wrote:

When the function no success value to transmit, it usually make the
function return void. It has turned out not to be a success, because
it means that the extra local_err variable and error_propagate() will
be needed. It leads to cumbersome code, therefore, transmit success/
failure in the return value is worth.

So fix the return type of blkconf_apply_backend_options(),
blkconf_geometry() and virtio_blk_data_plane_create() to avoid it.

Cc: John Snow 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Stefan Hajnoczi 

Signed-off-by: Mao Zhongyi 
---
 hw/block/block.c| 21 -
 hw/block/dataplane/virtio-blk.c | 16 +---
 hw/block/dataplane/virtio-blk.h |  6 +++---
 include/hw/block/block.h| 10 +-
 4 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/hw/block/block.c b/hw/block/block.c
index 27878d0..717bd0e 100644
--- a/hw/block/block.c
+++ b/hw/block/block.c
@@ -51,8 +51,8 @@ void blkconf_blocksizes(BlockConf *conf)
 }
 }

-void blkconf_apply_backend_options(BlockConf *conf, bool readonly,
-   bool resizable, Error **errp)
+int blkconf_apply_backend_options(BlockConf *conf, bool readonly,
+  bool resizable, Error **errp)


I'm not a fan of these changes because it makes inconsistencies between
the return value and the errp argument possible (e.g. returning success
but setting errp, or returning failure without setting errp).


Opinions and practice vary on this one, and we've discussed the
tradeoffs a few times.

Having both an Error parameter and an error return value poses the
question whether the two agree.  When there's no success value to
transmit, you avoid the problem by making the function return void.  The
problem remains for all the function that return a value on success, and
therefore must return some error value on failure.  A related problem
even remains for functions returning void: consistency between side
effects and the Error parameter.

Returning void leads to awkward boilerplate:

Error err = NULL;

foo(..., err);
if (err) {
error_propagate(errp, err);
... bail out ...
}

Compare:

if (!foo(..., errp)) {
... bail out ...
}

Much easier on the eyes.

For what it's worth, GLib wants you to transmit success / failure in the
return value, too:

https://developer.gnome.org/glib/unstable/glib-Error-Reporting.html#gerror-rules

Plenty of older code returns void, some newer code doesn't, because
returning void is just too awkward.  We willfully deviated from GLib's
convention, and we're now paying the price.

See also
Subject: Re: [RFC 00/15] Error API: Flag errors in *errp even if errors are 
being ignored
Message-ID: <87o9t8qy7d@dusky.pond.sub.org>
https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg06191.html


If you really want to do this please use bool as the return type instead
of int.  int can be abused to return error information that should
really be in the Error object.


For what it's worth, GLib wants bool[*].  Let's stick to that unless we
have a compelling reason to differ.


[*] Except being GLib, it wants its very own homemade version of bool.
Which is inferior to stdbool.h's in pretty much every conceivable way.



Thanks for pointing that out!
I will use bool as the return type instead of int.

Thanks,
Mao