Great to see efforts to ensure we only send useful notifications.
On 5/23/2014 7:48 AM, Denis Makogon wrote:
Good day, Trove community.
I would like to start thread related to Trove notification framework.
Notification design was defined as: “Trove will emit events for resources
as they are manipulated. These events can be used to meter the service and
possibly used to calculate bills.”
Actual reason of this mail is to start a discussion related to
re-implementing/refactoring of notifications. For now notifications are
hard-pinned to nova provisioning.
What kind of issues/problem do notifications have?
Let's first take a look at how they are
implemented.[1]<https://wiki.openstack.org/wiki/Trove/trove-notifications> –
this is how notifications design was defined and
approved.[2]<https://github.com/openstack/trove/blob/master/trove/taskmanager/models.py#L73-L133>
– this is how notifications are being implemented. How notifications should
look like [5]<https://wiki.openstack.org/wiki/Trove/trove-notifications-v2>.
First of all, there are a lot issues
with[2]<https://github.com/openstack/trove/blob/master/trove/taskmanager/models.py#L73-L133>
:
* pinning notifications to nova client – it's wrong way, because Trove is
going to supportheat for resource
management<https://blueprints.launchpad.net/trove/+spec/resource-manager-interface>;
* availability zone – should be only used at “trove.instance.create”
notification only, no need to use it each time “trove.instance.modify_*”
happens (* - flavor, volume);
* instance_size – this payload attribute referring to an amount of RAM
defined by flavor;
* instance_type – this payload attribute referring to flavor name, which
seems odd;
* instance_type_id – same thing, payload attribute referring to flavor id,
which seems odd;
* nova_instance_id – to be more generic, we should refuse from using
specific names;
* state_description and state – same referring to instance service status,
actual duplication;
* nova_volume_id – same as for nova_instance_id, should be more generic,
since instance can have cinder volume that has nothing common with nova at
all.
We need to have more generic, more flexible notifications, that can be used
with any provisioning engine, no matter what it actually is (nova/heat)
Also, ensuring you have .start and .end notifications really helps with a
number of problems:
1. early detection of problems, even before timeouts occur.
2. profiling a particular operation. Even amqp in-flight time can be computed
from the last .end to the .start of another service, which is cool.
3. un-reported exceptions. A .start without a corresponding ERROR or .end
notification can be troublesome.
How do we can re-write notifications taking into account described issues?
1. We need to
re-writesend_usage_event<https://github.com/openstack/trove/blob/master/trove/taskmanager/models.py#L88>
method.
2. It should not ask nova for flavor, server and AZ, because it's redundant.
So, the beginning of the method should look
like[3]<https://gist.github.com/denismakogon/9c2d802e2a61eb6164d2>.
Yeah, this is always a tricky one. You don't want to have to call out to a
million other services just to get the data for a notification. Notifications
should be light-weight (they can be large, but not computationally expensive).
The big thing is that they have the necessary information to answer the
question "what were the factors contributing to X happening" and not just "X
happened". So long as you put some references in the payload to answer those
questions, you don't really have to go back to the source to get the detailed
information. We should be able to sew those relationships up on the consuming
side.
1.
2. Payload should be re-written. It should have the following
form[4]<https://gist.github.com/denismakogon/c4a784d364f0af0fc543>.
What the actual value-add of this refactoring?
Notifications would be reusable for any kinds of actions (create, delete,
resizes), no matter what kind of the provisioning engine was used.
+1
Next steps after suggested refactoring?
Next steps will cover required notifications that were described as part of
the ceilometer
integration.<https://blueprints.launchpad.net/trove/+spec/ceilometer-integration>
Great to see ... go, go, go!
-Sandy
Best regards,
Denis Makogon
www.mirantis.com<http://www.mirantis.com>
[email protected]<mailto:[email protected]>
_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev