Re: [Openstack] [Metering] schema and counter definitions
On 05/09/2012 11:11 PM, Doug Hellmann wrote: On Wed, May 9, 2012 at 3:07 PM, Tomasz Paszkowski ss7...@gmail.com mailto:ss7...@gmail.com wrote: On Wed, May 9, 2012 at 8:02 PM, Doug Hellmann doug.hellm...@dreamhost.com mailto:doug.hellm...@dreamhost.com wrote: Nice! For production code I think we are going to want to separate collection from storage, aren't we? We don't want each compute node to require access to the database server (that's an issue with nova that they are trying to fix during the folsom release, IIRC). Yes. Part of the code responsible for amqp support is not functional yet :( OK, that's what I thought. We all seem to be reinventing different parts of the services that we will eventually need, which is good for education but may be wasting a bit of energy. Is it premature to start talking a little more about architecture so we can start splitting up the implementation work and focusing that energy differently? There is a lot of work we can do independently of the remaining decisions outlined in http://wiki.openstack.org/Meetings/MeteringAgenda. Hi, It looks like the architecture of metering is indeed always implemented in similar ways. I had discussions with a company yesterday about their own metering implementation (which will be used in production soon) and it also has an architecture matching what has been proposed so far in ceilometer. I added a few points to the architecture chapter in the wiki: http://wiki.openstack.org/EfficientMetering#Architecture including a note summarizing the conclusions of the discussion regarding need for an independent ceilometer agent in addition to the existing meters provided by the OpenStack components. What do you think ? -- Tomasz Paszkowski SS7, Asterisk, SAN, Datacenter, Cloud Computing +48500166299 tel:%2B48500166299 ___ Mailing list: https://launchpad.net/~openstack https://launchpad.net/%7Eopenstack Post to : openstack@lists.launchpad.net mailto:openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack https://launchpad.net/%7Eopenstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Loïc Dachary Chief Research Officer // eNovance labs http://labs.enovance.com // ? l...@enovance.com ? +33 1 49 70 99 82 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
Here is the simplified version of my code (without ampq support, counter stored directly to mysql db). https://github.com/ss7pro/rescnt Code is started from main.py which is constantly collecting counters from libvirt and storing them in a mysql database. On Mon, May 7, 2012 at 9:25 PM, Tomasz Paszkowski ss7...@gmail.com wrote: On Mon, May 7, 2012 at 5:21 PM, Loic Dachary l...@enovance.com wrote: Hi Tomasz, Hi I could not agree more and this is the reason why I/O shows in the list of meters shown in http://wiki.openstack.org/EfficientMetering (c5) disk IO in megabyte per second has a high impact on the service availability and could be billed separately . Yes but for disk drives I/O (number of read/write ops) are the key resource usage information. It's very hard to setup a billing model for disk drive usage on bandwidth as low bandwidth disk operations (small random read/writes) can utilize disk drive more than huge sequential reads/writes. I need also to mention that AWS is also charging for I/O in their volume service. It looks like you already have a codebase that could be useful for the metering implementation. Would you be willing to share it ? Yes. Just give me few days. -- Tomasz Paszkowski SS7, Asterisk, SAN, Datacenter, Cloud Computing +48500166299 -- Tomasz Paszkowski SS7, Asterisk, SAN, Datacenter, Cloud Computing +48500166299 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On Wed, May 9, 2012 at 12:42 PM, Tomasz Paszkowski ss7...@gmail.com wrote: Here is the simplified version of my code (without ampq support, counter stored directly to mysql db). https://github.com/ss7pro/rescnt Code is started from main.py which is constantly collecting counters from libvirt and storing them in a mysql database. Nice! For production code I think we are going to want to separate collection from storage, aren't we? We don't want each compute node to require access to the database server (that's an issue with nova that they are trying to fix during the folsom release, IIRC). On Mon, May 7, 2012 at 9:25 PM, Tomasz Paszkowski ss7...@gmail.com wrote: On Mon, May 7, 2012 at 5:21 PM, Loic Dachary l...@enovance.com wrote: Hi Tomasz, Hi I could not agree more and this is the reason why I/O shows in the list of meters shown in http://wiki.openstack.org/EfficientMetering (c5) disk IO in megabyte per second has a high impact on the service availability and could be billed separately . Yes but for disk drives I/O (number of read/write ops) are the key resource usage information. It's very hard to setup a billing model for disk drive usage on bandwidth as low bandwidth disk operations (small random read/writes) can utilize disk drive more than huge sequential reads/writes. I need also to mention that AWS is also charging for I/O in their volume service. It looks like you already have a codebase that could be useful for the metering implementation. Would you be willing to share it ? Yes. Just give me few days. -- Tomasz Paszkowski SS7, Asterisk, SAN, Datacenter, Cloud Computing +48500166299 -- Tomasz Paszkowski SS7, Asterisk, SAN, Datacenter, Cloud Computing +48500166299 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On Wed, May 9, 2012 at 8:02 PM, Doug Hellmann doug.hellm...@dreamhost.com wrote: Nice! For production code I think we are going to want to separate collection from storage, aren't we? We don't want each compute node to require access to the database server (that's an issue with nova that they are trying to fix during the folsom release, IIRC). Yes. Part of the code responsible for amqp support is not functional yet :( -- Tomasz Paszkowski SS7, Asterisk, SAN, Datacenter, Cloud Computing +48500166299 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On Wed, May 9, 2012 at 3:07 PM, Tomasz Paszkowski ss7...@gmail.com wrote: On Wed, May 9, 2012 at 8:02 PM, Doug Hellmann doug.hellm...@dreamhost.com wrote: Nice! For production code I think we are going to want to separate collection from storage, aren't we? We don't want each compute node to require access to the database server (that's an issue with nova that they are trying to fix during the folsom release, IIRC). Yes. Part of the code responsible for amqp support is not functional yet :( OK, that's what I thought. We all seem to be reinventing different parts of the services that we will eventually need, which is good for education but may be wasting a bit of energy. Is it premature to start talking a little more about architecture so we can start splitting up the implementation work and focusing that energy differently? There is a lot of work we can do independently of the remaining decisions outlined in http://wiki.openstack.org/Meetings/MeteringAgenda. -- Tomasz Paszkowski SS7, Asterisk, SAN, Datacenter, Cloud Computing +48500166299 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
I agree. Do you have any plans how to coordinate our efforts ? On Wed, May 9, 2012 at 11:11 PM, Doug Hellmann doug.hellm...@dreamhost.com wrote: On Wed, May 9, 2012 at 3:07 PM, Tomasz Paszkowski ss7...@gmail.com wrote: On Wed, May 9, 2012 at 8:02 PM, Doug Hellmann doug.hellm...@dreamhost.com wrote: Nice! For production code I think we are going to want to separate collection from storage, aren't we? We don't want each compute node to require access to the database server (that's an issue with nova that they are trying to fix during the folsom release, IIRC). Yes. Part of the code responsible for amqp support is not functional yet :( OK, that's what I thought. We all seem to be reinventing different parts of the services that we will eventually need, which is good for education but may be wasting a bit of energy. Is it premature to start talking a little more about architecture so we can start splitting up the implementation work and focusing that energy differently? There is a lot of work we can do independently of the remaining decisions outlined in http://wiki.openstack.org/Meetings/MeteringAgenda. -- Tomasz Paszkowski SS7, Asterisk, SAN, Datacenter, Cloud Computing +48500166299 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Tomasz Paszkowski SS7, Asterisk, SAN, Datacenter, Cloud Computing +48500166299 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On Mon, May 7, 2012 at 5:21 PM, Loic Dachary l...@enovance.com wrote: Hi Tomasz, Hi I could not agree more and this is the reason why I/O shows in the list of meters shown in http://wiki.openstack.org/EfficientMetering (c5) disk IO in megabyte per second has a high impact on the service availability and could be billed separately . Yes but for disk drives I/O (number of read/write ops) are the key resource usage information. It's very hard to setup a billing model for disk drive usage on bandwidth as low bandwidth disk operations (small random read/writes) can utilize disk drive more than huge sequential reads/writes. I need also to mention that AWS is also charging for I/O in their volume service. It looks like you already have a codebase that could be useful for the metering implementation. Would you be willing to share it ? Yes. Just give me few days. -- Tomasz Paszkowski SS7, Asterisk, SAN, Datacenter, Cloud Computing +48500166299 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
Hi, I'd like to share my thoughts on metering openstack resources usage. Except those data available from SystemUsageData and those mentioned in blueprints some of the cloud providers charge for I/O on disk drives (to prevent users from dd if=/dev/zero nosense and to teach them properly implementing cache strategies). Those data along with network card usage can be gathered from libvirt using domblkstat and domifstat. My idea is to gather data using agent (or modified nova-compute) and send them to messaging queue using following jsoned data schema: {'instance': 'instance-003b', 'host':'tytan-1','zone':'r4cz1','counters': {'interface': {'vnet0': (80796L, 1212L, 0L, 0L, 53403L, 621L, 0L, 0L)}, 'disk': {'vda': (629L, 11699200L, 58L, 219136L, 0L)}}} interface is a result of: interfaceStats(), disk is a result of: blockStats() Those messages are consumed from queue and stored in mysql tables. I assume that instance is a parent resource for each disk (ephemeral or volume) and for each network interface. So for message mentioned earlier we have three resources: 1) instances resource: instance-003b 2) child resource: vnet0 (network) 3) child resource: vda 'ephemeral Mysql table for resources is like below: CREATE TABLE resources ( id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT, parent BIGINT UNSIGNED, type VARCHAR(255) NOT NULL, value VARCHAR(255) NOT NULL, zone VARCHAR(255) NOT NULL, added TIMESTAMP, ) ENGINE=INNODB; Counters are stored also in Mysql using following table: CREATE TABLE counters ( resource BIGINT UNSIGNED NOT NULL, type VARCHAR(255) NOT NULL, value BIGINT UNSIGNED, delta BIGINT UNISGNED, added TIMESTAMP NOT NULL, prev TIMESTAMP NOT NULL, ) ENGINE=INNODB; Where prev is a reference to previous counter value. Process which is reading data from queue is puting raw counter value into the table and if possible (reference to previous entry present) evaluates delta value. By using this model of stroing usage counter's it very easy for billing system to evaluate charges. We just run SUM(delta) on each counter for given time range. This model could be very easy adopted to other counters (IP Traffic external/internal counters from iptables). On Mon, Apr 30, 2012 at 12:15 PM, Loic Dachary l...@enovance.com wrote: Hi, To prepare for the next meeting ( thursday 3rd, may 2012 http://wiki.openstack.org/Meetings/MeteringAgenda ) I cleaned up and reorganized the Metering blueprint so that it ( hopefully ) incorporates all the information temporarily stored in the etherpad ( http://etherpad.openstack.org/EfficientMetering revision 67 in case it is vandalized ). We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters http://wiki.openstack.org/EfficientMetering#Storage and come up with a list of the counters that should exist by default and how they should be stored. This morning we had a discussion with Zhongyue Luo on irc.freenode.net#openstack-metering about how Dough could use the metering service. Since it already knows about instance creations, counter c1 that records how long a given instance was up is of no interest. However, other counters such as the external bandwidth used would be useful. I advocated that one of the advantages for Dough to rely on metering to collect counters is that it does not need to know about each OpenStack component and can rely on metering to figure out how to extract such counters from nova-compute, nova-network soon to be quantum, nova-volume soon to be cinder, swift, glance and free it from the burden of tracking structural changes. Cheers -- Loïc Dachary Chief Research Officer // eNovance labs http://labs.enovance.com // ✉ l...@enovance.com ☎ +33 1 49 70 99 82 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Tomasz Paszkowski SS7, Asterisk, SAN, Datacenter, Cloud Computing +48500166299 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
Robert Collins wrote: On Fri, May 4, 2012 at 5:27 AM, Turner, Whit (Cloud Services) whit.tur...@hp.com wrote: Hi - I think a flexible aggregation scheme is needed; the levels of aggregation available should be definable in the meter independent of the sources of usage data themselves. If invoices need to be very granular down to the lowest possible level, then this drives higher data requirements all through the processing chain, including the rating engine. Traditional systems tend to pass less granular (more highly aggregated) data into the rating engine so that bill runs and invoices can be generated efficiently. At cloud-scale, this can be problematic. Given some “big data” approaches, though, this could be handled in a more granular and real-time fashion. Has anyone looked at what statsd does? It has very similar requirements (simple to use, no hard a-priori definition of things to count, a few base types to track), and needs to be horizontally scalable. Also Swift has plans to use statsd for instrumentation/monitoring, so it's definitely worth a look to see if it could be used here as well. http://folsomdesignsummit2012.sched.org/event/d9135eabdd775432c74c3f1d32a325d3 http://etherpad.openstack.org/FolsomSwiftStatsd -- Thierry Carrez (ttx) Release Manager, OpenStack ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
-Original Message- From: Thierry Carrez thie...@openstack.org To: openstack@lists.launchpad.net Sent: Fri, 04 May 2012 2:50 Subject: Re: [Openstack] [Metering] schema and counter definitions Robert Collins wrote: On Fri, May 4, 2012 at 5:27 AM, Turner, Whit (Cloud Services) whit.tur...@hp.com wrote: Hi - I think a flexible aggregation scheme is needed; the levels of aggregation available should be definable in the meter independent of the sources of usage data themselves. If invoices need to be very granular down to the lowest possible level, then this drives higher data requirements all through the processing chain, including the rating engine. Traditional systems tend to pass less granular (more highly aggregated) data into the rating engine so that bill runs and invoices can be generated efficiently. At cloud-scale, this can be problematic. Given some “big data” approaches, though, this could be handled in a more granular and real-time fashion. Has anyone looked at what statsd does? It has very similar requirements (simple to use, no hard a-priori definition of things to count, a few base types to track), and needs to be horizontally scalable. Also Swift has plans to use statsd for instrumentation/monitoring, so it's definitely worth a look to see if it could be used here as well. http://folsomdesignsummit2012.sched.org/event/d9135eabdd775432c74c3f1d32a325d3 http://etherpad.openstack.org/FolsomSwiftStatsd -- Thierry Carrez (ttx) Release Manager, OpenStack ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On 05/04/2012 02:50 AM, Thierry Carrez wrote: Robert Collins wrote: On Fri, May 4, 2012 at 5:27 AM, Turner, Whit (Cloud Services) whit.tur...@hp.com wrote: Hi - I think a flexible aggregation scheme is needed; the levels of aggregation available should be definable in the meter independent of the sources of usage data themselves. If invoices need to be very granular down to the lowest possible level, then this drives higher data requirements all through the processing chain, including the rating engine. Traditional systems tend to pass less granular (more highly aggregated) data into the rating engine so that bill runs and invoices can be generated efficiently. At cloud-scale, this can be problematic. Given some “big data” approaches, though, this could be handled in a more granular and real-time fashion. Has anyone looked at what statsd does? It has very similar requirements (simple to use, no hard a-priori definition of things to count, a few base types to track), and needs to be horizontally scalable. Also Swift has plans to use statsd for instrumentation/monitoring, so it's definitely worth a look to see if it could be used here as well. http://folsomdesignsummit2012.sched.org/event/d9135eabdd775432c74c3f1d32a325d3 http://etherpad.openstack.org/FolsomSwiftStatsd I am no Stastd expert, but a quick look at the project shows that it is aimed add data collection for the requirements of monitoring, and uses UDP as a way to aggregate vast quantity of data at short interval. The use of UDP implies that delivery is not guaranteed, which is fines for the objectives of monitoring, but is conflicting with the requirements of metering (as a sub component of a billing system). Stastd does not seem either to allow for message signature and authentication of collectors. Here are the requirements I think we have: * The data is sent from agents to the storage daemon via a trusted messaging system * The messages in queue are signed and non repudiable * The agents collecting data are authenticated to avoid pollution of the metering service Nick signature.asc Description: OpenPGP digital signature ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On 05/04/2012 11:50 AM, Thierry Carrez wrote: Robert Collins wrote: On Fri, May 4, 2012 at 5:27 AM, Turner, Whit (Cloud Services) whit.tur...@hp.com wrote: Hi - I think a flexible aggregation scheme is needed; the levels of aggregation available should be definable in the meter independent of the sources of usage data themselves. If invoices need to be very granular down to the lowest possible level, then this drives higher data requirements all through the processing chain, including the rating engine. Traditional systems tend to pass less granular (more highly aggregated) data into the rating engine so that bill runs and invoices can be generated efficiently. At cloud-scale, this can be problematic. Given some “big data” approaches, though, this could be handled in a more granular and real-time fashion. Has anyone looked at what statsd does? It has very similar requirements (simple to use, no hard a-priori definition of things to count, a few base types to track), and needs to be horizontally scalable. Also Swift has plans to use statsd for instrumentation/monitoring, so it's definitely worth a look to see if it could be used here as well. http://folsomdesignsummit2012.sched.org/event/d9135eabdd775432c74c3f1d32a325d3 http://etherpad.openstack.org/FolsomSwiftStatsd Thanks :-) Just saved the etherpad as http://etherpad.openstack.org/ep/pad/view/FolsomSwiftStatsd/9cy8Uxtp2U in case it is vandalized. Cheers -- Loïc Dachary Chief Research Officer // eNovance labs http://labs.enovance.com // ✉ l...@enovance.com ☎ +33 1 49 70 99 82 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
Hi - I think a flexible aggregation scheme is needed; the levels of aggregation available should be definable in the meter independent of the sources of usage data themselves. If invoices need to be very granular down to the lowest possible level, then this drives higher data requirements all through the processing chain, including the rating engine. Traditional systems tend to pass less granular (more highly aggregated) data into the rating engine so that bill runs and invoices can be generated efficiently. At cloud-scale, this can be problematic. Given some “big data” approaches, though, this could be handled in a more granular and real-time fashion. Regards Whit From: openstack-bounces+whit.turner=hp@lists.launchpad.net [mailto:openstack-bounces+whit.turner=hp@lists.launchpad.net] On Behalf Of Doug Hellmann Sent: Monday, April 30, 2012 2:03 PM To: Loic Dachary Cc: openstack@lists.launchpad.net Subject: Re: [Openstack] [Metering] schema and counter definitions On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary l...@enovance.com wrote: On 04/30/2012 03:49 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary l...@enovance.com wrote: On 04/30/2012 12:15 PM, Loic Dachary wrote: We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N. It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component. What do you think ? At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves. If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data. Doug Hi, For the record, here is the unfinished conversation we had on IRC (04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today? (04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP. (04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diff http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 rev2=40rev1=39 (04:55:58 PM) dachary: it makes me a little unconfortable to define such an ad-hoc grouping (04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column (04:58:43 PM) dachary: s/actuall/actually/ (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here (05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way. (05:11:34 PM) dachary: The definition of a counter could probably be described as : the id of a resource and code to fill each column associated with it. I tried to append the following, but the wiki kept failing. Propose that the counters are defined by a function instead of being fixed. That helps addressing the issue of aggregating the bandwidth associated to a given IP into a single counter. Alternate idea : * a counter is defined by * a name ( o1, n2, etc. ) that uniquely identifies the nature of the measure ( outbound internet transit, amount of RAM, etc. ) * the component in which it can be found ( nova, swift etc.) * and by columns, each one is set with the result of aggregate(find(record),record) where * find() looks for the existing column as found by selecting with the unique key ( maybe the name and the resource id ) * record
Re: [Openstack] [Metering] schema and counter definitions
On Fri, May 4, 2012 at 5:27 AM, Turner, Whit (Cloud Services) whit.tur...@hp.com wrote: Hi - I think a flexible aggregation scheme is needed; the levels of aggregation available should be definable in the meter independent of the sources of usage data themselves. If invoices need to be very granular down to the lowest possible level, then this drives higher data requirements all through the processing chain, including the rating engine. Traditional systems tend to pass less granular (more highly aggregated) data into the rating engine so that bill runs and invoices can be generated efficiently. At cloud-scale, this can be problematic. Given some “big data” approaches, though, this could be handled in a more granular and real-time fashion. Has anyone looked at what statsd does? It has very similar requirements (simple to use, no hard a-priori definition of things to count, a few base types to track), and needs to be horizontally scalable. We could, as a riff on my prior email, define the statsd (or a similar thing) as a common substrate, and then let different implementations discard detail, or preserve it as needed. The key difference I see vs defining a Python API is that if someone is writing a different language implementation of an Openstack component, they would have a common thing to target. OTOH it should be trivial to write a network component that thunks *into* the stock Python API, and from there to the configured backend, so there is no need to pick any specific network protocol up front - though bearing in mind that we want network handoffs is probably a good thing when looking at the nitty gritty. -Rob ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On 05/02/2012 07:19 AM, Mark McLoughlin wrote: Hey, On Tue, 2012-05-01 at 23:05 +0200, Loic Dachary wrote: On 05/01/2012 06:13 PM, Mark McLoughlin wrote: Hi Loic, On Mon, 2012-04-30 at 12:15 +0200, Loic Dachary wrote: - I agree that we don't want to go too far with aggregation and lose useful data like which instances have been running as opposed to just how many instance minutes a given tenant has consumed Another aspect of aggregation to think about is aggregation over time - e.g. I might like to see my hourly network usage has varied over the last week, or how my daily usage has varied over the last month, but I probably don't care so much about my hourly usage on a specific day 3 months ago oVirt's equivalent of a metering service does this kind of aggregation as follows: http://www.ovirt.org/wiki/Ovirt_DWH * Sample data is collected at the end of every minute and is kept for up to 48 hours. * Hourly level is aggregated every hour for the hour before last and is kept for 2 months. * Daily level is aggregated every day for the day before last and is kept for 5 years. Where can I read a description of the corresponding database ? Here FWIW: http://goo.gl/3Bqct Thanks : http://wiki.openstack.org/EfficientMetering?action=diffrev2=49rev1=48 Cheers, Mark. -- Loïc Dachary Chief Research Officer // eNovance labs http://labs.enovance.com // ✉ l...@enovance.com ☎ +33 1 49 70 99 82 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On 05/02/2012 07:39 AM, Mark McLoughlin wrote: On Tue, 2012-05-01 at 23:05 +0200, Loic Dachary wrote: On 05/01/2012 06:13 PM, Mark McLoughlin wrote: Hi Loic, On Mon, 2012-04-30 at 12:15 +0200, Loic Dachary wrote: To prepare for the next meeting ( thursday 3rd, may 2012 http://wiki.openstack.org/Meetings/MeteringAgenda ) I cleaned up and reorganized the Metering blueprint so that it ( hopefully ) incorporates all the information temporarily stored in the etherpad ( http://etherpad.openstack.org/EfficientMetering revision 67 in case it is vandalized ). I'm a bit late to the discussion, but some brief comments after reading up on what you guys have done so far: - big +1 on separating billing from metering; there's no need to conflate the two problems and doing it this way will allow for a bunch of different ideas to be tried around billing - I'd prefer to avoid adding a new node agents, so +1 on building on the notifications system I would also prefer this option. I have a few concerns though: a) adding too many messages to the existing message queues b) not all core components provide notifications c) convincing all components to agree on a unified approach to metering Instead it might be more practical to implement node agents when necessary to complete a first implementation. That is, taking advice from core component developers and possibly run into problems as opposed to convincing core component developers to adopt an approach to metering that is not yet implemented anywhere. I'd start with metering using the notifications which are already there. I think that will get us a long way. I've started a thread to check there is all we need and if not to figure out how it can be modified. My impression is that the notifications system is intended to cover all billable usage in at least Nova and Glance. It's also my understanding. Regarding swift, how would you suggest we approach the problem ? I see two possible courses: a) directly create something similar to nova http://wiki.openstack.org/SystemUsageData for swift (i.e. a swift blueprint and coding in swift ) so that there is no need to install a metering agent for swift b) create a swift plugin for a metering agent and when it proves useful, port it to swift so that it is integrated and there is no longer a need for a metering agent plugin What do you think ? -- Loïc Dachary Chief Research Officer // eNovance labs http://labs.enovance.com // ✉ l...@enovance.com ☎ +33 1 49 70 99 82 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On Wed, 2012-05-02 at 10:08 +0200, Loic Dachary wrote: My impression is that the notifications system is intended to cover all billable usage in at least Nova and Glance. It's also my understanding. Regarding swift, how would you suggest we approach the problem ? I see two possible courses: a) directly create something similar to nova http://wiki.openstack.org/SystemUsageData for swift (i.e. a swift blueprint and coding in swift ) so that there is no need to install a metering agent for swift b) create a swift plugin for a metering agent and when it proves useful, port it to swift so that it is integrated and there is no longer a need for a metering agent plugin What do you think ? I've no informed opinion on Swift, but I assume Swift is amenable to work which helps with metering its resources Cheers, Mark. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
There was a swift talk at the design summit that is related to (a): http://etherpad.openstack.org/FolsomSwiftStatsd. There is a good summary in the referenced blog post: http://www.swiftstack.com/blog/2012/04/11/swift-monitoring-with-statsd/ -David On 5/2/2012 4:19 AM, Mark McLoughlin wrote: On Wed, 2012-05-02 at 10:08 +0200, Loic Dachary wrote: My impression is that the notifications system is intended to cover all billable usage in at least Nova and Glance. It's also my understanding. Regarding swift, how would you suggest we approach the problem ? I see two possible courses: a) directly create something similar to nova http://wiki.openstack.org/SystemUsageData for swift (i.e. a swift blueprint and coding in swift ) so that there is no need to install a metering agent for swift b) create a swift plugin for a metering agent and when it proves useful, port it to swift so that it is integrated and there is no longer a need for a metering agent plugin What do you think ? I've no informed opinion on Swift, but I assume Swift is amenable to work which helps with metering its resources Cheers, Mark. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On 04/30/2012 11:39 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 08:03 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 03:49 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 12:15 PM, Loic Dachary wrote: We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N. It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component. What do you think ? At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves. If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data. Doug Hi, For the record, here is the unfinished conversation we had on IRC (04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today? (04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP. (04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 (04:55:58 PM) dachary: it makes me a little unconfortable to define such an ad-hoc grouping (04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column (04:58:43 PM) dachary: s/actuall/actually/ (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here (05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way. (05:11:34 PM) dachary: The definition of a counter could probably be described as : the id of a resource and code to fill each column associated with it. I tried to append the following, but the wiki kept failing. Propose that the counters are defined by a function instead of being fixed. That helps addressing the issue of aggregating the bandwidth associated to a given IP into a single counter. Alternate idea : * a counter is defined by * a name ( o1, n2, etc. ) that uniquely identifies the nature of the measure ( outbound internet transit, amount of RAM, etc. ) * the component in which it can be found ( nova, swift etc.) * and by columns, each one is set with the result of aggregate(find(record),record) where * find() looks for the existing column as found by selecting with the unique key ( maybe the name and the resource id ) * record is a detailed description of the metering event to be aggregated ( http://wiki.openstack.org/SystemUsageData#compute.instance.exists: ) * the aggregate() function returns the updated row. By default it just += the counter value with the old row returned by find() Would we want aggregation to occur within the database where we are collecting events, or should that move somewhere else?
Re: [Openstack] [Metering] schema and counter definitions
On 05/01/2012 02:23 AM, Loic Dachary wrote: On 04/30/2012 11:39 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 08:03 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 03:49 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 12:15 PM, Loic Dachary wrote: We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N. It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component. What do you think ? At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves. If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data. Doug Hi, For the record, here is the unfinished conversation we had on IRC (04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today? (04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP. (04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 (04:55:58 PM) dachary: it makes me a little unconfortable to define such an ad-hoc grouping (04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column (04:58:43 PM) dachary: s/actuall/actually/ (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here (05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way. (05:11:34 PM) dachary: The definition of a counter could probably be described as : the id of a resource and code to fill each column associated with it. I tried to append the following, but the wiki kept failing. Propose that the counters are defined by a function instead of being fixed. That helps addressing the issue of aggregating the bandwidth associated to a given IP into a single counter. Alternate idea : * a counter is defined by * a name ( o1, n2, etc. ) that uniquely identifies the nature of the measure ( outbound internet transit, amount of RAM, etc. ) * the component in which it can be found ( nova, swift etc.) * and by columns, each one is set with the result of aggregate(find(record),record) where * find() looks for the existing column as found by selecting with the unique key ( maybe the
Re: [Openstack] [Metering] schema and counter definitions
On 05/01/2012 04:38 PM, Nick Barcet wrote: On 05/01/2012 02:23 AM, Loic Dachary wrote: On 04/30/2012 11:39 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 08:03 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 03:49 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 12:15 PM, Loic Dachary wrote: We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N. It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component. What do you think ? At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves. If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data. Doug Hi, For the record, here is the unfinished conversation we had on IRC (04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today? (04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP. (04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 (04:55:58 PM) dachary: it makes me a little unconfortable to define such an ad-hoc grouping (04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column (04:58:43 PM) dachary: s/actuall/actually/ (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here (05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way. (05:11:34 PM) dachary: The definition of a counter could probably be described as : the id of a resource and code to fill each column associated with it. I tried to append the following, but the wiki kept failing. Propose that the counters are defined by a function instead of being fixed. That helps addressing the issue of aggregating the bandwidth associated to a given IP into a single counter. Alternate idea : * a counter is defined by * a name ( o1, n2, etc. ) that uniquely identifies the nature of the measure ( outbound internet transit, amount of RAM, etc. ) * the component in which it can be found ( nova, swift etc.) * and by columns, each one is set with the result of aggregate(find(record),record) where * find() looks for the existing column as found by
Re: [Openstack] [Metering] schema and counter definitions
On Tue, May 1, 2012 at 10:38 AM, Nick Barcet nick.bar...@canonical.comwrote: On 05/01/2012 02:23 AM, Loic Dachary wrote: On 04/30/2012 11:39 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 08:03 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 03:49 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 12:15 PM, Loic Dachary wrote: We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N. It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component. What do you think ? At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves. If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data. Doug Hi, For the record, here is the unfinished conversation we had on IRC (04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today? (04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP. (04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 (04:55:58 PM) dachary: it makes me a little unconfortable to define such an ad-hoc grouping (04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column (04:58:43 PM) dachary: s/actuall/actually/ (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here (05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way. (05:11:34 PM) dachary: The definition of a counter could probably be described as : the id of a resource and code to fill each column associated with it. I tried to append the following, but the wiki kept failing. Propose that the counters are defined by a function instead of being fixed. That helps addressing the issue of aggregating the bandwidth associated to a given IP into a single counter. Alternate idea : * a counter is defined by * a name ( o1, n2, etc. ) that uniquely identifies the nature of the measure ( outbound internet transit, amount of RAM, etc. ) * the component in which it can be found ( nova, swift etc.) * and by columns, each one is set with
Re: [Openstack] [Metering] schema and counter definitions
Hi Loic, On Mon, 2012-04-30 at 12:15 +0200, Loic Dachary wrote: To prepare for the next meeting ( thursday 3rd, may 2012 http://wiki.openstack.org/Meetings/MeteringAgenda ) I cleaned up and reorganized the Metering blueprint so that it ( hopefully ) incorporates all the information temporarily stored in the etherpad ( http://etherpad.openstack.org/EfficientMetering revision 67 in case it is vandalized ). I'm a bit late to the discussion, but some brief comments after reading up on what you guys have done so far: - big +1 on separating billing from metering; there's no need to conflate the two problems and doing it this way will allow for a bunch of different ideas to be tried around billing - I'd prefer to avoid adding a new node agents, so +1 on building on the notifications system - I agree that we don't want to go too far with aggregation and lose useful data like which instances have been running as opposed to just how many instance minutes a given tenant has consumed Another aspect of aggregation to think about is aggregation over time - e.g. I might like to see my hourly network usage has varied over the last week, or how my daily usage has varied over the last month, but I probably don't care so much about my hourly usage on a specific day 3 months ago oVirt's equivalent of a metering service does this kind of aggregation as follows: http://www.ovirt.org/wiki/Ovirt_DWH * Sample data is collected at the end of every minute and is kept for up to 48 hours. * Hourly level is aggregated every hour for the hour before last and is kept for 2 months. * Daily level is aggregated every day for the day before last and is kept for 5 years. - Lastly, bikeshed mode - since we're calling this metering and not counting, how about just using the term meters rather than counters? Cheers, Mark. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
I'm glad to see people championing the effort to implement metering. Is there someway to refocus the enthusiasm for solving the metering problem into engineering a general solution in OpenStack? I'm just going to apologize in advance, but I don't think this project is headed in the right direction. I believe metering should be a first class concern of OpenStack and the way this project is starting is almost exactly backwards from what I think a solution to metering should look like. The last thing I want to see right now is a blessed OpenStack metering project adding more agents, coupled to a particular db and making policy decisions about what is quantifiable. I think there are really three problems that need to be solved to do metering, what data to get, getting the data and doing things with the data. From my perspective, a lot if not all of the data events should be coming out of the services themselves. There is already a service that should know when an instance gets started by what tenant. A cross cutting system for publishing those events and a service definition for collecting them seems like a reasonable place to start. To me that should look awful lot like a message queue or centralized logging. Once the first two problems are solved well, if you are so inclined to collect the data into a relational model, the schema will be obvious. If the first two problems are solved well, then I could be persuaded that a service that provides some of the aggregation functionality is a great idea and a reference implementation on a relational database isn't the worst thing in the world. Without a general solution for the first two problems, I believe the primary focus on a schema and db is premature and sub-optimal. I also believe the current approach likely results in a project that is generally unusable. Does anyone else share my perspective? Maybe I'm the crazy one... Andrew ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On 05/01/2012 05:49 PM, Doug Hellmann wrote: On Tue, May 1, 2012 at 10:38 AM, Nick Barcet nick.bar...@canonical.com mailto:nick.bar...@canonical.com wrote: On 05/01/2012 02:23 AM, Loic Dachary wrote: On 04/30/2012 11:39 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary l...@enovance.com mailto:l...@enovance.com mailto:l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 08:03 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary l...@enovance.com mailto:l...@enovance.com mailto:l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 03:49 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary l...@enovance.com mailto:l...@enovance.com mailto:l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 12:15 PM, Loic Dachary wrote: We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N. It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component. What do you think ? At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves. If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data. Doug Hi, For the record, here is the unfinished conversation we had on IRC (04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today? (04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP. (04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 (04:55:58 PM) dachary: it makes me a little unconfortable to define such an ad-hoc grouping (04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column (04:58:43 PM) dachary: s/actuall/actually/ (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here (05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way. (05:11:34 PM) dachary: The definition of a counter could
Re: [Openstack] [Metering] schema and counter definitions
On 05/01/2012 06:13 PM, Mark McLoughlin wrote: Hi Loic, On Mon, 2012-04-30 at 12:15 +0200, Loic Dachary wrote: To prepare for the next meeting ( thursday 3rd, may 2012 http://wiki.openstack.org/Meetings/MeteringAgenda ) I cleaned up and reorganized the Metering blueprint so that it ( hopefully ) incorporates all the information temporarily stored in the etherpad ( http://etherpad.openstack.org/EfficientMetering revision 67 in case it is vandalized ). I'm a bit late to the discussion, but some brief comments after reading up on what you guys have done so far: - big +1 on separating billing from metering; there's no need to conflate the two problems and doing it this way will allow for a bunch of different ideas to be tried around billing - I'd prefer to avoid adding a new node agents, so +1 on building on the notifications system I would also prefer this option. I have a few concerns though: a) adding too many messages to the existing message queues b) not all core components provide notifications c) convincing all components to agree on a unified approach to metering Instead it might be more practical to implement node agents when necessary to complete a first implementation. That is, taking advice from core component developers and possibly run into problems as opposed to convincing core component developers to adopt an approach to metering that is not yet implemented anywhere. - I agree that we don't want to go too far with aggregation and lose useful data like which instances have been running as opposed to just how many instance minutes a given tenant has consumed Another aspect of aggregation to think about is aggregation over time - e.g. I might like to see my hourly network usage has varied over the last week, or how my daily usage has varied over the last month, but I probably don't care so much about my hourly usage on a specific day 3 months ago oVirt's equivalent of a metering service does this kind of aggregation as follows: http://www.ovirt.org/wiki/Ovirt_DWH * Sample data is collected at the end of every minute and is kept for up to 48 hours. * Hourly level is aggregated every hour for the hour before last and is kept for 2 months. * Daily level is aggregated every day for the day before last and is kept for 5 years. Where can I read a description of the corresponding database ? - Lastly, bikeshed mode - since we're calling this metering and not counting, how about just using the term meters rather than counters? +1 ;-) Cheers -- Loïc Dachary Chief Research Officer // eNovance labs http://labs.enovance.com // ✉ l...@enovance.com ☎ +33 1 49 70 99 82 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
Hey, On Tue, 2012-05-01 at 23:05 +0200, Loic Dachary wrote: On 05/01/2012 06:13 PM, Mark McLoughlin wrote: Hi Loic, On Mon, 2012-04-30 at 12:15 +0200, Loic Dachary wrote: - I agree that we don't want to go too far with aggregation and lose useful data like which instances have been running as opposed to just how many instance minutes a given tenant has consumed Another aspect of aggregation to think about is aggregation over time - e.g. I might like to see my hourly network usage has varied over the last week, or how my daily usage has varied over the last month, but I probably don't care so much about my hourly usage on a specific day 3 months ago oVirt's equivalent of a metering service does this kind of aggregation as follows: http://www.ovirt.org/wiki/Ovirt_DWH * Sample data is collected at the end of every minute and is kept for up to 48 hours. * Hourly level is aggregated every hour for the hour before last and is kept for 2 months. * Daily level is aggregated every day for the day before last and is kept for 5 years. Where can I read a description of the corresponding database ? Here FWIW: http://goo.gl/3Bqct Cheers, Mark. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On Tue, 2012-05-01 at 23:05 +0200, Loic Dachary wrote: On 05/01/2012 06:13 PM, Mark McLoughlin wrote: Hi Loic, On Mon, 2012-04-30 at 12:15 +0200, Loic Dachary wrote: To prepare for the next meeting ( thursday 3rd, may 2012 http://wiki.openstack.org/Meetings/MeteringAgenda ) I cleaned up and reorganized the Metering blueprint so that it ( hopefully ) incorporates all the information temporarily stored in the etherpad ( http://etherpad.openstack.org/EfficientMetering revision 67 in case it is vandalized ). I'm a bit late to the discussion, but some brief comments after reading up on what you guys have done so far: - big +1 on separating billing from metering; there's no need to conflate the two problems and doing it this way will allow for a bunch of different ideas to be tried around billing - I'd prefer to avoid adding a new node agents, so +1 on building on the notifications system I would also prefer this option. I have a few concerns though: a) adding too many messages to the existing message queues b) not all core components provide notifications c) convincing all components to agree on a unified approach to metering Instead it might be more practical to implement node agents when necessary to complete a first implementation. That is, taking advice from core component developers and possibly run into problems as opposed to convincing core component developers to adopt an approach to metering that is not yet implemented anywhere. I'd start with metering using the notifications which are already there. I think that will get us a long way. My impression is that the notifications system is intended to cover all billable usage in at least Nova and Glance. Cheers, Mark. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
[Openstack] [Metering] schema and counter definitions
Hi, To prepare for the next meeting ( thursday 3rd, may 2012 http://wiki.openstack.org/Meetings/MeteringAgenda ) I cleaned up and reorganized the Metering blueprint so that it ( hopefully ) incorporates all the information temporarily stored in the etherpad ( http://etherpad.openstack.org/EfficientMetering revision 67 in case it is vandalized ). We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters http://wiki.openstack.org/EfficientMetering#Storage and come up with a list of the counters that should exist by default and how they should be stored. This morning we had a discussion with Zhongyue Luo on irc.freenode.net#openstack-metering about how Dough could use the metering service. Since it already knows about instance creations, counter c1 that records how long a given instance was up is of no interest. However, other counters such as the external bandwidth used would be useful. I advocated that one of the advantages for Dough to rely on metering to collect counters is that it does not need to know about each OpenStack component and can rely on metering to figure out how to extract such counters from nova-compute, nova-network soon to be quantum, nova-volume soon to be cinder, swift, glance and free it from the burden of tracking structural changes. Cheers -- Loïc Dachary Chief Research Officer // eNovance labs http://labs.enovance.com // ✉ l...@enovance.com ☎ +33 1 49 70 99 82 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On 04/30/2012 12:15 PM, Loic Dachary wrote: We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N. It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component. What do you think ? Cheers -- Loïc Dachary Chief Research Officer // eNovance labs http://labs.enovance.com // ✉ l...@enovance.com ☎ +33 1 49 70 99 82 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary l...@enovance.com wrote: On 04/30/2012 12:15 PM, Loic Dachary wrote: We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N. It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component. What do you think ? At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves. If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data. Doug ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On 04/30/2012 03:49 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 12:15 PM, Loic Dachary wrote: We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N. It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component. What do you think ? At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves. If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data. Doug Hi, For the record, here is the unfinished conversation we had on IRC (04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today? (04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP. (04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 (04:55:58 PM) dachary: it makes me a little unconfortable to define such an ad-hoc grouping (04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column (04:58:43 PM) dachary: s/actuall/actually/ (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here (05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way. (05:11:34 PM) dachary: The definition of a counter could probably be described as : the id of a resource and code to fill each column associated with it. I tried to append the following, but the wiki kept failing. Propose that the counters are defined by a function instead of being fixed. That helps addressing the issue of aggregating the bandwidth associated to a given IP into a single counter. Alternate idea : * a counter is defined by * a name ( o1, n2, etc. ) that uniquely identifies the nature of the measure ( outbound internet transit, amount of RAM, etc. ) * the component in which it can be found ( nova, swift etc.) * and by columns, each one is set with the result of aggregate(find(record),record) where * find() looks for the existing column as found by selecting with the unique key ( maybe the name and the resource id ) * record is a detailed description of the metering event to be aggregated ( http://wiki.openstack.org/SystemUsageData#compute.instance.exists: ) * the aggregate() function returns the updated row. By default it just += the counter value with the old row returned by find() Cheers -- Loïc Dachary Chief Research Officer // eNovance labs http://labs.enovance.com // ? l...@enovance.com ? +33 1 49 70 99 82 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary l...@enovance.com wrote: On 04/30/2012 03:49 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary l...@enovance.com wrote: On 04/30/2012 12:15 PM, Loic Dachary wrote: We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N. It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component. What do you think ? At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves. If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data. Doug Hi, For the record, here is the unfinished conversation we had on IRC (04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today? (04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP. (04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 (04:55:58 PM) dachary: it makes me a little unconfortable to define such an ad-hoc grouping (04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column (04:58:43 PM) dachary: s/actuall/actually/ (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here (05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way. (05:11:34 PM) dachary: The definition of a counter could probably be described as : the id of a resource and code to fill each column associated with it. I tried to append the following, but the wiki kept failing. Propose that the counters are defined by a function instead of being fixed. That helps addressing the issue of aggregating the bandwidth associated to a given IP into a single counter. Alternate idea : * a counter is defined by * a name ( o1, n2, etc. ) that uniquely identifies the nature of the measure ( outbound internet transit, amount of RAM, etc. ) * the component in which it can be found ( nova, swift etc.) * and by columns, each one is set with the result of aggregate(find(record),record) where * find() looks for the existing column as found by selecting with the unique key ( maybe the name and the resource id ) * record is a detailed description of the metering event to be aggregated ( http://wiki.openstack.org/SystemUsageData#compute.instance.exists: ) * the aggregate() function returns the updated row. By default it just += the counter value with the old row returned by find() Would we want aggregation to occur within the database where we are collecting events, or should that move somewhere else? Cheers -- Loïc Dachary Chief Research Officer // eNovance labs http://labs.enovance.com // ✉ l...@enovance.com ☎ +33 1 49 70 99 82 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On 04/30/2012 08:03 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 03:49 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary l...@enovance.com mailto:l...@enovance.com wrote: On 04/30/2012 12:15 PM, Loic Dachary wrote: We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N. It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component. What do you think ? At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves. If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data. Doug Hi, For the record, here is the unfinished conversation we had on IRC (04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today? (04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP. (04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 (04:55:58 PM) dachary: it makes me a little unconfortable to define such an ad-hoc grouping (04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column (04:58:43 PM) dachary: s/actuall/actually/ (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here (05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way. (05:11:34 PM) dachary: The definition of a counter could probably be described as : the id of a resource and code to fill each column associated with it. I tried to append the following, but the wiki kept failing. Propose that the counters are defined by a function instead of being fixed. That helps addressing the issue of aggregating the bandwidth associated to a given IP into a single counter. Alternate idea : * a counter is defined by * a name ( o1, n2, etc. ) that uniquely identifies the nature of the measure ( outbound internet transit, amount of RAM, etc. ) * the component in which it can be found ( nova, swift etc.) * and by columns, each one is set with the result of aggregate(find(record),record) where * find() looks for the existing column as found by selecting with the unique key ( maybe the name and the resource id ) * record is a detailed description of the metering event to be aggregated ( http://wiki.openstack.org/SystemUsageData#compute.instance.exists: ) * the aggregate() function returns the updated row. By default it just += the counter value with the old row returned by find() Would we want aggregation to occur within the database where we are collecting events, or should that move somewhere else? I assume the events collected by the metering agents will all be archived for auditing (or re-building the database) http://wiki.openstack.org/EfficientMetering?action=diffrev2=45rev1=44 Therefore the aggregation should occur when the database is updated to account for a new event. Does this make
Re: [Openstack] [Metering] schema and counter definitions
Agreed, I would get as much low-level data as possible and let other systems combine that as they want to form whatever billing model they choose. On 4/30/12 6:49 AM, Doug Hellmann doug.hellm...@dreamhost.com wrote: On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary l...@enovance.com wrote: On 04/30/2012 12:15 PM, Loic Dachary wrote: We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N. It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component. What do you think ? At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves. If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data. Doug ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Metering] schema and counter definitions
On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary l...@enovance.com wrote: On 04/30/2012 08:03 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary l...@enovance.com wrote: On 04/30/2012 03:49 PM, Doug Hellmann wrote: On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary l...@enovance.com wrote: On 04/30/2012 12:15 PM, Loic Dachary wrote: We could start a discussion from the content of the following sections: http://wiki.openstack.org/EfficientMetering#Counters I think the rationale of the counter aggregation needs to be explained. My understanding is that the metering system will be able to deliver the following information: 10 floating IPv4 addresses were allocated to the tenant during three months and were leased from provider NNN. From this, the billing system could add a line to the invoice : 10 IPv4, $N each = $10xN because it has been configured to invoice each IPv4 leased from provider NNN for $N. It is not the purpose of the metering system to display each IPv4 used, therefore it only exposes the aggregated information. The counters define how the information should be aggregated. If the idea was to expose each resource usage individually, defining counters would be meaningless as they would duplicate the activity log from each OpenStack component. What do you think ? At DreamHost we are going to want to show each individual resource (the IPv4 address, the instance, etc.) along with the charge information. Having the metering system aggregate that data will make it difficult/impossible to present the bill summary and detail views that we want. It would be much more useful for us if it tracked the usage details for each resource, and let us aggregate the data ourselves. If other vendors want to show the data differently, perhaps we should provide separate APIs for retrieving the detailed and aggregate data. Doug Hi, For the record, here is the unfinished conversation we had on IRC (04:29:06 PM) dhellmann: dachary, did you see my reply about counter definitions on the list today? (04:39:05 PM) dachary: It means some counters must not be aggregated. Only the amount associated with it is but there is one counter per IP. (04:55:01 PM) dachary: dhellmann: what about this :the id of the ressource controls the agregation of all counters : if it is missing, all resources of the same kind and their measures are aggregated. Otherwise only the measures are agreggated. http://wiki.openstack.org/EfficientMetering?action=diffrev2=40rev1=39 (04:55:58 PM) dachary: it makes me a little unconfortable to define such an ad-hoc grouping (04:56:53 PM) dachary: i.e. you actuall control the aggregation by chosing which value to put in the id column (04:58:43 PM) dachary: s/actuall/actually/ (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem here (05:08:42 PM) dachary: values need to be aggregated. The raw input is a full description of the resource and a value ( gauge ). The question is how to control the aggregation in a reasonably flexible way. (05:11:34 PM) dachary: The definition of a counter could probably be described as : the id of a resource and code to fill each column associated with it. I tried to append the following, but the wiki kept failing. Propose that the counters are defined by a function instead of being fixed. That helps addressing the issue of aggregating the bandwidth associated to a given IP into a single counter. Alternate idea : * a counter is defined by * a name ( o1, n2, etc. ) that uniquely identifies the nature of the measure ( outbound internet transit, amount of RAM, etc. ) * the component in which it can be found ( nova, swift etc.) * and by columns, each one is set with the result of aggregate(find(record),record) where * find() looks for the existing column as found by selecting with the unique key ( maybe the name and the resource id ) * record is a detailed description of the metering event to be aggregated ( http://wiki.openstack.org/SystemUsageData#compute.instance.exists: ) * the aggregate() function returns the updated row. By default it just += the counter value with the old row returned by find() Would we want aggregation to occur within the database where we are collecting events, or should that move somewhere else? I assume the events collected by the metering agents will all be archived for auditing (or re-building the database) http://wiki.openstack.org/EfficientMetering?action=diffrev2=45rev1=44 Therefore the aggregation should occur when the database is updated to account for a new event. Does this make sense ? I may have misunderstood part of your question. I guess what I don't understand is why the aggregated data is written back to the metering database at all. If it's in the same database, it seems like it should be in a different table