Re: [Openstack-operators] Scaling Ceilometer compute agent?
Thanks for all the pointers. Vahric, we're running into this in our lab on a compute host with 135 instances and 12 meters, 3 of which we developed. /Bill On Tue, Jun 14, 2016 at 2:54 PM, Vahric Muhtaryan <vah...@doruk.net.tr> wrote: > Hello Bill > > Possible to share how many instance and how many meter per instance you > collecting and getting this error ? > > I guess for scaling purpose , you are talking about this , right > http://docs.openstack.org/ha-guide/controller-ha-telemetry.html > > Regards > VM > > From: Bill Jones <bill.jo...@sungardas.com> > Date: Tuesday 14 June 2016 at 18:03 > To: "openstack-oper." <openstack-operators@lists.openstack.org> > Subject: [Openstack-operators] Scaling Ceilometer compute agent? > > Has anyone had any experience with scaling ceilometer compute agents? > > We're starting to see messages like this in logs for some of our compute > agents: > > WARNING ceilometer.openstack.common.loopingcall [-] task interval_task at 0x2092cf8> run outlasted interval by 293.25 sec > > This is an indication that the compute agent failed to execute its > pipeline processing within the allotted interval (in our case 10 min). The > result of this is that less instance samples are generated per hour than > expected, and this causes billing issues for us due to the way we calculate > usage. > > It looks like we have three options for addressing this: make the pipeline > run faster, increase the interval time, or scale the compute agents. I'm > investigating the latter. > > I think I read in the ceilometer architecture docs that the agents are > designed to scale, but I don't see anything in the docs on how to > facilitate that. Any pointers would be appreciated. > > Thanks, > Bill > ___ OpenStack-operators > mailing list OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Scaling Ceilometer compute agent?
Hello Bill Possible to share how many instance and how many meter per instance you collecting and getting this error ? I guess for scaling purpose , you are talking about this , right http://docs.openstack.org/ha-guide/controller-ha-telemetry.html Regards VM From: Bill Jones <bill.jo...@sungardas.com> Date: Tuesday 14 June 2016 at 18:03 To: "openstack-oper." <openstack-operators@lists.openstack.org> Subject: [Openstack-operators] Scaling Ceilometer compute agent? Has anyone had any experience with scaling ceilometer compute agents? We're starting to see messages like this in logs for some of our compute agents: > WARNING ceilometer.openstack.common.loopingcall [-] task interval_task at 0x2092cf8> run outlasted interval by 293.25 sec > This is an indication that the compute agent failed to execute its pipeline processing within the allotted interval (in our case 10 min). The result of this is that less instance samples are generated per hour than expected, and this causes billing issues for us due to the way we calculate usage. It looks like we have three options for addressing this: make the pipeline run faster, increase the interval time, or scale the compute agents. I'm investigating the latter. I think I read in the ceilometer architecture docs that the agents are designed to scale, but I don't see anything in the docs on how to facilitate that. Any pointers would be appreciated. Thanks, Bill ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Scaling Ceilometer compute agent?
On 14/06/16 18:00, "Matt Riedemann" <mrie...@linux.vnet.ibm.com> wrote: >On 6/14/2016 10:14 AM, Kris G. Lindgren wrote: >> Cern is running ceilometer at scale with many thousands of compute >> nodes. I think their blog goes into some detail about it [1], but I >> don’t have a direct link to it. >> >> >> [1] - http://openstack-in-production.blogspot.com/ >> ___ >> Kris Lindgren >> Senior Linux Systems Engineer >> GoDaddy >> >> From: Bill Jones <bill.jo...@sungardas.com >> <mailto:bill.jo...@sungardas.com>> >> Date: Tuesday, June 14, 2016 at 9:03 AM >> To: "openstack-oper." <openstack-operators@lists.openstack.org >> <mailto:openstack-operators@lists.openstack.org>> >> Subject: [Openstack-operators] Scaling Ceilometer compute agent? >> >> Has anyone had any experience with scaling ceilometer compute agents? >> >> We're starting to see messages like this in logs for some of our compute >> agents: >> >> WARNING ceilometer.openstack.common.loopingcall [-] task > interval_task at 0x2092cf8> run outlasted interval by 293.25 sec >> >> This is an indication that the compute agent failed to execute its >> pipeline processing within the allotted interval (in our case 10 min). >> The result of this is that less instance samples are generated per hour >> than expected, and this causes billing issues for us due to the way we >> calculate usage. >> >> It looks like we have three options for addressing this: make the >> pipeline run faster, increase the interval time, or scale the compute >> agents. I'm investigating the latter. >> >> I think I read in the ceilometer architecture docs that the agents are >> designed to scale, but I don't see anything in the docs on how to >> facilitate that. Any pointers would be appreciated. >> >> Thanks, >> Bill >> >> >> ___ >> OpenStack-operators mailing list >> OpenStack-operators@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators >> > >This is the specific blog post I think: > >http://openstack-in-production.blogspot.com/2014/03/cern-cloud-architecture-update-for.html > To be complete, we are running ceilometer but do struggle, both with data volumes being recorded and with the time to extract reports. Using a dedicate keystone helps avoid impact on other cloud activities, but I don’t think this is a good architecture to replicate. With the rework on the metering architecture, there are developments around projects such as gnocchi and aodh but we have not yet got to a production state (RPM packaging, Puppet). We’ll look again with Mitaka once the upgrade is done for the CERN cloud and will produce a blog once we’ve got there. Tim >-- > >Thanks, > >Matt Riedemann > > >___ >OpenStack-operators mailing list >OpenStack-operators@lists.openstack.org >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Scaling Ceilometer compute agent?
On 6/14/2016 10:14 AM, Kris G. Lindgren wrote: Cern is running ceilometer at scale with many thousands of compute nodes. I think their blog goes into some detail about it [1], but I don’t have a direct link to it. [1] - http://openstack-in-production.blogspot.com/ ___ Kris Lindgren Senior Linux Systems Engineer GoDaddy From: Bill Jones <bill.jo...@sungardas.com <mailto:bill.jo...@sungardas.com>> Date: Tuesday, June 14, 2016 at 9:03 AM To: "openstack-oper." <openstack-operators@lists.openstack.org <mailto:openstack-operators@lists.openstack.org>> Subject: [Openstack-operators] Scaling Ceilometer compute agent? Has anyone had any experience with scaling ceilometer compute agents? We're starting to see messages like this in logs for some of our compute agents: WARNING ceilometer.openstack.common.loopingcall [-] task run outlasted interval by 293.25 sec This is an indication that the compute agent failed to execute its pipeline processing within the allotted interval (in our case 10 min). The result of this is that less instance samples are generated per hour than expected, and this causes billing issues for us due to the way we calculate usage. It looks like we have three options for addressing this: make the pipeline run faster, increase the interval time, or scale the compute agents. I'm investigating the latter. I think I read in the ceilometer architecture docs that the agents are designed to scale, but I don't see anything in the docs on how to facilitate that. Any pointers would be appreciated. Thanks, Bill ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators This is the specific blog post I think: http://openstack-in-production.blogspot.com/2014/03/cern-cloud-architecture-update-for.html -- Thanks, Matt Riedemann ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Scaling Ceilometer compute agent?
Cern is running ceilometer at scale with many thousands of compute nodes. I think their blog goes into some detail about it [1], but I don’t have a direct link to it. [1] - http://openstack-in-production.blogspot.com/ ___ Kris Lindgren Senior Linux Systems Engineer GoDaddy From: Bill Jones <bill.jo...@sungardas.com<mailto:bill.jo...@sungardas.com>> Date: Tuesday, June 14, 2016 at 9:03 AM To: "openstack-oper." <openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>> Subject: [Openstack-operators] Scaling Ceilometer compute agent? Has anyone had any experience with scaling ceilometer compute agents? We're starting to see messages like this in logs for some of our compute agents: WARNING ceilometer.openstack.common.loopingcall [-] task run outlasted interval by 293.25 sec This is an indication that the compute agent failed to execute its pipeline processing within the allotted interval (in our case 10 min). The result of this is that less instance samples are generated per hour than expected, and this causes billing issues for us due to the way we calculate usage. It looks like we have three options for addressing this: make the pipeline run faster, increase the interval time, or scale the compute agents. I'm investigating the latter. I think I read in the ceilometer architecture docs that the agents are designed to scale, but I don't see anything in the docs on how to facilitate that. Any pointers would be appreciated. Thanks, Bill ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] Scaling Ceilometer compute agent?
Has anyone had any experience with scaling ceilometer compute agents? We're starting to see messages like this in logs for some of our compute agents: WARNING ceilometer.openstack.common.loopingcall [-] task run outlasted interval by 293.25 sec This is an indication that the compute agent failed to execute its pipeline processing within the allotted interval (in our case 10 min). The result of this is that less instance samples are generated per hour than expected, and this causes billing issues for us due to the way we calculate usage. It looks like we have three options for addressing this: make the pipeline run faster, increase the interval time, or scale the compute agents. I'm investigating the latter. I think I read in the ceilometer architecture docs that the agents are designed to scale, but I don't see anything in the docs on how to facilitate that. Any pointers would be appreciated. Thanks, Bill ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators