Re: [Openstack-operators] Scaling Ceilometer compute agent?

2016-06-14 Thread Bill Jones
Thanks for all the pointers.

Vahric, we're running into this in our lab on a compute host with 135
instances and 12 meters, 3 of which we developed.

/Bill

On Tue, Jun 14, 2016 at 2:54 PM, Vahric Muhtaryan <vah...@doruk.net.tr>
wrote:

> Hello Bill
>
> Possible to share how many instance and how many meter per instance you
> collecting and getting this error ?
>
> I guess for scaling purpose , you are talking about this , right
> http://docs.openstack.org/ha-guide/controller-ha-telemetry.html
>
> Regards
> VM
>
> From: Bill Jones <bill.jo...@sungardas.com>
> Date: Tuesday 14 June 2016 at 18:03
> To: "openstack-oper." <openstack-operators@lists.openstack.org>
> Subject: [Openstack-operators] Scaling Ceilometer compute agent?
>
> Has anyone had any experience with scaling ceilometer compute agents?
>
> We're starting to see messages like this in logs for some of our compute
> agents:
>
> WARNING ceilometer.openstack.common.loopingcall [-] task  interval_task at 0x2092cf8> run outlasted interval by 293.25 sec
>
> This is an indication that the compute agent failed to execute its
> pipeline processing within the allotted interval (in our case 10 min). The
> result of this is that less instance samples are generated per hour than
> expected, and this causes billing issues for us due to the way we calculate
> usage.
>
> It looks like we have three options for addressing this: make the pipeline
> run faster, increase the interval time, or scale the compute agents. I'm
> investigating the latter.
>
> I think I read in the ceilometer architecture docs that the agents are
> designed to scale, but I don't see anything in the docs on how to
> facilitate that. Any pointers would be appreciated.
>
> Thanks,
> Bill
> ___ OpenStack-operators
> mailing list OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Scaling Ceilometer compute agent?

2016-06-14 Thread Vahric Muhtaryan
Hello Bill 

Possible to share how many instance and how many meter per instance you
collecting and getting this error ?

I guess for scaling purpose , you are talking about this , right
http://docs.openstack.org/ha-guide/controller-ha-telemetry.html

Regards
VM

From:  Bill Jones <bill.jo...@sungardas.com>
Date:  Tuesday 14 June 2016 at 18:03
To:  "openstack-oper." <openstack-operators@lists.openstack.org>
Subject:  [Openstack-operators] Scaling Ceilometer compute agent?

Has anyone had any experience with scaling ceilometer compute agents?

We're starting to see messages like this in logs for some of our compute
agents:

> WARNING ceilometer.openstack.common.loopingcall [-] task  interval_task at 0x2092cf8> run outlasted interval by 293.25 sec
> 
This is an indication that the compute agent failed to execute its pipeline
processing within the allotted interval (in our case 10 min). The result of
this is that less instance samples are generated per hour than expected, and
this causes billing issues for us due to the way we calculate usage.

It looks like we have three options for addressing this: make the pipeline
run faster, increase the interval time, or scale the compute agents. I'm
investigating the latter.

I think I read in the ceilometer architecture docs that the agents are
designed to scale, but I don't see anything in the docs on how to facilitate
that. Any pointers would be appreciated.

Thanks,
Bill
___ OpenStack-operators mailing
list OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Scaling Ceilometer compute agent?

2016-06-14 Thread Tim Bell

On 14/06/16 18:00, "Matt Riedemann" <mrie...@linux.vnet.ibm.com> wrote:

>On 6/14/2016 10:14 AM, Kris G. Lindgren wrote:
>> Cern is running ceilometer at scale with many thousands of compute
>> nodes.  I think their blog goes into some detail about it [1], but I
>> don’t have a direct link to it.
>>
>>
>> [1] - http://openstack-in-production.blogspot.com/
>> ___
>> Kris Lindgren
>> Senior Linux Systems Engineer
>> GoDaddy
>>
>> From: Bill Jones <bill.jo...@sungardas.com
>> <mailto:bill.jo...@sungardas.com>>
>> Date: Tuesday, June 14, 2016 at 9:03 AM
>> To: "openstack-oper." <openstack-operators@lists.openstack.org
>> <mailto:openstack-operators@lists.openstack.org>>
>> Subject: [Openstack-operators] Scaling Ceilometer compute agent?
>>
>> Has anyone had any experience with scaling ceilometer compute agents?
>>
>> We're starting to see messages like this in logs for some of our compute
>> agents:
>>
>> WARNING ceilometer.openstack.common.loopingcall [-] task > interval_task at 0x2092cf8> run outlasted interval by 293.25 sec
>>
>> This is an indication that the compute agent failed to execute its
>> pipeline processing within the allotted interval (in our case 10 min).
>> The result of this is that less instance samples are generated per hour
>> than expected, and this causes billing issues for us due to the way we
>> calculate usage.
>>
>> It looks like we have three options for addressing this: make the
>> pipeline run faster, increase the interval time, or scale the compute
>> agents. I'm investigating the latter.
>>
>> I think I read in the ceilometer architecture docs that the agents are
>> designed to scale, but I don't see anything in the docs on how to
>> facilitate that. Any pointers would be appreciated.
>>
>> Thanks,
>> Bill
>>
>>
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>This is the specific blog post I think:
>
>http://openstack-in-production.blogspot.com/2014/03/cern-cloud-architecture-update-for.html
>

To be complete, we are running ceilometer but do struggle, both with data 
volumes being recorded and with the time to extract reports. Using a dedicate 
keystone helps avoid impact on other cloud activities, but I don’t think this 
is a good architecture to replicate.

With the rework on the metering architecture, there are developments around 
projects such as gnocchi and aodh but we have not yet got to a production state 
(RPM packaging, Puppet). We’ll look again with Mitaka once the upgrade is done 
for the CERN cloud and will produce a blog once we’ve got there.

Tim
>-- 
>
>Thanks,
>
>Matt Riedemann
>
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Scaling Ceilometer compute agent?

2016-06-14 Thread Matt Riedemann

On 6/14/2016 10:14 AM, Kris G. Lindgren wrote:

Cern is running ceilometer at scale with many thousands of compute
nodes.  I think their blog goes into some detail about it [1], but I
don’t have a direct link to it.


[1] - http://openstack-in-production.blogspot.com/
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Bill Jones <bill.jo...@sungardas.com
<mailto:bill.jo...@sungardas.com>>
Date: Tuesday, June 14, 2016 at 9:03 AM
To: "openstack-oper." <openstack-operators@lists.openstack.org
<mailto:openstack-operators@lists.openstack.org>>
Subject: [Openstack-operators] Scaling Ceilometer compute agent?

Has anyone had any experience with scaling ceilometer compute agents?

We're starting to see messages like this in logs for some of our compute
agents:

WARNING ceilometer.openstack.common.loopingcall [-] task  run outlasted interval by 293.25 sec

This is an indication that the compute agent failed to execute its
pipeline processing within the allotted interval (in our case 10 min).
The result of this is that less instance samples are generated per hour
than expected, and this causes billing issues for us due to the way we
calculate usage.

It looks like we have three options for addressing this: make the
pipeline run faster, increase the interval time, or scale the compute
agents. I'm investigating the latter.

I think I read in the ceilometer architecture docs that the agents are
designed to scale, but I don't see anything in the docs on how to
facilitate that. Any pointers would be appreciated.

Thanks,
Bill


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



This is the specific blog post I think:

http://openstack-in-production.blogspot.com/2014/03/cern-cloud-architecture-update-for.html

--

Thanks,

Matt Riedemann


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Scaling Ceilometer compute agent?

2016-06-14 Thread Kris G. Lindgren
Cern is running ceilometer at scale with many thousands of compute nodes.  I 
think their blog goes into some detail about it [1], but I don’t have a direct 
link to it.


[1] - http://openstack-in-production.blogspot.com/
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Bill Jones <bill.jo...@sungardas.com<mailto:bill.jo...@sungardas.com>>
Date: Tuesday, June 14, 2016 at 9:03 AM
To: "openstack-oper." 
<openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>>
Subject: [Openstack-operators] Scaling Ceilometer compute agent?

Has anyone had any experience with scaling ceilometer compute agents?

We're starting to see messages like this in logs for some of our compute agents:

WARNING ceilometer.openstack.common.loopingcall [-] task  run outlasted interval by 293.25 sec

This is an indication that the compute agent failed to execute its pipeline 
processing within the allotted interval (in our case 10 min). The result of 
this is that less instance samples are generated per hour than expected, and 
this causes billing issues for us due to the way we calculate usage.

It looks like we have three options for addressing this: make the pipeline run 
faster, increase the interval time, or scale the compute agents. I'm 
investigating the latter.

I think I read in the ceilometer architecture docs that the agents are designed 
to scale, but I don't see anything in the docs on how to facilitate that. Any 
pointers would be appreciated.

Thanks,
Bill
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Scaling Ceilometer compute agent?

2016-06-14 Thread Bill Jones
Has anyone had any experience with scaling ceilometer compute agents?

We're starting to see messages like this in logs for some of our compute
agents:

WARNING ceilometer.openstack.common.loopingcall [-] task  run outlasted interval by 293.25 sec

This is an indication that the compute agent failed to execute its pipeline
processing within the allotted interval (in our case 10 min). The result of
this is that less instance samples are generated per hour than expected,
and this causes billing issues for us due to the way we calculate usage.

It looks like we have three options for addressing this: make the pipeline
run faster, increase the interval time, or scale the compute agents. I'm
investigating the latter.

I think I read in the ceilometer architecture docs that the agents are
designed to scale, but I don't see anything in the docs on how to
facilitate that. Any pointers would be appreciated.

Thanks,
Bill
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators