Re: [openstack-dev] Jenkins test logs and their retention period

2014-03-27 Thread Doug Hellmann
On Wed, Mar 26, 2014 at 2:54 PM, Joe Gordon joe.gord...@gmail.com wrote:




 On Wed, Mar 26, 2014 at 9:51 AM, Doug Hellmann 
 doug.hellm...@dreamhost.com wrote:




 On Tue, Mar 25, 2014 at 5:34 PM, Brant Knudson b...@acm.org wrote:




 On Mon, Mar 24, 2014 at 5:49 AM, Sean Dague s...@dague.net wrote:

 ...

 Part of the challenge is turning off DEBUG is currently embedded in code
 in oslo log, which makes it kind of awkward to set sane log levels for
 included libraries because it requires an oslo round trip with code to
 all the projects to do it.


 Here's how it's done in Keystone:
 https://review.openstack.org/#/c/62068/10/keystone/config.py

 It's definitely awkward.


 https://bugs.launchpad.net/oslo/+bug/1297950


 Currently when you enable debug logs in openstack, the root logger is set
 to debug and then we have to go and blacklist specific modules that we
 don't want to run on debug. What about instead adding an option to just set
 the openstack component at hand to debug log level and not the root logger?
 That way we won't have to keep maintaining a blacklist of modules that
 generate too many debug logs.


Doing that makes sense, too. Do we need a new option, or is there some
combination of existing options that we could interpret to mean debug this
openstack app but not all of the libraries it is using?

Doug







 Doug





 - Brant


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Jenkins test logs and their retention period

2014-03-27 Thread Joe Gordon
On Thu, Mar 27, 2014 at 6:53 AM, Doug Hellmann
doug.hellm...@dreamhost.comwrote:




 On Wed, Mar 26, 2014 at 2:54 PM, Joe Gordon joe.gord...@gmail.com wrote:




 On Wed, Mar 26, 2014 at 9:51 AM, Doug Hellmann 
 doug.hellm...@dreamhost.com wrote:




 On Tue, Mar 25, 2014 at 5:34 PM, Brant Knudson b...@acm.org wrote:




 On Mon, Mar 24, 2014 at 5:49 AM, Sean Dague s...@dague.net wrote:

 ...

 Part of the challenge is turning off DEBUG is currently embedded in
 code
 in oslo log, which makes it kind of awkward to set sane log levels for
 included libraries because it requires an oslo round trip with code to
 all the projects to do it.


 Here's how it's done in Keystone:
 https://review.openstack.org/#/c/62068/10/keystone/config.py

 It's definitely awkward.


 https://bugs.launchpad.net/oslo/+bug/1297950


 Currently when you enable debug logs in openstack, the root logger is set
 to debug and then we have to go and blacklist specific modules that we
 don't want to run on debug. What about instead adding an option to just set
 the openstack component at hand to debug log level and not the root logger?
 That way we won't have to keep maintaining a blacklist of modules that
 generate too many debug logs.


 Doing that makes sense, too. Do we need a new option, or is there some
 combination of existing options that we could interpret to mean debug this
 openstack app but not all of the libraries it is using?


I'm not sure if we need a new option or re-use the existing ones. But the
current config options are somewhat confusing. We have separate debug and
verbose options.



 Doug







 Doug





 - Brant


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Jenkins test logs and their retention period

2014-03-26 Thread Doug Hellmann
On Tue, Mar 25, 2014 at 5:34 PM, Brant Knudson b...@acm.org wrote:




 On Mon, Mar 24, 2014 at 5:49 AM, Sean Dague s...@dague.net wrote:

 ...

 Part of the challenge is turning off DEBUG is currently embedded in code
 in oslo log, which makes it kind of awkward to set sane log levels for
 included libraries because it requires an oslo round trip with code to
 all the projects to do it.


 Here's how it's done in Keystone:
 https://review.openstack.org/#/c/62068/10/keystone/config.py

 It's definitely awkward.


https://bugs.launchpad.net/oslo/+bug/1297950

Doug





 - Brant


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Jenkins test logs and their retention period

2014-03-26 Thread Vishvananda Ishaya

On Mar 24, 2014, at 2:09 PM, Joe Gordon joe.gord...@gmail.com wrote:

 
 
 
 On Mon, Mar 24, 2014 at 3:49 AM, Sean Dague s...@dague.net wrote:
 Here is some preliminary views (it currently ignores the ceilometer
 logs, I haven't had a chance to dive in there yet).
 
 It actually looks like a huge part of the issue is olso.messaging, the
 bulk of screen-n-cond is oslo.messaging debug errors. It seems that in
 debug mode oslo.messaging is basically a 100% trace mode, which include
 logging every time a UUID is created and every payload.
 
 I'm not convinced why that's a useful. We don't log every sql statement
 we run (with full payload).
 
 
 Agreed. I turned off oslo.messaging logs [1] and the file sizes in a 
 check-tempest-dsvm-full dropped drastically to [2]. nova-conductor logs 
 dropped way down from 7.3MB to 214K.
 
 [1] https://review.openstack.org/#/c/82255/
 [2] 
 http://logs.openstack.org/55/82255/1/check/check-tempest-dsvm-full/88d1e36/logs/?C=S;O=D
 
 The recent integration of oslo.messaging would also explain the new
 growth of logs.
 
 Other issues include other oslo utils that have really verbose debug
 modes. Like lockutils emitting 4 DEBUG messages for every lock acquired.
 
 Part of the challenge is turning off DEBUG is currently embedded in code
 in oslo log, which makes it kind of awkward to set sane log levels for
 included libraries because it requires an oslo round trip with code to
 all the projects to do it.
 
 ++
 
 One possible solution is to start using the  log_config_append and load the 
 config from a logging.conf file. But we don't even copy over the sample file 
 in devstack. So for icehouse we may want to do a cherry-pick from 
 oslo-incubator to disable oslo.messaging

Can’t we just specify a reasonable default_log_levels in *.conf in devstack? 
That would cut down the log chatter for integration tests, and wouldn’t be a 
breaking change.

Vish

  
 
 -Sean
 
 On 03/21/2014 07:23 PM, Clark Boylan wrote:
  Hello everyone,
 
  Back at the Portland summit the Infra team committed to archiving six months
  of test logs for Openstack. Since then we have managed to do just that.
  However, more recently we have seen the growth rate on those logs continue
  to grow beyond what is a currently sustainable level.
 
  For reasons, we currently store logs on a filesystem backed by cinder
  volumes. Rackspace limits the size and number of volumes attached to a
  single host meaning the upper bound on the log archive filesystem is ~12TB
  and we are almost there. You can see real numbers and pretty graphs on our
  cacti server [0].
 
  Long term we are trying to move to putting all of the logs in swift, but it
  turns out there are some use case issues we need to sort out around that
  before we can do so (but this is being worked on so should happen). Until
  that day arrives we need to work on logging more smartly, and if we can't do
  that we will have to reduce the log retention period.
 
  So what can you do? Well it appears that our log files may need a diet. I
  have listed the worst offenders below (after a small sampling, there may be
  more) and it would be great if we could go through those with a comb and
  figure out if we are logging actually useful data. The great thing about
  doing this is it will make lives better for deployers of Openstack too.
 
  Some initial checking indicates a lot of this noise may be related to
  ceilometer. It looks like it is logging AMQP stuff frequently and inflating
  the logs of individual services as it polls them.
 
  Offending files from tempest tests:
  screen-n-cond.txt.gz 7.3M
  screen-ceilometer-collector.txt.gz 6.0M
  screen-n-api.txt.gz 3.7M
  screen-n-cpu.txt.gz 3.6M
  tempest.txt.gz 2.7M
  screen-ceilometer-anotification.txt.gz 1.9M
  subunit_log.txt.gz 1.5M
  screen-g-api.txt.gz 1.4M
  screen-ceilometer-acentral.txt.gz 1.4M
  screen-n-net.txt.gz 1.4M
  from: 
  http://logs.openstack.org/52/81252/2/gate/gate-tempest-dsvm-full/488bc4e/logs/?C=S;O=D
 
  Unittest offenders:
  Nova subunit_log.txt.gz 14M
  Neutron subunit_log.txt.gz 7.8M
  Keystone subunit_log.txt.gz 4.8M
 
  Note all of the above files are compressed with gzip -9 and the filesizes
  above reflect compressed file sizes.
 
  Debug logs are important to you guys when dealing with Jenkins results. We
  want your feedback on how we can make this better for everyone.
 
  [0] 
  http://cacti.openstack.org/cacti/graph.php?action=viewlocal_graph_id=717rra_id=all
 
  Thank you,
  Clark Boylan
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 --
 Sean Dague
 Samsung Research America
 s...@dague.net / sean.da...@samsung.com
 http://dague.net
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 

Re: [openstack-dev] Jenkins test logs and their retention period

2014-03-26 Thread Joe Gordon
On Wed, Mar 26, 2014 at 11:15 AM, Vishvananda Ishaya
vishvana...@gmail.comwrote:


 On Mar 24, 2014, at 2:09 PM, Joe Gordon joe.gord...@gmail.com wrote:




 On Mon, Mar 24, 2014 at 3:49 AM, Sean Dague s...@dague.net wrote:

 Here is some preliminary views (it currently ignores the ceilometer
 logs, I haven't had a chance to dive in there yet).

 It actually looks like a huge part of the issue is olso.messaging, the
 bulk of screen-n-cond is oslo.messaging debug errors. It seems that in
 debug mode oslo.messaging is basically a 100% trace mode, which include
 logging every time a UUID is created and every payload.

 I'm not convinced why that's a useful. We don't log every sql statement
 we run (with full payload).


 Agreed. I turned off oslo.messaging logs [1] and the file sizes in a
 check-tempest-dsvm-full dropped drastically to [2]. nova-conductor logs
 dropped way down from 7.3MB to 214K.

 [1] https://review.openstack.org/#/c/82255/
 [2]
 http://logs.openstack.org/55/82255/1/check/check-tempest-dsvm-full/88d1e36/logs/?C=S;O=D

 The recent integration of oslo.messaging would also explain the new
 growth of logs.

 Other issues include other oslo utils that have really verbose debug
 modes. Like lockutils emitting 4 DEBUG messages for every lock acquired.

 Part of the challenge is turning off DEBUG is currently embedded in code
 in oslo log, which makes it kind of awkward to set sane log levels for
 included libraries because it requires an oslo round trip with code to
 all the projects to do it.


 ++

 One possible solution is to start using the  log_config_append and load
 the config from a logging.conf file. But we don't even copy over the sample
 file in devstack. So for icehouse we may want to do a cherry-pick from
 oslo-incubator to disable oslo.messaging


 Can't we just specify a reasonable default_log_levels in *.conf in
 devstack? That would cut down the log chatter for integration tests, and
 wouldn't be a breaking change.


If we are having problems in gate with verbose and useless logs, others
will to ... so I don't think we should sidestep the problem via devstack
otherwise every deployer will have to do the same. This fits in with the
*sane defaults* mantra.



 Vish




 -Sean

 On 03/21/2014 07:23 PM, Clark Boylan wrote:
  Hello everyone,
 
  Back at the Portland summit the Infra team committed to archiving six
 months
  of test logs for Openstack. Since then we have managed to do just that.
  However, more recently we have seen the growth rate on those logs
 continue
  to grow beyond what is a currently sustainable level.
 
  For reasons, we currently store logs on a filesystem backed by cinder
  volumes. Rackspace limits the size and number of volumes attached to a
  single host meaning the upper bound on the log archive filesystem is
 ~12TB
  and we are almost there. You can see real numbers and pretty graphs on
 our
  cacti server [0].
 
  Long term we are trying to move to putting all of the logs in swift,
 but it
  turns out there are some use case issues we need to sort out around that
  before we can do so (but this is being worked on so should happen).
 Until
  that day arrives we need to work on logging more smartly, and if we
 can't do
  that we will have to reduce the log retention period.
 
  So what can you do? Well it appears that our log files may need a diet.
 I
  have listed the worst offenders below (after a small sampling, there
 may be
  more) and it would be great if we could go through those with a comb and
  figure out if we are logging actually useful data. The great thing about
  doing this is it will make lives better for deployers of Openstack too.
 
  Some initial checking indicates a lot of this noise may be related to
  ceilometer. It looks like it is logging AMQP stuff frequently and
 inflating
  the logs of individual services as it polls them.
 
  Offending files from tempest tests:
  screen-n-cond.txt.gz 7.3M
  screen-ceilometer-collector.txt.gz 6.0M
  screen-n-api.txt.gz 3.7M
  screen-n-cpu.txt.gz 3.6M
  tempest.txt.gz 2.7M
  screen-ceilometer-anotification.txt.gz 1.9M
  subunit_log.txt.gz 1.5M
  screen-g-api.txt.gz 1.4M
  screen-ceilometer-acentral.txt.gz 1.4M
  screen-n-net.txt.gz 1.4M
  from:
 http://logs.openstack.org/52/81252/2/gate/gate-tempest-dsvm-full/488bc4e/logs/?C=S;O=D
 
  Unittest offenders:
  Nova subunit_log.txt.gz 14M
  Neutron subunit_log.txt.gz 7.8M
  Keystone subunit_log.txt.gz 4.8M
 
  Note all of the above files are compressed with gzip -9 and the
 filesizes
  above reflect compressed file sizes.
 
  Debug logs are important to you guys when dealing with Jenkins results.
 We
  want your feedback on how we can make this better for everyone.
 
  [0]
 http://cacti.openstack.org/cacti/graph.php?action=viewlocal_graph_id=717rra_id=all
 
  Thank you,
  Clark Boylan
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  

Re: [openstack-dev] Jenkins test logs and their retention period

2014-03-26 Thread Joe Gordon
On Wed, Mar 26, 2014 at 9:51 AM, Doug Hellmann
doug.hellm...@dreamhost.comwrote:




 On Tue, Mar 25, 2014 at 5:34 PM, Brant Knudson b...@acm.org wrote:




 On Mon, Mar 24, 2014 at 5:49 AM, Sean Dague s...@dague.net wrote:

 ...

 Part of the challenge is turning off DEBUG is currently embedded in code
 in oslo log, which makes it kind of awkward to set sane log levels for
 included libraries because it requires an oslo round trip with code to
 all the projects to do it.


 Here's how it's done in Keystone:
 https://review.openstack.org/#/c/62068/10/keystone/config.py

 It's definitely awkward.


 https://bugs.launchpad.net/oslo/+bug/1297950


Currently when you enable debug logs in openstack, the root logger is set
to debug and then we have to go and blacklist specific modules that we
don't want to run on debug. What about instead adding an option to just set
the openstack component at hand to debug log level and not the root logger?
That way we won't have to keep maintaining a blacklist of modules that
generate too many debug logs.




 Doug





 - Brant


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Jenkins test logs and their retention period

2014-03-25 Thread Brant Knudson
On Mon, Mar 24, 2014 at 5:49 AM, Sean Dague s...@dague.net wrote:

 ...
 Part of the challenge is turning off DEBUG is currently embedded in code
 in oslo log, which makes it kind of awkward to set sane log levels for
 included libraries because it requires an oslo round trip with code to
 all the projects to do it.


Here's how it's done in Keystone:
https://review.openstack.org/#/c/62068/10/keystone/config.py

It's definitely awkward.

- Brant
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Jenkins test logs and their retention period

2014-03-24 Thread Sean Dague
Here is some preliminary views (it currently ignores the ceilometer
logs, I haven't had a chance to dive in there yet).

It actually looks like a huge part of the issue is olso.messaging, the
bulk of screen-n-cond is oslo.messaging debug errors. It seems that in
debug mode oslo.messaging is basically a 100% trace mode, which include
logging every time a UUID is created and every payload.

I'm not convinced why that's a useful. We don't log every sql statement
we run (with full payload).

The recent integration of oslo.messaging would also explain the new
growth of logs.

Other issues include other oslo utils that have really verbose debug
modes. Like lockutils emitting 4 DEBUG messages for every lock acquired.

Part of the challenge is turning off DEBUG is currently embedded in code
in oslo log, which makes it kind of awkward to set sane log levels for
included libraries because it requires an oslo round trip with code to
all the projects to do it.

-Sean

On 03/21/2014 07:23 PM, Clark Boylan wrote:
 Hello everyone,
 
 Back at the Portland summit the Infra team committed to archiving six months
 of test logs for Openstack. Since then we have managed to do just that.
 However, more recently we have seen the growth rate on those logs continue
 to grow beyond what is a currently sustainable level.
 
 For reasons, we currently store logs on a filesystem backed by cinder
 volumes. Rackspace limits the size and number of volumes attached to a
 single host meaning the upper bound on the log archive filesystem is ~12TB
 and we are almost there. You can see real numbers and pretty graphs on our
 cacti server [0].
 
 Long term we are trying to move to putting all of the logs in swift, but it
 turns out there are some use case issues we need to sort out around that
 before we can do so (but this is being worked on so should happen). Until
 that day arrives we need to work on logging more smartly, and if we can't do
 that we will have to reduce the log retention period.
 
 So what can you do? Well it appears that our log files may need a diet. I
 have listed the worst offenders below (after a small sampling, there may be
 more) and it would be great if we could go through those with a comb and
 figure out if we are logging actually useful data. The great thing about
 doing this is it will make lives better for deployers of Openstack too.
 
 Some initial checking indicates a lot of this noise may be related to
 ceilometer. It looks like it is logging AMQP stuff frequently and inflating
 the logs of individual services as it polls them.
 
 Offending files from tempest tests:
 screen-n-cond.txt.gz 7.3M
 screen-ceilometer-collector.txt.gz 6.0M
 screen-n-api.txt.gz 3.7M
 screen-n-cpu.txt.gz 3.6M
 tempest.txt.gz 2.7M
 screen-ceilometer-anotification.txt.gz 1.9M
 subunit_log.txt.gz 1.5M
 screen-g-api.txt.gz 1.4M
 screen-ceilometer-acentral.txt.gz 1.4M
 screen-n-net.txt.gz 1.4M
 from: 
 http://logs.openstack.org/52/81252/2/gate/gate-tempest-dsvm-full/488bc4e/logs/?C=S;O=D
 
 Unittest offenders:
 Nova subunit_log.txt.gz 14M
 Neutron subunit_log.txt.gz 7.8M
 Keystone subunit_log.txt.gz 4.8M
 
 Note all of the above files are compressed with gzip -9 and the filesizes
 above reflect compressed file sizes.
 
 Debug logs are important to you guys when dealing with Jenkins results. We
 want your feedback on how we can make this better for everyone.
 
 [0] 
 http://cacti.openstack.org/cacti/graph.php?action=viewlocal_graph_id=717rra_id=all
 
 Thank you,
 Clark Boylan
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 


-- 
Sean Dague
Samsung Research America
s...@dague.net / sean.da...@samsung.com
http://dague.net



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Jenkins test logs and their retention period

2014-03-24 Thread Joe Gordon
On Mon, Mar 24, 2014 at 3:49 AM, Sean Dague s...@dague.net wrote:

 Here is some preliminary views (it currently ignores the ceilometer
 logs, I haven't had a chance to dive in there yet).

 It actually looks like a huge part of the issue is olso.messaging, the
 bulk of screen-n-cond is oslo.messaging debug errors. It seems that in
 debug mode oslo.messaging is basically a 100% trace mode, which include
 logging every time a UUID is created and every payload.

 I'm not convinced why that's a useful. We don't log every sql statement
 we run (with full payload).


Agreed. I turned off oslo.messaging logs [1] and the file sizes in a
check-tempest-dsvm-full dropped drastically to [2]. nova-conductor logs
dropped way down from 7.3MB to 214K.

[1] https://review.openstack.org/#/c/82255/
[2]
http://logs.openstack.org/55/82255/1/check/check-tempest-dsvm-full/88d1e36/logs/?C=S;O=D

The recent integration of oslo.messaging would also explain the new
 growth of logs.

 Other issues include other oslo utils that have really verbose debug
 modes. Like lockutils emitting 4 DEBUG messages for every lock acquired.

 Part of the challenge is turning off DEBUG is currently embedded in code
 in oslo log, which makes it kind of awkward to set sane log levels for
 included libraries because it requires an oslo round trip with code to
 all the projects to do it.


++

One possible solution is to start using the  log_config_append and load the
config from a logging.conf file. But we don't even copy over the sample
file in devstack. So for icehouse we may want to do a cherry-pick from
oslo-incubator to disable oslo.messaging



 -Sean

 On 03/21/2014 07:23 PM, Clark Boylan wrote:
  Hello everyone,
 
  Back at the Portland summit the Infra team committed to archiving six
 months
  of test logs for Openstack. Since then we have managed to do just that.
  However, more recently we have seen the growth rate on those logs
 continue
  to grow beyond what is a currently sustainable level.
 
  For reasons, we currently store logs on a filesystem backed by cinder
  volumes. Rackspace limits the size and number of volumes attached to a
  single host meaning the upper bound on the log archive filesystem is
 ~12TB
  and we are almost there. You can see real numbers and pretty graphs on
 our
  cacti server [0].
 
  Long term we are trying to move to putting all of the logs in swift, but
 it
  turns out there are some use case issues we need to sort out around that
  before we can do so (but this is being worked on so should happen). Until
  that day arrives we need to work on logging more smartly, and if we
 can't do
  that we will have to reduce the log retention period.
 
  So what can you do? Well it appears that our log files may need a diet. I
  have listed the worst offenders below (after a small sampling, there may
 be
  more) and it would be great if we could go through those with a comb and
  figure out if we are logging actually useful data. The great thing about
  doing this is it will make lives better for deployers of Openstack too.
 
  Some initial checking indicates a lot of this noise may be related to
  ceilometer. It looks like it is logging AMQP stuff frequently and
 inflating
  the logs of individual services as it polls them.
 
  Offending files from tempest tests:
  screen-n-cond.txt.gz 7.3M
  screen-ceilometer-collector.txt.gz 6.0M
  screen-n-api.txt.gz 3.7M
  screen-n-cpu.txt.gz 3.6M
  tempest.txt.gz 2.7M
  screen-ceilometer-anotification.txt.gz 1.9M
  subunit_log.txt.gz 1.5M
  screen-g-api.txt.gz 1.4M
  screen-ceilometer-acentral.txt.gz 1.4M
  screen-n-net.txt.gz 1.4M
  from:
 http://logs.openstack.org/52/81252/2/gate/gate-tempest-dsvm-full/488bc4e/logs/?C=S;O=D
 
  Unittest offenders:
  Nova subunit_log.txt.gz 14M
  Neutron subunit_log.txt.gz 7.8M
  Keystone subunit_log.txt.gz 4.8M
 
  Note all of the above files are compressed with gzip -9 and the filesizes
  above reflect compressed file sizes.
 
  Debug logs are important to you guys when dealing with Jenkins results.
 We
  want your feedback on how we can make this better for everyone.
 
  [0]
 http://cacti.openstack.org/cacti/graph.php?action=viewlocal_graph_id=717rra_id=all
 
  Thank you,
  Clark Boylan
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 


 --
 Sean Dague
 Samsung Research America
 s...@dague.net / sean.da...@samsung.com
 http://dague.net


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Jenkins test logs and their retention period

2014-03-24 Thread Joe Gordon
On Mon, Mar 24, 2014 at 2:09 PM, Joe Gordon joe.gord...@gmail.com wrote:




 On Mon, Mar 24, 2014 at 3:49 AM, Sean Dague s...@dague.net wrote:

 Here is some preliminary views (it currently ignores the ceilometer
 logs, I haven't had a chance to dive in there yet).

 It actually looks like a huge part of the issue is olso.messaging, the
 bulk of screen-n-cond is oslo.messaging debug errors. It seems that in
 debug mode oslo.messaging is basically a 100% trace mode, which include
 logging every time a UUID is created and every payload.

 I'm not convinced why that's a useful. We don't log every sql statement
 we run (with full payload).


 Agreed. I turned off oslo.messaging logs [1] and the file sizes in a
 check-tempest-dsvm-full dropped drastically to [2]. nova-conductor logs
 dropped way down from 7.3MB to 214K.

 [1] https://review.openstack.org/#/c/82255/


The patch above is now an oslo-incubator cherry-pick (oslo-incubator patch
pending). Cherry-pick because we are in feature freeze and want to be
conservative with changes.


 [2]
 http://logs.openstack.org/55/82255/1/check/check-tempest-dsvm-full/88d1e36/logs/?C=S;O=D

 The recent integration of oslo.messaging would also explain the new
 growth of logs.

 Other issues include other oslo utils that have really verbose debug
 modes. Like lockutils emitting 4 DEBUG messages for every lock acquired.

 Part of the challenge is turning off DEBUG is currently embedded in code
 in oslo log, which makes it kind of awkward to set sane log levels for
 included libraries because it requires an oslo round trip with code to
 all the projects to do it.


 ++

 One possible solution is to start using the  log_config_append and load
 the config from a logging.conf file. But we don't even copy over the sample
 file in devstack. So for icehouse we may want to do a cherry-pick from
 oslo-incubator to disable oslo.messaging



 -Sean

 On 03/21/2014 07:23 PM, Clark Boylan wrote:
  Hello everyone,
 
  Back at the Portland summit the Infra team committed to archiving six
 months
  of test logs for Openstack. Since then we have managed to do just that.
  However, more recently we have seen the growth rate on those logs
 continue
  to grow beyond what is a currently sustainable level.
 
  For reasons, we currently store logs on a filesystem backed by cinder
  volumes. Rackspace limits the size and number of volumes attached to a
  single host meaning the upper bound on the log archive filesystem is
 ~12TB
  and we are almost there. You can see real numbers and pretty graphs on
 our
  cacti server [0].
 
  Long term we are trying to move to putting all of the logs in swift,
 but it
  turns out there are some use case issues we need to sort out around that
  before we can do so (but this is being worked on so should happen).
 Until
  that day arrives we need to work on logging more smartly, and if we
 can't do
  that we will have to reduce the log retention period.
 
  So what can you do? Well it appears that our log files may need a diet.
 I
  have listed the worst offenders below (after a small sampling, there
 may be
  more) and it would be great if we could go through those with a comb and
  figure out if we are logging actually useful data. The great thing about
  doing this is it will make lives better for deployers of Openstack too.
 
  Some initial checking indicates a lot of this noise may be related to
  ceilometer. It looks like it is logging AMQP stuff frequently and
 inflating
  the logs of individual services as it polls them.
 
  Offending files from tempest tests:
  screen-n-cond.txt.gz 7.3M
  screen-ceilometer-collector.txt.gz 6.0M
  screen-n-api.txt.gz 3.7M
  screen-n-cpu.txt.gz 3.6M
  tempest.txt.gz 2.7M
  screen-ceilometer-anotification.txt.gz 1.9M
  subunit_log.txt.gz 1.5M
  screen-g-api.txt.gz 1.4M
  screen-ceilometer-acentral.txt.gz 1.4M
  screen-n-net.txt.gz 1.4M
  from:
 http://logs.openstack.org/52/81252/2/gate/gate-tempest-dsvm-full/488bc4e/logs/?C=S;O=D
 
  Unittest offenders:
  Nova subunit_log.txt.gz 14M
  Neutron subunit_log.txt.gz 7.8M
  Keystone subunit_log.txt.gz 4.8M
 
  Note all of the above files are compressed with gzip -9 and the
 filesizes
  above reflect compressed file sizes.
 
  Debug logs are important to you guys when dealing with Jenkins results.
 We
  want your feedback on how we can make this better for everyone.
 
  [0]
 http://cacti.openstack.org/cacti/graph.php?action=viewlocal_graph_id=717rra_id=all
 
  Thank you,
  Clark Boylan
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 


 --
 Sean Dague
 Samsung Research America
 s...@dague.net / sean.da...@samsung.com
 http://dague.net


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




[openstack-dev] Jenkins test logs and their retention period

2014-03-21 Thread Clark Boylan
Hello everyone,

Back at the Portland summit the Infra team committed to archiving six months
of test logs for Openstack. Since then we have managed to do just that.
However, more recently we have seen the growth rate on those logs continue
to grow beyond what is a currently sustainable level.

For reasons, we currently store logs on a filesystem backed by cinder
volumes. Rackspace limits the size and number of volumes attached to a
single host meaning the upper bound on the log archive filesystem is ~12TB
and we are almost there. You can see real numbers and pretty graphs on our
cacti server [0].

Long term we are trying to move to putting all of the logs in swift, but it
turns out there are some use case issues we need to sort out around that
before we can do so (but this is being worked on so should happen). Until
that day arrives we need to work on logging more smartly, and if we can't do
that we will have to reduce the log retention period.

So what can you do? Well it appears that our log files may need a diet. I
have listed the worst offenders below (after a small sampling, there may be
more) and it would be great if we could go through those with a comb and
figure out if we are logging actually useful data. The great thing about
doing this is it will make lives better for deployers of Openstack too.

Some initial checking indicates a lot of this noise may be related to
ceilometer. It looks like it is logging AMQP stuff frequently and inflating
the logs of individual services as it polls them.

Offending files from tempest tests:
screen-n-cond.txt.gz 7.3M
screen-ceilometer-collector.txt.gz 6.0M
screen-n-api.txt.gz 3.7M
screen-n-cpu.txt.gz 3.6M
tempest.txt.gz 2.7M
screen-ceilometer-anotification.txt.gz 1.9M
subunit_log.txt.gz 1.5M
screen-g-api.txt.gz 1.4M
screen-ceilometer-acentral.txt.gz 1.4M
screen-n-net.txt.gz 1.4M
from: 
http://logs.openstack.org/52/81252/2/gate/gate-tempest-dsvm-full/488bc4e/logs/?C=S;O=D

Unittest offenders:
Nova subunit_log.txt.gz 14M
Neutron subunit_log.txt.gz 7.8M
Keystone subunit_log.txt.gz 4.8M

Note all of the above files are compressed with gzip -9 and the filesizes
above reflect compressed file sizes.

Debug logs are important to you guys when dealing with Jenkins results. We
want your feedback on how we can make this better for everyone.

[0] 
http://cacti.openstack.org/cacti/graph.php?action=viewlocal_graph_id=717rra_id=all

Thank you,
Clark Boylan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev