Re: [openstack-dev] Jenkins test logs and their retention period
On Wed, Mar 26, 2014 at 2:54 PM, Joe Gordon joe.gord...@gmail.com wrote: On Wed, Mar 26, 2014 at 9:51 AM, Doug Hellmann doug.hellm...@dreamhost.com wrote: On Tue, Mar 25, 2014 at 5:34 PM, Brant Knudson b...@acm.org wrote: On Mon, Mar 24, 2014 at 5:49 AM, Sean Dague s...@dague.net wrote: ... Part of the challenge is turning off DEBUG is currently embedded in code in oslo log, which makes it kind of awkward to set sane log levels for included libraries because it requires an oslo round trip with code to all the projects to do it. Here's how it's done in Keystone: https://review.openstack.org/#/c/62068/10/keystone/config.py It's definitely awkward. https://bugs.launchpad.net/oslo/+bug/1297950 Currently when you enable debug logs in openstack, the root logger is set to debug and then we have to go and blacklist specific modules that we don't want to run on debug. What about instead adding an option to just set the openstack component at hand to debug log level and not the root logger? That way we won't have to keep maintaining a blacklist of modules that generate too many debug logs. Doing that makes sense, too. Do we need a new option, or is there some combination of existing options that we could interpret to mean debug this openstack app but not all of the libraries it is using? Doug Doug - Brant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Jenkins test logs and their retention period
On Thu, Mar 27, 2014 at 6:53 AM, Doug Hellmann doug.hellm...@dreamhost.comwrote: On Wed, Mar 26, 2014 at 2:54 PM, Joe Gordon joe.gord...@gmail.com wrote: On Wed, Mar 26, 2014 at 9:51 AM, Doug Hellmann doug.hellm...@dreamhost.com wrote: On Tue, Mar 25, 2014 at 5:34 PM, Brant Knudson b...@acm.org wrote: On Mon, Mar 24, 2014 at 5:49 AM, Sean Dague s...@dague.net wrote: ... Part of the challenge is turning off DEBUG is currently embedded in code in oslo log, which makes it kind of awkward to set sane log levels for included libraries because it requires an oslo round trip with code to all the projects to do it. Here's how it's done in Keystone: https://review.openstack.org/#/c/62068/10/keystone/config.py It's definitely awkward. https://bugs.launchpad.net/oslo/+bug/1297950 Currently when you enable debug logs in openstack, the root logger is set to debug and then we have to go and blacklist specific modules that we don't want to run on debug. What about instead adding an option to just set the openstack component at hand to debug log level and not the root logger? That way we won't have to keep maintaining a blacklist of modules that generate too many debug logs. Doing that makes sense, too. Do we need a new option, or is there some combination of existing options that we could interpret to mean debug this openstack app but not all of the libraries it is using? I'm not sure if we need a new option or re-use the existing ones. But the current config options are somewhat confusing. We have separate debug and verbose options. Doug Doug - Brant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Jenkins test logs and their retention period
On Tue, Mar 25, 2014 at 5:34 PM, Brant Knudson b...@acm.org wrote: On Mon, Mar 24, 2014 at 5:49 AM, Sean Dague s...@dague.net wrote: ... Part of the challenge is turning off DEBUG is currently embedded in code in oslo log, which makes it kind of awkward to set sane log levels for included libraries because it requires an oslo round trip with code to all the projects to do it. Here's how it's done in Keystone: https://review.openstack.org/#/c/62068/10/keystone/config.py It's definitely awkward. https://bugs.launchpad.net/oslo/+bug/1297950 Doug - Brant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Jenkins test logs and their retention period
On Mar 24, 2014, at 2:09 PM, Joe Gordon joe.gord...@gmail.com wrote: On Mon, Mar 24, 2014 at 3:49 AM, Sean Dague s...@dague.net wrote: Here is some preliminary views (it currently ignores the ceilometer logs, I haven't had a chance to dive in there yet). It actually looks like a huge part of the issue is olso.messaging, the bulk of screen-n-cond is oslo.messaging debug errors. It seems that in debug mode oslo.messaging is basically a 100% trace mode, which include logging every time a UUID is created and every payload. I'm not convinced why that's a useful. We don't log every sql statement we run (with full payload). Agreed. I turned off oslo.messaging logs [1] and the file sizes in a check-tempest-dsvm-full dropped drastically to [2]. nova-conductor logs dropped way down from 7.3MB to 214K. [1] https://review.openstack.org/#/c/82255/ [2] http://logs.openstack.org/55/82255/1/check/check-tempest-dsvm-full/88d1e36/logs/?C=S;O=D The recent integration of oslo.messaging would also explain the new growth of logs. Other issues include other oslo utils that have really verbose debug modes. Like lockutils emitting 4 DEBUG messages for every lock acquired. Part of the challenge is turning off DEBUG is currently embedded in code in oslo log, which makes it kind of awkward to set sane log levels for included libraries because it requires an oslo round trip with code to all the projects to do it. ++ One possible solution is to start using the log_config_append and load the config from a logging.conf file. But we don't even copy over the sample file in devstack. So for icehouse we may want to do a cherry-pick from oslo-incubator to disable oslo.messaging Can’t we just specify a reasonable default_log_levels in *.conf in devstack? That would cut down the log chatter for integration tests, and wouldn’t be a breaking change. Vish -Sean On 03/21/2014 07:23 PM, Clark Boylan wrote: Hello everyone, Back at the Portland summit the Infra team committed to archiving six months of test logs for Openstack. Since then we have managed to do just that. However, more recently we have seen the growth rate on those logs continue to grow beyond what is a currently sustainable level. For reasons, we currently store logs on a filesystem backed by cinder volumes. Rackspace limits the size and number of volumes attached to a single host meaning the upper bound on the log archive filesystem is ~12TB and we are almost there. You can see real numbers and pretty graphs on our cacti server [0]. Long term we are trying to move to putting all of the logs in swift, but it turns out there are some use case issues we need to sort out around that before we can do so (but this is being worked on so should happen). Until that day arrives we need to work on logging more smartly, and if we can't do that we will have to reduce the log retention period. So what can you do? Well it appears that our log files may need a diet. I have listed the worst offenders below (after a small sampling, there may be more) and it would be great if we could go through those with a comb and figure out if we are logging actually useful data. The great thing about doing this is it will make lives better for deployers of Openstack too. Some initial checking indicates a lot of this noise may be related to ceilometer. It looks like it is logging AMQP stuff frequently and inflating the logs of individual services as it polls them. Offending files from tempest tests: screen-n-cond.txt.gz 7.3M screen-ceilometer-collector.txt.gz 6.0M screen-n-api.txt.gz 3.7M screen-n-cpu.txt.gz 3.6M tempest.txt.gz 2.7M screen-ceilometer-anotification.txt.gz 1.9M subunit_log.txt.gz 1.5M screen-g-api.txt.gz 1.4M screen-ceilometer-acentral.txt.gz 1.4M screen-n-net.txt.gz 1.4M from: http://logs.openstack.org/52/81252/2/gate/gate-tempest-dsvm-full/488bc4e/logs/?C=S;O=D Unittest offenders: Nova subunit_log.txt.gz 14M Neutron subunit_log.txt.gz 7.8M Keystone subunit_log.txt.gz 4.8M Note all of the above files are compressed with gzip -9 and the filesizes above reflect compressed file sizes. Debug logs are important to you guys when dealing with Jenkins results. We want your feedback on how we can make this better for everyone. [0] http://cacti.openstack.org/cacti/graph.php?action=viewlocal_graph_id=717rra_id=all Thank you, Clark Boylan ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Sean Dague Samsung Research America s...@dague.net / sean.da...@samsung.com http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Jenkins test logs and their retention period
On Wed, Mar 26, 2014 at 11:15 AM, Vishvananda Ishaya vishvana...@gmail.comwrote: On Mar 24, 2014, at 2:09 PM, Joe Gordon joe.gord...@gmail.com wrote: On Mon, Mar 24, 2014 at 3:49 AM, Sean Dague s...@dague.net wrote: Here is some preliminary views (it currently ignores the ceilometer logs, I haven't had a chance to dive in there yet). It actually looks like a huge part of the issue is olso.messaging, the bulk of screen-n-cond is oslo.messaging debug errors. It seems that in debug mode oslo.messaging is basically a 100% trace mode, which include logging every time a UUID is created and every payload. I'm not convinced why that's a useful. We don't log every sql statement we run (with full payload). Agreed. I turned off oslo.messaging logs [1] and the file sizes in a check-tempest-dsvm-full dropped drastically to [2]. nova-conductor logs dropped way down from 7.3MB to 214K. [1] https://review.openstack.org/#/c/82255/ [2] http://logs.openstack.org/55/82255/1/check/check-tempest-dsvm-full/88d1e36/logs/?C=S;O=D The recent integration of oslo.messaging would also explain the new growth of logs. Other issues include other oslo utils that have really verbose debug modes. Like lockutils emitting 4 DEBUG messages for every lock acquired. Part of the challenge is turning off DEBUG is currently embedded in code in oslo log, which makes it kind of awkward to set sane log levels for included libraries because it requires an oslo round trip with code to all the projects to do it. ++ One possible solution is to start using the log_config_append and load the config from a logging.conf file. But we don't even copy over the sample file in devstack. So for icehouse we may want to do a cherry-pick from oslo-incubator to disable oslo.messaging Can't we just specify a reasonable default_log_levels in *.conf in devstack? That would cut down the log chatter for integration tests, and wouldn't be a breaking change. If we are having problems in gate with verbose and useless logs, others will to ... so I don't think we should sidestep the problem via devstack otherwise every deployer will have to do the same. This fits in with the *sane defaults* mantra. Vish -Sean On 03/21/2014 07:23 PM, Clark Boylan wrote: Hello everyone, Back at the Portland summit the Infra team committed to archiving six months of test logs for Openstack. Since then we have managed to do just that. However, more recently we have seen the growth rate on those logs continue to grow beyond what is a currently sustainable level. For reasons, we currently store logs on a filesystem backed by cinder volumes. Rackspace limits the size and number of volumes attached to a single host meaning the upper bound on the log archive filesystem is ~12TB and we are almost there. You can see real numbers and pretty graphs on our cacti server [0]. Long term we are trying to move to putting all of the logs in swift, but it turns out there are some use case issues we need to sort out around that before we can do so (but this is being worked on so should happen). Until that day arrives we need to work on logging more smartly, and if we can't do that we will have to reduce the log retention period. So what can you do? Well it appears that our log files may need a diet. I have listed the worst offenders below (after a small sampling, there may be more) and it would be great if we could go through those with a comb and figure out if we are logging actually useful data. The great thing about doing this is it will make lives better for deployers of Openstack too. Some initial checking indicates a lot of this noise may be related to ceilometer. It looks like it is logging AMQP stuff frequently and inflating the logs of individual services as it polls them. Offending files from tempest tests: screen-n-cond.txt.gz 7.3M screen-ceilometer-collector.txt.gz 6.0M screen-n-api.txt.gz 3.7M screen-n-cpu.txt.gz 3.6M tempest.txt.gz 2.7M screen-ceilometer-anotification.txt.gz 1.9M subunit_log.txt.gz 1.5M screen-g-api.txt.gz 1.4M screen-ceilometer-acentral.txt.gz 1.4M screen-n-net.txt.gz 1.4M from: http://logs.openstack.org/52/81252/2/gate/gate-tempest-dsvm-full/488bc4e/logs/?C=S;O=D Unittest offenders: Nova subunit_log.txt.gz 14M Neutron subunit_log.txt.gz 7.8M Keystone subunit_log.txt.gz 4.8M Note all of the above files are compressed with gzip -9 and the filesizes above reflect compressed file sizes. Debug logs are important to you guys when dealing with Jenkins results. We want your feedback on how we can make this better for everyone. [0] http://cacti.openstack.org/cacti/graph.php?action=viewlocal_graph_id=717rra_id=all Thank you, Clark Boylan ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org
Re: [openstack-dev] Jenkins test logs and their retention period
On Wed, Mar 26, 2014 at 9:51 AM, Doug Hellmann doug.hellm...@dreamhost.comwrote: On Tue, Mar 25, 2014 at 5:34 PM, Brant Knudson b...@acm.org wrote: On Mon, Mar 24, 2014 at 5:49 AM, Sean Dague s...@dague.net wrote: ... Part of the challenge is turning off DEBUG is currently embedded in code in oslo log, which makes it kind of awkward to set sane log levels for included libraries because it requires an oslo round trip with code to all the projects to do it. Here's how it's done in Keystone: https://review.openstack.org/#/c/62068/10/keystone/config.py It's definitely awkward. https://bugs.launchpad.net/oslo/+bug/1297950 Currently when you enable debug logs in openstack, the root logger is set to debug and then we have to go and blacklist specific modules that we don't want to run on debug. What about instead adding an option to just set the openstack component at hand to debug log level and not the root logger? That way we won't have to keep maintaining a blacklist of modules that generate too many debug logs. Doug - Brant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Jenkins test logs and their retention period
On Mon, Mar 24, 2014 at 5:49 AM, Sean Dague s...@dague.net wrote: ... Part of the challenge is turning off DEBUG is currently embedded in code in oslo log, which makes it kind of awkward to set sane log levels for included libraries because it requires an oslo round trip with code to all the projects to do it. Here's how it's done in Keystone: https://review.openstack.org/#/c/62068/10/keystone/config.py It's definitely awkward. - Brant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Jenkins test logs and their retention period
Here is some preliminary views (it currently ignores the ceilometer logs, I haven't had a chance to dive in there yet). It actually looks like a huge part of the issue is olso.messaging, the bulk of screen-n-cond is oslo.messaging debug errors. It seems that in debug mode oslo.messaging is basically a 100% trace mode, which include logging every time a UUID is created and every payload. I'm not convinced why that's a useful. We don't log every sql statement we run (with full payload). The recent integration of oslo.messaging would also explain the new growth of logs. Other issues include other oslo utils that have really verbose debug modes. Like lockutils emitting 4 DEBUG messages for every lock acquired. Part of the challenge is turning off DEBUG is currently embedded in code in oslo log, which makes it kind of awkward to set sane log levels for included libraries because it requires an oslo round trip with code to all the projects to do it. -Sean On 03/21/2014 07:23 PM, Clark Boylan wrote: Hello everyone, Back at the Portland summit the Infra team committed to archiving six months of test logs for Openstack. Since then we have managed to do just that. However, more recently we have seen the growth rate on those logs continue to grow beyond what is a currently sustainable level. For reasons, we currently store logs on a filesystem backed by cinder volumes. Rackspace limits the size and number of volumes attached to a single host meaning the upper bound on the log archive filesystem is ~12TB and we are almost there. You can see real numbers and pretty graphs on our cacti server [0]. Long term we are trying to move to putting all of the logs in swift, but it turns out there are some use case issues we need to sort out around that before we can do so (but this is being worked on so should happen). Until that day arrives we need to work on logging more smartly, and if we can't do that we will have to reduce the log retention period. So what can you do? Well it appears that our log files may need a diet. I have listed the worst offenders below (after a small sampling, there may be more) and it would be great if we could go through those with a comb and figure out if we are logging actually useful data. The great thing about doing this is it will make lives better for deployers of Openstack too. Some initial checking indicates a lot of this noise may be related to ceilometer. It looks like it is logging AMQP stuff frequently and inflating the logs of individual services as it polls them. Offending files from tempest tests: screen-n-cond.txt.gz 7.3M screen-ceilometer-collector.txt.gz 6.0M screen-n-api.txt.gz 3.7M screen-n-cpu.txt.gz 3.6M tempest.txt.gz 2.7M screen-ceilometer-anotification.txt.gz 1.9M subunit_log.txt.gz 1.5M screen-g-api.txt.gz 1.4M screen-ceilometer-acentral.txt.gz 1.4M screen-n-net.txt.gz 1.4M from: http://logs.openstack.org/52/81252/2/gate/gate-tempest-dsvm-full/488bc4e/logs/?C=S;O=D Unittest offenders: Nova subunit_log.txt.gz 14M Neutron subunit_log.txt.gz 7.8M Keystone subunit_log.txt.gz 4.8M Note all of the above files are compressed with gzip -9 and the filesizes above reflect compressed file sizes. Debug logs are important to you guys when dealing with Jenkins results. We want your feedback on how we can make this better for everyone. [0] http://cacti.openstack.org/cacti/graph.php?action=viewlocal_graph_id=717rra_id=all Thank you, Clark Boylan ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Sean Dague Samsung Research America s...@dague.net / sean.da...@samsung.com http://dague.net signature.asc Description: OpenPGP digital signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Jenkins test logs and their retention period
On Mon, Mar 24, 2014 at 3:49 AM, Sean Dague s...@dague.net wrote: Here is some preliminary views (it currently ignores the ceilometer logs, I haven't had a chance to dive in there yet). It actually looks like a huge part of the issue is olso.messaging, the bulk of screen-n-cond is oslo.messaging debug errors. It seems that in debug mode oslo.messaging is basically a 100% trace mode, which include logging every time a UUID is created and every payload. I'm not convinced why that's a useful. We don't log every sql statement we run (with full payload). Agreed. I turned off oslo.messaging logs [1] and the file sizes in a check-tempest-dsvm-full dropped drastically to [2]. nova-conductor logs dropped way down from 7.3MB to 214K. [1] https://review.openstack.org/#/c/82255/ [2] http://logs.openstack.org/55/82255/1/check/check-tempest-dsvm-full/88d1e36/logs/?C=S;O=D The recent integration of oslo.messaging would also explain the new growth of logs. Other issues include other oslo utils that have really verbose debug modes. Like lockutils emitting 4 DEBUG messages for every lock acquired. Part of the challenge is turning off DEBUG is currently embedded in code in oslo log, which makes it kind of awkward to set sane log levels for included libraries because it requires an oslo round trip with code to all the projects to do it. ++ One possible solution is to start using the log_config_append and load the config from a logging.conf file. But we don't even copy over the sample file in devstack. So for icehouse we may want to do a cherry-pick from oslo-incubator to disable oslo.messaging -Sean On 03/21/2014 07:23 PM, Clark Boylan wrote: Hello everyone, Back at the Portland summit the Infra team committed to archiving six months of test logs for Openstack. Since then we have managed to do just that. However, more recently we have seen the growth rate on those logs continue to grow beyond what is a currently sustainable level. For reasons, we currently store logs on a filesystem backed by cinder volumes. Rackspace limits the size and number of volumes attached to a single host meaning the upper bound on the log archive filesystem is ~12TB and we are almost there. You can see real numbers and pretty graphs on our cacti server [0]. Long term we are trying to move to putting all of the logs in swift, but it turns out there are some use case issues we need to sort out around that before we can do so (but this is being worked on so should happen). Until that day arrives we need to work on logging more smartly, and if we can't do that we will have to reduce the log retention period. So what can you do? Well it appears that our log files may need a diet. I have listed the worst offenders below (after a small sampling, there may be more) and it would be great if we could go through those with a comb and figure out if we are logging actually useful data. The great thing about doing this is it will make lives better for deployers of Openstack too. Some initial checking indicates a lot of this noise may be related to ceilometer. It looks like it is logging AMQP stuff frequently and inflating the logs of individual services as it polls them. Offending files from tempest tests: screen-n-cond.txt.gz 7.3M screen-ceilometer-collector.txt.gz 6.0M screen-n-api.txt.gz 3.7M screen-n-cpu.txt.gz 3.6M tempest.txt.gz 2.7M screen-ceilometer-anotification.txt.gz 1.9M subunit_log.txt.gz 1.5M screen-g-api.txt.gz 1.4M screen-ceilometer-acentral.txt.gz 1.4M screen-n-net.txt.gz 1.4M from: http://logs.openstack.org/52/81252/2/gate/gate-tempest-dsvm-full/488bc4e/logs/?C=S;O=D Unittest offenders: Nova subunit_log.txt.gz 14M Neutron subunit_log.txt.gz 7.8M Keystone subunit_log.txt.gz 4.8M Note all of the above files are compressed with gzip -9 and the filesizes above reflect compressed file sizes. Debug logs are important to you guys when dealing with Jenkins results. We want your feedback on how we can make this better for everyone. [0] http://cacti.openstack.org/cacti/graph.php?action=viewlocal_graph_id=717rra_id=all Thank you, Clark Boylan ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Sean Dague Samsung Research America s...@dague.net / sean.da...@samsung.com http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Jenkins test logs and their retention period
On Mon, Mar 24, 2014 at 2:09 PM, Joe Gordon joe.gord...@gmail.com wrote: On Mon, Mar 24, 2014 at 3:49 AM, Sean Dague s...@dague.net wrote: Here is some preliminary views (it currently ignores the ceilometer logs, I haven't had a chance to dive in there yet). It actually looks like a huge part of the issue is olso.messaging, the bulk of screen-n-cond is oslo.messaging debug errors. It seems that in debug mode oslo.messaging is basically a 100% trace mode, which include logging every time a UUID is created and every payload. I'm not convinced why that's a useful. We don't log every sql statement we run (with full payload). Agreed. I turned off oslo.messaging logs [1] and the file sizes in a check-tempest-dsvm-full dropped drastically to [2]. nova-conductor logs dropped way down from 7.3MB to 214K. [1] https://review.openstack.org/#/c/82255/ The patch above is now an oslo-incubator cherry-pick (oslo-incubator patch pending). Cherry-pick because we are in feature freeze and want to be conservative with changes. [2] http://logs.openstack.org/55/82255/1/check/check-tempest-dsvm-full/88d1e36/logs/?C=S;O=D The recent integration of oslo.messaging would also explain the new growth of logs. Other issues include other oslo utils that have really verbose debug modes. Like lockutils emitting 4 DEBUG messages for every lock acquired. Part of the challenge is turning off DEBUG is currently embedded in code in oslo log, which makes it kind of awkward to set sane log levels for included libraries because it requires an oslo round trip with code to all the projects to do it. ++ One possible solution is to start using the log_config_append and load the config from a logging.conf file. But we don't even copy over the sample file in devstack. So for icehouse we may want to do a cherry-pick from oslo-incubator to disable oslo.messaging -Sean On 03/21/2014 07:23 PM, Clark Boylan wrote: Hello everyone, Back at the Portland summit the Infra team committed to archiving six months of test logs for Openstack. Since then we have managed to do just that. However, more recently we have seen the growth rate on those logs continue to grow beyond what is a currently sustainable level. For reasons, we currently store logs on a filesystem backed by cinder volumes. Rackspace limits the size and number of volumes attached to a single host meaning the upper bound on the log archive filesystem is ~12TB and we are almost there. You can see real numbers and pretty graphs on our cacti server [0]. Long term we are trying to move to putting all of the logs in swift, but it turns out there are some use case issues we need to sort out around that before we can do so (but this is being worked on so should happen). Until that day arrives we need to work on logging more smartly, and if we can't do that we will have to reduce the log retention period. So what can you do? Well it appears that our log files may need a diet. I have listed the worst offenders below (after a small sampling, there may be more) and it would be great if we could go through those with a comb and figure out if we are logging actually useful data. The great thing about doing this is it will make lives better for deployers of Openstack too. Some initial checking indicates a lot of this noise may be related to ceilometer. It looks like it is logging AMQP stuff frequently and inflating the logs of individual services as it polls them. Offending files from tempest tests: screen-n-cond.txt.gz 7.3M screen-ceilometer-collector.txt.gz 6.0M screen-n-api.txt.gz 3.7M screen-n-cpu.txt.gz 3.6M tempest.txt.gz 2.7M screen-ceilometer-anotification.txt.gz 1.9M subunit_log.txt.gz 1.5M screen-g-api.txt.gz 1.4M screen-ceilometer-acentral.txt.gz 1.4M screen-n-net.txt.gz 1.4M from: http://logs.openstack.org/52/81252/2/gate/gate-tempest-dsvm-full/488bc4e/logs/?C=S;O=D Unittest offenders: Nova subunit_log.txt.gz 14M Neutron subunit_log.txt.gz 7.8M Keystone subunit_log.txt.gz 4.8M Note all of the above files are compressed with gzip -9 and the filesizes above reflect compressed file sizes. Debug logs are important to you guys when dealing with Jenkins results. We want your feedback on how we can make this better for everyone. [0] http://cacti.openstack.org/cacti/graph.php?action=viewlocal_graph_id=717rra_id=all Thank you, Clark Boylan ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Sean Dague Samsung Research America s...@dague.net / sean.da...@samsung.com http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Jenkins test logs and their retention period
Hello everyone, Back at the Portland summit the Infra team committed to archiving six months of test logs for Openstack. Since then we have managed to do just that. However, more recently we have seen the growth rate on those logs continue to grow beyond what is a currently sustainable level. For reasons, we currently store logs on a filesystem backed by cinder volumes. Rackspace limits the size and number of volumes attached to a single host meaning the upper bound on the log archive filesystem is ~12TB and we are almost there. You can see real numbers and pretty graphs on our cacti server [0]. Long term we are trying to move to putting all of the logs in swift, but it turns out there are some use case issues we need to sort out around that before we can do so (but this is being worked on so should happen). Until that day arrives we need to work on logging more smartly, and if we can't do that we will have to reduce the log retention period. So what can you do? Well it appears that our log files may need a diet. I have listed the worst offenders below (after a small sampling, there may be more) and it would be great if we could go through those with a comb and figure out if we are logging actually useful data. The great thing about doing this is it will make lives better for deployers of Openstack too. Some initial checking indicates a lot of this noise may be related to ceilometer. It looks like it is logging AMQP stuff frequently and inflating the logs of individual services as it polls them. Offending files from tempest tests: screen-n-cond.txt.gz 7.3M screen-ceilometer-collector.txt.gz 6.0M screen-n-api.txt.gz 3.7M screen-n-cpu.txt.gz 3.6M tempest.txt.gz 2.7M screen-ceilometer-anotification.txt.gz 1.9M subunit_log.txt.gz 1.5M screen-g-api.txt.gz 1.4M screen-ceilometer-acentral.txt.gz 1.4M screen-n-net.txt.gz 1.4M from: http://logs.openstack.org/52/81252/2/gate/gate-tempest-dsvm-full/488bc4e/logs/?C=S;O=D Unittest offenders: Nova subunit_log.txt.gz 14M Neutron subunit_log.txt.gz 7.8M Keystone subunit_log.txt.gz 4.8M Note all of the above files are compressed with gzip -9 and the filesizes above reflect compressed file sizes. Debug logs are important to you guys when dealing with Jenkins results. We want your feedback on how we can make this better for everyone. [0] http://cacti.openstack.org/cacti/graph.php?action=viewlocal_graph_id=717rra_id=all Thank you, Clark Boylan ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev