Re: [openstack-dev] Jenkins test logs and their retention period

Joe Gordon Mon, 24 Mar 2014 14:16:30 -0700

On Mon, Mar 24, 2014 at 3:49 AM, Sean Dague <[email protected]> wrote:

> Here is some preliminary views (it currently ignores the ceilometer
> logs, I haven't had a chance to dive in there yet).
>
> It actually looks like a huge part of the issue is olso.messaging, the
> bulk of screen-n-cond is oslo.messaging debug errors. It seems that in
> debug mode oslo.messaging is basically a 100% trace mode, which include
> logging every time a UUID is created and every payload.
>
> I'm not convinced why that's a useful. We don't log every sql statement
> we run (with full payload).
>
>
Agreed. I turned off oslo.messaging logs [1] and the file sizes in a
check-tempest-dsvm-full dropped drastically to [2]. nova-conductor logs
dropped way down from 7.3MB to 214K.


[1] https://review.openstack.org/#/c/82255/
[2]
http://logs.openstack.org/55/82255/1/check/check-tempest-dsvm-full/88d1e36/logs/?C=S;O=D

The recent integration of oslo.messaging would also explain the new
> growth of logs.
>
> Other issues include other oslo utils that have really verbose debug
> modes. Like lockutils emitting 4 DEBUG messages for every lock acquired.
>
> Part of the challenge is turning off DEBUG is currently embedded in code
> in oslo log, which makes it kind of awkward to set sane log levels for
> included libraries because it requires an oslo round trip with code to
> all the projects to do it.
>

++

One possible solution is to start using the  log_config_append and load the
config from a logging.conf file. But we don't even copy over the sample
file in devstack. So for icehouse we may want to do a cherry-pick from
oslo-incubator to disable oslo.messaging


>
>         -Sean
>
> On 03/21/2014 07:23 PM, Clark Boylan wrote:
> > Hello everyone,
> >
> > Back at the Portland summit the Infra team committed to archiving six
> months
> > of test logs for Openstack. Since then we have managed to do just that.
> > However, more recently we have seen the growth rate on those logs
> continue
> > to grow beyond what is a currently sustainable level.
> >
> > For reasons, we currently store logs on a filesystem backed by cinder
> > volumes. Rackspace limits the size and number of volumes attached to a
> > single host meaning the upper bound on the log archive filesystem is
> ~12TB
> > and we are almost there. You can see real numbers and pretty graphs on
> our
> > cacti server [0].
> >
> > Long term we are trying to move to putting all of the logs in swift, but
> it
> > turns out there are some use case issues we need to sort out around that
> > before we can do so (but this is being worked on so should happen). Until
> > that day arrives we need to work on logging more smartly, and if we
> can't do
> > that we will have to reduce the log retention period.
> >
> > So what can you do? Well it appears that our log files may need a diet. I
> > have listed the worst offenders below (after a small sampling, there may
> be
> > more) and it would be great if we could go through those with a comb and
> > figure out if we are logging actually useful data. The great thing about
> > doing this is it will make lives better for deployers of Openstack too.
> >
> > Some initial checking indicates a lot of this noise may be related to
> > ceilometer. It looks like it is logging AMQP stuff frequently and
> inflating
> > the logs of individual services as it polls them.
> >
> > Offending files from tempest tests:
> > screen-n-cond.txt.gz 7.3M
> > screen-ceilometer-collector.txt.gz 6.0M
> > screen-n-api.txt.gz 3.7M
> > screen-n-cpu.txt.gz 3.6M
> > tempest.txt.gz 2.7M
> > screen-ceilometer-anotification.txt.gz 1.9M
> > subunit_log.txt.gz 1.5M
> > screen-g-api.txt.gz 1.4M
> > screen-ceilometer-acentral.txt.gz 1.4M
> > screen-n-net.txt.gz 1.4M
> > from:
> http://logs.openstack.org/52/81252/2/gate/gate-tempest-dsvm-full/488bc4e/logs/?C=S;O=D
> >
> > Unittest offenders:
> > Nova subunit_log.txt.gz 14M
> > Neutron subunit_log.txt.gz 7.8M
> > Keystone subunit_log.txt.gz 4.8M
> >
> > Note all of the above files are compressed with gzip -9 and the filesizes
> > above reflect compressed file sizes.
> >
> > Debug logs are important to you guys when dealing with Jenkins results.
> We
> > want your feedback on how we can make this better for everyone.
> >
> > [0]
> http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=717&rra_id=all
> >
> > Thank you,
> > Clark Boylan
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > [email protected]
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
>
> --
> Sean Dague
> Samsung Research America
> [email protected] / [email protected]
> http://dague.net
>
>
> _______________________________________________
> OpenStack-dev mailing list
> [email protected]
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Jenkins test logs and their retention period

Reply via email to