Are we sure that performance concerns aren't manifested in 'real' environments?
I think we should make some more effort to validate that assumption.
I agree with all the points below, but the worst part of the logging is the
readability. Right now the log is fine for developers and support but from a
user's point of view we may as well write it out in hex - it's very hard to
decipher and makes self support for users very difficult.
We should default to INFO mode which should be understandable, and switching to
debug should be simple and ideally possible from ovirt-engine.
Saggi Mizrahi smizr...@redhat.com wrote:
of course you are right and just having everything log in debug is less then
optimal.
We do only save 100 logs back so this will not fill up a decently sized hard
drive.
The performance implication though existing don't really pop up in normal use.
It's true we usually don't need logs that far back but as I said it's really
not an issue.
Even when you try and run a 1000 VMs you will not be blocked on logging trying
to get written to disk I can assure you of that.
There are a lot of things we can do to improve the size footprint and
readability of our logs.
The move to XZ instead of GZ was a simple way to save on space and there are
plans on using an even more compressed representation for old logs.
We are also working on tools to help manage the log files.
You are correct though that setting the logging level back or changing the
amount of logs kept shouldn't be such a bothersome process and you should open
a bug on that for VDSM and it will be fixed.
- Original Message -
From: Peter Portante pport...@redhat.com
To: Saggi Mizrahi smizr...@redhat.com
Cc: vdsm-devel@lists.fedorahosted.org
Sent: Thursday, April 5, 2012 3:57:08 PM
Subject: Re: [vdsm] /etc/rc.c/init.d/vdsmd set filters for libvirt at DEBUG
level
Hi Saggi,
- Original Message -
From: Saggi Mizrahi smizr...@redhat.com
To: Peter Portante pport...@redhat.com
Cc: vdsm-devel@lists.fedorahosted.org
Sent: Thursday, April 5, 2012 3:02:23 PM
Subject: Re: [vdsm] /etc/rc.c/init.d/vdsmd set filters for libvirt
at DEBUG level
It's required for support purposes. When something goes wrong we
collect all the logs from the hosts, this way we can figure out
whats wrong without requiring someone to reproduce the problem with
the logging turned on.
We are working on making the logs easier to filter when inspecting
them.
But the general idea is, if the information is in when can easily
get
it out, but doing it the other way around is impossible.
While one can understand the pain of debugging complex systems,
respectfully, this approach seems more problematic than helpful.
First, it was filling up the system disk. In 10 minutes there were
four compressed log files for libvirt alone, not to mention the
vdsmd.log files. Talking with the rest of the performance team, we
always turn all this off so that we don't loose our systems while
testing.
Additionally to get the logging to stop, one has to modify the vdsmd
start up script to get libvirt to stop logging so much. Each time a
modification was made to libvirt's configuration file to make it
stop, libvirt kept up all its debug logging. All the documentation
on the libvirt web page tells one what to do to affect its behavior,
but in the presence of vdsm that is not the case. Somehow, that
seems like a problem to have one subsystem completely override
another with leaving any indication that it is doing so.
Second, it is expensive to have such overhead. Compressing and
maintaining arbitrary sized log files of text takes processing time
away from the VMs. It was amazing to see how often xz would run on a
box, not knowing it was related to maintaining these log files.
There must be a better way than collecting all the data we could
possibly need ahead of time just in case a problem comes up.
Have you considered asserting the expected state of the system before
embarking on changing that state?
In a nutshell order to filter out debug messages grep -v is you
friend.
grep -v is not useful when your system disk fills up. :)
And Why wouldn't an attacker use that fact in some sort of denial of
service? And if the counter is to configure the log files so that
they are processed more often and kept to a small number, then as
the amount of data grows (like when multiple VMs are created) the
original problem will get lost as it is truncated.
So if we already a finite data set of sorts, why not drop using log
files in favor of using a dedicated ring buffer that stores
ultra-compressed binary data (enough to track the problem) with a
tool that can format that ring buffer into useable output.
When we were writing the thread library for Tru64 Unix and OpenVMS,
such ring buffers were invaluable to help find complicated timing
problems across multiple processors.
Respectfully