On Sun, Apr 14, 2013 at 04:28:02AM -0400, Yaniv Bronheim wrote: > If only you would accept http://gerrit.ovirt.org/#/c/10313, Tony could manage > to check the syslog for reports and fix it much faster.. :) > Both patches should be backported IMHO
Well, please start by backporting the undoubted log separation. Then we can continute the blame game :-) > > Thanks, > Yaniv. > > > ----- Original Message ----- > > From: "Dan Kenigsberg" <[email protected]> > > To: "Tony Feldmann" <[email protected]>, "Yaniv Bronheim" > > <[email protected]> > > Cc: "Joop" <[email protected]>, [email protected], [email protected] > > Sent: Friday, April 12, 2013 12:33:07 AM > > Subject: Re: [Users] vdsm unresponsive with python exception > > > > On Thu, Apr 11, 2013 at 03:51:07PM -0500, Tony Feldmann wrote: > > > That was the issue. Found out yesterday that vdsm.log was somehow changed > > > to root:root. Just now got a chance to put it back on the mailing list. > > > How does the ownership of that file get cahnged. When the issue occurred > > > I > > > am certain there was no one on the system. > > > > http://gerrit.ovirt.org/#/c/12940/ (Separating supervdsm log to > > supervdsm.log file) solves the issue. unfortunately, only on the master > > branch of vdsm. > > > > I think that this is a nasty issue that has to be backported to the > > ovirt-3.2 branch as well, and merits to be part of ovirt-3.2.2. > > > > Regards, > > Dan. > > > > > > > > > > > On Thu, Apr 11, 2013 at 2:15 PM, Joop <[email protected]> wrote: > > > > > > > Dan Kenigsberg wrote: > > > > > > > >> On Wed, Apr 10, 2013 at 08:59:01AM -0500, Tony Feldmann wrote: > > > >> > > > >> > > > >>> I am having a strange issue in my ovirt cluster. I have 2 hosts, 1 > > > >>> running > > > >>> engine and added as a host and one other system added as a host. Both > > > >>> systems are running gluster across local disks for shared storage. > > > >>> Everything was working fine until last night, where my system that is > > > >>> also > > > >>> running the engine when unresponsive in the admin page. All vms were > > > >>> still > > > >>> running that were on the host. I shut down the vms that were on the > > > >>> host > > > >>> from within the guest os as I was not able to do anything to the vm > > > >>> with > > > >>> the host in unresponsive state. After getting the vms off and > > > >>> rebooting > > > >>> the host, the vdsmd service says that it is running, but it > > > >>> continually > > > >>> restarts the vdsm process and dumps out these messages: detected > > > >>> unhandled > > > >>> Python exception in '/usr/share/vdsm/vdsm'. All services say they are > > > >>> up > > > >>> and running but the host stays in unresponsive state and the vdsm > > > >>> process > > > >>> keeps respawning. There is also no data in the vdsm.log. Can anyone > > > >>> shed > > > >>> any light on this for me? > > > >>> > > > >>> > > > >> > > > >> [email protected] may be a better place to ask vdsm-specific > > > >> questions. > > > >> > > > >> Could you log into the non-operational host as root, and stop the vdsm > > > >> service. > > > >> > > > >> Then become the vdsm user with > > > >> > > > >> su -s /bin/bash - vdsm > > > >> > > > >> and run /usr/share/vdsm/vdsm manually. Do you see anything in > > > >> particular? > > > >> > > > >> > > > >> > > > > Please have a look at the permissions/owner of /var/log/vdsm/vdsm.log. > > > > Should be vdsm:kvm and not root:root > > > > > > > > Joop > > > > > > > > > > > > > _______________________________________________ > > > Users mailing list > > > [email protected] > > > http://lists.ovirt.org/mailman/listinfo/users > > > > _______________________________________________ vdsm-devel mailing list [email protected] https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
