If only you would accept http://gerrit.ovirt.org/#/c/10313, Tony could manage to check the syslog for reports and fix it much faster.. :) Both patches should be backported IMHO
Thanks, Yaniv. ----- Original Message ----- > From: "Dan Kenigsberg" <[email protected]> > To: "Tony Feldmann" <[email protected]>, "Yaniv Bronheim" > <[email protected]> > Cc: "Joop" <[email protected]>, [email protected], [email protected] > Sent: Friday, April 12, 2013 12:33:07 AM > Subject: Re: [Users] vdsm unresponsive with python exception > > On Thu, Apr 11, 2013 at 03:51:07PM -0500, Tony Feldmann wrote: > > That was the issue. Found out yesterday that vdsm.log was somehow changed > > to root:root. Just now got a chance to put it back on the mailing list. > > How does the ownership of that file get cahnged. When the issue occurred I > > am certain there was no one on the system. > > http://gerrit.ovirt.org/#/c/12940/ (Separating supervdsm log to > supervdsm.log file) solves the issue. unfortunately, only on the master > branch of vdsm. > > I think that this is a nasty issue that has to be backported to the > ovirt-3.2 branch as well, and merits to be part of ovirt-3.2.2. > > Regards, > Dan. > > > > > > > On Thu, Apr 11, 2013 at 2:15 PM, Joop <[email protected]> wrote: > > > > > Dan Kenigsberg wrote: > > > > > >> On Wed, Apr 10, 2013 at 08:59:01AM -0500, Tony Feldmann wrote: > > >> > > >> > > >>> I am having a strange issue in my ovirt cluster. I have 2 hosts, 1 > > >>> running > > >>> engine and added as a host and one other system added as a host. Both > > >>> systems are running gluster across local disks for shared storage. > > >>> Everything was working fine until last night, where my system that is > > >>> also > > >>> running the engine when unresponsive in the admin page. All vms were > > >>> still > > >>> running that were on the host. I shut down the vms that were on the > > >>> host > > >>> from within the guest os as I was not able to do anything to the vm > > >>> with > > >>> the host in unresponsive state. After getting the vms off and > > >>> rebooting > > >>> the host, the vdsmd service says that it is running, but it continually > > >>> restarts the vdsm process and dumps out these messages: detected > > >>> unhandled > > >>> Python exception in '/usr/share/vdsm/vdsm'. All services say they are > > >>> up > > >>> and running but the host stays in unresponsive state and the vdsm > > >>> process > > >>> keeps respawning. There is also no data in the vdsm.log. Can anyone > > >>> shed > > >>> any light on this for me? > > >>> > > >>> > > >> > > >> [email protected] may be a better place to ask vdsm-specific > > >> questions. > > >> > > >> Could you log into the non-operational host as root, and stop the vdsm > > >> service. > > >> > > >> Then become the vdsm user with > > >> > > >> su -s /bin/bash - vdsm > > >> > > >> and run /usr/share/vdsm/vdsm manually. Do you see anything in > > >> particular? > > >> > > >> > > >> > > > Please have a look at the permissions/owner of /var/log/vdsm/vdsm.log. > > > Should be vdsm:kvm and not root:root > > > > > > Joop > > > > > > > > > _______________________________________________ > > Users mailing list > > [email protected] > > http://lists.ovirt.org/mailman/listinfo/users > > _______________________________________________ vdsm-devel mailing list [email protected] https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
