----- Original Message ----- > From: "Nir Soffer" <[email protected]> > To: "Simone Tiraboschi" <[email protected]> > Cc: [email protected], "Fabian Deutsch" <[email protected]> > Sent: Friday, May 29, 2015 5:26:52 PM > Subject: Re: [ovirt-devel] oVirt node 3.6 and CPU load indefinitely stuck on > 100% while vdsmd indefinitely tries to > restart > > ----- Original Message ----- > > From: "Simone Tiraboschi" <[email protected]> > > To: [email protected] > > Cc: "Fabian Deutsch" <[email protected]> > > Sent: Friday, May 29, 2015 1:44:02 PM > > Subject: [ovirt-devel] oVirt node 3.6 and CPU load indefinitely stuck on > > 100% while vdsmd indefinitely tries to > > restart > > > > Hi, > > I tried to have hosted-engine deploying the engine appliance over oVirt > > node. > > I think it will be quite a common scenario. > > I tried with an oVirt node build from yesterday. > > > > Unfortunately I'm not able to conclude the setup cause oVirt node got the > > CPU > > load indefinitely stuck on 100% and so it's almost unresponsive. > > > > The issue seams to be related to vdsmd daemon witch couldn't really start > > and > > so it retries indefinitely using all the available CPU power (it also runs > > with niceless -20...). > > > > [root@node36 admin]# grep "Unit vdsmd.service entered failed state." > > /var/log/messages | wc -l > > 368 > > It tried 368 times in a row in a few minutes. > > > > With journalctl I can read: > > May 29 10:06:45 node36 systemd[1]: Unit vdsmd.service entered failed state. > > May 29 10:06:45 node36 systemd[1]: vdsmd.service holdoff time over, > > scheduling restart. > > May 29 10:06:45 node36 systemd[1]: Stopping Virtual Desktop Server > > Manager... > > May 29 10:06:45 node36 systemd[1]: Starting Virtual Desktop Server > > Manager... > > May 29 10:06:45 node36 vdsmd_init_common.sh[13697]: vdsm: Running mkdirs > > May 29 10:06:45 node36 vdsmd_init_common.sh[13697]: vdsm: Running > > configure_coredump > > May 29 10:06:45 node36 vdsmd_init_common.sh[13697]: vdsm: Running > > configure_vdsm_logs > > May 29 10:06:45 node36 vdsmd_init_common.sh[13697]: vdsm: Running > > wait_for_network > > May 29 10:06:45 node36 vdsmd_init_common.sh[13697]: vdsm: Running > > run_init_hooks > > May 29 10:06:46 node36 vdsmd_init_common.sh[13697]: vdsm: Running > > upgraded_version_check > > May 29 10:06:46 node36 vdsmd_init_common.sh[13697]: vdsm: Running > > check_is_configured > > May 29 10:06:46 node36 vdsmd_init_common.sh[13697]: vdsm: Running > > validate_configuration > > May 29 10:06:47 node36 vdsmd_init_common.sh[13697]: vdsm: Running > > prepare_transient_repository > > May 29 10:06:49 node36 vdsmd_init_common.sh[13697]: vdsm: Running > > syslog_available > > May 29 10:06:49 node36 vdsmd_init_common.sh[13697]: vdsm: Running nwfilter > > May 29 10:06:50 node36 vdsmd_init_common.sh[13697]: vdsm: Running dummybr > > May 29 10:06:51 node36 vdsmd_init_common.sh[13697]: vdsm: Running > > load_needed_modules > > May 29 10:06:51 node36 vdsmd_init_common.sh[13697]: vdsm: Running > > tune_system > > May 29 10:06:51 node36 vdsmd_init_common.sh[13697]: vdsm: Running > > test_space > > May 29 10:06:51 node36 vdsmd_init_common.sh[13697]: vdsm: Running test_lo > > May 29 10:06:51 node36 systemd[1]: Started Virtual Desktop Server Manager. > > May 29 10:06:51 node36 systemd[1]: vdsmd.service: main process exited, > > code=exited, status=1/FAILURE > > May 29 10:06:51 node36 vdsmd_init_common.sh[13821]: vdsm: Running > > run_final_hooks > > May 29 10:06:52 node36 systemd[1]: Unit vdsmd.service entered failed state. > > May 29 10:06:52 node36 systemd[1]: vdsmd.service holdoff time over, > > scheduling restart. > > May 29 10:06:52 node36 systemd[1]: Stopping Virtual Desktop Server > > Manager... > > May 29 10:06:52 node36 systemd[1]: Starting Virtual Desktop Server > > Manager... > > repeated a lot of times > > > > /var/log/vdsm/vdsm.log is empty. > > > > while > > [root@node36 admin]# /usr/share/vdsm/daemonAdapter -0 /dev/null -1 > > /dev/null > > -2 /dev/null /usr/share/vdsm/vdsm; echo $? > > 1 > > Can you try to run vdsm manually from the shell? > > # /usr/share/vdsm/vdsm > > Typically you would see a python traceback explaining the failure.
I tried and it just fails. Exit code is 1 _______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
