On Thu, Dec 28, 2017 at 6:02 PM Dan Kenigsberg <[email protected]> wrote:
> On Thu, Dec 28, 2017 at 5:36 PM, Yaniv Kaul <[email protected]> wrote: > > Just saw this on[1]: > > 2017-12-28 10:09:24,980-0500 INFO (MainThread) [vdsm.api] FINISH > > prepareForShutdown return=None from=internal, > > task_id=1a83a4d3-43e3-4833-883b-dfd2fe4d76be (api:52) > > 2017-12-28 10:09:24,980-0500 INFO (MainThread) [vds] Stopping threads > > (vdsmd:159) > > 2017-12-28 10:09:24,980-0500 INFO (MainThread) [vds] Exiting (vdsmd:170) > > 2017-12-28 10:09:25,011-0500 INFO (mailbox-hsm) > > [storage.MailBox.HsmMailMonitor] HSM_MailboxMonitor - Incoming mail > > monitoring thread stopped, clearing outgoing mail (mailbox:511) > This seems to be the issue, the monitor thread is trying to access storage *after* it was stopped. Relying on being able to access storage to clean up things during shutdown is never going to work. > > 2017-12-28 10:09:25,011-0500 INFO (mailbox-hsm) > > [storage.MailBox.HsmMailMonitor] HSM_MailMonitor sending mail to SPM - > > ['/usr/bin/dd', > > > 'of=/rhev/data-center/9b97451a-0312-4c69-85e2-f86d6d273636/mastersd/dom_md/inbox', > > 'iflag=fullblock', 'oflag=direct', 'conv=notrunc', 'bs=4096', 'count=1', > > 'seek=1'] (mailbox:387) > > 2017-12-28 10:09:25,012-0500 ERROR (mailbox-hsm) > > [storage.MailBox.HsmMailMonitor] FINISH thread <Thread(mailbox-hsm, > started > > daemon 139968009594624)> failed (concurrent:201) > > Traceback (most recent call last): > > File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line > > 194, in run > > ret = func(*args, **kwargs) > > File "/usr/lib/python2.7/site-packages/vdsm/storage/mailbox.py", line > 514, > > in _run > > self._sendMail() # Clear outgoing mailbox > > File "/usr/lib/python2.7/site-packages/vdsm/storage/mailbox.py", line > 394, > > in _sendMail > > _mboxExecCmd(self._outCmd, data=self._outgoingMail) > > File "/usr/lib/python2.7/site-packages/vdsm/storage/mailbox.py", line > 84, > > in _mboxExecCmd > > return misc.execCmd(*args, **kwargs) > > File "/usr/lib/python2.7/site-packages/vdsm/common/commands.py", line > 53, > > in execCmd > > command = cmdutils.wrap_command(command, with_ioclass=ioclass, > > AttributeError: 'NoneType' object has no attribute 'wrap_command' > > 2017-12-28 10:09:28,965-0500 INFO (MainThread) [vds] (PID: 23232) I am > the > > actual vdsm 4.20.9-84.git615770f.el7.centos > lago-basic-suite-master-host-0 > > (3.10.0-693.2.2.el7.x86_64) (vdsmd:148) > > 2017-12-28 10:09:28,966-0500 INFO (MainThread) [vds] VDSM will run with > cpu > > affinity: frozenset([1]) (vdsmd:254) > > 2017-12-28 10:09:28,970-0500 INFO (MainThread) [storage.HSM] START HSM > init > > (hsm:366) > > > > > > [1] > > > http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_64/3113/artifact/exported-artifacts/basic-suite-master__logs/test_logs/basic-suite-master/post-005_network_by_label.py/lago-basic-suite-master-host-0/_var_log/vdsm/vdsm.log > > > It's quite ugly, but I think that it is harmless. > prepareForShutdown calls self._pool.hsmMailer.stop() but never waits > for it to actually finish. It then exists and Python starts freeing > its modules, such as cmdutils. When hsmMailer wakes up, cmdutils is > already None. > > The fix could be as simple as https://gerrit.ovirt.org/85810 but I'm > not sure if it is worth the slowdown in vdsm shutdown. > We can wait for the mailer, but only after we stopped all other threads. Nir
_______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
