So here is the engine collapse as it lost network connectivity
(before the
server actually crashed hard).
2018-01-23 13:45:33,666 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-87) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VDSM d0lppn067 command failed:
Heartbeat
exeeded
2018-01-23 13:45:33,666 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-10) [21574461] Correlation ID: null, Call
Stack: null, Custom Event ID: -1, Message: VDSM d0lppn072 command
failed:
Heartbeat exeeded
2018-01-23 13:45:33,666 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-37) [4e8ec41d] Correlation ID: null, Call
Stack: null, Custom Event ID: -1, Message: VDSM d0lppn066 command
failed:
Heartbeat exeeded
2018-01-23 13:45:33,667 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetStatsVDSCommand]
(DefaultQuartzScheduler_Worker-87) [] Command
'GetStatsVDSCommand(HostName =
d0lppn067, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
hostId='f99c68c8-b0e8-437b-8cd9-ebaddaaede96',
vds='Host[d0lppn067,f99c68c8-b0e8-437b-8cd9-ebaddaaede96]'})' execution
failed: VDSGenericException: VDSNetworkException: Heartbeat exeeded
2018-01-23 13:45:33,667 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetStatsVDSCommand]
(DefaultQuartzScheduler_Worker-10) [21574461] Command
'GetStatsVDSCommand(HostName = d0lppn072,
VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
hostId='fdc00296-973d-4268-bd79-6dac535974e0',
vds='Host[d0lppn072,fdc00296-973d-4268-bd79-6dac535974e0]'})' execution
failed: VDSGenericException: VDSNetworkException: Heartbeat exeeded
2018-01-23 13:45:33,667 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetStatsVDSCommand]
(DefaultQuartzScheduler_Worker-37) [4e8ec41d] Command
'GetStatsVDSCommand(HostName = d0lppn066,
VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
hostId='14abf559-4b62-4ebd-a345-77fa9e1fa3ae',
vds='Host[d0lppn066,14abf559-4b62-4ebd-a345-77fa9e1fa3ae]'})' execution
failed: VDSGenericException: VDSNetworkException: Heartbeat exeeded
2018-01-23 13:45:33,669 ERROR
[org.ovirt.engine.core.vdsbroker.HostMonitoring]
(DefaultQuartzScheduler_Worker-87) [] Failed getting vds stats,
vds='d0lppn067'(f99c68c8-b0e8-437b-8cd9-ebaddaaede96):
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
VDSGenericException: VDSNetworkException: Heartbeat exeeded
2018-01-23 13:45:33,669 ERROR
[org.ovirt.engine.core.vdsbroker.HostMonitoring]
(DefaultQuartzScheduler_Worker-10) [21574461] Failed getting vds stats,
vds='d0lppn072'(fdc00296-973d-4268-bd79-6dac535974e0):
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
VDSGenericException: VDSNetworkException: Heartbeat exeeded
2018-01-23 13:45:33,669 ERROR
[org.ovirt.engine.core.vdsbroker.HostMonitoring]
(DefaultQuartzScheduler_Worker-37) [4e8ec41d] Failed getting vds stats,
vds='d0lppn066'(14abf559-4b62-4ebd-a345-77fa9e1fa3ae):
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
VDSGenericException: VDSNetworkException: Heartbeat exeeded
2018-01-23 13:45:33,671 ERROR
[org.ovirt.engine.core.vdsbroker.HostMonitoring]
(DefaultQuartzScheduler_Worker-10) [21574461] Failure to refresh Vds
runtime
info: VDSGenericException: VDSNetworkException: Heartbeat exeeded
2018-01-23 13:45:33,671 ERROR
[org.ovirt.engine.core.vdsbroker.HostMonitoring]
(DefaultQuartzScheduler_Worker-37) [4e8ec41d] Failure to refresh Vds
runtime
info: VDSGenericException: VDSNetworkException: Heartbeat exeeded
2018-01-23 13:45:33,671 ERROR
[org.ovirt.engine.core.vdsbroker.HostMonitoring]
(DefaultQuartzScheduler_Worker-87) [] Failure to refresh Vds runtime
info:
VDSGenericException: VDSNetworkException: Heartbeat exeeded
2018-01-23 13:45:33,671 ERROR
[org.ovirt.engine.core.vdsbroker.HostMonitoring]
(DefaultQuartzScheduler_Worker-37) [4e8ec41d] Exception:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
VDSGenericException: VDSNetworkException: Heartbeat exeeded
at
org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.proceedProxyReturnValue(BrokerCommandBase.java:188)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.vdsbroker.GetStatsVDSCommand.executeVdsBrokerCommand(GetStatsVDSCommand.java:21)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:110)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:65)
[vdsbroker.jar:]
at
org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33)
[dal.jar:]
at
org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:467)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.HostMonitoring.refreshVdsStats(HostMonitoring.java:472)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.HostMonitoring.refreshVdsRunTimeInfo(HostMonitoring.java:114)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.HostMonitoring.refresh(HostMonitoring.java:84)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.VdsManager.onTimer(VdsManager.java:227)
[vdsbroker.jar:]
at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
[:1.8.0_102]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[rt.jar:1.8.0_102]
at java.lang.reflect.Method.invoke(Method.java:498)
[rt.jar:1.8.0_102]
at
org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:81)
[scheduler.jar:]
at
org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:52)
[scheduler.jar:]
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
[quartz.jar:]
at
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
[quartz.jar:]
2018-01-23 13:45:33,671 ERROR
[org.ovirt.engine.core.vdsbroker.HostMonitoring]
(DefaultQuartzScheduler_Worker-10) [21574461] Exception:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
VDSGenericException: VDSNetworkException: Heartbeat exeeded
at
org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.proceedProxyReturnValue(BrokerCommandBase.java:188)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.vdsbroker.GetStatsVDSCommand.executeVdsBrokerCommand(GetStatsVDSCommand.java:21)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:110)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:65)
[vdsbroker.jar:]
at
org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33)
[dal.jar:]
at
org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:467)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.HostMonitoring.refreshVdsStats(HostMonitoring.java:472)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.HostMonitoring.refreshVdsRunTimeInfo(HostMonitoring.java:114)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.HostMonitoring.refresh(HostMonitoring.java:84)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.VdsManager.onTimer(VdsManager.java:227)
[vdsbroker.jar:]
at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
[:1.8.0_102]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[rt.jar:1.8.0_102]
at java.lang.reflect.Method.invoke(Method.java:498)
[rt.jar:1.8.0_102]
at
org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:81)
[scheduler.jar:]
at
org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:52)
[scheduler.jar:]
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
[quartz.jar:]
at
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
[quartz.jar:]
2018-01-23 13:45:33,671 ERROR
[org.ovirt.engine.core.vdsbroker.HostMonitoring]
(DefaultQuartzScheduler_Worker-87) [] Exception:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
VDSGenericException: VDSNetworkException: Heartbeat exeeded
at
org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.proceedProxyReturnValue(BrokerCommandBase.java:188)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.vdsbroker.GetStatsVDSCommand.executeVdsBrokerCommand(GetStatsVDSCommand.java:21)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:110)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:65)
[vdsbroker.jar:]
at
org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33)
[dal.jar:]
at
org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:467)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.HostMonitoring.refreshVdsStats(HostMonitoring.java:472)
[vdsbroker.jar:]
Here are the engine logs show problem with node d0lppn065, the VMs
first go
to "Unknown" then then "Unknown" plus "not responding":
2018-01-23 14:48:00,712 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(org.ovirt.thread.pool-8-thread-28) [] Correlation ID: null, Call Stack:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
org.ovirt.vdsm.jsonrpc.client.ClientConnection
Exception: Connection failed
at
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.createNetworkException(VdsBrokerCommand.java:157)
at
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:120)
at
org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:65)
at
org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33)
at
org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:467)
at
org.ovirt.engine.core.vdsbroker.VmsStatisticsFetcher.fetch(VmsStatisticsFetcher.java:27)
at
org.ovirt.engine.core.vdsbroker.PollVmStatsRefresher.poll(PollVmStatsRefresher.java:35)
at sun.reflect.GeneratedMethodAccessor80.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:81)
at
org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:52)
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
at
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
Caused by: org.ovirt.vdsm.jsonrpc.client.ClientConnectionException:
Connection failed
at
org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient.connect(ReactorClient.java:155)
at
org.ovirt.vdsm.jsonrpc.client.JsonRpcClient.getClient(JsonRpcClient.java:134)
at
org.ovirt.vdsm.jsonrpc.client.JsonRpcClient.call(JsonRpcClient.java:81)
at
org.ovirt.engine.core.vdsbroker.jsonrpc.FutureMap.<init>(FutureMap.java:70)
at
org.ovirt.engine.core.vdsbroker.jsonrpc.JsonRpcVdsServer.getAllVmStats(JsonRpcVdsServer.java:331)
at
org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand.executeVdsBrokerCommand(GetAllVmStatsVDSCommand.java:20)
at
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:110)
... 12 more
, Custom Event ID: -1, Message: Host d0lppn065 is non responsive.
2018-01-23 14:48:00,713 INFO
[org.ovirt.engine.core.bll.VdsEventListener]
(org.ovirt.thread.pool-8-thread-1) [] ResourceManager::vdsNotResponding
entered for Host '2797cae7-6886-4898-a5e4-23361ce03a90', '10.32.0.65'
2018-01-23 14:48:00,713 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(org.ovirt.thread.pool-8-thread-36) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM vtop3 was set to the Unknown
status.
...etc... (sorry about the wraps below)
2018-01-23 14:59:07,817 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'30f7af86-c2b9-41c3-b2c5-49f5bbdd0e27'(d0lpvd070) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:07,819 INFO
[org.ovirt.engine.core.vdsbroker.VmsStatisticsFetcher]
(DefaultQuartzScheduler_Worker-74) [] Fetched 15 VMs from VDS
'8cb119c5-b7f0-48a3-970a-205d96b2e940'
2018-01-23 14:59:07,936 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM d0lpvd070 is not responding.
2018-01-23 14:59:07,939 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'ebc5bb82-b985-451b-8313-827b5f40eaf3'(d0lpvd039) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:08,032 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM d0lpvd039 is not responding.
2018-01-23 14:59:08,038 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'494c4f9e-1616-476a-8f66-a26a96b76e56'(vtop3) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:08,134 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM vtop3 is not responding.
2018-01-23 14:59:08,136 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'eaeaf73c-d9e2-426e-a2f2-7fcf085137b0'(d0lpvw059) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:08,237 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM d0lpvw059 is not responding.
2018-01-23 14:59:08,239 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'8308a547-37a1-4163-8170-f89b6dc85ba8'(d0lpvm058) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:08,326 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM d0lpvm058 is not responding.
2018-01-23 14:59:08,328 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'3d544926-3326-44e1-8b2a-ec632f51112a'(d0lqva056) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:08,400 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM d0lqva056 is not responding.
2018-01-23 14:59:08,402 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'989e5a17-789d-4eba-8a5e-f74846128842'(d0lpva078) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:08,472 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM d0lpva078 is not responding.
2018-01-23 14:59:08,474 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'050a71c1-9e65-43c6-bdb2-18eba571e2eb'(d0lpvw077) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:08,545 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM d0lpvw077 is not responding.
2018-01-23 14:59:08,547 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'c3b497fd-6181-4dd1-9acf-8e32f981f769'(d0lpva079) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:08,621 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM d0lpva079 is not responding.
2018-01-23 14:59:08,623 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'7cd22b39-feb1-4c6e-8643-ac8fb0578842'(d0lqva034) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:08,690 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM d0lqva034 is not responding.
2018-01-23 14:59:08,692 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'2ab9b1d8-d1e8-4071-a47c-294e586d2fb6'(d0lpvd038) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:08,763 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM d0lpvd038 is not responding.
2018-01-23 14:59:08,768 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'ecb4e795-9eeb-4cdc-a356-c1b9b32af5aa'(d0lqva031) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:08,836 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM d0lqva031 is not responding.
2018-01-23 14:59:08,838 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'1a361727-1607-43d9-bd22-34d45b386d3e'(d0lqva033) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:08,911 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM d0lqva033 is not responding.
2018-01-23 14:59:08,913 INFO
[org.ovirt.engine.core.vdsbroker.VmAnalyzer]
(DefaultQuartzScheduler_Worker-75) [] VM
'0cd65f90-719e-429e-a845-f425612d7b14'(vtop4) moved from 'Up' -->
'NotResponding'
2018-01-23 14:59:08,984 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-75) [] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM vtop4 is not responding.
Probably it's time to think to upgrade your environment from 3.6.
I know. But from a production standpoint mid-2016 wasn't that long ago.
And 4 was just coming out of beta at the time.
We were upgrading from 3.4 to 3.6. And it took a long time (again,
because
it's all "live"). Trust me, the move to 4.0 was discussed, it was
just a
timing thing.
With that said, I do "hear you"....and certainly it's being
discussed. We
just don't see a "good" migration path... we see a slow path (moving
nodes
out, upgrading, etc.) and knowing that as with all things, nobody can
guarantee "success", which would be a very bad thing. So going from
working
3.6 to totally (potential) broken 4.2, isn't going to impress anyone
here,
you know? If all goes according to our best guesses, then great, but
when
things go bad, and the chance is not insignificant, well... I'm just not
quite prepared with my résumé if you know what I mean.
Don't get me wrong, our move from 3.4 to 3.6 had some similar risks,
but we
also migrated to whole new infrastructure, a luxury we will not have
this
time. And somehow 3.4 to 3.6 doesn't sound as risky as 3.6 to 4.2.