Hello Remi,
First of all I don't have /var/log/xha.log file. I have examined logs
in detail and haven't found any trace that heartbeat has failed. The
only serious problem I have found in management logs before restart is
repeating many times error:
-------------------------------------------------
2015-09-06 00:47:21,591 DEBUG [c.c.a.m.AgentManagerImpl]
(RouterMonitor-1:ctx-2d67d422) Details from executing class
com.cloud.agent.api.NetworkUsageCommand: Exception: java.lang.Exception
Message: vpc network usage plugin call failed
Stack: java.lang.Exception: vpc network usage plugin call failed
at
com.cloud.hypervisor.xenserver.resource.XenServer56Resource.VPCNetworkUsage(XenServer56Resource.java:172)
at
com.cloud.hypervisor.xenserver.resource.XenServer56Resource.execute(XenServer56Resource.java:195)
at
com.cloud.hypervisor.xenserver.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:62)
at
com.cloud.hypervisor.xenserver.resource.XenServer610Resource.executeRequest(XenServer610Resource.java:87)
at
com.cloud.hypervisor.xenserver.resource.XenServer620SP1Resource.executeRequest(XenServer620SP1Resource.java:65)
at
com.cloud.agent.manager.DirectAgentAttache$Task.runInContext(DirectAgentAttache.java:302)
...
-------------------------------------------------
Just couple of seconds before XCP2 host restart:
-------------------------------------------------
2015-09-06 00:48:27,884 DEBUG [c.c.a.m.DirectAgentAttache]
(DirectAgentCronJob-83:ctx-ff822baf) Ping from 2(xcp1)
2015-09-06 00:48:27,884 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) Process host VM state report from
ping process. host: 2
2015-09-06 00:48:27,904 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) Process VM state report. host: 2,
number of records in report: 6
2015-09-06 00:48:27,904 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) VM state report. host: 2, vm id:
85, power state: PowerOn
2015-09-06 00:48:27,907 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) VM power state does not change,
skip DB writing. vm id: 85
2015-09-06 00:48:27,907 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) VM state report. host: 2, vm id: 1,
power state: PowerOn
2015-09-06 00:48:27,910 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) VM power state does not change,
skip DB writing. vm id: 1
2015-09-06 00:48:27,910 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) VM state report. host: 2, vm id: 2,
power state: PowerOn
2015-09-06 00:48:27,913 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) VM power state does not change,
skip DB writing. vm id: 2
2015-09-06 00:48:27,913 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) VM state report. host: 2, vm id:
82, power state: PowerOn
2015-09-06 00:48:27,916 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) VM power state does not change,
skip DB writing. vm id: 82
2015-09-06 00:48:27,916 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) VM state report. host: 2, vm id:
94, power state: PowerOn
2015-09-06 00:48:27,919 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) VM power state does not change,
skip DB writing. vm id: 94
2015-09-06 00:48:27,919 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) VM state report. host: 2, vm id:
90, power state: PowerOn
2015-09-06 00:48:27,922 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) VM power state does not change,
skip DB writing. vm id: 90
2015-09-06 00:48:27,928 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-83:ctx-ff822baf) Done with process of VM state
report. host: 2
2015-09-06 00:48:27,940 DEBUG [c.c.a.m.DirectAgentAttache]
(DirectAgentCronJob-154:ctx-2e8a5911) Ping from 1(xcp2)
2015-09-06 00:48:27,940 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-154:ctx-2e8a5911) Process host VM state report from
ping process. host: 1
2015-09-06 00:48:27,951 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-154:ctx-2e8a5911) Process VM state report. host: 1,
number of records in report: 4
2015-09-06 00:48:27,951 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-154:ctx-2e8a5911) VM state report. host: 1, vm id:
100, power state: PowerOn
2015-09-06 00:48:27,954 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-154:ctx-2e8a5911) VM power state does not change,
skip DB writing. vm id: 100
2015-09-06 00:48:27,954 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-154:ctx-2e8a5911) VM state report. host: 1, vm id:
33, power state: PowerOn
2015-09-06 00:48:27,957 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-154:ctx-2e8a5911) VM power state does not change,
skip DB writing. vm id: 33
2015-09-06 00:48:27,957 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-154:ctx-2e8a5911) VM state report. host: 1, vm id:
89, power state: PowerOn
2015-09-06 00:48:27,960 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-154:ctx-2e8a5911) VM power state does not change,
skip DB writing. vm id: 89
2015-09-06 00:48:27,961 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-154:ctx-2e8a5911) VM state report. host: 1, vm id:
88, power state: PowerOn
2015-09-06 00:48:27,963 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-154:ctx-2e8a5911) VM power state does not change,
skip DB writing. vm id: 88
2015-09-06 00:48:27,968 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
(DirectAgentCronJob-154:ctx-2e8a5911) Done with process of VM state
report. host: 1
----------------------------------------------------
Vadim.
On 2015-09-07 23:18, Remi Bergsma wrote:
Hi Vadim,
What kind of storage do you use? Can you show /var/log/xha.log (I think
that is the name) please? It could be xen-ha that fences the box if the
heartbeat cannot be written.
You suggest it is CloudStack. Did you see anything in the mgt logs?
Regards, Remi
Sent from my iPhone
On 07 Sep 2015, at 08:26, Vadim Kimlaychuk <va...@kickcloud.net>
wrote:
Hello all,
I have experienced accidental cluster reboot 3 days after update to CS
4.5.2. Cluster is XenServer 6.5 with SP1. Reboot has been started from
slave node and then - master.
Syslog on slave shows only this:
Sep 6 00:47:05 xcp2 last message repeated 3 times
Sep 6 00:47:15 xcp2 xenstored: D12 write data/meminfo_free 713732
Sep 6 00:47:15 xcp2 xenstored: A1564203 w event
/local/domain/12/data/meminfo_free /local/domain/12/data/meminfo_free
Sep 6 00:47:15 xcp2 xenstored: D12 write data/updated Sun Sep 6
00:48:55 EEST 2015
Sep 6 00:47:15 xcp2 xenstored: A6 w event
/local/domain/12/data/updated /local/domain/12/data/updated
Sep 6 00:47:15 xcp2 xenstored: A10 w event
/local/domain/12/data/updated /local/domain/12/data/updated
Sep 6 00:47:26 xcp2 dhclient: DHCPREQUEST on xenbr0 to 172.17.0.1 port
67 (xid=0x304ae9dc)
Sep 6 00:47:27 xcp2 xapi: [ info|xcp2|462044 INET
0.0.0.0:80|dispatch:host.call_plugin D:7593b578fada|taskhelper] task
host.call_plugin R:ddd3cc399f86 forwarded
(trackid=407f6adaa118a34f19eb1e29cd68a0e8)
Sep 6 00:47:36 xcp2 dhclient: DHCPREQUEST on xenbr0 to 172.17.0.1 port
67 (xid=0x304ae9dc)
Sep 6 00:48:18 xcp2 last message repeated 4 times
Sep 6 00:48:25 xcp2 xenstored: D1 write data/meminfo_free 1740496
Sep 6 00:48:25 xcp2 xenstored: A1564203 w event
/local/domain/1/data/meminfo_free /local/domain/1/data/meminfo_free
Sep 6 00:48:25 xcp2 xenstored: D1 write data/updated Sat Sep 5
21:50:07 EEST 2015
Sep 6 00:48:25 xcp2 xenstored: A6 w event /local/domain/1/data/updated
/local/domain/1/data/updated
Sep 6 00:48:25 xcp2 xenstored: A10 w event
/local/domain/1/data/updated /local/domain/1/data/updated
Sep 6 00:48:26 xcp2 dhclient: DHCPREQUEST on xenbr0 to 172.17.0.1 port
67 (xid=0x304ae9dc)
Sep 6 00:48:27 xcp2 xapi: [ info|xcp2|462044 INET
0.0.0.0:80|dispatch:host.call_plugin D:f2c8987bc0ff|taskhelper] task
host.call_plugin R:b62d2d4f58eb forwarded
(trackid=e3d4ea00c96194830a7dbbfc35563a3c)
Sep 6 00:48:38 xcp2 dhclient: DHCPREQUEST on xenbr0 to 172.17.0.1 port
67 (xid=0x304ae9dc)
Sep 06 00:48:48 xcp2 syslogd 1.4.1: restart.
Sep 6 00:48:48 xcp2 kernel: klogd 1.4.1, log source = /proc/kmsg
started.
Sep 6 00:48:48 xcp2 kernel: [ 0.000000] Initializing cgroup subsys
cpuset
Sep 6 00:48:48 xcp2 kernel: [ 0.000000] Initializing cgroup subsys cpu
Sep 6 00:48:48 xcp2 kernel: [ 0.000000] Initializing cgroup subsys
cpuacct
Sep 6 00:48:48 xcp2 kernel: [ 0.000000] Linux version 3.10.0+2
Can anyone help with diagnostics ?
Thank you,
Vadim.