Re: [ovirt-users] Hosted engine not starting up after system reboot

2017-11-16 Thread Simone Tiraboschi
On Thu, Nov 16, 2017 at 7:09 AM, Rudi Ahlers  wrote:

> I forgot to add:
>
> The file   /var/run/ovirt-hosted-engine-ha/vm.conf doesn't exist:
>
> [root@virt1 ~]# ll /var/run/ovirt-hosted-engine-ha/vm.conf
> ls: cannot access /var/run/ovirt-hosted-engine-ha/vm.conf: No such file
> or directory
> [root@virt1 ~]# ll /var/run/ovirt-hosted-engine-ha/
> total 8
> -rw-r--r--. 1 root root 5 Nov 16 08:05 agent.pid
> -rw-r--r--. 1 root root 5 Nov 16 06:45 broker.pid
> srwxr-xr-x. 1 vdsm kvm  0 Nov 16 06:45 broker.socket
>
> I am not sure how to get it (back?) or how to generate it?
>

Hi Rudi,
we have to understand what happened at setup time.
Do you still have hosted-engine-setup logs file under
/var/log/ovirt-hosted-engine-setup on your first host?
Could you please share it?


>
>
> On Thu, Nov 16, 2017 at 7:55 AM, Rudi Ahlers  wrote:
>
>> Hi,
>>
>> I wonder if someone can help. After a system reboot, the Hosted-Agent
>> isn't running. This is on a fresh installaion CentOS Linux release 7.4.1708
>> running ovirt-release41-4.1.7-1.el7.centos.noarch. Gluster is setup on 3
>> nodes, but hosted-engine is only setup on the 1st node for now.
>>
>> [root@virt1 ~]# hosted-engine --console
>> Virtual machine does not exist
>> The engine VM is not on this host
>>
>> [root@virt1 ~]# hosted-engine --vm-status
>> The hosted engine configuration has not been retrieved from shared
>> storage. Please ensure that ovirt-ha-agent is running and the storage
>> server is reachable.
>>
>>
>> [root@virt1 ~]# ps ax | grep ovirt-ha-agent
>> 41309 ?Rsl0:14 /usr/bin/python 
>> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
>> --no-daemon
>> 42818 pts/0S+ 0:00 grep --color=auto ovirt-ha-agent
>>
>>
>> [root@virt1 ~]# mount | grep engine
>> /dev/mapper/storage-engine on /storage/engine type xfs
>> (rw,relatime,seclabel,attr2,inode64,sunit=1024,swidth=2048,noquota)
>> virt1:/engine on /mnt/engine type fuse.glusterfs
>> (rw,relatime,user_id=0,group_id=0,default_permissions,allow_
>> other,max_read=131072)
>> virt1:/engine on /rhev/data-center/mnt/glusterSD/virt1:_engine type
>> fuse.glusterfs (rw,relatime,user_id=0,group_i
>> d=0,default_permissions,allow_other,max_read=131072)
>>
>>
>> And then I see this error:
>>
>> [root@virt1 ~]# systemctl status ovirt-ha-agent -l
>> ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability
>> Monitoring Agent
>>Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service;
>> enabled; vendor preset: disabled)
>>Active: active (running) since Thu 2017-11-16 07:44:43 SAST; 4min 23s
>> ago
>>  Main PID: 41309 (ovirt-ha-agent)
>>CGroup: /system.slice/ovirt-ha-agent.service
>>└─41309 /usr/bin/python 
>> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
>> --no-daemon
>>
>> Nov 16 07:48:30 virt ovirt-ha-agent[41309]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
>> Unable to identify the OVF_STORE volume, falling back to initial vm.conf.
>> Please ensure you already added your first data domain for regular VMs
>> Nov 16 07:48:30 virt ovirt-ha-agent[41309]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
>> 'version' is not stored in the HE configuration image
>> Nov 16 07:48:39 virt ovirt-ha-agent[41309]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
>> Unable to identify the OVF_STORE volume, falling back to initial vm.conf.
>> Please ensure you already added your first data domain for regular VMs
>> Nov 16 07:48:39 virt ovirt-ha-agent[41309]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
>> 'version' is not stored in the HE configuration image
>> Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
>> Unable to identify the OVF_STORE volume, falling back to initial vm.conf.
>> Please ensure you already added your first data domain for regular VMs
>> Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
>> 'version' is not stored in the HE configuration image
>> Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent
>> call last):
>> File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 191, in _run_agent
>>   return
>> action(he)
>> File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 64, in action_proper
>>   return
>> he.start_monitoring()
>> File
>> 

Re: [ovirt-users] Hosted engine not starting up after system reboot

2017-11-15 Thread Rudi Ahlers
I forgot to add:

The file   /var/run/ovirt-hosted-engine-ha/vm.conf doesn't exist:

[root@virt1 ~]# ll /var/run/ovirt-hosted-engine-ha/vm.conf
ls: cannot access /var/run/ovirt-hosted-engine-ha/vm.conf: No such file or
directory
[root@virt1 ~]# ll /var/run/ovirt-hosted-engine-ha/
total 8
-rw-r--r--. 1 root root 5 Nov 16 08:05 agent.pid
-rw-r--r--. 1 root root 5 Nov 16 06:45 broker.pid
srwxr-xr-x. 1 vdsm kvm  0 Nov 16 06:45 broker.socket

I am not sure how to get it (back?) or how to generate it?


On Thu, Nov 16, 2017 at 7:55 AM, Rudi Ahlers  wrote:

> Hi,
>
> I wonder if someone can help. After a system reboot, the Hosted-Agent
> isn't running. This is on a fresh installaion CentOS Linux release 7.4.1708
> running ovirt-release41-4.1.7-1.el7.centos.noarch. Gluster is setup on 3
> nodes, but hosted-engine is only setup on the 1st node for now.
>
> [root@virt1 ~]# hosted-engine --console
> Virtual machine does not exist
> The engine VM is not on this host
>
> [root@virt1 ~]# hosted-engine --vm-status
> The hosted engine configuration has not been retrieved from shared
> storage. Please ensure that ovirt-ha-agent is running and the storage
> server is reachable.
>
>
> [root@virt1 ~]# ps ax | grep ovirt-ha-agent
> 41309 ?Rsl0:14 /usr/bin/python 
> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
> --no-daemon
> 42818 pts/0S+ 0:00 grep --color=auto ovirt-ha-agent
>
>
> [root@virt1 ~]# mount | grep engine
> /dev/mapper/storage-engine on /storage/engine type xfs
> (rw,relatime,seclabel,attr2,inode64,sunit=1024,swidth=2048,noquota)
> virt1:/engine on /mnt/engine type fuse.glusterfs
> (rw,relatime,user_id=0,group_id=0,default_permissions,
> allow_other,max_read=131072)
> virt1:/engine on /rhev/data-center/mnt/glusterSD/virt1:_engine type
> fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,
> allow_other,max_read=131072)
>
>
> And then I see this error:
>
> [root@virt1 ~]# systemctl status ovirt-ha-agent -l
> ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability
> Monitoring Agent
>Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service;
> enabled; vendor preset: disabled)
>Active: active (running) since Thu 2017-11-16 07:44:43 SAST; 4min 23s
> ago
>  Main PID: 41309 (ovirt-ha-agent)
>CGroup: /system.slice/ovirt-ha-agent.service
>└─41309 /usr/bin/python 
> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
> --no-daemon
>
> Nov 16 07:48:30 virt ovirt-ha-agent[41309]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
> Unable to identify the OVF_STORE volume, falling back to initial vm.conf.
> Please ensure you already added your first data domain for regular VMs
> Nov 16 07:48:30 virt ovirt-ha-agent[41309]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
> 'version' is not stored in the HE configuration image
> Nov 16 07:48:39 virt ovirt-ha-agent[41309]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
> Unable to identify the OVF_STORE volume, falling back to initial vm.conf.
> Please ensure you already added your first data domain for regular VMs
> Nov 16 07:48:39 virt ovirt-ha-agent[41309]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
> 'version' is not stored in the HE configuration image
> Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
> Unable to identify the OVF_STORE volume, falling back to initial vm.conf.
> Please ensure you already added your first data domain for regular VMs
> Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
> 'version' is not stored in the HE configuration image
> Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent
> call last):
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 191, in _run_agent
>   return
> action(he)
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 64, in action_proper
>   return
> he.start_monitoring()
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 423, in start_monitoring
>   for
> old_state, state, delay in self.fsm:
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py",
> line 127, in next
>

[ovirt-users] Hosted engine not starting up after system reboot

2017-11-15 Thread Rudi Ahlers
Hi,

I wonder if someone can help. After a system reboot, the Hosted-Agent isn't
running. This is on a fresh installaion CentOS Linux release 7.4.1708
running ovirt-release41-4.1.7-1.el7.centos.noarch. Gluster is setup on 3
nodes, but hosted-engine is only setup on the 1st node for now.

[root@virt1 ~]# hosted-engine --console
Virtual machine does not exist
The engine VM is not on this host

[root@virt1 ~]# hosted-engine --vm-status
The hosted engine configuration has not been retrieved from shared storage.
Please ensure that ovirt-ha-agent is running and the storage server is
reachable.


[root@virt1 ~]# ps ax | grep ovirt-ha-agent
41309 ?Rsl0:14 /usr/bin/python
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon
42818 pts/0S+ 0:00 grep --color=auto ovirt-ha-agent


[root@virt1 ~]# mount | grep engine
/dev/mapper/storage-engine on /storage/engine type xfs
(rw,relatime,seclabel,attr2,inode64,sunit=1024,swidth=2048,noquota)
virt1:/engine on /mnt/engine type fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
virt1:/engine on /rhev/data-center/mnt/glusterSD/virt1:_engine type
fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)


And then I see this error:

[root@virt1 ~]# systemctl status ovirt-ha-agent -l
● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring
Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled;
vendor preset: disabled)
   Active: active (running) since Thu 2017-11-16 07:44:43 SAST; 4min 23s ago
 Main PID: 41309 (ovirt-ha-agent)
   CGroup: /system.slice/ovirt-ha-agent.service
   └─41309 /usr/bin/python
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon

Nov 16 07:48:30 virt ovirt-ha-agent[41309]: ovirt-ha-agent
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR Unable
to identify the OVF_STORE volume, falling back to initial vm.conf. Please
ensure you already added your first data domain for regular VMs
Nov 16 07:48:30 virt ovirt-ha-agent[41309]: ovirt-ha-agent
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
'version' is not stored in the HE configuration image
Nov 16 07:48:39 virt ovirt-ha-agent[41309]: ovirt-ha-agent
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR Unable
to identify the OVF_STORE volume, falling back to initial vm.conf. Please
ensure you already added your first data domain for regular VMs
Nov 16 07:48:39 virt ovirt-ha-agent[41309]: ovirt-ha-agent
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
'version' is not stored in the HE configuration image
Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR Unable
to identify the OVF_STORE volume, falling back to initial vm.conf. Please
ensure you already added your first data domain for regular VMs
Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config ERROR
'version' is not stored in the HE configuration image
Nov 16 07:48:41 virt ovirt-ha-agent[41309]: ovirt-ha-agent
ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call
last):
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
line 191, in _run_agent
  return
action(he)
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
line 64, in action_proper
  return
he.start_monitoring()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 423, in start_monitoring
  for
old_state, state, delay in self.fsm:
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py",
line 127, in next
  new_data =
self.refresh(self._state.data)
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py",
line 123, in refresh
  ] =
self.hosted_engine.min_memory_threshold
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 183, in min_memory_threshold
  return
int(self._config.get(config.VM, config.MEM_SIZE))
File