Re: [ovirt-users] VM failover with ovirt3.5

2015-01-11 Thread Artyom Lukianov
1) If you want to check crush of vm, you can emulate kernel panic on vm, or 
just to kill vm process on host. So if you have HA vm that run on first host, 
and you kill vm process, vm must be restarted on second host.

2) Reason why we drop vm to unknown status and not start it automatically on 
second host(in case of some host problem), because engine not really know if it 
power outage or just some connectivity problem with host, and can be that vm 
still exist on first host and continue write on storage, so when you start the 
same vm on second host, that will write on the same storage you can get data 
corruption. I think you can write your own script, for example check 
connectivity to vm or to power management interface of host(if you have one) 
and if it really power outage, Confirm Host has been Rebooted via REST or 
SDK(fence manual).

I hope it will help you.  

- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org, 
Jiri Moskovcak jmosk...@redhat.com, Yedidyah Bar David d...@redhat.com, 
Sandro Bonazzola sbona...@redhat.com
Sent: Thursday, January 8, 2015 7:03:55 PM
Subject: RE: [ovirt-users] VM failover with ovirt3.5

Thanks for the advice.

For
2) If something happen with host where HA vm run(network problem, power 
outage), vm dropped to unknown state, and if you want from engine to start 
this vm on another host, you need click under problematic host menu Confirm 
Host has been Rebooted, when you confirm this, engine will start vm on 
another host and also release SPM role from problematic host(if it SPM sure).

Is there any way to make the VM failover happen(move to another host in the 
same cluster) automatically  as for sometimes the administrator may also can 
not recognize the sudden power outage immediately.

Also How I can test in my environment? Please kindly advise.
1) If vm crash(from some reason) it restarted automatically on another host in 
the same cluster.


Thanks,
Cong

-Original Message-
From: Artyom Lukianov [mailto:aluki...@redhat.com]
Sent: Thursday, January 08, 2015 8:46 AM
To: Yue, Cong
Cc: cong yue; stira...@redhat.com; users@ovirt.org; Jiri Moskovcak; Yedidyah 
Bar David; Sandro Bonazzola
Subject: Re: [ovirt-users] VM failover with ovirt3.5

So, behavior for not HE HA vm is:
1) If vm crash(from some reason) it restarted automatically on another host in 
the same cluster.
2) If something happen with host where HA vm run(network problem, power 
outage), vm dropped to unknown state, and if you want from engine to start this 
vm on another host, you need click under problematic host menu Confirm Host 
has been Rebooted, when you confirm this, engine will start vm on another host 
and also release SPM role from problematic host(if it SPM sure).


- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org, 
Jiri Moskovcak jmosk...@redhat.com, Yedidyah Bar David d...@redhat.com, 
Sandro Bonazzola sbona...@redhat.com
Sent: Wednesday, January 7, 2015 3:00:26 AM
Subject: RE: [ovirt-users] VM failover with ovirt3.5

For case 1, I got the avicde that I need to change 
'migration_max_time_per_gib_mem'  value inside vdsm.conf, I am doing it and 
when I get the result, I will also share with you. Thanks.

For case 2, do you mean I did the wrong way to test normal VM failover? Now 
although I shut down host 3 forcely, the vm on the top of it will not do 
failover.
What is your advice for this?

Thanks,
Cong



-Original Message-
From: Artyom Lukianov [mailto:aluki...@redhat.com]
Sent: Tuesday, January 06, 2015 12:34 AM
To: Yue, Cong
Cc: cong yue; stira...@redhat.com; users@ovirt.org; Jiri Moskovcak; Yedidyah 
Bar David; Sandro Bonazzola
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Case 1:
In vdsm.log I can see this one:
Thread-674407::ERROR::2015-01-05 12:09:43,264::migration::259::vm.Vm::(run) 
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to migrate
Traceback (most recent call last):
  File /usr/share/vdsm/virt/migration.py, line 245, in run
self._startUnderlyingMigration(time.time())
  File /usr/share/vdsm/virt/migration.py, line 324, in 
_startUnderlyingMigration
None, maxBandwidth)
  File /usr/share/vdsm/virt/vm.py, line 670, in f
ret = attr(*args, **kwargs)
  File /usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py, line 111, 
in wrapper
ret = f(*args, **kwargs)
  File /usr/lib64/python2.7/site-packages/libvirt.py, line 1264, in 
migrateToURI2
if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
dom=self)
libvirtError: operation aborted: migration job: canceled by client
I see that this kind can be happen, because migration time exceeding the 
configured maximum time for migrations, but anyway we need help from devs, I 
added some to CC.

Case 2:
HA vm

Re: [ovirt-users] VM failover with ovirt3.5

2015-01-08 Thread Yue, Cong
The patch works for my case1. Thanks!

Thanks,
Cong


-Original Message-
From: Jiri Moskovcak [mailto:jmosk...@redhat.com]
Sent: Tuesday, January 06, 2015 1:44 AM
To: Artyom Lukianov; Yue, Cong
Cc: cong yue; stira...@redhat.com; users@ovirt.org; Yedidyah Bar David; Sandro 
Bonazzola
Subject: Re: [ovirt-users] VM failover with ovirt3.5

On 01/06/2015 09:34 AM, Artyom Lukianov wrote:
 Case 1:
 In vdsm.log I can see this one:
 Thread-674407::ERROR::2015-01-05 12:09:43,264::migration::259::vm.Vm::(run) 
 vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to migrate
 Traceback (most recent call last):
File /usr/share/vdsm/virt/migration.py, line 245, in run
  self._startUnderlyingMigration(time.time())
File /usr/share/vdsm/virt/migration.py, line 324, in 
 _startUnderlyingMigration
  None, maxBandwidth)
File /usr/share/vdsm/virt/vm.py, line 670, in f
  ret = attr(*args, **kwargs)
File /usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py, line 
 111, in wrapper
  ret = f(*args, **kwargs)
File /usr/lib64/python2.7/site-packages/libvirt.py, line 1264, in 
 migrateToURI2
  if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
 dom=self)
 libvirtError: operation aborted: migration job: canceled by client
 I see that this kind can be happen, because migration time exceeding the 
 configured maximum time for migrations, but anyway we need help from devs, I 
 added some to CC.


- agent did everything correctly and as Artyom says, the migration is
aborted by vdsm:

The migration took 260 seconds which is exceeding the configured maximum
time for migrations of 256 seconds. The migration will be aborted.

- there is a configuration option in vdsm conf you can tweak to increase
the timeout:

snip
'migration_max_time_per_gib_mem', '64',
 'The maximum time in seconds per GiB memory a migration may
take '
 'before the migration will be aborted by the source host. '
 'Setting this value to 0 will disable this feature.'
/snip

So as you can see in your case it's 4 * 64 seconds = 256seconds.


--Jirka

 Case 2:
 HA vm must migrate only in case of some fail on host3, so if your host_3 is 
 ok vm will continue run on it.


 - Original Message -
 From: Cong Yue cong_...@alliedtelesis.com
 To: Artyom Lukianov aluki...@redhat.com
 Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org
 Sent: Monday, January 5, 2015 7:38:08 PM
 Subject: RE: [ovirt-users] VM failover with ovirt3.5

 I collected the agent.log and vdsm.log in 2 cases.

 Case1 HE VM failover trail
 What I did
 1, make all host be engine up
 2, set host1 be with local maintenance mode. In host1, there is HE VM.
 3, Then HE VM is trying to migrate, but finally it fails. This can be found 
 from agent.log_hosted_engine_1
 As for the log is very large, I uploaded into google dirve. The link is as
 https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdRGJhUXUwejNGRHc
 The logs are for 3 hosts in my environment.

 Case2 non-HE VM failover trail
 1, make all host be engine up
 2,set host2 be with local maintenance mode. In host3, there is one vm with ha 
 enabled. Also for the cluster, Enable HA reservation and Resilience policy 
 is set as migrating virtual machines
 3,But the vm on the top of host3 does not migrate at all.
 The logs are uploaded to good drive as
 https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdd3MzTXZBbmxpNmc


 Thanks,
 Cong




 -Original Message-
 From: Artyom Lukianov [mailto:aluki...@redhat.com]
 Sent: Sunday, January 04, 2015 3:22 AM
 To: Yue, Cong
 Cc: cong yue; stira...@redhat.com; users@ovirt.org
 Subject: Re: [ovirt-users] VM failover with ovirt3.5

 Can you provide vdsm logs:
 1) for HE vm case
 2) for not HE vm case
 Thanks

 - Original Message -
 From: Cong Yue cong_...@alliedtelesis.com
 To: Artyom Lukianov aluki...@redhat.com
 Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org
 Sent: Thursday, January 1, 2015 2:32:18 AM
 Subject: Re: [ovirt-users] VM failover with ovirt3.5

 Thanks for the advice. I applied the patch for clientIF.py as
 - port = config.getint('addresses', 'management_port')
 + port = config.get('addresses', 'management_port')

 Now there is no fatal error in beam.log, also migration can start to happen 
 when I set the host where HE VM is to be local maintenance mode. But it 
 finally fail with the following log. Also HE VM can not be done with live 
 migration in my environment.

 MainThread::INFO::2014-12-31
 19:08:06,197::states::759::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
 Continuing to monitor migration
 MainThread::INFO::2014-12-31
 19:08:06,430::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineMigratingAway (score: 2000)
 MainThread::INFO::2014-12-31
 19:08:06,430::hosted_engine::332

Re: [ovirt-users] VM failover with ovirt3.5

2015-01-08 Thread Artyom Lukianov
So, behavior for not HE HA vm is:
1) If vm crash(from some reason) it restarted automatically on another host in 
the same cluster.
2) If something happen with host where HA vm run(network problem, power 
outage), vm dropped to unknown state, and if you want from engine to start this 
vm on another host, you need click under problematic host menu Confirm Host 
has been Rebooted, when you confirm this, engine will start vm on another host 
and also release SPM role from problematic host(if it SPM sure).
 

- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org, 
Jiri Moskovcak jmosk...@redhat.com, Yedidyah Bar David d...@redhat.com, 
Sandro Bonazzola sbona...@redhat.com
Sent: Wednesday, January 7, 2015 3:00:26 AM
Subject: RE: [ovirt-users] VM failover with ovirt3.5

For case 1, I got the avicde that I need to change 
'migration_max_time_per_gib_mem'  value inside vdsm.conf, I am doing it and 
when I get the result, I will also share with you. Thanks.

For case 2, do you mean I did the wrong way to test normal VM failover? Now 
although I shut down host 3 forcely, the vm on the top of it will not do 
failover.
What is your advice for this?

Thanks,
Cong



-Original Message-
From: Artyom Lukianov [mailto:aluki...@redhat.com]
Sent: Tuesday, January 06, 2015 12:34 AM
To: Yue, Cong
Cc: cong yue; stira...@redhat.com; users@ovirt.org; Jiri Moskovcak; Yedidyah 
Bar David; Sandro Bonazzola
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Case 1:
In vdsm.log I can see this one:
Thread-674407::ERROR::2015-01-05 12:09:43,264::migration::259::vm.Vm::(run) 
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to migrate
Traceback (most recent call last):
  File /usr/share/vdsm/virt/migration.py, line 245, in run
self._startUnderlyingMigration(time.time())
  File /usr/share/vdsm/virt/migration.py, line 324, in 
_startUnderlyingMigration
None, maxBandwidth)
  File /usr/share/vdsm/virt/vm.py, line 670, in f
ret = attr(*args, **kwargs)
  File /usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py, line 111, 
in wrapper
ret = f(*args, **kwargs)
  File /usr/lib64/python2.7/site-packages/libvirt.py, line 1264, in 
migrateToURI2
if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
dom=self)
libvirtError: operation aborted: migration job: canceled by client
I see that this kind can be happen, because migration time exceeding the 
configured maximum time for migrations, but anyway we need help from devs, I 
added some to CC.

Case 2:
HA vm must migrate only in case of some fail on host3, so if your host_3 is ok 
vm will continue run on it.


- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org
Sent: Monday, January 5, 2015 7:38:08 PM
Subject: RE: [ovirt-users] VM failover with ovirt3.5

I collected the agent.log and vdsm.log in 2 cases.

Case1 HE VM failover trail
What I did
1, make all host be engine up
2, set host1 be with local maintenance mode. In host1, there is HE VM.
3, Then HE VM is trying to migrate, but finally it fails. This can be found 
from agent.log_hosted_engine_1
As for the log is very large, I uploaded into google dirve. The link is as
https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdRGJhUXUwejNGRHc
The logs are for 3 hosts in my environment.

Case2 non-HE VM failover trail
1, make all host be engine up
2,set host2 be with local maintenance mode. In host3, there is one vm with ha 
enabled. Also for the cluster, Enable HA reservation and Resilience policy is 
set as migrating virtual machines
3,But the vm on the top of host3 does not migrate at all.
The logs are uploaded to good drive as
https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdd3MzTXZBbmxpNmc


Thanks,
Cong




-Original Message-
From: Artyom Lukianov [mailto:aluki...@redhat.com]
Sent: Sunday, January 04, 2015 3:22 AM
To: Yue, Cong
Cc: cong yue; stira...@redhat.com; users@ovirt.org
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Can you provide vdsm logs:
1) for HE vm case
2) for not HE vm case
Thanks

- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org
Sent: Thursday, January 1, 2015 2:32:18 AM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Thanks for the advice. I applied the patch for clientIF.py as
- port = config.getint('addresses', 'management_port')
+ port = config.get('addresses', 'management_port')

Now there is no fatal error in beam.log, also migration can start to happen 
when I set the host where HE VM is to be local maintenance mode. But it finally 
fail with the following log. Also HE VM can not be done with live

Re: [ovirt-users] VM failover with ovirt3.5

2015-01-08 Thread Yue, Cong
Thanks for the advice.

For
2) If something happen with host where HA vm run(network problem, power 
outage), vm dropped to unknown state, and if you want from engine to start 
this vm on another host, you need click under problematic host menu Confirm 
Host has been Rebooted, when you confirm this, engine will start vm on 
another host and also release SPM role from problematic host(if it SPM sure).

Is there any way to make the VM failover happen(move to another host in the 
same cluster) automatically  as for sometimes the administrator may also can 
not recognize the sudden power outage immediately.

Also How I can test in my environment? Please kindly advise.
1) If vm crash(from some reason) it restarted automatically on another host in 
the same cluster.


Thanks,
Cong

-Original Message-
From: Artyom Lukianov [mailto:aluki...@redhat.com]
Sent: Thursday, January 08, 2015 8:46 AM
To: Yue, Cong
Cc: cong yue; stira...@redhat.com; users@ovirt.org; Jiri Moskovcak; Yedidyah 
Bar David; Sandro Bonazzola
Subject: Re: [ovirt-users] VM failover with ovirt3.5

So, behavior for not HE HA vm is:
1) If vm crash(from some reason) it restarted automatically on another host in 
the same cluster.
2) If something happen with host where HA vm run(network problem, power 
outage), vm dropped to unknown state, and if you want from engine to start this 
vm on another host, you need click under problematic host menu Confirm Host 
has been Rebooted, when you confirm this, engine will start vm on another host 
and also release SPM role from problematic host(if it SPM sure).


- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org, 
Jiri Moskovcak jmosk...@redhat.com, Yedidyah Bar David d...@redhat.com, 
Sandro Bonazzola sbona...@redhat.com
Sent: Wednesday, January 7, 2015 3:00:26 AM
Subject: RE: [ovirt-users] VM failover with ovirt3.5

For case 1, I got the avicde that I need to change 
'migration_max_time_per_gib_mem'  value inside vdsm.conf, I am doing it and 
when I get the result, I will also share with you. Thanks.

For case 2, do you mean I did the wrong way to test normal VM failover? Now 
although I shut down host 3 forcely, the vm on the top of it will not do 
failover.
What is your advice for this?

Thanks,
Cong



-Original Message-
From: Artyom Lukianov [mailto:aluki...@redhat.com]
Sent: Tuesday, January 06, 2015 12:34 AM
To: Yue, Cong
Cc: cong yue; stira...@redhat.com; users@ovirt.org; Jiri Moskovcak; Yedidyah 
Bar David; Sandro Bonazzola
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Case 1:
In vdsm.log I can see this one:
Thread-674407::ERROR::2015-01-05 12:09:43,264::migration::259::vm.Vm::(run) 
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to migrate
Traceback (most recent call last):
  File /usr/share/vdsm/virt/migration.py, line 245, in run
self._startUnderlyingMigration(time.time())
  File /usr/share/vdsm/virt/migration.py, line 324, in 
_startUnderlyingMigration
None, maxBandwidth)
  File /usr/share/vdsm/virt/vm.py, line 670, in f
ret = attr(*args, **kwargs)
  File /usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py, line 111, 
in wrapper
ret = f(*args, **kwargs)
  File /usr/lib64/python2.7/site-packages/libvirt.py, line 1264, in 
migrateToURI2
if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
dom=self)
libvirtError: operation aborted: migration job: canceled by client
I see that this kind can be happen, because migration time exceeding the 
configured maximum time for migrations, but anyway we need help from devs, I 
added some to CC.

Case 2:
HA vm must migrate only in case of some fail on host3, so if your host_3 is ok 
vm will continue run on it.


- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org
Sent: Monday, January 5, 2015 7:38:08 PM
Subject: RE: [ovirt-users] VM failover with ovirt3.5

I collected the agent.log and vdsm.log in 2 cases.

Case1 HE VM failover trail
What I did
1, make all host be engine up
2, set host1 be with local maintenance mode. In host1, there is HE VM.
3, Then HE VM is trying to migrate, but finally it fails. This can be found 
from agent.log_hosted_engine_1
As for the log is very large, I uploaded into google dirve. The link is as
https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdRGJhUXUwejNGRHc
The logs are for 3 hosts in my environment.

Case2 non-HE VM failover trail
1, make all host be engine up
2,set host2 be with local maintenance mode. In host3, there is one vm with ha 
enabled. Also for the cluster, Enable HA reservation and Resilience policy is 
set as migrating virtual machines
3,But the vm on the top of host3 does not migrate at all.
The logs are uploaded to good drive as
https://drive.google.com

Re: [ovirt-users] VM failover with ovirt3.5

2015-01-06 Thread Jiri Moskovcak

On 01/06/2015 09:34 AM, Artyom Lukianov wrote:

Case 1:
In vdsm.log I can see this one:
Thread-674407::ERROR::2015-01-05 12:09:43,264::migration::259::vm.Vm::(run) 
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to migrate
Traceback (most recent call last):
   File /usr/share/vdsm/virt/migration.py, line 245, in run
 self._startUnderlyingMigration(time.time())
   File /usr/share/vdsm/virt/migration.py, line 324, in 
_startUnderlyingMigration
 None, maxBandwidth)
   File /usr/share/vdsm/virt/vm.py, line 670, in f
 ret = attr(*args, **kwargs)
   File /usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py, line 111, 
in wrapper
 ret = f(*args, **kwargs)
   File /usr/lib64/python2.7/site-packages/libvirt.py, line 1264, in 
migrateToURI2
 if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
dom=self)
libvirtError: operation aborted: migration job: canceled by client
I see that this kind can be happen, because migration time exceeding the 
configured maximum time for migrations, but anyway we need help from devs, I 
added some to CC.



- agent did everything correctly and as Artyom says, the migration is 
aborted by vdsm:


The migration took 260 seconds which is exceeding the configured maximum 
time for migrations of 256 seconds. The migration will be aborted.


- there is a configuration option in vdsm conf you can tweak to increase 
the timeout:


snip
'migration_max_time_per_gib_mem', '64',
'The maximum time in seconds per GiB memory a migration may 
take '

'before the migration will be aborted by the source host. '
'Setting this value to 0 will disable this feature.'
/snip

So as you can see in your case it's 4 * 64 seconds = 256seconds.


--Jirka


Case 2:
HA vm must migrate only in case of some fail on host3, so if your host_3 is ok 
vm will continue run on it.


- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org
Sent: Monday, January 5, 2015 7:38:08 PM
Subject: RE: [ovirt-users] VM failover with ovirt3.5

I collected the agent.log and vdsm.log in 2 cases.

Case1 HE VM failover trail
What I did
1, make all host be engine up
2, set host1 be with local maintenance mode. In host1, there is HE VM.
3, Then HE VM is trying to migrate, but finally it fails. This can be found 
from agent.log_hosted_engine_1
As for the log is very large, I uploaded into google dirve. The link is as
https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdRGJhUXUwejNGRHc
The logs are for 3 hosts in my environment.

Case2 non-HE VM failover trail
1, make all host be engine up
2,set host2 be with local maintenance mode. In host3, there is one vm with ha enabled. Also for the 
cluster, Enable HA reservation and Resilience policy is set as migrating virtual 
machines
3,But the vm on the top of host3 does not migrate at all.
The logs are uploaded to good drive as
https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdd3MzTXZBbmxpNmc


Thanks,
Cong




-Original Message-
From: Artyom Lukianov [mailto:aluki...@redhat.com]
Sent: Sunday, January 04, 2015 3:22 AM
To: Yue, Cong
Cc: cong yue; stira...@redhat.com; users@ovirt.org
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Can you provide vdsm logs:
1) for HE vm case
2) for not HE vm case
Thanks

- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org
Sent: Thursday, January 1, 2015 2:32:18 AM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Thanks for the advice. I applied the patch for clientIF.py as
- port = config.getint('addresses', 'management_port')
+ port = config.get('addresses', 'management_port')

Now there is no fatal error in beam.log, also migration can start to happen 
when I set the host where HE VM is to be local maintenance mode. But it finally 
fail with the following log. Also HE VM can not be done with live migration in 
my environment.

MainThread::INFO::2014-12-31
19:08:06,197::states::759::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Continuing to monitor migration
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineMigratingAway (score: 2000)
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::ERROR::2014-12-31
19:08:16,490::hosted_engine::867::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitor_migration)
Failed to migrate
Traceback (most recent call last):
  File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
line 863

Re: [ovirt-users] VM failover with ovirt3.5

2015-01-06 Thread Artyom Lukianov
Case 1:
In vdsm.log I can see this one:
Thread-674407::ERROR::2015-01-05 12:09:43,264::migration::259::vm.Vm::(run) 
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to migrate
Traceback (most recent call last):
  File /usr/share/vdsm/virt/migration.py, line 245, in run
self._startUnderlyingMigration(time.time())
  File /usr/share/vdsm/virt/migration.py, line 324, in 
_startUnderlyingMigration
None, maxBandwidth)
  File /usr/share/vdsm/virt/vm.py, line 670, in f
ret = attr(*args, **kwargs)
  File /usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py, line 111, 
in wrapper
ret = f(*args, **kwargs)
  File /usr/lib64/python2.7/site-packages/libvirt.py, line 1264, in 
migrateToURI2
if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
dom=self)
libvirtError: operation aborted: migration job: canceled by client
I see that this kind can be happen, because migration time exceeding the 
configured maximum time for migrations, but anyway we need help from devs, I 
added some to CC.

Case 2:
HA vm must migrate only in case of some fail on host3, so if your host_3 is ok 
vm will continue run on it.


- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org
Sent: Monday, January 5, 2015 7:38:08 PM
Subject: RE: [ovirt-users] VM failover with ovirt3.5

I collected the agent.log and vdsm.log in 2 cases.

Case1 HE VM failover trail
What I did
1, make all host be engine up
2, set host1 be with local maintenance mode. In host1, there is HE VM.
3, Then HE VM is trying to migrate, but finally it fails. This can be found 
from agent.log_hosted_engine_1
As for the log is very large, I uploaded into google dirve. The link is as
https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdRGJhUXUwejNGRHc
The logs are for 3 hosts in my environment.

Case2 non-HE VM failover trail
1, make all host be engine up
2,set host2 be with local maintenance mode. In host3, there is one vm with ha 
enabled. Also for the cluster, Enable HA reservation and Resilience policy is 
set as migrating virtual machines
3,But the vm on the top of host3 does not migrate at all.
The logs are uploaded to good drive as
https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdd3MzTXZBbmxpNmc


Thanks,
Cong




-Original Message-
From: Artyom Lukianov [mailto:aluki...@redhat.com]
Sent: Sunday, January 04, 2015 3:22 AM
To: Yue, Cong
Cc: cong yue; stira...@redhat.com; users@ovirt.org
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Can you provide vdsm logs:
1) for HE vm case
2) for not HE vm case
Thanks

- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org
Sent: Thursday, January 1, 2015 2:32:18 AM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Thanks for the advice. I applied the patch for clientIF.py as
- port = config.getint('addresses', 'management_port')
+ port = config.get('addresses', 'management_port')

Now there is no fatal error in beam.log, also migration can start to happen 
when I set the host where HE VM is to be local maintenance mode. But it finally 
fail with the following log. Also HE VM can not be done with live migration in 
my environment.

MainThread::INFO::2014-12-31
19:08:06,197::states::759::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Continuing to monitor migration
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineMigratingAway (score: 2000)
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::ERROR::2014-12-31
19:08:16,490::hosted_engine::867::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitor_migration)
Failed to migrate
Traceback (most recent call last):
 File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
line 863, in _monitor_migration
   vm_id,
 File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/vds_client.py,
line 85, in run_vds_client_cmd
   response['status']['message'])
DetailedError: Error 47 from migrateStatus: Migration canceled
MainThread::INFO::2014-12-31
19:08:16,501::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1420070896.5 type=state_transition
detail=EngineMigratingAway-ReinitializeFSM hostname='compute2-3'
MainThread::INFO::2014-12-31
19:08:16,502::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(EngineMigratingAway-ReinitializeFSM) sent? ignored
MainThread::INFO::2014-12-31
19:08:16,805

Re: [ovirt-users] VM failover with ovirt3.5

2015-01-05 Thread Yue, Cong
I collected the agent.log and vdsm.log in 2 cases.

Case1 HE VM failover trail
What I did
1, make all host be engine up
2, set host1 be with local maintenance mode. In host1, there is HE VM.
3, Then HE VM is trying to migrate, but finally it fails. This can be found 
from agent.log_hosted_engine_1
As for the log is very large, I uploaded into google dirve. The link is as
https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdRGJhUXUwejNGRHc
The logs are for 3 hosts in my environment.

Case2 non-HE VM failover trail
1, make all host be engine up
2,set host2 be with local maintenance mode. In host3, there is one vm with ha 
enabled. Also for the cluster, Enable HA reservation and Resilience policy is 
set as migrating virtual machines
3,But the vm on the top of host3 does not migrate at all.
The logs are uploaded to good drive as
https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdd3MzTXZBbmxpNmc


Thanks,
Cong




-Original Message-
From: Artyom Lukianov [mailto:aluki...@redhat.com]
Sent: Sunday, January 04, 2015 3:22 AM
To: Yue, Cong
Cc: cong yue; stira...@redhat.com; users@ovirt.org
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Can you provide vdsm logs:
1) for HE vm case
2) for not HE vm case
Thanks

- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org
Sent: Thursday, January 1, 2015 2:32:18 AM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Thanks for the advice. I applied the patch for clientIF.py as
- port = config.getint('addresses', 'management_port')
+ port = config.get('addresses', 'management_port')

Now there is no fatal error in beam.log, also migration can start to happen 
when I set the host where HE VM is to be local maintenance mode. But it finally 
fail with the following log. Also HE VM can not be done with live migration in 
my environment.

MainThread::INFO::2014-12-31
19:08:06,197::states::759::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Continuing to monitor migration
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineMigratingAway (score: 2000)
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::ERROR::2014-12-31
19:08:16,490::hosted_engine::867::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitor_migration)
Failed to migrate
Traceback (most recent call last):
 File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
line 863, in _monitor_migration
   vm_id,
 File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/vds_client.py,
line 85, in run_vds_client_cmd
   response['status']['message'])
DetailedError: Error 47 from migrateStatus: Migration canceled
MainThread::INFO::2014-12-31
19:08:16,501::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1420070896.5 type=state_transition
detail=EngineMigratingAway-ReinitializeFSM hostname='compute2-3'
MainThread::INFO::2014-12-31
19:08:16,502::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(EngineMigratingAway-ReinitializeFSM) sent? ignored
MainThread::INFO::2014-12-31
19:08:16,805::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state ReinitializeFSM (score: 0)
MainThread::INFO::2014-12-31
19:08:16,805::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)

Besides, I had a try for other VMs instead of HE VM, but the failover( also no 
start to try migrating) happen. I set HA for those VMs. Is there some log I can 
check for this?

Please kindly advise.

Thanks,
Cong


 On 2014/12/31, at 0:14, Artyom Lukianov aluki...@redhat.com wrote:

 Ok I found this one:
 Thread-1807180::ERROR::2014-12-30 
 13:02:52,164::migration::165::vm.Vm::(_recover) 
 vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to destroy remote VM
 Traceback (most recent call last):
 File /usr/share/vdsm/virt/migration.py, line 163, in _recover
  self.destServer.destroy(self._vm.id)
 AttributeError: 'SourceThread' object has no attribute 'destServer'
 Thread-1807180::ERROR::2014-12-30 13:02:52,165::migration::259::vm.Vm::(run) 
 vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to migrate
 Traceback (most recent call last):
 File /usr/share/vdsm/virt/migration.py, line 229, in run
  self._setupVdsConnection()
 File /usr/share/vdsm/virt/migration.py, line 92, in _setupVdsConnection
  self._dst, self._vm.cif.bindings['xmlrpc'].serverPort)
 File /usr/lib/python2.7/site-packages

Re: [ovirt-users] VM failover with ovirt3.5

2015-01-04 Thread Artyom Lukianov
Can you provide vdsm logs:
1) for HE vm case
2) for not HE vm case
Thanks

- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org
Sent: Thursday, January 1, 2015 2:32:18 AM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Thanks for the advice. I applied the patch for clientIF.py as
- port = config.getint('addresses', 'management_port')
+ port = config.get('addresses', 'management_port')

Now there is no fatal error in beam.log, also migration can start to happen 
when I set the host where HE VM is to be local maintenance mode. But it finally 
fail with the following log. Also HE VM can not be done with live migration in 
my environment.

MainThread::INFO::2014-12-31
19:08:06,197::states::759::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Continuing to monitor migration
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineMigratingAway (score: 2000)
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::ERROR::2014-12-31
19:08:16,490::hosted_engine::867::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitor_migration)
Failed to migrate
Traceback (most recent call last):
 File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
line 863, in _monitor_migration
   vm_id,
 File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/vds_client.py,
line 85, in run_vds_client_cmd
   response['status']['message'])
DetailedError: Error 47 from migrateStatus: Migration canceled
MainThread::INFO::2014-12-31
19:08:16,501::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1420070896.5 type=state_transition
detail=EngineMigratingAway-ReinitializeFSM hostname='compute2-3'
MainThread::INFO::2014-12-31
19:08:16,502::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(EngineMigratingAway-ReinitializeFSM) sent? ignored
MainThread::INFO::2014-12-31
19:08:16,805::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state ReinitializeFSM (score: 0)
MainThread::INFO::2014-12-31
19:08:16,805::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)

Besides, I had a try for other VMs instead of HE VM, but the failover( also no 
start to try migrating) happen. I set HA for those VMs. Is there some log I can 
check for this?

Please kindly advise.

Thanks,
Cong


 On 2014/12/31, at 0:14, Artyom Lukianov aluki...@redhat.com wrote:

 Ok I found this one:
 Thread-1807180::ERROR::2014-12-30 
 13:02:52,164::migration::165::vm.Vm::(_recover) 
 vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to destroy remote VM
 Traceback (most recent call last):
 File /usr/share/vdsm/virt/migration.py, line 163, in _recover
  self.destServer.destroy(self._vm.id)
 AttributeError: 'SourceThread' object has no attribute 'destServer'
 Thread-1807180::ERROR::2014-12-30 13:02:52,165::migration::259::vm.Vm::(run) 
 vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to migrate
 Traceback (most recent call last):
 File /usr/share/vdsm/virt/migration.py, line 229, in run
  self._setupVdsConnection()
 File /usr/share/vdsm/virt/migration.py, line 92, in _setupVdsConnection
  self._dst, self._vm.cif.bindings['xmlrpc'].serverPort)
 File /usr/lib/python2.7/site-packages/vdsm/vdscli.py, line 91, in 
 cannonizeHostPort
  return addr + ':' + port
 TypeError: cannot concatenate 'str' and 'int' objects

 We have bug that already verified for this one 
 https://bugzilla.redhat.com/show_bug.cgi?id=1163771, so patch must be 
 included in latest builds, but you can also take a look on patch, and edit 
 files by yourself on all you machines and restart vdsm.

 - Original Message -
 From: cong yue yuecong1...@gmail.com
 To: aluki...@redhat.com, stira...@redhat.com, users@ovirt.org
 Cc: Cong Yue cong_...@alliedtelesis.com
 Sent: Tuesday, December 30, 2014 8:22:47 PM
 Subject: Re: [ovirt-users] VM failover with ovirt3.5

 The vdsm.log just after I turned the host where HE VM is to local.

 In the log, there is some part like

 ---
 GuestMonitor-HostedEngine::DEBUG::2014-12-30
 13:01:03,988::vm::486::vm.Vm::(_getUserCpuTuneInfo)
 vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
 set
 GuestMonitor-HostedEngine::DEBUG::2014-12-30
 13:01:03,989::vm::486::vm.Vm::(_getUserCpuTuneInfo)
 vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
 set
 GuestMonitor-HostedEngine::DEBUG::2014-12-30
 13:01:03,990::vm::486::vm.Vm::(_getUserCpuTuneInfo)
 vmId

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-31 Thread Artyom Lukianov
Ok I found this one:
Thread-1807180::ERROR::2014-12-30 
13:02:52,164::migration::165::vm.Vm::(_recover) 
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to destroy remote VM
Traceback (most recent call last):
  File /usr/share/vdsm/virt/migration.py, line 163, in _recover
self.destServer.destroy(self._vm.id)
AttributeError: 'SourceThread' object has no attribute 'destServer'
Thread-1807180::ERROR::2014-12-30 13:02:52,165::migration::259::vm.Vm::(run) 
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to migrate
Traceback (most recent call last):
  File /usr/share/vdsm/virt/migration.py, line 229, in run
self._setupVdsConnection()
  File /usr/share/vdsm/virt/migration.py, line 92, in _setupVdsConnection
self._dst, self._vm.cif.bindings['xmlrpc'].serverPort)
  File /usr/lib/python2.7/site-packages/vdsm/vdscli.py, line 91, in 
cannonizeHostPort
return addr + ':' + port
TypeError: cannot concatenate 'str' and 'int' objects

We have bug that already verified for this one 
https://bugzilla.redhat.com/show_bug.cgi?id=1163771, so patch must be included 
in latest builds, but you can also take a look on patch, and edit files by 
yourself on all you machines and restart vdsm.

- Original Message -
From: cong yue yuecong1...@gmail.com
To: aluki...@redhat.com, stira...@redhat.com, users@ovirt.org
Cc: Cong Yue cong_...@alliedtelesis.com
Sent: Tuesday, December 30, 2014 8:22:47 PM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

The vdsm.log just after I turned the host where HE VM is to local.

In the log, there is some part like

---
GuestMonitor-HostedEngine::DEBUG::2014-12-30
13:01:03,988::vm::486::vm.Vm::(_getUserCpuTuneInfo)
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
set
GuestMonitor-HostedEngine::DEBUG::2014-12-30
13:01:03,989::vm::486::vm.Vm::(_getUserCpuTuneInfo)
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
set
GuestMonitor-HostedEngine::DEBUG::2014-12-30
13:01:03,990::vm::486::vm.Vm::(_getUserCpuTuneInfo)
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
set
JsonRpc (StompReactor)::DEBUG::2014-12-30
13:01:04,675::stompReactor::98::Broker.StompAdapter::(handle_frame)
Handling message StompFrame command='SEND'
JsonRpcServer::DEBUG::2014-12-30
13:01:04,676::__init__::504::jsonrpc.JsonRpcServer::(serve_requests)
Waiting for request
Thread-1806995::DEBUG::2014-12-30
13:01:04,677::stompReactor::163::yajsonrpc.StompServer::(send) Sending
response
JsonRpc (StompReactor)::DEBUG::2014-12-30
13:01:04,678::stompReactor::98::Broker.StompAdapter::(handle_frame)
Handling message StompFrame command='SEND'
JsonRpcServer::DEBUG::2014-12-30
13:01:04,679::__init__::504::jsonrpc.JsonRpcServer::(serve_requests)
Waiting for request
Thread-1806996::DEBUG::2014-12-30
13:01:04,681::vm::486::vm.Vm::(_getUserCpuTuneInfo)
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
set
---

I this with some wrong?

Thanks,
Cong


 From: Artyom Lukianov aluki...@redhat.com
 Date: 2014年12月29日 23:13:45 GMT-8
 To: Yue, Cong cong_...@alliedtelesis.com
 Cc: Simone Tiraboschi stira...@redhat.com, users@ovirt.org
 users@ovirt.org
 Subject: Re: [ovirt-users] VM failover with ovirt3.5

 HE vm migrated only by ovirt-ha-agent and not by engine, but FatalError it's
 more interesting, can you provide vdsm.log for this one please.

 - Original Message -
 From: Cong Yue cong_...@alliedtelesis.com
 To: Artyom Lukianov aluki...@redhat.com
 Cc: Simone Tiraboschi stira...@redhat.com, users@ovirt.org
 Sent: Monday, December 29, 2014 8:29:04 PM
 Subject: Re: [ovirt-users] VM failover with ovirt3.5

 I disabled local maintenance mode for all hosts, and then only set the host
 where HE VM is there to local maintenance mode. The logs are as follows.
 During the migration of HE VM , it shows some fatal error happen. By the
 way, also HE VM can not work with live migration. Instead, other VMs can do
 live migration.

 ---
 [root@compute2-3 ~]# hosted-engine --set-maintenance --mode=local
 You have new mail in /var/spool/mail/root
 [root@compute2-3 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
 MainThread::INFO::2014-12-29
 13:16:12,435::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.92 (id: 3, score: 2400)
 MainThread::INFO::2014-12-29
 13:16:22,711::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-29
 13:16:22,711::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.92 (id: 3, score: 2400)
 MainThread::INFO::2014-12-29
 13:16:32,978::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-29
 13:16:32,978::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-31 Thread Yue, Cong
Thanks for the advice. I applied the patch for clientIF.py as
- port = config.getint('addresses', 'management_port')
+ port = config.get('addresses', 'management_port')

Now there is no fatal error in beam.log, also migration can start to happen 
when I set the host where HE VM is to be local maintenance mode. But it finally 
fail with the following log. Also HE VM can not be done with live migration in 
my environment.

MainThread::INFO::2014-12-31
19:08:06,197::states::759::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Continuing to monitor migration
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineMigratingAway (score: 2000)
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::ERROR::2014-12-31
19:08:16,490::hosted_engine::867::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitor_migration)
Failed to migrate
Traceback (most recent call last):
 File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
line 863, in _monitor_migration
   vm_id,
 File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/vds_client.py,
line 85, in run_vds_client_cmd
   response['status']['message'])
DetailedError: Error 47 from migrateStatus: Migration canceled
MainThread::INFO::2014-12-31
19:08:16,501::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1420070896.5 type=state_transition
detail=EngineMigratingAway-ReinitializeFSM hostname='compute2-3'
MainThread::INFO::2014-12-31
19:08:16,502::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(EngineMigratingAway-ReinitializeFSM) sent? ignored
MainThread::INFO::2014-12-31
19:08:16,805::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state ReinitializeFSM (score: 0)
MainThread::INFO::2014-12-31
19:08:16,805::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)

Besides, I had a try for other VMs instead of HE VM, but the failover( also no 
start to try migrating) happen. I set HA for those VMs. Is there some log I can 
check for this?

Please kindly advise.

Thanks,
Cong


 On 2014/12/31, at 0:14, Artyom Lukianov aluki...@redhat.com wrote:

 Ok I found this one:
 Thread-1807180::ERROR::2014-12-30 
 13:02:52,164::migration::165::vm.Vm::(_recover) 
 vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to destroy remote VM
 Traceback (most recent call last):
 File /usr/share/vdsm/virt/migration.py, line 163, in _recover
  self.destServer.destroy(self._vm.id)
 AttributeError: 'SourceThread' object has no attribute 'destServer'
 Thread-1807180::ERROR::2014-12-30 13:02:52,165::migration::259::vm.Vm::(run) 
 vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to migrate
 Traceback (most recent call last):
 File /usr/share/vdsm/virt/migration.py, line 229, in run
  self._setupVdsConnection()
 File /usr/share/vdsm/virt/migration.py, line 92, in _setupVdsConnection
  self._dst, self._vm.cif.bindings['xmlrpc'].serverPort)
 File /usr/lib/python2.7/site-packages/vdsm/vdscli.py, line 91, in 
 cannonizeHostPort
  return addr + ':' + port
 TypeError: cannot concatenate 'str' and 'int' objects

 We have bug that already verified for this one 
 https://bugzilla.redhat.com/show_bug.cgi?id=1163771, so patch must be 
 included in latest builds, but you can also take a look on patch, and edit 
 files by yourself on all you machines and restart vdsm.

 - Original Message -
 From: cong yue yuecong1...@gmail.com
 To: aluki...@redhat.com, stira...@redhat.com, users@ovirt.org
 Cc: Cong Yue cong_...@alliedtelesis.com
 Sent: Tuesday, December 30, 2014 8:22:47 PM
 Subject: Re: [ovirt-users] VM failover with ovirt3.5

 The vdsm.log just after I turned the host where HE VM is to local.

 In the log, there is some part like

 ---
 GuestMonitor-HostedEngine::DEBUG::2014-12-30
 13:01:03,988::vm::486::vm.Vm::(_getUserCpuTuneInfo)
 vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
 set
 GuestMonitor-HostedEngine::DEBUG::2014-12-30
 13:01:03,989::vm::486::vm.Vm::(_getUserCpuTuneInfo)
 vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
 set
 GuestMonitor-HostedEngine::DEBUG::2014-12-30
 13:01:03,990::vm::486::vm.Vm::(_getUserCpuTuneInfo)
 vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
 set
 JsonRpc (StompReactor)::DEBUG::2014-12-30
 13:01:04,675::stompReactor::98::Broker.StompAdapter::(handle_frame)
 Handling message StompFrame command='SEND'
 JsonRpcServer::DEBUG::2014-12-30
 13:01:04,676::__init__::504::jsonrpc.JsonRpcServer::(serve_requests)
 Waiting for request

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-29 Thread Artyom Lukianov
Can you also provide output of hosted-engine --vm-status please, previous time 
it was useful, because I do not see something unusual.
Thanks

- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: Simone Tiraboschi stira...@redhat.com, users@ovirt.org
Sent: Monday, December 29, 2014 7:15:24 AM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Also I change the maintenance mode to local in another host. But also the VM in 
this host can not be migrated. The logs are as follows.

[root@compute2-2 ~]# hosted-engine --set-maintenance --mode=local
[root@compute2-2 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
MainThread::INFO::2014-12-28
21:09:04,184::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-28
21:09:14,603::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-12-28
21:09:14,603::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-28
21:09:24,903::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-12-28
21:09:24,904::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-28
21:09:35,026::states::437::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Engine vm is running on host 10.0.0.94 (id 1)
MainThread::INFO::2014-12-28
21:09:35,236::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-12-28
21:09:35,236::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-28
21:09:45,604::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-12-28
21:09:45,604::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-28
21:09:55,691::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
Local maintenance detected
MainThread::INFO::2014-12-28
21:09:55,701::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1419829795.7 type=state_transition
detail=EngineDown-LocalMaintenance hostname='compute2-2'
MainThread::INFO::2014-12-28
21:09:55,761::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(EngineDown-LocalMaintenance) sent? sent
MainThread::INFO::2014-12-28
21:09:55,990::states::208::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
Score is 0 due to local maintenance mode
MainThread::INFO::2014-12-28
21:09:55,990::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state LocalMaintenance (score: 0)
MainThread::INFO::2014-12-28
21:09:55,991::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
^C
You have new mail in /var/spool/mail/root
[root@compute2-2 ~]# ps -ef | grep qemu
root 18420  2777  0 21:10x-apple-data-detectors://39 pts/0
00:00:00x-apple-data-detectors://40 grep --color=auto qemu
qemu 29809 1  0 Dec19 ?01:17:20 /usr/libexec/qemu-kvm
-name testvm2-2 -S -machine rhel6.5.0,accel=kvm,usb=off -cpu Nehalem
-m 500 -realtime mlock=off -smp
1,maxcpus=16,sockets=16,cores=1,threads=1 -uuid
c31e97d0-135e-42da-9954-162b5228dce3 -smbios
type=1,manufacturer=oVirt,product=oVirt
Node,version=7-0.1406.el7.centos.2.5,serial=4C4C4544-0059-3610-8033-B4C04F395931,uuid=c31e97d0-135e-42da-9954-162b5228dce3
-no-user-config -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/testvm2-2.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc
base=2014-12-19T20:17:17x-apple-data-detectors://42,driftfix=slew 
-no-kvm-pit-reinjection
-no-hpet -no-shutdown -boot strict=on -device
piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device
virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5
-drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial=
-device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0
-drive 
file=/rhev/data-center/0002-0002-0002-0002-01e4/1dc71096-27c4-4256-b2ac-bd7265525c69/images/5cbeb8c9

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-29 Thread Yue, Cong
Thanks and the --vm-status log is as follows:
[root@compute2-2 ~]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date  : True
Hostname   : 10.0.0.94
Host ID: 1
Engine status  : {health: good, vm: up,
detail: up}
Score  : 2400
Local maintenance  : False
Host timestamp : 1008087tel:1008087
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=1008087tel:1008087 (Mon Dec 29 11:25:51 2014)
host-id=1
score=2400
maintenance=False
state=EngineUp


--== Host 2 status ==--

Status up-to-date  : True
Hostname   : 10.0.0.93
Host ID: 2
Engine status  : {reason: vm not running on
this host, health: bad, vm: down, detail: unknown}
Score  : 0
Local maintenance  : True
Host timestamp : 859142
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=859142 (Mon Dec 29 08:25:08 2014)
host-id=2
score=0
maintenance=True
state=LocalMaintenance


--== Host 3 status ==--

Status up-to-date  : True
Hostname   : 10.0.0.92
Host ID: 3
Engine status  : {reason: vm not running on
this host, health: bad, vm: down, detail: unknown}
Score  : 0
Local maintenance  : True
Host timestamp : 853615
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=853615 (Mon Dec 29 08:25:57 2014)
host-id=3
score=0
maintenance=True
state=LocalMaintenance
You have new mail in /var/spool/mail/root
[root@compute2-2 ~]#

Could you please explain how VM failover works inside ovirt? Is there any other 
debug option I can enable to check the problem?

Thanks,
Cong


On 2014/12/29, at 1:39, Artyom Lukianov 
aluki...@redhat.commailto:aluki...@redhat.com wrote:

Can you also provide output of hosted-engine --vm-status please, previous time 
it was useful, because I do not see something unusual.
Thanks

- Original Message -
From: Cong Yue cong_...@alliedtelesis.commailto:cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.commailto:aluki...@redhat.com
Cc: Simone Tiraboschi stira...@redhat.commailto:stira...@redhat.com, 
users@ovirt.orgmailto:users@ovirt.org
Sent: Monday, December 29, 2014 7:15:24 AM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Also I change the maintenance mode to local in another host. But also the VM in 
this host can not be migrated. The logs are as follows.

[root@compute2-2 ~]# hosted-engine --set-maintenance --mode=local
[root@compute2-2 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
MainThread::INFO::2014-12-28
21:09:04,184::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-28
21:09:14,603::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-12-28
21:09:14,603::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-28
21:09:24,903::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-12-28
21:09:24,904::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-28
21:09:35,026::states::437::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Engine vm is running on host 10.0.0.94 (id 1)
MainThread::INFO::2014-12-28
21:09:35,236::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-12-28
21:09:35,236::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-28
21:09:45,604::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-12-28
21:09:45,604::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-28
21:09:55,691::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
Local maintenance detected
MainThread::INFO::2014-12-28
21:09:55,701::brokerlink::111

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-29 Thread Artyom Lukianov
I see that HE vm run on host with ip 10.0.0.94, and two another hosts in Local 
Maintenance state, so vm will not migrate to any of them, can you try disable 
local maintenance on all hosts in HE environment and after enable local 
maintenance on host where HE vm run, and provide also output of hosted-engine 
--vm-status.
Failover works in next way:
1) if host where run HE vm have score less by 800 that some other host in HE 
environment, HE vm will migrate on host with best score
2) if something happen to vm(kernel panic, crash of service...), agent will 
restart HE vm on another host in HE environment with positive score
3) if put to local maintenance host with HE vm, vm will migrate to another host 
with positive score
Thanks.

- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: Simone Tiraboschi stira...@redhat.com, users@ovirt.org
Sent: Monday, December 29, 2014 6:30:42 PM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Thanks and the --vm-status log is as follows:
[root@compute2-2 ~]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date  : True
Hostname   : 10.0.0.94
Host ID: 1
Engine status  : {health: good, vm: up,
detail: up}
Score  : 2400
Local maintenance  : False
Host timestamp : 1008087
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=1008087tel:1008087 (Mon Dec 29 11:25:51 2014)
host-id=1
score=2400
maintenance=False
state=EngineUp


--== Host 2 status ==--

Status up-to-date  : True
Hostname   : 10.0.0.93
Host ID: 2
Engine status  : {reason: vm not running on
this host, health: bad, vm: down, detail: unknown}
Score  : 0
Local maintenance  : True
Host timestamp : 859142
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=859142 (Mon Dec 29 08:25:08 2014)
host-id=2
score=0
maintenance=True
state=LocalMaintenance


--== Host 3 status ==--

Status up-to-date  : True
Hostname   : 10.0.0.92
Host ID: 3
Engine status  : {reason: vm not running on
this host, health: bad, vm: down, detail: unknown}
Score  : 0
Local maintenance  : True
Host timestamp : 853615
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=853615 (Mon Dec 29 08:25:57 2014)
host-id=3
score=0
maintenance=True
state=LocalMaintenance
You have new mail in /var/spool/mail/root
[root@compute2-2 ~]#

Could you please explain how VM failover works inside ovirt? Is there any other 
debug option I can enable to check the problem?

Thanks,
Cong


On 2014/12/29, at 1:39, Artyom Lukianov 
aluki...@redhat.commailto:aluki...@redhat.com wrote:

Can you also provide output of hosted-engine --vm-status please, previous time 
it was useful, because I do not see something unusual.
Thanks

- Original Message -
From: Cong Yue cong_...@alliedtelesis.commailto:cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.commailto:aluki...@redhat.com
Cc: Simone Tiraboschi stira...@redhat.commailto:stira...@redhat.com, 
users@ovirt.orgmailto:users@ovirt.org
Sent: Monday, December 29, 2014 7:15:24 AM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Also I change the maintenance mode to local in another host. But also the VM in 
this host can not be migrated. The logs are as follows.

[root@compute2-2 ~]# hosted-engine --set-maintenance --mode=local
[root@compute2-2 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
MainThread::INFO::2014-12-28
21:09:04,184::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-28
21:09:14,603::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-12-28
21:09:14,603::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-28
21:09:24,903::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-12-28
21:09:24,904::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-28
21:09:35,026::states::437::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Engine vm is running on host

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-29 Thread Yue, Cong
Thanks for detailed explanation. Do you mean only HE VM can be failover? I want 
to have a try with the VM on any host to check whether VM can be failover to 
other host automatically like VMware or Xenserver?
I will have a try as you advised and provide the log for your further advice.

Thanks,
Cong



 On 2014/12/29, at 8:43, Artyom Lukianov aluki...@redhat.com wrote:

 I see that HE vm run on host with ip 10.0.0.94, and two another hosts in 
 Local Maintenance state, so vm will not migrate to any of them, can you try 
 disable local maintenance on all hosts in HE environment and after enable 
 local maintenance on host where HE vm run, and provide also output of 
 hosted-engine --vm-status.
 Failover works in next way:
 1) if host where run HE vm have score less by 800 that some other host in HE 
 environment, HE vm will migrate on host with best score
 2) if something happen to vm(kernel panic, crash of service...), agent will 
 restart HE vm on another host in HE environment with positive score
 3) if put to local maintenance host with HE vm, vm will migrate to another 
 host with positive score
 Thanks.

 - Original Message -
 From: Cong Yue cong_...@alliedtelesis.com
 To: Artyom Lukianov aluki...@redhat.com
 Cc: Simone Tiraboschi stira...@redhat.com, users@ovirt.org
 Sent: Monday, December 29, 2014 6:30:42 PM
 Subject: Re: [ovirt-users] VM failover with ovirt3.5

 Thanks and the --vm-status log is as follows:
 [root@compute2-2 ~]# hosted-engine --vm-status


 --== Host 1 status ==--

 Status up-to-date  : True
 Hostname   : 10.0.0.94
 Host ID: 1
 Engine status  : {health: good, vm: up,
 detail: up}
 Score  : 2400
 Local maintenance  : False
 Host timestamp : 1008087
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=1008087tel:1008087 (Mon Dec 29 11:25:51 2014)
 host-id=1
 score=2400
 maintenance=False
 state=EngineUp


 --== Host 2 status ==--

 Status up-to-date  : True
 Hostname   : 10.0.0.93
 Host ID: 2
 Engine status  : {reason: vm not running on
 this host, health: bad, vm: down, detail: unknown}
 Score  : 0
 Local maintenance  : True
 Host timestamp : 859142
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=859142 (Mon Dec 29 08:25:08 2014)
 host-id=2
 score=0
 maintenance=True
 state=LocalMaintenance


 --== Host 3 status ==--

 Status up-to-date  : True
 Hostname   : 10.0.0.92
 Host ID: 3
 Engine status  : {reason: vm not running on
 this host, health: bad, vm: down, detail: unknown}
 Score  : 0
 Local maintenance  : True
 Host timestamp : 853615
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=853615 (Mon Dec 29 08:25:57 2014)
 host-id=3
 score=0
 maintenance=True
 state=LocalMaintenance
 You have new mail in /var/spool/mail/root
 [root@compute2-2 ~]#

 Could you please explain how VM failover works inside ovirt? Is there any 
 other debug option I can enable to check the problem?

 Thanks,
 Cong


 On 2014/12/29, at 1:39, Artyom Lukianov 
 aluki...@redhat.commailto:aluki...@redhat.com wrote:

 Can you also provide output of hosted-engine --vm-status please, previous 
 time it was useful, because I do not see something unusual.
 Thanks

 - Original Message -
 From: Cong Yue 
 cong_...@alliedtelesis.commailto:cong_...@alliedtelesis.com
 To: Artyom Lukianov aluki...@redhat.commailto:aluki...@redhat.com
 Cc: Simone Tiraboschi stira...@redhat.commailto:stira...@redhat.com, 
 users@ovirt.orgmailto:users@ovirt.org
 Sent: Monday, December 29, 2014 7:15:24 AM
 Subject: Re: [ovirt-users] VM failover with ovirt3.5

 Also I change the maintenance mode to local in another host. But also the VM 
 in this host can not be migrated. The logs are as follows.

 [root@compute2-2 ~]# hosted-engine --set-maintenance --mode=local
 [root@compute2-2 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
 MainThread::INFO::2014-12-28
 21:09:04,184::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.94 (id: 1, score: 2400)
 MainThread::INFO::2014-12-28
 21:09:14,603::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineDown (score: 2400)
 MainThread::INFO::2014-12-28
 21:09:14,603::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.94 (id: 1, score: 2400)
 MainThread::INFO::2014-12-28
 21:09

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-29 Thread Nikolai Sednev
Hi, 
Your guest-vm have to be defined as  Highly Available 
Highly Available 

Select this check box if the virtual machine is to be highly available. For 
example, in cases of host maintenance or failure, the virtual machine is 
automatically moved to or re-launched on another host. If the host is manually 
shut down by the system administrator, the virtual machine is not automatically 
moved to another host. 
Note that this option is unavailable if the Migration Options setting in the 
Hosts tab is set to either Allow manual migration only or No migration . For a 
virtual machine to be highly available, it must be possible for the Manager to 
migrate the virtual machine to other available hosts as necessary. 


Thanks in advance. 

Best regards, 
Nikolai 
 
Nikolai Sednev 
Senior Quality Engineer at Compute team 
Red Hat Israel 
34 Jerusalem Road, 
Ra'anana, Israel 43501 

Tel: +972 9 7692043 
Mobile: +972 52 7342734 
Email: nsed...@redhat.com 
IRC: nsednev 

- Original Message -

From: users-requ...@ovirt.org 
To: users@ovirt.org 
Sent: Monday, December 29, 2014 7:50:07 PM 
Subject: Users Digest, Vol 39, Issue 169 

Send Users mailing list submissions to 
users@ovirt.org 

To subscribe or unsubscribe via the World Wide Web, visit 
http://lists.ovirt.org/mailman/listinfo/users 
or, via email, send a message with subject or body 'help' to 
users-requ...@ovirt.org 

You can reach the person managing the list at 
users-ow...@ovirt.org 

When replying, please edit your Subject line so it is more specific 
than Re: Contents of Users digest... 


Today's Topics: 

1. Re: VM failover with ovirt3.5 (Yue, Cong) 


-- 

Message: 1 
Date: Mon, 29 Dec 2014 09:49:58 -0800 
From: Yue, Cong cong_...@alliedtelesis.com 
To: Artyom Lukianov aluki...@redhat.com 
Cc: users@ovirt.org users@ovirt.org 
Subject: Re: [ovirt-users] VM failover with ovirt3.5 
Message-ID: 11a51118-8b03-41fe-8fd0-c81ac8897...@alliedtelesis.com 
Content-Type: text/plain; charset=us-ascii 

Thanks for detailed explanation. Do you mean only HE VM can be failover? I want 
to have a try with the VM on any host to check whether VM can be failover to 
other host automatically like VMware or Xenserver? 
I will have a try as you advised and provide the log for your further advice. 

Thanks, 
Cong 



 On 2014/12/29, at 8:43, Artyom Lukianov aluki...@redhat.com wrote: 
 
 I see that HE vm run on host with ip 10.0.0.94, and two another hosts in 
 Local Maintenance state, so vm will not migrate to any of them, can you try 
 disable local maintenance on all hosts in HE environment and after enable 
 local maintenance on host where HE vm run, and provide also output of 
 hosted-engine --vm-status. 
 Failover works in next way: 
 1) if host where run HE vm have score less by 800 that some other host in HE 
 environment, HE vm will migrate on host with best score 
 2) if something happen to vm(kernel panic, crash of service...), agent will 
 restart HE vm on another host in HE environment with positive score 
 3) if put to local maintenance host with HE vm, vm will migrate to another 
 host with positive score 
 Thanks. 
 
 - Original Message - 
 From: Cong Yue cong_...@alliedtelesis.com 
 To: Artyom Lukianov aluki...@redhat.com 
 Cc: Simone Tiraboschi stira...@redhat.com, users@ovirt.org 
 Sent: Monday, December 29, 2014 6:30:42 PM 
 Subject: Re: [ovirt-users] VM failover with ovirt3.5 
 
 Thanks and the --vm-status log is as follows: 
 [root@compute2-2 ~]# hosted-engine --vm-status 
 
 
 --== Host 1 status ==-- 
 
 Status up-to-date : True 
 Hostname : 10.0.0.94 
 Host ID : 1 
 Engine status : {health: good, vm: up, 
 detail: up} 
 Score : 2400 
 Local maintenance : False 
 Host timestamp : 1008087 
 Extra metadata (valid at timestamp): 
 metadata_parse_version=1 
 metadata_feature_version=1 
 timestamp=1008087tel:1008087 (Mon Dec 29 11:25:51 2014) 
 host-id=1 
 score=2400 
 maintenance=False 
 state=EngineUp 
 
 
 --== Host 2 status ==-- 
 
 Status up-to-date : True 
 Hostname : 10.0.0.93 
 Host ID : 2 
 Engine status : {reason: vm not running on 
 this host, health: bad, vm: down, detail: unknown} 
 Score : 0 
 Local maintenance : True 
 Host timestamp : 859142 
 Extra metadata (valid at timestamp): 
 metadata_parse_version=1 
 metadata_feature_version=1 
 timestamp=859142 (Mon Dec 29 08:25:08 2014) 
 host-id=2 
 score=0 
 maintenance=True 
 state=LocalMaintenance 
 
 
 --== Host 3 status ==-- 
 
 Status up-to-date : True 
 Hostname : 10.0.0.92 
 Host ID : 3 
 Engine status : {reason: vm not running on 
 this host, health: bad, vm: down, detail: unknown} 
 Score : 0 
 Local maintenance : True 
 Host timestamp : 853615 
 Extra metadata (valid at timestamp): 
 metadata_parse_version=1 
 metadata_feature_version=1 
 timestamp=853615 (Mon Dec 29 08:25:57 2014) 
 host-id=3 
 score=0 
 maintenance=True 
 state=LocalMaintenance 
 You have new mail in /var

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-29 Thread Artyom Lukianov
HE vm migrated only by ovirt-ha-agent and not by engine, but FatalError it's 
more interesting, can you provide vdsm.log for this one please.

- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: Simone Tiraboschi stira...@redhat.com, users@ovirt.org
Sent: Monday, December 29, 2014 8:29:04 PM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

I disabled local maintenance mode for all hosts, and then only set the host 
where HE VM is there to local maintenance mode. The logs are as follows. During 
the migration of HE VM , it shows some fatal error happen. By the way, also HE 
VM can not work with live migration. Instead, other VMs can do live migration.

---
[root@compute2-3 ~]# hosted-engine --set-maintenance --mode=local
You have new mail in /var/spool/mail/root
[root@compute2-3 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
MainThread::INFO::2014-12-29
13:16:12,435::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.92 (id: 3, score: 2400)
MainThread::INFO::2014-12-29
13:16:22,711::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUp (score: 2400)
MainThread::INFO::2014-12-29
13:16:22,711::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.92 (id: 3, score: 2400)
MainThread::INFO::2014-12-29
13:16:32,978::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUp (score: 2400)
MainThread::INFO::2014-12-29
13:16:32,978::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::INFO::2014-12-29
13:16:43,272::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUp (score: 2400)
MainThread::INFO::2014-12-29
13:16:43,272::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::INFO::2014-12-29
13:16:53,316::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Engine vm running on localhost
MainThread::INFO::2014-12-29
13:16:53,562::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUp (score: 2400)
MainThread::INFO::2014-12-29
13:16:53,562::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::INFO::2014-12-29
13:17:03,600::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
Local maintenance detected
MainThread::INFO::2014-12-29
13:17:03,611::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1419877023.61 type=state_transition
detail=EngineUp-LocalMaintenanceMigrateVm hostname='compute2-3'
MainThread::INFO::2014-12-29
13:17:03,672::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(EngineUp-LocalMaintenanceMigrateVm) sent? sent
MainThread::INFO::2014-12-29
13:17:03,911::states::208::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
Score is 0 due to local maintenance mode
MainThread::INFO::2014-12-29
13:17:03,912::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state LocalMaintenanceMigrateVm (score: 0)
MainThread::INFO::2014-12-29
13:17:03,912::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::INFO::2014-12-29
13:17:03,960::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1419877023.96 type=state_transition
detail=LocalMaintenanceMigrateVm-EngineMigratingAway
hostname='compute2-3'
MainThread::INFO::2014-12-29
13:17:03,980::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(LocalMaintenanceMigrateVm-EngineMigratingAway) sent? sent
MainThread::INFO::2014-12-29
13:17:04,218::states::66::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_penalize_memory)
Penalizing score by 400 due to low free memory
MainThread::INFO::2014-12-29
13:17:04,218::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineMigratingAway (score: 2000)
MainThread::INFO::2014-12-29
13:17:04,219::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::ERROR::2014-12-29
13:17:14,251::hosted_engine::867::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-29 Thread Artyom Lukianov
If you want enable failover for some vm, you can enter under vm 
properties-High Availability and enable Highly Available checkbox. But HE vm 
already automatically Highly Available.

- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: Simone Tiraboschi stira...@redhat.com, users@ovirt.org
Sent: Monday, December 29, 2014 7:49:58 PM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Thanks for detailed explanation. Do you mean only HE VM can be failover? I want 
to have a try with the VM on any host to check whether VM can be failover to 
other host automatically like VMware or Xenserver?
I will have a try as you advised and provide the log for your further advice.

Thanks,
Cong



 On 2014/12/29, at 8:43, Artyom Lukianov aluki...@redhat.com wrote:

 I see that HE vm run on host with ip 10.0.0.94, and two another hosts in 
 Local Maintenance state, so vm will not migrate to any of them, can you try 
 disable local maintenance on all hosts in HE environment and after enable 
 local maintenance on host where HE vm run, and provide also output of 
 hosted-engine --vm-status.
 Failover works in next way:
 1) if host where run HE vm have score less by 800 that some other host in HE 
 environment, HE vm will migrate on host with best score
 2) if something happen to vm(kernel panic, crash of service...), agent will 
 restart HE vm on another host in HE environment with positive score
 3) if put to local maintenance host with HE vm, vm will migrate to another 
 host with positive score
 Thanks.

 - Original Message -
 From: Cong Yue cong_...@alliedtelesis.com
 To: Artyom Lukianov aluki...@redhat.com
 Cc: Simone Tiraboschi stira...@redhat.com, users@ovirt.org
 Sent: Monday, December 29, 2014 6:30:42 PM
 Subject: Re: [ovirt-users] VM failover with ovirt3.5

 Thanks and the --vm-status log is as follows:
 [root@compute2-2 ~]# hosted-engine --vm-status


 --== Host 1 status ==--

 Status up-to-date  : True
 Hostname   : 10.0.0.94
 Host ID: 1
 Engine status  : {health: good, vm: up,
 detail: up}
 Score  : 2400
 Local maintenance  : False
 Host timestamp : 1008087
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=1008087tel:1008087 (Mon Dec 29 11:25:51 2014)
 host-id=1
 score=2400
 maintenance=False
 state=EngineUp


 --== Host 2 status ==--

 Status up-to-date  : True
 Hostname   : 10.0.0.93
 Host ID: 2
 Engine status  : {reason: vm not running on
 this host, health: bad, vm: down, detail: unknown}
 Score  : 0
 Local maintenance  : True
 Host timestamp : 859142
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=859142 (Mon Dec 29 08:25:08 2014)
 host-id=2
 score=0
 maintenance=True
 state=LocalMaintenance


 --== Host 3 status ==--

 Status up-to-date  : True
 Hostname   : 10.0.0.92
 Host ID: 3
 Engine status  : {reason: vm not running on
 this host, health: bad, vm: down, detail: unknown}
 Score  : 0
 Local maintenance  : True
 Host timestamp : 853615
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=853615 (Mon Dec 29 08:25:57 2014)
 host-id=3
 score=0
 maintenance=True
 state=LocalMaintenance
 You have new mail in /var/spool/mail/root
 [root@compute2-2 ~]#

 Could you please explain how VM failover works inside ovirt? Is there any 
 other debug option I can enable to check the problem?

 Thanks,
 Cong


 On 2014/12/29, at 1:39, Artyom Lukianov 
 aluki...@redhat.commailto:aluki...@redhat.com wrote:

 Can you also provide output of hosted-engine --vm-status please, previous 
 time it was useful, because I do not see something unusual.
 Thanks

 - Original Message -
 From: Cong Yue 
 cong_...@alliedtelesis.commailto:cong_...@alliedtelesis.com
 To: Artyom Lukianov aluki...@redhat.commailto:aluki...@redhat.com
 Cc: Simone Tiraboschi stira...@redhat.commailto:stira...@redhat.com, 
 users@ovirt.orgmailto:users@ovirt.org
 Sent: Monday, December 29, 2014 7:15:24 AM
 Subject: Re: [ovirt-users] VM failover with ovirt3.5

 Also I change the maintenance mode to local in another host. But also the VM 
 in this host can not be migrated. The logs are as follows.

 [root@compute2-2 ~]# hosted-engine --set-maintenance --mode=local
 [root@compute2-2 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
 MainThread::INFO::2014-12-28
 21:09:04,184::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-28 Thread Artyom Lukianov
I see that you set local maintenance on host3 that do not have engine vm on it, 
so it nothing to migrate from this host.
If you set local maintenance on host1, vm must migrate to another host with 
positive score.
Thanks

- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Simone Tiraboschi stira...@redhat.com
Cc: users@ovirt.org
Sent: Saturday, December 27, 2014 6:58:32 PM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Hi

I had a try with hosted-engine --set-maintence --mode=local on
compute2-1, which is host 3 in my cluster. From the log, it shows
maintence mode is dectected, but migration does not happen.

The logs are as follows. Is there any other config I need to check?

[root@compute2-1 vdsm]# hosted-engine --vm-status


--== Host 1 status ==-

Status up-to-date  : True
Hostname   : 10.0.0.94
Host ID: 1
Engine status  : {health: good, vm: up,
detail: up}
Score  : 2400
Local maintenance  : False
Host timestamp : 836296
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=836296 (Sat Dec 27 11:42:39 2014)
host-id=1
score=2400
maintenance=False
state=EngineUp


--== Host 2 status ==--

Status up-to-date  : True
Hostname   : 10.0.0.93
Host ID: 2
Engine status  : {reason: vm not running on
this host, health: bad, vm: down, detail: unknown}
Score  : 2400
Local maintenance  : False
Host timestamp : 687358
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=687358 (Sat Dec 27 08:42:04 2014)
host-id=2
score=2400
maintenance=False
state=EngineDown


--== Host 3 status ==--

Status up-to-date  : True
Hostname   : 10.0.0.92
Host ID: 3
Engine status  : {reason: vm not running on
this host, health: bad, vm: down, detail: unknown}
Score  : 0
Local maintenance  : True
Host timestamp : 681827
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=681827 (Sat Dec 27 08:42:40 2014)
host-id=3
score=0
maintenance=True
state=LocalMaintenance
[root@compute2-1 vdsm]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
MainThread::INFO::2014-12-27
08:42:41,109::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-27
08:42:51,198::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
Local maintenance detected
MainThread::INFO::2014-12-27
08:42:51,420::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state LocalMaintenance (score: 0)
MainThread::INFO::2014-12-27
08:42:51,420::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-27
08:43:01,507::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
Local maintenance detected
MainThread::INFO::2014-12-27
08:43:01,773::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state LocalMaintenance (score: 0)
MainThread::INFO::2014-12-27
08:43:01,773::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-27
08:43:11,859::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
Local maintenance detected
MainThread::INFO::2014-12-27
08:43:12,072::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state LocalMaintenance (score: 0)
MainThread::INFO::2014-12-27
08:43:12,072::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)



[root@compute2-3 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
MainThread::INFO::2014-12-27
11:36:28,855::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::INFO::2014-12-27
11:36:39,130::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUp (score: 2400)
MainThread::INFO::2014-12-27
11:36:39,130::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::INFO::2014-12-27
11:36:49,449

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-28 Thread Yue, Cong
...@redhat.com wrote:

I see that you set local maintenance on host3 that do not have engine vm on it, 
so it nothing to migrate from this host.
If you set local maintenance on host1, vm must migrate to another host with 
positive score.
Thanks

- Original Message -
From: Cong Yue cong_...@alliedtelesis.commailto:cong_...@alliedtelesis.com
To: Simone Tiraboschi stira...@redhat.commailto:stira...@redhat.com
Cc: users@ovirt.orgmailto:users@ovirt.org
Sent: Saturday, December 27, 2014 6:58:32 PM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Hi

I had a try with hosted-engine --set-maintence --mode=local on
compute2-1, which is host 3 in my cluster. From the log, it shows
maintence mode is dectected, but migration does not happen.

The logs are as follows. Is there any other config I need to check?

[root@compute2-1 vdsm]# hosted-engine --vm-status


--== Host 1 status ==-

Status up-to-date  : True
Hostname   : 10.0.0.94
Host ID: 1
Engine status  : {health: good, vm: up,
detail: up}
Score  : 2400
Local maintenance  : False
Host timestamp : 836296
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=836296 (Sat Dec 27 11:42:39 2014)
host-id=1
score=2400
maintenance=False
state=EngineUp


--== Host 2 status ==--

Status up-to-date  : True
Hostname   : 10.0.0.93
Host ID: 2
Engine status  : {reason: vm not running on
this host, health: bad, vm: down, detail: unknown}
Score  : 2400
Local maintenance  : False
Host timestamp : 687358
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=687358 (Sat Dec 27 08:42:04 2014)
host-id=2
score=2400
maintenance=False
state=EngineDown


--== Host 3 status ==--

Status up-to-date  : True
Hostname   : 10.0.0.92
Host ID: 3
Engine status  : {reason: vm not running on
this host, health: bad, vm: down, detail: unknown}
Score  : 0
Local maintenance  : True
Host timestamp : 681827
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=681827 (Sat Dec 27 08:42:40 2014)
host-id=3
score=0
maintenance=True
state=LocalMaintenance
[root@compute2-1 vdsm]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
MainThread::INFO::2014-12-27
08:42:41,109::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-27
08:42:51,198::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
Local maintenance detected
MainThread::INFO::2014-12-27
08:42:51,420::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state LocalMaintenance (score: 0)
MainThread::INFO::2014-12-27
08:42:51,420::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-27
08:43:01,507::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
Local maintenance detected
MainThread::INFO::2014-12-27
08:43:01,773::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state LocalMaintenance (score: 0)
MainThread::INFO::2014-12-27
08:43:01,773::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-27
08:43:11,859::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
Local maintenance detected
MainThread::INFO::2014-12-27
08:43:12,072::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state LocalMaintenance (score: 0)
MainThread::INFO::2014-12-27
08:43:12,072::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)



[root@compute2-3 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
MainThread::INFO::2014-12-27
11:36:28,855::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::INFO::2014-12-27
11:36:39,130::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUp (score: 2400)
MainThread::INFO::2014-12-27
11:36:39,130::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-27 Thread Yue, Cong
 wrote:



 - Original Message -
 From: Cong Yue cong_...@alliedtelesis.com
 To: Simone Tiraboschi stira...@redhat.com
 Cc: users@ovirt.org
 Sent: Friday, December 19, 2014 7:22:10 PM
 Subject: RE: [ovirt-users] VM failover with ovirt3.5

 Thanks for the information. This is the log for my three ovirt nodes.
 From the output of hosted-engine --vm-status, it shows the engine state for
 my 2nd and 3rd ovirt node is DOWN.
 Is this the reason why VM failover not work in my environment?

 No, they looks ok: you can run the engine VM on single host at a time.

 How can I make
 also engine works for my 2nd and 3rd ovit nodes?

 If you put the host 1 in local maintenance mode ( hosted-engine 
 --set-maintenance --mode=local ) the VM should migrate to host 2; if you 
 reactivate host 1 ( hosted-engine --set-maintenance --mode=none ) and put 
 host 2 in local maintenance mode the VM should migrate again.

 Can you please try that and post the logs if something is going bad?


 --
 --== Host 1 status ==--

 Status up-to-date  : True
 Hostname   : 10.0.0.94
 Host ID: 1
 Engine status  : {health: good, vm: up,
 detail: up}
 Score  : 2400
 Local maintenance  : False
 Host timestamp : 150475
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=150475 (Fri Dec 19 13:12:18 2014)
 host-id=1
 score=2400
 maintenance=False
 state=EngineUp


 --== Host 2 status ==--

 Status up-to-date  : True
 Hostname   : 10.0.0.93
 Host ID: 2
 Engine status  : {reason: vm not running on
 this host, health: bad, vm: down, detail: unknown}
 Score  : 2400
 Local maintenance  : False
 Host timestamp : 1572
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=1572 (Fri Dec 19 10:12:18 2014)
 host-id=2
 score=2400
 maintenance=False
 state=EngineDown


 --== Host 3 status ==--

 Status up-to-date  : False
 Hostname   : 10.0.0.92
 Host ID: 3
 Engine status  : unknown stale-data
 Score  : 2400
 Local maintenance  : False
 Host timestamp : 987
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=987 (Fri Dec 19 10:09:58 2014)
 host-id=3
 score=2400
 maintenance=False
 state=EngineDown

 --
 And the /var/log/ovirt-hosted-engine-ha/agent.log for three ovirt nodes are
 as follows:
 --
 10.0.0.94(hosted-engine-1)
 ---
 MainThread::INFO::2014-12-19
 13:09:33,716::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-19
 13:09:33,716::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.93 (id: 2, score: 2400)
 MainThread::INFO::2014-12-19
 13:09:44,017::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-19
 13:09:44,017::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.93 (id: 2, score: 2400)
 MainThread::INFO::2014-12-19
 13:09:54,303::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-19
 13:09:54,303::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.93 (id: 2, score: 2400)
 MainThread::INFO::2014-12-19
 13:10:04,342::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
 Engine vm running on localhost
 MainThread::INFO::2014-12-19
 13:10:04,617::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-19
 13:10:04,617::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.93 (id: 2, score: 2400)
 MainThread::INFO::2014-12-19
 13:10:14,657::state_machine::160::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
 Global metadata: {'maintenance': False}
 MainThread::INFO::2014-12-19
 13:10:14,657::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
 Host 10.0.0.93 (id 2): {'extra':
 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=1448
 (Fri Dec 19 10:10:14
 2014)\nhost-id=2\nscore=2400\nmaintenance=False\nstate=EngineDown\n',
 'hostname': '10.0.0.93', 'alive': True

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-22 Thread Simone Tiraboschi


- Original Message -
 From: Cong Yue cong_...@alliedtelesis.com
 To: Simone Tiraboschi stira...@redhat.com
 Cc: users@ovirt.org
 Sent: Friday, December 19, 2014 7:22:10 PM
 Subject: RE: [ovirt-users] VM failover with ovirt3.5
 
 Thanks for the information. This is the log for my three ovirt nodes.
 From the output of hosted-engine --vm-status, it shows the engine state for
 my 2nd and 3rd ovirt node is DOWN.
 Is this the reason why VM failover not work in my environment? 

No, they looks ok: you can run the engine VM on single host at a time.

 How can I make
 also engine works for my 2nd and 3rd ovit nodes?

If you put the host 1 in local maintenance mode ( hosted-engine 
--set-maintenance --mode=local ) the VM should migrate to host 2; if you 
reactivate host 1 ( hosted-engine --set-maintenance --mode=none ) and put host 
2 in local maintenance mode the VM should migrate again.

Can you please try that and post the logs if something is going bad?


 --
 --== Host 1 status ==--
 
 Status up-to-date  : True
 Hostname   : 10.0.0.94
 Host ID: 1
 Engine status  : {health: good, vm: up,
 detail: up}
 Score  : 2400
 Local maintenance  : False
 Host timestamp : 150475
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=150475 (Fri Dec 19 13:12:18 2014)
 host-id=1
 score=2400
 maintenance=False
 state=EngineUp
 
 
 --== Host 2 status ==--
 
 Status up-to-date  : True
 Hostname   : 10.0.0.93
 Host ID: 2
 Engine status  : {reason: vm not running on
 this host, health: bad, vm: down, detail: unknown}
 Score  : 2400
 Local maintenance  : False
 Host timestamp : 1572
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=1572 (Fri Dec 19 10:12:18 2014)
 host-id=2
 score=2400
 maintenance=False
 state=EngineDown
 
 
 --== Host 3 status ==--
 
 Status up-to-date  : False
 Hostname   : 10.0.0.92
 Host ID: 3
 Engine status  : unknown stale-data
 Score  : 2400
 Local maintenance  : False
 Host timestamp : 987
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=987 (Fri Dec 19 10:09:58 2014)
 host-id=3
 score=2400
 maintenance=False
 state=EngineDown
 
 --
 And the /var/log/ovirt-hosted-engine-ha/agent.log for three ovirt nodes are
 as follows:
 --
 10.0.0.94(hosted-engine-1)
 ---
 MainThread::INFO::2014-12-19
 13:09:33,716::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-19
 13:09:33,716::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.93 (id: 2, score: 2400)
 MainThread::INFO::2014-12-19
 13:09:44,017::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-19
 13:09:44,017::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.93 (id: 2, score: 2400)
 MainThread::INFO::2014-12-19
 13:09:54,303::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-19
 13:09:54,303::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.93 (id: 2, score: 2400)
 MainThread::INFO::2014-12-19
 13:10:04,342::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
 Engine vm running on localhost
 MainThread::INFO::2014-12-19
 13:10:04,617::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-19
 13:10:04,617::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.93 (id: 2, score: 2400)
 MainThread::INFO::2014-12-19
 13:10:14,657::state_machine::160::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
 Global metadata: {'maintenance': False}
 MainThread::INFO::2014-12-19
 13:10:14,657::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
 Host 10.0.0.93 (id 2): {'extra':
 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=1448
 (Fri Dec 19 10:10:14
 2014)\nhost-id=2\nscore=2400\nmaintenance=False\nstate=EngineDown\n',
 'hostname': '10.0.0.93', 'alive': True

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-22 Thread Simone Tiraboschi


- Original Message -
 From: Cong Yue cong_...@alliedtelesis.com
 To: Simone Tiraboschi stira...@redhat.com
 Cc: users@ovirt.org
 Sent: Friday, December 19, 2014 9:25:32 PM
 Subject: RE: [ovirt-users] VM failover with ovirt3.5
 
 In the documentation of
 http://www.ovirt.org/OVirt_Administration_Guide#.E2.81.A0Improving_Uptime_with_Virtual_Machine_High_Availability
  it says
 To enable the migration of highly available virtual machines:
 Power management must be configured for the hosts running the highly
 available virtual machines.

hosted-engine and VM HA are note really the same feature cause other VMs can be 
managed by the engine while engine VM itself cannot (chicken-and-egg problem)

 Does this mean I need to confirgure all poer management for all ovirt nodes?

No, it's not mandatory for hosted engine but it would be better to do so just 
for power management itself.

 Thanks,
 Cong
 
 -Original Message-
 From: Yue, Cong
 Sent: Friday, December 19, 2014 10:22 AM
 To: 'Simone Tiraboschi'
 Cc: users@ovirt.org
 Subject: RE: [ovirt-users] VM failover with ovirt3.5
 
 Thanks for the information. This is the log for my three ovirt nodes.
 From the output of hosted-engine --vm-status, it shows the engine state for
 my 2nd and 3rd ovirt node is DOWN.
 Is this the reason why VM failover not work in my environment? How can I make
 also engine works for my 2nd and 3rd ovit nodes?
 --
 --== Host 1 status ==--
 
 Status up-to-date  : True
 Hostname   : 10.0.0.94
 Host ID: 1
 Engine status  : {health: good, vm: up,
 detail: up}
 Score  : 2400
 Local maintenance  : False
 Host timestamp : 150475
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=150475 (Fri Dec 19 13:12:18 2014)
 host-id=1
 score=2400
 maintenance=False
 state=EngineUp
 
 
 --== Host 2 status ==--
 
 Status up-to-date  : True
 Hostname   : 10.0.0.93
 Host ID: 2
 Engine status  : {reason: vm not running on
 this host, health: bad, vm: down, detail: unknown}
 Score  : 2400
 Local maintenance  : False
 Host timestamp : 1572
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=1572 (Fri Dec 19 10:12:18 2014)
 host-id=2
 score=2400
 maintenance=False
 state=EngineDown
 
 
 --== Host 3 status ==--
 
 Status up-to-date  : False
 Hostname   : 10.0.0.92
 Host ID: 3
 Engine status  : unknown stale-data
 Score  : 2400
 Local maintenance  : False
 Host timestamp : 987
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=987 (Fri Dec 19 10:09:58 2014)
 host-id=3
 score=2400
 maintenance=False
 state=EngineDown
 
 --
 And the /var/log/ovirt-hosted-engine-ha/agent.log for three ovirt nodes are
 as follows:
 --
 10.0.0.94(hosted-engine-1)
 ---
 MainThread::INFO::2014-12-19
 13:09:33,716::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-19
 13:09:33,716::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.93 (id: 2, score: 2400)
 MainThread::INFO::2014-12-19
 13:09:44,017::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-19
 13:09:44,017::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.93 (id: 2, score: 2400)
 MainThread::INFO::2014-12-19
 13:09:54,303::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-19
 13:09:54,303::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.93 (id: 2, score: 2400)
 MainThread::INFO::2014-12-19
 13:10:04,342::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
 Engine vm running on localhost
 MainThread::INFO::2014-12-19
 13:10:04,617::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Current state EngineUp (score: 2400)
 MainThread::INFO::2014-12-19
 13:10:04,617::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
 Best remote host 10.0.0.93 (id: 2, score: 2400)
 MainThread::INFO::2014-12-19
 13:10:14,657::state_machine::160

Re: [ovirt-users] VM failover with ovirt3.5

2014-12-19 Thread Simone Tiraboschi


- Original Message -
 From: Cong Yue cong_...@alliedtelesis.com
 To: users@ovirt.org
 Sent: Friday, December 19, 2014 2:14:33 AM
 Subject: [ovirt-users] VM failover with ovirt3.5
 
 
 
 Hi
 
 
 
 In my environment, I have 3 ovirt nodes as one cluster. And on top of host-1,
 there is one vm to host ovirt engine.
 
 Also I have one external storage for the cluster to use as data domain of
 engine and data.
 
 I confirmed live migration works well in my environment.
 
 But it seems very buggy for VM failover if I try to force to shut down one
 ovirt node. Sometimes the VM in the node which is shutdown can migrate to
 other host, but it take more than several minutes.
 
 Sometimes, it can not migrate at all. Sometimes, only when the host is back,
 the VM is beginning to move.

Can you please check or share the logs under /var/log/ovirt-hosted-engine-ha/ ?
 
 Is there some documentation to explain how VM failover is working? And is
 there some bugs reported related with this?

http://www.ovirt.org/Features/Self_Hosted_Engine#Agent_State_Diagram

 Thanks in advance,
 
 Cong
 
 
 
 
 This e-mail message is for the sole use of the intended recipient(s) and may
 contain confidential and privileged information. Any unauthorized review,
 use, disclosure or distribution is prohibited. If you are not the intended
 recipient, please contact the sender by reply e-mail and destroy all copies
 of the original message. If you are the intended recipient, please be
 advised that the content of this message is subject to access, review and
 disclosure by the sender's e-mail System Administrator.
 
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VM failover with ovirt3.5

2014-12-19 Thread Yue, Cong
::INFO::2014-12-19
10:12:49,338::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-19
10:12:59,642::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-12-19
10:12:59,642::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)
MainThread::INFO::2014-12-19
10:13:10,010::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-12-19
10:13:10,010::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.94 (id: 1, score: 2400)


10.0.0.92(hosted-engine-3)
same as 10.0.0.93
--

-Original Message-
From: Simone Tiraboschi [mailto:stira...@redhat.com]
Sent: Friday, December 19, 2014 12:28 AM
To: Yue, Cong
Cc: users@ovirt.org
Subject: Re: [ovirt-users] VM failover with ovirt3.5



- Original Message -
 From: Cong Yue cong_...@alliedtelesis.com
 To: users@ovirt.org
 Sent: Friday, December 19, 2014 2:14:33 AM
 Subject: [ovirt-users] VM failover with ovirt3.5



 Hi



 In my environment, I have 3 ovirt nodes as one cluster. And on top of
 host-1, there is one vm to host ovirt engine.

 Also I have one external storage for the cluster to use as data domain
 of engine and data.

 I confirmed live migration works well in my environment.

 But it seems very buggy for VM failover if I try to force to shut down
 one ovirt node. Sometimes the VM in the node which is shutdown can
 migrate to other host, but it take more than several minutes.

 Sometimes, it can not migrate at all. Sometimes, only when the host is
 back, the VM is beginning to move.

Can you please check or share the logs under /var/log/ovirt-hosted-engine-ha/ ?

 Is there some documentation to explain how VM failover is working? And
 is there some bugs reported related with this?

http://www.ovirt.org/Features/Self_Hosted_Engine#Agent_State_Diagram

 Thanks in advance,

 Cong




 This e-mail message is for the sole use of the intended recipient(s)
 and may contain confidential and privileged information. Any
 unauthorized review, use, disclosure or distribution is prohibited. If
 you are not the intended recipient, please contact the sender by reply
 e-mail and destroy all copies of the original message. If you are the
 intended recipient, please be advised that the content of this message
 is subject to access, review and disclosure by the sender's e-mail System 
 Administrator.

 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users


This e-mail message is for the sole use of the intended recipient(s) and may 
contain confidential and privileged information. Any unauthorized review, use, 
disclosure or distribution is prohibited. If you are not the intended 
recipient, please contact the sender by reply e-mail and destroy all copies of 
the original message. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's e-mail System Administrator.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VM failover with ovirt3.5

2014-12-19 Thread Yue, Cong
In the documentation of
http://www.ovirt.org/OVirt_Administration_Guide#.E2.81.A0Improving_Uptime_with_Virtual_Machine_High_Availability
 it says
To enable the migration of highly available virtual machines:
Power management must be configured for the hosts running the highly available 
virtual machines.

Does this mean I need to confirgure all poer management for all ovirt nodes?

Thanks,
Cong

-Original Message-
From: Yue, Cong
Sent: Friday, December 19, 2014 10:22 AM
To: 'Simone Tiraboschi'
Cc: users@ovirt.org
Subject: RE: [ovirt-users] VM failover with ovirt3.5

Thanks for the information. This is the log for my three ovirt nodes.
From the output of hosted-engine --vm-status, it shows the engine state for my 
2nd and 3rd ovirt node is DOWN.
Is this the reason why VM failover not work in my environment? How can I make 
also engine works for my 2nd and 3rd ovit nodes?
--
--== Host 1 status ==--

Status up-to-date  : True
Hostname   : 10.0.0.94
Host ID: 1
Engine status  : {health: good, vm: up,
detail: up}
Score  : 2400
Local maintenance  : False
Host timestamp : 150475
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=150475 (Fri Dec 19 13:12:18 2014)
host-id=1
score=2400
maintenance=False
state=EngineUp


--== Host 2 status ==--

Status up-to-date  : True
Hostname   : 10.0.0.93
Host ID: 2
Engine status  : {reason: vm not running on
this host, health: bad, vm: down, detail: unknown}
Score  : 2400
Local maintenance  : False
Host timestamp : 1572
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=1572 (Fri Dec 19 10:12:18 2014)
host-id=2
score=2400
maintenance=False
state=EngineDown


--== Host 3 status ==--

Status up-to-date  : False
Hostname   : 10.0.0.92
Host ID: 3
Engine status  : unknown stale-data
Score  : 2400
Local maintenance  : False
Host timestamp : 987
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=987 (Fri Dec 19 10:09:58 2014)
host-id=3
score=2400
maintenance=False
state=EngineDown

--
And the /var/log/ovirt-hosted-engine-ha/agent.log for three ovirt nodes are as 
follows:
--
10.0.0.94(hosted-engine-1)
---
MainThread::INFO::2014-12-19
13:09:33,716::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUp (score: 2400)
MainThread::INFO::2014-12-19
13:09:33,716::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::INFO::2014-12-19
13:09:44,017::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUp (score: 2400)
MainThread::INFO::2014-12-19
13:09:44,017::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::INFO::2014-12-19
13:09:54,303::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUp (score: 2400)
MainThread::INFO::2014-12-19
13:09:54,303::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::INFO::2014-12-19
13:10:04,342::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Engine vm running on localhost
MainThread::INFO::2014-12-19
13:10:04,617::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUp (score: 2400)
MainThread::INFO::2014-12-19
13:10:04,617::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::INFO::2014-12-19
13:10:14,657::state_machine::160::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
Global metadata: {'maintenance': False}
MainThread::INFO::2014-12-19
13:10:14,657::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
Host 10.0.0.93 (id 2): {'extra':
'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=1448
(Fri Dec 19 10:10:14
2014)\nhost-id=2\nscore=2400\nmaintenance=False\nstate=EngineDown\n',
'hostname': '10.0.0.93', 'alive': True, 'host-id': 2, 'engine-status':
{'reason': 'vm not running on this host', 'health': 'bad', 'vm':
'down', 'detail': 'unknown'}, 'score': 2400, 'maintenance': False,
'host-ts': 1448

[ovirt-users] VM failover with ovirt3.5

2014-12-18 Thread Yue, Cong
Hi

In my environment, I have 3 ovirt nodes as one cluster. And on top of host-1, 
there is one vm to host ovirt engine.
Also I have one external storage for the cluster to use as data domain of 
engine and data.
I confirmed live migration works well in my environment.
But it seems very buggy for VM failover if I try to force to shut down one 
ovirt node. Sometimes the VM in the node which is shutdown can migrate to other 
host, but it take more than several minutes.
Sometimes, it can not migrate at all. Sometimes, only when the host is back, 
the VM is beginning to move.

Is there some documentation to explain how VM failover is working? And is there 
some bugs reported related with this?

Thanks in advance,
Cong



This e-mail message is for the sole use of the intended recipient(s) and may 
contain confidential and privileged information. Any unauthorized review, use, 
disclosure or distribution is prohibited. If you are not the intended 
recipient, please contact the sender by reply e-mail and destroy all copies of 
the original message. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's e-mail System Administrator.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users