Re: [ovirt-users] VM failover with ovirt3.5

2015-01-06 Thread Jiri Moskovcak

On 01/06/2015 09:34 AM, Artyom Lukianov wrote:

Case 1:
In vdsm.log I can see this one:
Thread-674407::ERROR::2015-01-05 12:09:43,264::migration::259::vm.Vm::(run) 
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to migrate
Traceback (most recent call last):
   File /usr/share/vdsm/virt/migration.py, line 245, in run
 self._startUnderlyingMigration(time.time())
   File /usr/share/vdsm/virt/migration.py, line 324, in 
_startUnderlyingMigration
 None, maxBandwidth)
   File /usr/share/vdsm/virt/vm.py, line 670, in f
 ret = attr(*args, **kwargs)
   File /usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py, line 111, 
in wrapper
 ret = f(*args, **kwargs)
   File /usr/lib64/python2.7/site-packages/libvirt.py, line 1264, in 
migrateToURI2
 if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
dom=self)
libvirtError: operation aborted: migration job: canceled by client
I see that this kind can be happen, because migration time exceeding the 
configured maximum time for migrations, but anyway we need help from devs, I 
added some to CC.



- agent did everything correctly and as Artyom says, the migration is 
aborted by vdsm:


The migration took 260 seconds which is exceeding the configured maximum 
time for migrations of 256 seconds. The migration will be aborted.


- there is a configuration option in vdsm conf you can tweak to increase 
the timeout:


snip
'migration_max_time_per_gib_mem', '64',
'The maximum time in seconds per GiB memory a migration may 
take '

'before the migration will be aborted by the source host. '
'Setting this value to 0 will disable this feature.'
/snip

So as you can see in your case it's 4 * 64 seconds = 256seconds.


--Jirka


Case 2:
HA vm must migrate only in case of some fail on host3, so if your host_3 is ok 
vm will continue run on it.


- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org
Sent: Monday, January 5, 2015 7:38:08 PM
Subject: RE: [ovirt-users] VM failover with ovirt3.5

I collected the agent.log and vdsm.log in 2 cases.

Case1 HE VM failover trail
What I did
1, make all host be engine up
2, set host1 be with local maintenance mode. In host1, there is HE VM.
3, Then HE VM is trying to migrate, but finally it fails. This can be found 
from agent.log_hosted_engine_1
As for the log is very large, I uploaded into google dirve. The link is as
https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdRGJhUXUwejNGRHc
The logs are for 3 hosts in my environment.

Case2 non-HE VM failover trail
1, make all host be engine up
2,set host2 be with local maintenance mode. In host3, there is one vm with ha enabled. Also for the 
cluster, Enable HA reservation and Resilience policy is set as migrating virtual 
machines
3,But the vm on the top of host3 does not migrate at all.
The logs are uploaded to good drive as
https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdd3MzTXZBbmxpNmc


Thanks,
Cong




-Original Message-
From: Artyom Lukianov [mailto:aluki...@redhat.com]
Sent: Sunday, January 04, 2015 3:22 AM
To: Yue, Cong
Cc: cong yue; stira...@redhat.com; users@ovirt.org
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Can you provide vdsm logs:
1) for HE vm case
2) for not HE vm case
Thanks

- Original Message -
From: Cong Yue cong_...@alliedtelesis.com
To: Artyom Lukianov aluki...@redhat.com
Cc: cong yue yuecong1...@gmail.com, stira...@redhat.com, users@ovirt.org
Sent: Thursday, January 1, 2015 2:32:18 AM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Thanks for the advice. I applied the patch for clientIF.py as
- port = config.getint('addresses', 'management_port')
+ port = config.get('addresses', 'management_port')

Now there is no fatal error in beam.log, also migration can start to happen 
when I set the host where HE VM is to be local maintenance mode. But it finally 
fail with the following log. Also HE VM can not be done with live migration in 
my environment.

MainThread::INFO::2014-12-31
19:08:06,197::states::759::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Continuing to monitor migration
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineMigratingAway (score: 2000)
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::ERROR::2014-12-31
19:08:16,490::hosted_engine::867::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitor_migration)
Failed to migrate
Traceback (most recent call last):
  File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
line 863, in 

Re: [ovirt-users] HostedEngine Deployment Woes

2015-01-05 Thread Jiri Moskovcak

On 01/02/2015 11:27 PM, Mikola Rose wrote:

Hi Didi,

Thank you for the response.

I have tried to do a fresh install (RH 6.6) and still ran into the same
problem.

[root@pws-hv15 rhiso]# hosted-engine --deploy
[ INFO  ] Stage: Initializing
   Continuing will configure this host for serving as hypervisor
and create a VM where you have to install oVirt Engine afterwards.
   Are you sure you want to continue? (Yes, No)[Yes]:
   It has been detected that this program is executed through an
SSH connection without using screen.
   Continuing with the installation may lead to broken
installation if the network connection fails.
   It is highly recommended to abort the installation and run it
inside a screen session using command screen.
   Do you want to continue anyway? (Yes, No)[No]: yes
[ INFO  ] Generating a temporary VNC password.
[ INFO  ] Stage: Environment setup
   Configuration files: []
   Log file:
/var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20150102134318-7ougjc.log
   Version: otopi-1.2.3 (otopi-1.2.3-1.el6ev)
[ INFO  ] Hardware supports virtualization
[ INFO  ] Bridge rhevm already created
[ INFO  ] Stage: Environment packages setup
[ INFO  ] Stage: Programs detection
[ INFO  ] Stage: Environment setup
[ INFO  ] Stage: Environment customization


   --== STORAGE CONFIGURATION ==--


   During customization use CTRL-D to abort.
   Please specify the storage you would like to use (nfs3,
nfs4)[nfs3]:
   Please specify the full shared storage connection path to use
(example: host:/path): 192.168.1.32:/Volumes/Raid1
[ INFO  ] Installing on first host
   Please provide storage domain name. [hosted_storage]:
   Local storage datacenter name is an internal name and
currently will not be shown in engine's admin UI.Please enter local
datacenter name [hosted_datacenter]:


   --== SYSTEM CONFIGURATION ==--



   --== NETWORK CONFIGURATION ==--


   iptables was detected on your computer, do you wish setup to
configure it? (Yes, No)[Yes]:
   Please indicate a pingable gateway IP address [192.168.0.3]:


   --== VM CONFIGURATION ==--


   Please specify the device to boot the VM from (cdrom, disk,
pxe) [cdrom]:
   The following CPU types are supported by this host:
- model_Westmere: Intel Westmere Family
- model_Nehalem: Intel Nehalem Family
- model_Penryn: Intel Penryn Family
- model_Conroe: Intel Conroe Family
   Please specify the CPU type to be used by the VM
[model_Westmere]:
   Please specify path to installation media you would like to
use [None]: /mnt/rhiso
   Please specify the number of virtual CPUs for the VM
[Defaults to minimum requirement: 2]:
   Please specify the disk size of the VM in GB [Defaults to
minimum requirement: 25]:
   You may specify a MAC address for the VM or accept a randomly
generated default [00:16:3e:02:7f:c4]:
   Please specify the memory size of the VM in MB [Defaults to
minimum requirement: 4096]:
   Please specify the console type you would like to use to
connect to the VM (vnc, spice) [vnc]:


   --== HOSTED ENGINE CONFIGURATION ==--


   Enter the name which will be used to identify this host
inside the Administrator Portal [hosted_engine_1]:
   Enter 'admin@internal' user password that will be used for
accessing the Administrator Portal:
   Confirm 'admin@internal' user password:
   Please provide the FQDN for the engine you would like to use.
   This needs to match the FQDN that you will use for the engine
installation within the VM.
   Note: This will be the FQDN of the VM you are now going to
create,
   it should not point to the base host or to any other existing
machine.
   Engine FQDN: powerhost1.power-soft.net
http://powerhost1.power-soft.net
   Please provide the name of the SMTP server through which we
will send notifications [localhost]:
   Please provide the TCP port number of the SMTP server [25]:
   Please provide the email address from which notifications
will be sent [root@localhost]:
   Please provide a comma-separated list of email addresses
which will get notifications [root@localhost]:
[ INFO  ] Stage: Setup validation


   --== CONFIGURATION PREVIEW ==--


   Engine FQDN:
powerhost1.power-soft.net http://powerhost1.power-soft.net
   Bridge name: rhevm
   SSH daemon port: 22
   Firewall manager   : iptables
   Gateway address: 192.168.0.3
   Host name for web application  : hosted_engine_1
   Host ID: 1
   Image size GB  : 25
   Storage connection : 

Re: [ovirt-users] Hosted engine: sending ioctl 5401 to a partition!

2014-11-27 Thread Jiri Moskovcak

On 11/21/2014 10:28 PM, Chris Adams wrote:

I have set up oVirt with hosted engine, on an iSCSI volume.  On both
nodes, the kernel logs the following about every 10 seconds:

Nov 21 15:27:49 node8 kernel: ovirt-ha-broker: sending ioctl 5401 to a 
partition!

Is this a known bug, something that I need to address, etc.?



I was looking into our code and there is nothing explicitly sending 
this, it's too low-level. I believe it's caused by attempts to read from 
that partitions (agent reads the data every 10 secs, so it matches the 
pattern). To be honest I actually don't know if that's wrong or not I'm 
not storage expert, so CCing Federico to shed some light into this.


Regards,
--Jirka
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted-Engine

2014-11-24 Thread Jiri Moskovcak

Thanks for the logs,
seems like your yum repositories are corrupted:
engine.log:2014-11-21 10:58:09,555 ERROR 
[org.ovirt.engine.core.bll.InstallerMessages] (VdsDeploy) Installation 
192.168.21.239: Yum Cannot queue package iproute: Cannot find a valid 
baseurl for repo: base
engine.log:2014-11-21 10:58:09,559 ERROR 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(VdsDeploy) Correlation ID: 335c3f42, Call Stack: null, Custom Event ID: 
-1, Message: Failed to install Host hosted_engine_1. Yum Cannot queue 
package iproute: Cannot find a valid baseurl for repo: base.


So the engine fails to install the host. You can try to manually install 
add the host to the engine after you fix the problem with the repository.


--Jirka

On 11/23/2014 05:53 AM, Sandvik Agustin wrote:

Hi Jirka,

Good day, here's my vdsm.log, ovirt-hosted-engine-setup-DATE.log, and
engine.log. Please see attachment.


Thanks,
Sandvik

On Fri, Nov 21, 2014 at 4:33 PM, Jiri Moskovcak jmosk...@redhat.com
mailto:jmosk...@redhat.com wrote:

On 11/21/2014 03:54 AM, Sandvik Agustin wrote:

Hi users,


I'm installing ovirt with hosted engine following this documentation

http://community.redhat.com/__blog/2014/03/up-and-running-__with-ovirt-3-4/

http://community.redhat.com/blog/2014/03/up-and-running-with-ovirt-3-4/

I've successfully created the vm that will host the engine, but
at the
end of the installer, I have this kind of error message


Hi Sandvik,
as the message says, you should check the engine and vdsm.log
without them we're not able to debug your issue. So please send me
these files:
/var/log/vdsm/vdsm.log
/var/log/ovirt-hosted-engine-__setup/ovirt-hosted-engine-__setup-DATE.log
/var/log/ovirt-engine/engine.__log

And btw, HE in 3.5 might be a better choice ;)

Regards,
Jirka


(1, 2, 3)[1]:
[ INFO  ] Engine replied: DB Up!Welcome to Health Status!
Enter the name of the cluster to which you want to
add the
host (Default) [Default]:
[ INFO  ] Waiting for the host to become operational in the
engine. This
may take several minutes...
[ ERROR ] The VDSM host was found in a failed state. Please
check engine
and bootstrap installation logs.
[ ERROR ] Unable to add hosted_engine_1 to the manager
Please shutdown the VM allowing the system to launch
it as a
monitored service.
The system will wait until the VM is down.

then I manage to shutdown the vm that hosting the engine, then I get
this message


   Enabling and starting HA services
Hosted Engine successfully set up
[ INFO  ] Stage: Clean up
[ INFO  ] Generating answer file
'/etc/ovirt-hosted-engine/__answers.conf'
[ INFO  ] Answer file '/etc/ovirt-hosted-engine/__answers.conf'
has been
updated
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination



Hope you can help me,
Thanks in advance,
sandvik


_
Users mailing list
Users@ovirt.org mailto:Users@ovirt.org
http://lists.ovirt.org/__mailman/listinfo/users
http://lists.ovirt.org/mailman/listinfo/users





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Failure on last step of ovirt hosted engine

2014-11-24 Thread Jiri Moskovcak

On 11/21/2014 07:39 PM, Mario Giammarco wrote:

Hello,
I am following the guide up and running with ovirt 3.5 but with iscsi
and not glusterfs.
On the last step (after I have already installed ovirt-engine in the vm)
the script has stopped due to an error.
So the ovirt hosted vm is not registered in the datacenter and iscsi
storage is not seen by datacenter.

I am trying to use ovirt gui to register it manually without luck.

Can someone tell me how to repeat the last step of the script?



The HE storage shouldn't be visible in the engine, so don't add it. If 
the vm and engine is running you should be fine just adding the host 
manually to the engine and starting the ovirt-ha-agent and 
ovirt-ha-broker services on the host (don't forget to make them start on 
boot)


--Jirka


Thanks,
Mario


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted-Engine

2014-11-21 Thread Jiri Moskovcak

On 11/21/2014 03:54 AM, Sandvik Agustin wrote:

Hi users,


I'm installing ovirt with hosted engine following this documentation
http://community.redhat.com/blog/2014/03/up-and-running-with-ovirt-3-4/

I've successfully created the vm that will host the engine, but at the
end of the installer, I have this kind of error message



Hi Sandvik,
as the message says, you should check the engine and vdsm.log without 
them we're not able to debug your issue. So please send me these files:

/var/log/vdsm/vdsm.log
/var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-DATE.log
/var/log/ovirt-engine/engine.log

And btw, HE in 3.5 might be a better choice ;)

Regards,
Jirka



(1, 2, 3)[1]:
[ INFO  ] Engine replied: DB Up!Welcome to Health Status!
   Enter the name of the cluster to which you want to add the
host (Default) [Default]:
[ INFO  ] Waiting for the host to become operational in the engine. This
may take several minutes...
[ ERROR ] The VDSM host was found in a failed state. Please check engine
and bootstrap installation logs.
[ ERROR ] Unable to add hosted_engine_1 to the manager
   Please shutdown the VM allowing the system to launch it as a
monitored service.
   The system will wait until the VM is down.

then I manage to shutdown the vm that hosting the engine, then I get
this message


  Enabling and starting HA services
   Hosted Engine successfully set up
[ INFO  ] Stage: Clean up
[ INFO  ] Generating answer file '/etc/ovirt-hosted-engine/answers.conf'
[ INFO  ] Answer file '/etc/ovirt-hosted-engine/answers.conf' has been
updated
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination



Hope you can help me,
Thanks in advance,
sandvik


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Error trying to add new hosted-engine host to upgraded oVirt cluster

2014-11-13 Thread Jiri Moskovcak

On 11/12/2014 04:20 PM, Sandro Bonazzola wrote:

Il 12/11/2014 16:10, David King ha scritto:

Hi everyone,

I have upgraded my oVirt 3.4 hosted engine cluster to oVirt 3.5 using the
upgrade instructions on the Wiki.  Everything appears to be working fine
after the upgrade.

However, I am now trying to add a new host to the hosted engine
configuration but the hosted-engine --deploy fails after sshing the answers
file from the upgraded primary configuration.  The following errors can be
found in the setup log:


Answer file lacks lockspace UUIDs, please use an answer file generated from

the same version you are using on this additional host


Can you please open a BZ about this issue so we can track it?
Jiri, Martin, is the file backend able to handle this kind of upgrade?


Hi Sandro,
yes, it is able to handle it, the lockspace uuid is needed only form 
iscsi (lvm based) storage, which is not the case when upgrading from 
3.4, so we should be safe skipping the check for lockspace UUID in the 
setup if the storage is on nfs.


--Jirka

@David, I'm afraid the setup is not able to add host to the cluster 
created in 3.4, the workaround might be to deploy the host with the 
setup from 3.4 and then update it. Sorry for the inconvenience :-/


--Jirka







I confirmed that the answers file on the upgraded host does not have any
lockspace UUIDs:

OVEHOSTED_STORAGE/storageDatacenterName=str:hosted_datacenter

OVEHOSTED_STORAGE/storageDomainName=str:hosted_storage
OVEHOSTED_STORAGE/storageType=none:None
OVEHOSTED_STORAGE/volUUID=str:da160775-07fe-4569-b45f-03be0c5896a5
OVEHOSTED_STORAGE/domainType=str:nfs3
OVEHOSTED_STORAGE/imgSizeGB=str:25
OVEHOSTED_STORAGE/storageDomainConnection=str:192.168.8.12:
/mnt/data2/vm/engine
OVEHOSTED_STORAGE/connectionUUID=str:880093ea-b0c1-448d-ac55-cde99feebc23
OVEHOSTED_STORAGE/spUUID=str:5e7ff7c2-6e75-4ba8-a5cc-e8dc5d37e478
OVEHOSTED_STORAGE/imgUUID=str:c9466bb6-a78c-4caa-bce3-22c87a5f3f1a
OVEHOSTED_STORAGE/sdUUID=str:b12fd59c-380a-40b3-b7f2-02d455de1d3b



Is there something I can do to update the answers file on the updated 3.5
working host so this will work?

Thanks,
David

PS: Here is the relevant section of the hosted-engine setup log file:

2014-11-11 22:57:04 DEBUG otopi.context context._executeMethod:138 Stage

validation METHOD
otopi.plugins.ovirt_hosted_engine_setup.sanlock.lockspace.Plugin._validation
2014-11-11 22:57:04 DEBUG otopi.context context._executeMethod:152 method
exception
Traceback (most recent call last):
   File /usr/lib/python2.7/site-packages/otopi/context.py, line 142, in
_executeMethod
 method['method']()
   File
/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/sanlock/lockspace.py,
line 102, in _validation
 'Answer file lacks lockspace UUIDs, please use an '
RuntimeError: Answer file lacks lockspace UUIDs, please use an answer file
generated from the same version you are using on this additional host
2014-11-11 22:57:04 ERROR otopi.context context._executeMethod:161 Failed
to execute stage 'Setup validation': Answer file lacks lockspace UUIDs,
please use an answer file generated from the same version you are using on
this additional host
2014-11-11 22:57:04 DEBUG otopi.context context.dumpEnvironment:490
ENVIRONMENT DUMP - BEGIN
2014-11-11 22:57:04 DEBUG otopi.context context.dumpEnvironment:500 ENV
BASE/error=bool:'True'
2014-11-11 22:57:04 DEBUG otopi.context context.dumpEnvironment:500 ENV
BASE/exceptionInfo=list:'[(type 'exceptions.RuntimeError',
RuntimeError('Answer file lacks lockspace UUIDs, please use an answer file
generated from the same version you are using on this additional host',),
traceback object at 0x34c85a8)]'
2014-11-11 22:57:04 DEBUG otopi.context context.dumpEnvironment:504
ENVIRONMENT DUMP - END
2014-11-11 22:57:04 INFO otopi.context context.runSequence:417 Stage:
Clean up
2014-11-11 22:57:04 DEBUG otopi.context context.runSequence:421 STAGE
cleanup
2014-11-11 22:57:04 DEBUG otopi.context context._executeMethod:138 Stage
cleanup METHOD
otopi.plugins.ovirt_hosted_engine_setup.core.remote_answerfile.Plugin._cleanup
2014-11-11 22:57:04 DEBUG otopi.context context._executeMethod:138 Stage
cleanup METHOD
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host.Plugin._cleanup
2014-11-11 22:57:04 DEBUG otopi.context context._executeMethod:138 Stage
cleanup METHOD
otopi.plugins.ovirt_hosted_engine_setup.pki.vdsmpki.Plugin._cleanup
2014-11-11 22:57:04 DEBUG otopi.context context._executeMethod:138 Stage
cleanup METHOD
otopi.plugins.ovirt_hosted_engine_setup.storage.storage.Plugin._cleanup
2014-11-11 22:57:04 DEBUG
otopi.plugins.ovirt_hosted_engine_setup.storage.storage
storage._spmStop:609 spmStop
2014-11-11 22:57:04 DEBUG
otopi.plugins.ovirt_hosted_engine_setup.storage.storage
storage._cleanup:970 Not SPM?
Traceback (most recent call last):
   File
/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/storage.py,
line 968, in _cleanup
 self._spmStop()
   

Re: [ovirt-users] hosted engine on iSCSI deep dive

2014-11-13 Thread Jiri Moskovcak

On 11/12/2014 06:06 PM, jplor...@gmail.com wrote:

Hi,

First thanks for the video. Though I have a question about the setup. I
don't see any options regarding multipath. You are connecting directly
to the iscsi lun, but most of the times the connection is redundant and
multipathd needs to be configured in order to have redundancy.
Is it a future enhancement or multipath is done behind scene? If it's
taking care of it, what about special configs for the device in the
multipath.conf?
Regards



Hi,
unfortunately the current setup doesn't support multipath. It's planned 
as a future enhancement.


--Jirka




___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] hosted engine on iSCSI deep dive

2014-11-12 Thread Jiri Moskovcak

Hi,
for those of you who would like to know more about this 3.5 feature and 
who didn't get a chance to watch it live:


https://www.youtube.com/watch?v=PU99qnKHvbk

--Jirka
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted-Engine HA problem

2014-11-10 Thread Jiri Moskovcak

On 11/11/2014 05:56 AM, Jaicel wrote:

Hi Jirka,

the patch works. it stabilized the status of my two hosts. the engine
migration during failover also works fine. thanks guys!


Hi Jaicel,
I'm glad it works for you! Enjoy the hosted engine ;)

--Jirka



Jaicel


*From: *Jiri Moskovcak jmosk...@redhat.com
*To: *Jaicel jai...@asti.dost.gov.ph
*Cc: *Niels de Vos nde...@redhat.com, Vijay Bellur
vbel...@redhat.com, users@ovirt.org, Gluster Devel
gluster-de...@gluster.org
*Sent: *Monday, November 3, 2014 3:33:16 PM
*Subject: *Re: [ovirt-users] Hosted-Engine HA problem

On 11/01/2014 07:43 AM, Jaicel wrote:
  Hi,
 
  my engine runs on Host1. current status and agent logs below.
 
  Host 1

Hi,
it seems like you ran into [1], you can either zero-out the metadata
file or apply the patch from [1] manually.

--Jirka

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1158925

 
  MainThread::INFO::2014-10-31
16:55:39,918::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
ovirt-hosted-engi
  ne-ha agent 1.1.6 started
  MainThread::INFO::2014-10-31
16:55:39,985::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
  :(_get_hostname) Found certificate common name: 192.168.12.11
  MainThread::INFO::2014-10-31
16:55:40,228::hosted_engine::367::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
  :(_initialize_broker) Initializing ha-broker connection
  MainThread::INFO::2014-10-31
16:55:40,228::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
  nitor) Starting monitor ping, options {'addr': '192.168.12.254'}
  MainThread::INFO::2014-10-31
16:55:40,231::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
  nitor) Success, id 140634215107920
  MainThread::INFO::2014-10-31
16:55:40,231::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
  nitor) Starting monitor mgmt-bridge, options {'use_ssl': 'true',
'bridge_name': 'ovirtmgmt', 'address': '0'}
  MainThread::INFO::2014-10-31
16:55:40,237::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
  nitor) Success, id 140634215108432
  MainThread::INFO::2014-10-31
16:55:40,237::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
  nitor) Starting monitor mem-free, options {'use_ssl': 'true',
'address': '0'}
  MainThread::INFO::2014-10-31
16:55:40,240::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
  nitor) Success, id 39956688
  MainThread::INFO::2014-10-31
16:55:40,240::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
  nitor) Starting monitor cpu-load-no-engine, options {'use_ssl':
'true', 'vm_uuid': '41d4aff1-54e1-4946-a812-2e656bb7d3f
  9', 'address': '0'}
  MainThread::INFO::2014-10-31
16:55:40,243::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
  nitor) Success, id 140634215107664
  MainThread::INFO::2014-10-31
16:55:40,244::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
  nitor) Starting monitor engine-health, options {'use_ssl': 'true',
'vm_uuid': '41d4aff1-54e1-4946-a812-2e656bb7d3f9', '
  address': '0'}
  MainThread::INFO::2014-10-31
16:55:40,249::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
  nitor) Success, id 140634006879632
  MainThread::INFO::2014-10-31
16:55:40,249::hosted_engine::391::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
  :(_initialize_broker) Broker initialized, all submonitors started
  MainThread::INFO::2014-10-31
16:55:40,298::hosted_engine::476::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
  :(_initialize_sanlock) Ensuring lease for lockspace hosted-engine,
host id 1 is acquired (file: /rhev/data-center/mnt/g
 
luster1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_agent/hosted-engine.lockspace)
  MainThread::INFO::2014-10-31
16:55:40,322::state_machine::153::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
  :(refresh) Global metadata: {'maintenance': False}
  MainThread::INFO::2014-10-31
16:55:40,322::state_machine::158::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
  :(refresh) Host 192.168.12.12 (id 2): {'live-data': False, 'extra':
'metadata_parse_version=1\nmetadata_feature_version
  =1\ntimestamp=1413882675 (Tue Oct 21 17:11:15
2014)\nhost-id=2\nscore=2400\nmaintenance=False\nstate=EngineDown\n',
'hostname': '192.168.12.12', 'host-id': 2, 'engine-status': {'reason':
'vm not running on this host', 'health': 'bad', 'vm': 'down', 'detail':
'unknown'}, 'score': 2400, 'maintenance': False, 'host-ts': 1413882675}
  MainThread::INFO::2014-10-31
16:55:40,322::state_machine::161::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
Local (id 1): {'engine-health': None, 'bridge': True, 'mem-free': None,
'maintenance': False, 'cpu-load': None, 'gateway': True}
  MainThread::INFO::2014-10-31
16:55:40,323::brokerlink

Re: [ovirt-users] 3.5 hosted engine: 2nd host Cannot acquire bridge address

2014-11-05 Thread Jiri Moskovcak

On 11/06/2014 02:19 AM, Robert Story wrote:

On Wed, 5 Nov 2014 19:57:07 -0500 Robert wrote:
RS I've got a hosted engine up and running on a freshly installed 3.5 host
RS (CentOS 6.6), and I'm tyying to add a second host. The install fails
RS trying to configure the ovirtmgmt bridge:
RS
RS [ INFO  ] Updating hosted-engine configuration
RS [ INFO  ] Stage: Transaction commit
RS [ INFO  ] Stage: Closing up
RS [ ERROR ] Failed to execute stage 'Closing up': Cannot acquire bridge
RS address
RS
RS From the setup log:
RS [snip]

complete logs from the second host at http://futz.org/users/tmp/ovirt7/




Hi,
this seems like a question for our network gurus, Antoni, can you please 
take a look?


Thanks,
Jirka



Robert



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted engine , how to make changes to the VM post deploy

2014-11-04 Thread Jiri Moskovcak

On 11/04/2014 03:52 PM, Alastair Neil wrote:

Thanks Jirka

So is this the workflow?

set the hosted-engine maintenance to global
shutdown the engine VM
make changes via virsh or editing vm.conf
sync changes to the other ha nodes
restart the VM
set hosted-engine maintenance to none


- well, not official, because it can cause a lot of troubles, so I would 
not recommend it unless you have a really good reason to do it.






Also, the patch for the broker will be in 3.5.1?


- yes

--Jirka




-Alastair


On 4 November 2014 02:50, Jiri Moskovcak jmosk...@redhat.com
mailto:jmosk...@redhat.com wrote:

On 11/04/2014 04:38 AM, Alastair Neil wrote:

I have successfully migrated my standalone ovirt into a hosted
instance.  I have some questions and comments.


Is there a mechanism to make changes to the hosted engine VM post
deploy.  I would like to change some of the VM configuration
choices.

I tried setting the maintenance mode to global and then editing:

   /etc/ovirt-hosted-engine/vm.__conf

I changed something simple: the VM name.  Then I copied the file
to the
two other HA hosts and set the maintenance mode to none.


- you need to kill the vm and restart it after you edit the configuation

hosted-engine --vm-shutdown
hosted-engine --vm-start

- also you can use virsh [1] to edit the vm

[1]

http://wiki.libvirt.org/page/__FAQ#Where_are_VM_config_files___stored.3F_How_do_I_edit_a_VM.__27s_XML_config.3F

http://wiki.libvirt.org/page/FAQ#Where_are_VM_config_files_stored.3F_How_do_I_edit_a_VM.27s_XML_config.3F


No joy the VM hostname remains unchanged in the portal.


Also, the notification system is horrendously spammy, I get email
notification of each state change from each of the HA hosts,
surely this
is not intentional.  Is there some way to control this?


- that's a bug which should be fixed by this patch
http://gerrit.ovirt.org/#/c/__33518/
http://gerrit.ovirt.org/#/c/33518/

--Jirka

-Thanks,  Alastair





_
Users mailing list
Users@ovirt.org mailto:Users@ovirt.org
http://lists.ovirt.org/__mailman/listinfo/users
http://lists.ovirt.org/mailman/listinfo/users





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted engine , how to make changes to the VM post deploy

2014-11-03 Thread Jiri Moskovcak

On 11/04/2014 04:38 AM, Alastair Neil wrote:

I have successfully migrated my standalone ovirt into a hosted
instance.  I have some questions and comments.


Is there a mechanism to make changes to the hosted engine VM post
deploy.  I would like to change some of the VM configuration choices.

I tried setting the maintenance mode to global and then editing:

  /etc/ovirt-hosted-engine/vm.conf

I changed something simple: the VM name.  Then I copied the file to the
two other HA hosts and set the maintenance mode to none.



- you need to kill the vm and restart it after you edit the configuation

hosted-engine --vm-shutdown
hosted-engine --vm-start

- also you can use virsh [1] to edit the vm

[1] 
http://wiki.libvirt.org/page/FAQ#Where_are_VM_config_files_stored.3F_How_do_I_edit_a_VM.27s_XML_config.3F




No joy the VM hostname remains unchanged in the portal.


Also, the notification system is horrendously spammy, I get email
notification of each state change from each of the HA hosts, surely this
is not intentional.  Is there some way to control this?



- that's a bug which should be fixed by this patch 
http://gerrit.ovirt.org/#/c/33518/


--Jirka


-Thanks,  Alastair





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted-Engine HA problem

2014-11-02 Thread Jiri Moskovcak
 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py, 
line 171, in get_stats_from_storage
 result = self._checked_communicate(request)
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py, 
line 199, in _checked_communicate
 .format(message or response))
ovirt_hosted_engine_ha.lib.exceptions.RequestError: Request failed: type 
'exceptions.OSError'
[root@ovirt2 ~]# service ovirt-ha-agent status
ovirt-ha-agent dead but subsys locked


Thanks,
Jaicel

- Original Message -
From: Jiri Moskovcak jmosk...@redhat.com
To: Jaicel jai...@asti.dost.gov.ph
Cc: Niels de Vos nde...@redhat.com, Vijay Bellur vbel...@redhat.com, users@ovirt.org, 
Gluster Devel gluster-de...@gluster.org
Sent: Friday, October 31, 2014 11:05:32 PM
Subject: Re: [ovirt-users] Hosted-Engine HA problem

On 10/31/2014 10:26 AM, Jaicel wrote:

i've increased the limit and then restarted agent and broker. status normalize, but then right now 
it went to False state again but still both having 2400 score. agent logs remains the 
same, with ovirt-ha-agent dead but subsys locked status. ha-broker logs below

Thread-138::INFO::2014-10-31 
17:24:22,981::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
Thread-138::INFO::2014-10-31 
17:24:22,991::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed
Thread-139::INFO::2014-10-31 
17:24:38,385::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
Thread-139::INFO::2014-10-31 
17:24:38,395::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed
Thread-140::INFO::2014-10-31 
17:24:53,816::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
Thread-140::INFO::2014-10-31 
17:24:53,827::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed
Thread-141::INFO::2014-10-31 
17:25:09,172::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
Thread-141::INFO::2014-10-31 
17:25:09,182::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed
Thread-142::INFO::2014-10-31 
17:25:24,551::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
Thread-142::INFO::2014-10-31 
17:25:24,562::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed

Thanks,
Jaicel


ok, now it seems that broker runs fine, so I need the recent agent.log
to debug it more.

--Jirka



- Original Message -
From: Jiri Moskovcak jmosk...@redhat.com
To: Jaicel R. Sabonsolin jai...@asti.dost.gov.ph, Niels de Vos 
nde...@redhat.com
Cc: Vijay Bellur vbel...@redhat.com, users@ovirt.org, Gluster Devel 
gluster-de...@gluster.org
Sent: Friday, October 31, 2014 4:32:02 PM
Subject: Re: [ovirt-users] Hosted-Engine HA problem

On 10/31/2014 03:53 AM, Jaicel R. Sabonsolin wrote:

Hi guys,

these logs appear on both hosts just like the result of --vm-status. tried to 
tcpdump on ovirt hosts and gluster nodes but only packets exchange with my 
monitoring VM(zabbix) appeared.

agent.log
   new_data = self.refresh(self._state.data)
 File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py,
 line 77, in refresh
   stats.update(self.hosted_engine.collect_stats())
 File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
 line 662, in collect_stats
   constants.SERVICE_TYPE)
 File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py, 
line 171, in get_stats_from_storage
   result = self._checked_communicate(request)
 File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py, 
line 199, in _checked_communicate
   .format(message or response))
RequestError: Request failed: type 'exceptions.OSError'

broker.log
 File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, 
line 165, in handle
   response = success  + self._dispatch(data)
 File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, 
line 261, in _dispatch
   .get_all_stats_for_service_type(**options)
 File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
 line 41, in get_all_stats_for_service_type
   d = self.get_raw_stats_for_service_type(storage_dir, service_type)
 File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
 line 74, in get_raw_stats_for_service_type
   f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 24] Too many open files: 
'/rhev/data-center/mnt/gluster1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_agent/hosted-engine.metadata'


- ah, there we go

Re: [ovirt-users] Hosted-Engine HA problem

2014-10-31 Thread Jiri Moskovcak

On 10/31/2014 03:53 AM, Jaicel R. Sabonsolin wrote:

Hi guys,

these logs appear on both hosts just like the result of --vm-status. tried to 
tcpdump on ovirt hosts and gluster nodes but only packets exchange with my 
monitoring VM(zabbix) appeared.

agent.log
 new_data = self.refresh(self._state.data)
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py,
 line 77, in refresh
 stats.update(self.hosted_engine.collect_stats())
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
 line 662, in collect_stats
 constants.SERVICE_TYPE)
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py, 
line 171, in get_stats_from_storage
 result = self._checked_communicate(request)
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py, 
line 199, in _checked_communicate
 .format(message or response))
RequestError: Request failed: type 'exceptions.OSError'

broker.log
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, 
line 165, in handle
 response = success  + self._dispatch(data)
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, 
line 261, in _dispatch
 .get_all_stats_for_service_type(**options)
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
 line 41, in get_all_stats_for_service_type
 d = self.get_raw_stats_for_service_type(storage_dir, service_type)
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
 line 74, in get_raw_stats_for_service_type
 f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 24] Too many open files: 
'/rhev/data-center/mnt/gluster1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_agent/hosted-engine.metadata'


- ah, there we go ^^ you might need to tweak the limit of allowed 
open files as described here [1] or find the app keeps so many files open



--Jirka

[1] 
http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/



Thread-38160::INFO::2014-10-31 
10:28:37,989::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed
Thread-38161::INFO::2014-10-31 
10:28:53,656::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
Thread-38161::ERROR::2014-10-31 
10:28:53,657::listener::190::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Error handling request, data: 'get-stats 
storage_dir=/rhev/data-center/mnt/gluster1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_agent
 service_type=hosted-engine'
Traceback (most recent call last):
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, 
line 165, in handle
 response = success  + self._dispatch(data)
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, 
line 261, in _dispatch
 .get_all_stats_for_service_type(**options)
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
 line 41, in get_all_stats_for_service_type
 d = self.get_raw_stats_for_service_type(storage_dir, service_type)
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
 line 74, in get_raw_stats_for_service_type
 f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 24] Too many open files: 
'/rhev/data-center/mnt/gluster1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_agent/hosted-engine.metadata'
Thread-38161::INFO::2014-10-31 
10:28:53,658::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed

Thanks,
Jaicel

- Original Message -
From: Niels de Vos nde...@redhat.com
To: Vijay Bellur vbel...@redhat.com
Cc: Jiri Moskovcak jmosk...@redhat.com, Jaicel R. Sabonsolin jai...@asti.dost.gov.ph, 
users@ovirt.org, Gluster Devel gluster-de...@gluster.org
Sent: Friday, October 31, 2014 4:11:25 AM
Subject: Re: [ovirt-users] Hosted-Engine HA problem

On Thu, Oct 30, 2014 at 09:07:24PM +0530, Vijay Bellur wrote:

On 10/30/2014 06:45 PM, Jiri Moskovcak wrote:

On 10/30/2014 09:22 AM, Jaicel R. Sabonsolin wrote:

Hi Guys,

I need help with my ovirt Hosted-Engine HA setup. I am running on 2
ovirt hosts and 2 gluster nodes with replicated volumes. i already have
VMs running on my hosts and they can migrate normally once i for example
power off the host that they are running on. the problem is that the
engine can't migrate once i switch off the host that hosts the engine.

oVirt3.4.3-1.el6
KVM 0.12.1.2 - 2.415.el6_5.10
LIBVIRT   libvirt-0.10.2-29.el6_5.9
VDSM  vdsm-4.14.17-0.el6


right now, i have this result from hosted-engine --vm-status.

   File /usr/lib64/python2.6/runpy.py, line 122, in
_run_module_as_main
 __main__, fname, loader, pkg_name)
   File /usr/lib64/python2.6

Re: [ovirt-users] hosted engine deploy failed 3.5 centos 6.5 host FC20 vm

2014-10-31 Thread Jiri Moskovcak

Hi Alastair,
I need the engine.log to debug it, because the actual problem is logged 
there.


Thanks,
Jirka

On 10/29/2014 08:58 PM, Alastair Neil wrote:

OK I seem to be having some fundamental confusion about this migration.


I have an existing ovirt 3.5 (upgraded from 3.4) setup with  a Data
Center containing four clusters, 3 VM clusters for 3 differenc classes
of CPU hosts (Penryn, Nehalem, and SandyBridge).  I also have a  gluster
storage cluster.

There are 4 storage domains, an Export domain (Export-Dom1) nfs v1, and
ISO domain (Gluster-ISOs) posix FS v1, a Data domain (Gluster Data)
GlusterFS V3, and a Data (Master) (Gluster-VM-Store) GlusterFS v3.

As Gluster replica 2 is not considered adequate for the hosted-engine
storage I created a volume in the gluster store and exported it as NFS.
This is what I planned to use as the storage pool for the hosted
engine.  So far so good.

I have tried the deployment several times now,  and it fails with the
following:

[ ERROR ] Cannot automatically add the host to cluster None: HTTP
Status 401
[ ERROR ] Failed to execute stage 'Closing up': Cannot add the host
to cluster None


2014-10-29 15:26:11 DEBUG
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host
add_host._closeup:502 Cannot add the host to cluster None
Traceback (most recent call last):
   File

/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/engine/add_host.py,
line 426, in _closeup
 ca_file=self.cert,
   File /usr/lib/python2.6/site-packages/ovirtsdk/api.py, line
154, in __init__
 url=''
   File
/usr/lib/python2.6/site-packages/ovirtsdk/infrastructure/proxy.py,
line 118, in request
 persistent_auth=self._persistent_auth)
   File
/usr/lib/python2.6/site-packages/ovirtsdk/infrastructure/proxy.py,
line 146, in __doRequest
 persistent_auth=persistent_auth
   File
/usr/lib/python2.6/site-packages/ovirtsdk/web/connection.py, line
134, in doRequest
 raise RequestError, response
RequestError:
status: 401
reason: Unauthorized
detail: HTTP Status 401
2014-10-29 15:26:11 ERROR
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host
add_host._closeup:510 Cannot automatically add the host to
cluster None:
HTTP Status 401
2014-10-29 15:26:11 DEBUG otopi.context context._executeMethod:152
method exception
Traceback (most recent call last):
   File /usr/lib/python2.6/site-packages/otopi/context.py, line
142, in _executeMethod
 method['method']()
   File

/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/engine/add_host.py,
line 517, in _closeup
 cluster=cluster_name,
RuntimeError: Cannot add the host to cluster None



The hosted-engine host cluster name it seems is set to None, and then
fails to add the host as there is no cluster None in the restored
engine.  Presumably the storage domain would need to be added too,
however I don't ever seem to see any message about this

I recall being prompted for a data-center name and even a storage-domain
name, but not a cluster name, so am I missing a step.  I could use some
guidance as I am stumped.  Is there some pre-migration tasks I am
failing to do in the original engine?



.

On 29 October 2014 03:10, Jiri Moskovcak jmosk...@redhat.com
mailto:jmosk...@redhat.com wrote:

On 10/27/2014 06:22 PM, Alastair Neil wrote:

After belatedly realising that no engine for EL7 is planned for
3.5 I
tried using FC20:

I used a database called engine with user engine on the VM to
restore to.
The engine-backup restore appeared to complete with no errors
save the
canonical complaint about less that 16GB of memory being available.
However on completion the host the hosted-engine-deploy threw
this error:

 Failed to execute stage 'Closing up': The host name
 ovirt-admin-hosted.x.xxx.__edu
http://ovirt-admin-hosted.x.xxx.edu
 http://ovirt-admin-hosted.__vsnet.gmu.edu
http://ovirt-admin-hosted.vsnet.gmu.edu contained in the URL

 doesn't match any of the names in the server certificate.


from the setup log

 2014-10-27 12:55:49 DEBUG
 otopi.ovirt_hosted_engine___setup.check_liveliness
 check_liveliness.isEngineUp:46 Checking for Engine health
status
 2014-10-27 12:55:50 INFO
 otopi.ovirt_hosted_engine___setup.check_liveliness
 check_liveliness.isEngineUp:64 Engine replied: DB Up!Welcome to
 Health Status!
 2014-10-27 12:55:50 DEBUG otopi.context
context._executeMethod:138
 Stage closeup METHOD


otopi.plugins.ovirt_hosted___engine_setup.engine.add_host.__Plugin._closeup
 2014-10-27 12:55:50 DEBUG

Re: [ovirt-users] Hosted-Engine HA problem

2014-10-31 Thread Jiri Moskovcak

On 10/31/2014 10:26 AM, Jaicel wrote:

i've increased the limit and then restarted agent and broker. status normalize, but then right now 
it went to False state again but still both having 2400 score. agent logs remains the 
same, with ovirt-ha-agent dead but subsys locked status. ha-broker logs below

Thread-138::INFO::2014-10-31 
17:24:22,981::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
Thread-138::INFO::2014-10-31 
17:24:22,991::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed
Thread-139::INFO::2014-10-31 
17:24:38,385::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
Thread-139::INFO::2014-10-31 
17:24:38,395::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed
Thread-140::INFO::2014-10-31 
17:24:53,816::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
Thread-140::INFO::2014-10-31 
17:24:53,827::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed
Thread-141::INFO::2014-10-31 
17:25:09,172::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
Thread-141::INFO::2014-10-31 
17:25:09,182::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed
Thread-142::INFO::2014-10-31 
17:25:24,551::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
Thread-142::INFO::2014-10-31 
17:25:24,562::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed

Thanks,
Jaicel


ok, now it seems that broker runs fine, so I need the recent agent.log 
to debug it more.


--Jirka



- Original Message -
From: Jiri Moskovcak jmosk...@redhat.com
To: Jaicel R. Sabonsolin jai...@asti.dost.gov.ph, Niels de Vos 
nde...@redhat.com
Cc: Vijay Bellur vbel...@redhat.com, users@ovirt.org, Gluster Devel 
gluster-de...@gluster.org
Sent: Friday, October 31, 2014 4:32:02 PM
Subject: Re: [ovirt-users] Hosted-Engine HA problem

On 10/31/2014 03:53 AM, Jaicel R. Sabonsolin wrote:

Hi guys,

these logs appear on both hosts just like the result of --vm-status. tried to 
tcpdump on ovirt hosts and gluster nodes but only packets exchange with my 
monitoring VM(zabbix) appeared.

agent.log
  new_data = self.refresh(self._state.data)
File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py,
 line 77, in refresh
  stats.update(self.hosted_engine.collect_stats())
File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
 line 662, in collect_stats
  constants.SERVICE_TYPE)
File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py, 
line 171, in get_stats_from_storage
  result = self._checked_communicate(request)
File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py, 
line 199, in _checked_communicate
  .format(message or response))
RequestError: Request failed: type 'exceptions.OSError'

broker.log
File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, 
line 165, in handle
  response = success  + self._dispatch(data)
File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, 
line 261, in _dispatch
  .get_all_stats_for_service_type(**options)
File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
 line 41, in get_all_stats_for_service_type
  d = self.get_raw_stats_for_service_type(storage_dir, service_type)
File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
 line 74, in get_raw_stats_for_service_type
  f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 24] Too many open files: 
'/rhev/data-center/mnt/gluster1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_agent/hosted-engine.metadata'


- ah, there we go ^^ you might need to tweak the limit of allowed
open files as described here [1] or find the app keeps so many files open


--Jirka

[1]
http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/


Thread-38160::INFO::2014-10-31 
10:28:37,989::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Connection closed
Thread-38161::INFO::2014-10-31 
10:28:53,656::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
 Connection established
Thread-38161::ERROR::2014-10-31 
10:28:53,657::listener::190::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
 Error handling request, data: 'get-stats 
storage_dir=/rhev/data-center/mnt/gluster1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_agent
 service_type=hosted-engine'
Traceback (most recent call last

Re: [ovirt-users] Hosted Engine fail after upgrading to 3.5

2014-10-30 Thread Jiri Moskovcak

On 10/29/2014 11:32 AM, Stefano Stagnaro wrote:

Hi,

please find the new logs at the same place: 
https://www.dropbox.com/sh/qh2rbews45ky2g8/AAC4_4_j94cw6sI_hfaSFg-Fa?dl=0

Thank you,



Hi,
I can't see any exception related to the migration process, but it's 
full of some exception about getting storage data from database. Nir, 
can you please take a look if it's something critical which might 
influence the migration process?


Thanks,
Jirka
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted Engine fail after upgrading to 3.5

2014-10-30 Thread Jiri Moskovcak

On 10/24/2014 02:12 PM, Stefano Stagnaro wrote:

Hi Jirka,

thank you for the reply. I've uploaded all the relevant logs in here: 
https://www.dropbox.com/sh/qh2rbews45ky2g8/AAC4_4_j94cw6sI_hfaSFg-Fa?dl=0

Thank you,



After closer look I noticed, that there is 'null' as the engine-state in 
the metadata. This shouldn't happen, but we also should be able to 
recover when this happens.


--Jirka
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted engine deploy failed 3.5 centos 6.5 host FC20 vm

2014-10-29 Thread Jiri Moskovcak

On 10/27/2014 06:22 PM, Alastair Neil wrote:

After belatedly realising that no engine for EL7 is planned for 3.5 I
tried using FC20:

I used a database called engine with user engine on the VM to restore to.
The engine-backup restore appeared to complete with no errors save the
canonical complaint about less that 16GB of memory being available.
However on completion the host the hosted-engine-deploy threw this error:

Failed to execute stage 'Closing up': The host name
ovirt-admin-hosted.x.xxx.edu
http://ovirt-admin-hosted.vsnet.gmu.edu contained in the URL
doesn't match any of the names in the server certificate.


from the setup log

2014-10-27 12:55:49 DEBUG
otopi.ovirt_hosted_engine_setup.check_liveliness
check_liveliness.isEngineUp:46 Checking for Engine health status
2014-10-27 12:55:50 INFO
otopi.ovirt_hosted_engine_setup.check_liveliness
check_liveliness.isEngineUp:64 Engine replied: DB Up!Welcome to
Health Status!
2014-10-27 12:55:50 DEBUG otopi.context context._executeMethod:138
Stage closeup METHOD
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host.Plugin._closeup
2014-10-27 12:55:50 DEBUG
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host
add_host._getPKICert:89 Acquiring ca.crt from the engine
2014-10-27 12:55:50 DEBUG
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host
add_host._getPKICert:101 -BEGIN CERTIFICATE-
MIID3DCCAsSgAwIBAgICEAAwDQYJKoZIhvcNAQEFBQAwTzELMAkGA1UEBhMCVVMxFjAUBgNVBAoT
DXZzbmV0LmdtdS5lZHUxKDAmBgNVBAMTH292aXJ0LWFkbWluLnZzbmV0LmdtdS5lZHUuNzIyNDcw
IhcRMTMxMTExMTk1NTQ1KzAwMDAXDTIzMTExMDE5NTU0NVowTzELMAkGA1UEBhMCVVMxFjAUBgNV
BAoTDXZzbmV0LmdtdS5lZHUxKDAmBgNVBAMTH292aXJ0LWFkbWluLnZzbmV0LmdtdS5lZHUuNzIy
NDcwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDAzjsdTOPIhruA/TvupQ+syMdVu8GT

VJ9IlFdqc/RhiV9YB6snYAF6MIeWKnW0eOL9jY/5TmfIqY/+rvYvLhPui1/612KoW9kEcZXUw0k-2
ntz1i+wHv5PEq1Cvn/G8mI9b56EFiiYPfAzcdKGbJ8iqafFPW71/612KoW9kEcZXUwyUXLHF01Yo
nQGAtjL+VGgY6jWaaFD4j/5XTkzfcybI8jAW8o97vfTrnmqe+2cvIUyip9l5KQJjblO6FDjpJJUC
MhyDEjJPCKAT1kW1f3E/t8lHD4UUsMpX4rB142oGwBo5st3sGlUks5fFLHtYjFTUYSSmTwOlnq+t
D8HFr01lAgMBAAGjgb0wgbowHQYDVR0OBBYEFFpdSy5ACG6PC8YtE8vGRYvSYyI6MHgGA1UdIwRx
MG+AFFpdSy5ACG6PC8YtE8vGRYvSYyI6oVOkUTBPMQswCQYDVQQGEwJVUzEWMBQGA1UEChMNdnNu
ZXQuZ211LmVkdTEoMCYGA1UEAxMfb3ZpcnQtYWRtaW4udnNuZXQuZ211LmVkdS43MjI0N4ICEAAw
DwYDVR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8EBAMCAQYwDQYJKoZIhvcNAQEFBQADggEBAKqhXoL/
jlVhw9qasoqMnJw6ypHjJQCVAukCHvwioHVz+XwvIcIGuod+rHOcvexPZyCkacU2sOaIPjnyv8mJ
sNQ4nKW/oGwUfiKBgsvjv+cHAaqcQNn7MI0VDL71ulYq8UpW0bX3n5fafbstbdN1K2uad3UZH0ae
pv+gLiCXIKTmTtRtHCiKAxVw7Nx48rN8jJyzbP0FoK0+uddrI4TSJDfa5F3USdiYCk/bPCLThDPe
UgpyVDXH11c+j+Bp8IKUvNLLw6gjBkDkPa6oS7qKIP9DaVuroJyUO7OQOes3Uz54+QGc1A+Zewv+
2mgdbFVYcsm1qpxBYL6R5fK2ThMz4r8=
-END CERTIFICATE-
2014-10-27 12:55:50 DEBUG
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host
add_host._getSSHkey:111 Acquiring SSH key from the engine
2014-10-27 12:55:50 DEBUG
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host
add_host._getSSHkey:123 ssh-rsa

B3NzaC1yc2EDAQABAAABAQCpmyaDlP8Kt/yDb/kB4OaIdPx2sgH8T5Ra6hBRGHMxnTtykajnDj9WMannNc0F3d0htvVQXPKZYxxsXxNeHq00Ga/agnCjsYM9EjzujdsBqvyOTjlVX3BVWhWGZu5yNxYwpvdQBRCzhHibgqaafWNRvaixUeO1VAlU+q5W4bZDxJwKui+Bf1dLuZw94zHKs3jiGFcQOegJUVYmWuLVh5GH6SNLMLdbJdr4B5MwlK8ItiOC9XgUdH0RxN56Y1PEUkLserNOW/FxsXuf+cbWRsMtVa5xj82AlDWQUjyQleC91Nl7FT3OHGU1nJf289EjzujdsBqvyOTjlVX3BV5
ovirt-engine
2014-10-27 12:55:50 DEBUG otopi.transaction transaction._prepare:77
preparing 'File transaction for '/root/.ssh/authorized_keys''
2014-10-27 12:55:50 DEBUG otopi.filetransaction
filetransaction.prepare:194 file '/root/.ssh/authorized_keys' missing
2014-10-27 12:55:50 DEBUG otopi.transaction transaction.commit:159
committing 'File transaction for '/root/.ssh/authorized_keys''
2014-10-27 12:55:50 DEBUG otopi.filetransaction
filetransaction.commit:327 Executing restorecon for /root/.ssh
2014-10-27 12:55:50 DEBUG otopi.filetransaction
filetransaction.commit:341 restorecon result rc=0, stdout=, stderr=
2014-10-27 12:55:50 DEBUG
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host
plugin.executeRaw:785 execute: ('/sbin/restorecon', '-r',
'/root/.ssh'), executable='None', cwd='None', env=None
2014-10-27 12:55:50 DEBUG
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host
plugin.executeRaw:803 execute-result: ('/sbin/restorecon', '-r',
'/root/.ssh'), rc=0
2014-10-27 12:55:50 DEBUG
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host
plugin.execute:861 execute-output: ('/sbin/restorecon', '-r',
'/root/.ssh') stdout:

2014-10-27 12:55:50 DEBUG
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host
plugin.execute:866 execute-output: ('/sbin/restorecon', '-r',
'/root/.ssh') stderr:

2014-10-27 

Re: [ovirt-users] migrate from stand alone on FC19 to hosted OK to use CentOS 7?

2014-10-29 Thread Jiri Moskovcak

On 10/27/2014 06:30 PM, Alastair Neil wrote:

Thanks Jirka

I assume EL7 will at some point be supported by the engine?



- at some point yes, but unfortunately I can't give you any specific date

--Jirka


-Alastair


On 27 October 2014 02:54, Jiri Moskovcak jmosk...@redhat.com
mailto:jmosk...@redhat.com wrote:

On 10/25/2014 12:11 AM, Alastair Neil wrote:

I am trying to migrate my old ovirt install which started out at 3.3
standalone engine on FC19 to a hosted engine.  I want to use
CentOS ,
however, the postgresql version on 6.5 is old (8.4.20) and I am
unable
to get a clean restore.  The version on FC 19 is 9.2.8, it looks
like EL
7 has 9.2.7 (I am hoping the difference in the minor rev will
not bite me).

I was wondering if there are any issues using EL 7 to host the
engine?
I know I had seen some reports of issues with 3.5 on EL7 as
hosts but
was not sure if the engine had any gotchas.


using el7 on host for hosted engine is ok (just tried that few times
last week), but engine is not supported on el7, so use different os
when installing the engine vm.

--Jirka


Thanks, Alastair



_
Users mailing list
Users@ovirt.org mailto:Users@ovirt.org
http://lists.ovirt.org/__mailman/listinfo/users
http://lists.ovirt.org/mailman/listinfo/users





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted Engine fail after upgrading to 3.5

2014-10-29 Thread Jiri Moskovcak

On 10/27/2014 06:07 PM, Stefano Stagnaro wrote:

Hi Jirka,

after truncating the metadata file the Engine is running again.

Unfortunately now the HA migration is not working anymore. If I put the host with the 
running Engine in local maintenance, the HA go trough EngineMigratingAway - 
ReinitializeFSM - LocalMaintenance but the VM never migrates.

I can read this error in the agent.log:

MainThread::ERROR::2014-10-27 
18:02:51,053::hosted_engine::867::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitor_migration)
 Failed to migrate
Traceback (most recent call last):
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
 line 863, in _monitor_migration
 vm_id,
   File 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/vds_client.py, 
line 85, in run_vds_client_cmd
 response['status']['message'])
DetailedError: Error 12 from migrateStatus: Fatal error during migration

Thank you,



Hi,
to debug the migration failure I gonna need the engine.log.

Thanks,
Jirka
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.5 install issues: failed to setup ovirtmgmt

2014-10-27 Thread Jiri Moskovcak

On 10/26/2014 11:13 PM, Dan Kenigsberg wrote:

On Sat, Oct 25, 2014 at 09:18:18PM -0400, Robert Story wrote:

On Sat, 25 Oct 2014 23:30:41 +0100 Dan wrote:
DK On Sat, Oct 25, 2014 at 03:18:32PM -0400, Robert Story wrote:
DK  line 225, in _setupNetworks 'message: %s' % (networks, code,
DK  message)) RuntimeError: Failed to setup networks {'ovirtmgmt':
DK  {'nic': 'eth0', 'netmask': '255.255.255.0', 'bootproto': 'none',
DK  'ipaddr': '10.69.79.31'}}. Error code: 16 message: Unexpected
DK  exception
DK
DK This means that something nasty happened inside Vdsm while attempting to
DK create the bridge.
DK
DK Can you attach vdsm.log and supervdsmd.log?

Sure...

http://futz.org/users/tmp/ovirt5/supervdsm.log
http://futz.org/users/tmp/ovirt5/vdsm.log


supervdsm.log has

MainProcess|Thread-16::DEBUG::2014-10-25 
12:37:25,045::supervdsmServer::101::SuperVdsm.ServerCallback::(wrapper) call 
setupNetworks with ({'ovirtmgmt': {'nic': 'eth0', 'netmask': '255.255.255.0', 
'bootproto': 'none', 'ipaddr': '10.69.79.31'}}, {}, {'connectivityCheck': 
False}) {}
MainProcess|Thread-16::WARNING::2014-10-25 
12:37:25,046::libvirtconnection::135::root::(wrapper) connection to libvirt 
broken. ecode: 1 edom: 7
MainProcess|Thread-16::CRITICAL::2014-10-25 
12:37:25,046::libvirtconnection::137::root::(wrapper) taking calling process 
down.
MainThread::DEBUG::2014-10-25 
12:37:25,046::supervdsmServer::451::SuperVdsm.Server::(main) Terminated normally
MainProcess|Thread-16::DEBUG::2014-10-25 
12:37:25,046::libvirtconnection::143::root::(wrapper) Unknown libvirterror: 
ecode: 1 edom: 7 level: 2 message: internal error client socket is closed

which means that libvirtd has died (or was restarted).

Jiri, can you tell why this has happened?


Hi Dan,
I'm not libvirt expert, so I don't have any idea why it died and I don't 
think the setup restarts it during the installation.


--Jirka



Dan.



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] migrate from stand alone on FC19 to hosted OK to use CentOS 7?

2014-10-27 Thread Jiri Moskovcak

On 10/25/2014 12:11 AM, Alastair Neil wrote:

I am trying to migrate my old ovirt install which started out at 3.3
standalone engine on FC19 to a hosted engine.  I want to use CentOS ,
however, the postgresql version on 6.5 is old (8.4.20) and I am unable
to get a clean restore.  The version on FC 19 is 9.2.8, it looks like EL
7 has 9.2.7 (I am hoping the difference in the minor rev will not bite me).

I was wondering if there are any issues using EL 7 to host the engine?
I know I had seen some reports of issues with 3.5 on EL7 as hosts but
was not sure if the engine had any gotchas.


using el7 on host for hosted engine is ok (just tried that few times 
last week), but engine is not supported on el7, so use different os when 
installing the engine vm.


--Jirka



Thanks, Alastair



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Self hosted engine and storage domain limitations

2014-10-24 Thread Jiri Moskovcak

On 10/23/2014 04:00 PM, Gianluca Cecchi wrote:

On Thu, Oct 23, 2014 at 9:04 AM, Jiri Moskovcak jmosk...@redhat.com
mailto:jmosk...@redhat.com wrote:

On 10/22/2014 03:52 PM, Gianluca Cecchi wrote:

Hello,
now with oVirt 3.5 there is also iSCSI with NFS as backed
storage for
Hosted Engine.
What is the limitation about the storage domain? The integration in
setup scripts or due to storage type itself in engine mgmt?


Both ;) The storage is managed via vdsm so whatever vdsm supports HE
should support, but it requires some work in setup and the agent code.


What I mean is: if in some way one initially configures the
engine on an
external server (physical or virtual), could then be feasible
migrating
it manually in the oVirt environment (with backup/restore steps)?


- yes


Sorry,
do you mean that I can have a Gluster storage domain, with an external
engine and then backup/restore the egine inside a vm on this Gluster
environment as a self hosted engine?



- yes, moving the engine to the vm is not different from moving it to 
another physical machine, is it?


--Jirka


Thanks
Gianluca


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Self hosted engine and storage domain limitations

2014-10-24 Thread Jiri Moskovcak

On 10/24/2014 08:31 AM, Jiri Moskovcak wrote:

On 10/23/2014 04:00 PM, Gianluca Cecchi wrote:

On Thu, Oct 23, 2014 at 9:04 AM, Jiri Moskovcak jmosk...@redhat.com
mailto:jmosk...@redhat.com wrote:

On 10/22/2014 03:52 PM, Gianluca Cecchi wrote:

Hello,
now with oVirt 3.5 there is also iSCSI with NFS as backed
storage for
Hosted Engine.
What is the limitation about the storage domain? The
integration in
setup scripts or due to storage type itself in engine mgmt?


Both ;) The storage is managed via vdsm so whatever vdsm supports HE
should support, but it requires some work in setup and the agent
code.


What I mean is: if in some way one initially configures the
engine on an
external server (physical or virtual), could then be feasible
migrating
it manually in the oVirt environment (with backup/restore
steps)?


- yes


Sorry,
do you mean that I can have a Gluster storage domain, with an external
engine and then backup/restore the egine inside a vm on this Gluster
environment as a self hosted engine?



- yes, moving the engine to the vm is not different from moving it to
another physical machine, is it?



- altough gluster doesn't work well with the hosted engine, the 
migration is possible, but the actual usage of hosted engine on top of 
gluster is known to cause problems.


--Jirka


--Jirka


Thanks
Gianluca


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted Engine fail after upgrading to 3.5

2014-10-24 Thread Jiri Moskovcak

On 10/24/2014 01:11 PM, Stefano Stagnaro wrote:

Hi. After upgrading from 3.4 to 3.5 (I've followed the official RHEV 
documentation) the Hosted Engine VM cannot boot up anymore.

hosted-engine --vm-status says Engine status: unknown stale-data
agent.log says: Error: ''NoneType' object has no attribute 'iteritems''

Some logs:
- agent on node ov0h21: http://fpaste.org/144822/
- agent on node ov0h21: http://fpaste.org/144824/
- hosted-engine --vm-status: http://fpaste.org/144825/


Hi Stefano,
can you please provide also the broker log?

Thank you,
Jirka



Thank you,



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted Engine fail after upgrading to 3.5

2014-10-24 Thread Jiri Moskovcak

On 10/24/2014 01:30 PM, Jiri Moskovcak wrote:

On 10/24/2014 01:11 PM, Stefano Stagnaro wrote:

Hi. After upgrading from 3.4 to 3.5 (I've followed the official RHEV
documentation) the Hosted Engine VM cannot boot up anymore.

hosted-engine --vm-status says Engine status: unknown stale-data
agent.log says: Error: ''NoneType' object has no attribute 'iteritems''

Some logs:
- agent on node ov0h21: http://fpaste.org/144822/
- agent on node ov0h21: http://fpaste.org/144824/
- hosted-engine --vm-status: http://fpaste.org/144825/


Hi Stefano,
can you please provide also the broker log?


- and also this file:
/rhev/data-center/mnt/ov0nfs:_engine/e4e8282e-6bde-4332-ad68-313287b4fc65/ha_agent/hosted-engine.metadata

Thanks,
Jirka



Thank you,
Jirka



Thank you,



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted Engine fail after upgrading to 3.5

2014-10-24 Thread Jiri Moskovcak

On 10/24/2014 02:12 PM, Stefano Stagnaro wrote:

Hi Jirka,

thank you for the reply. I've uploaded all the relevant logs in here: 
https://www.dropbox.com/sh/qh2rbews45ky2g8/AAC4_4_j94cw6sI_hfaSFg-Fa?dl=0

Thank you,



Hi Stefano,
I'd say, that agent is not able to parse the metadata from the previous 
version, so as a workaround before I fix it you can try to zero out the 
metadata file (backup the original just in case..)


1. stop agent and broker on all hosts
2. truncate the file

this should do the trick:

$ service ovirt-ha-agent stop; service ovirt-ha-broker stop;
$ truncate --size 0 
/rhev/data-center/mnt/ov0nfs:_engine/e4e8282e-6bde-4332-ad68-313287b4fc65/ha_agent/hosted-engine.metadata 

$ truncate --size 1M 
/rhev/data-center/mnt/ov0nfs:_engine/e4e8282e-6bde-4332-ad68-313287b4fc65/ha_agent/hosted-engine.metadata

$ service ovirt-ha-broker start; service ovirt-ha-agent start


--Jirka
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine on CentOS 6.5

2014-10-23 Thread Jiri Moskovcak

On 10/23/2014 02:51 AM, Robert Story wrote:

I'm trying a new oVirt3.5  hosted engine on a fresh CentOS 6.5 install. The
engine VM install goes fine, but when I get back to the host to continue
the install, I get this error:

[ INFO  ] Engine replied: DB Up!Welcome to Health Status!
[ ERROR ] Cannot automatically add the host to cluster None: HTTP Status 401
[ ERROR ] Failed to execute stage 'Closing up': Cannot add the host to cluster 
None

This is the same error reported on a CentOS 7 install [1], but that was
apparently the result of some jpackage confusion, and I'm working with a
fresh install.

During the engine install I had specified the Application mode as oVirt only
(no gluster). So I tried again on another host (again, fresh install), and
this time specified Both for Application mode, but I got the same error.

I've attached the host and engine install logs from the second install...
Any suggestions appreciated..


Robert



Hi Robert,
to be able to investigate this I'm gonna need the engine.log.

Thanks,
Jirka




___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Self hosted engine and storage domain limitations

2014-10-23 Thread Jiri Moskovcak

On 10/22/2014 03:52 PM, Gianluca Cecchi wrote:

Hello,
now with oVirt 3.5 there is also iSCSI with NFS as backed storage for
Hosted Engine.
What is the limitation about the storage domain? The integration in
setup scripts or due to storage type itself in engine mgmt?


Both ;) The storage is managed via vdsm so whatever vdsm supports HE 
should support, but it requires some work in setup and the agent code.




What I mean is: if in some way one initially configures the engine on an
external server (physical or virtual), could then be feasible migrating
it manually in the oVirt environment (with backup/restore steps)?



- yes


Is there any plan to add for example in 3.6 another storage domain type
such as local posixfs, or some other type to be able to realize sort of
vSphere vSAN infrastructure for small/labs environments?



Yes, so far we have a plan to add fiber channel support in 3.6 but maybe 
we throw in even other popular storage backends, but nothing specific is 
decided yet. You can file a feature request in our bugzilla [1]if you 
have a good use case for some storage backend we will definitely 
consider it.


Regards,
Jirka


[1] https://bugzilla.redhat.com/enter_bug.cgi?product=oVirt

Gianluca


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted-engine on CentOS7

2014-10-20 Thread Jiri Moskovcak

On 10/20/2014 04:02 PM, Finstrle, Ludek wrote:


Sandro Bonazzola píše v Po 20. 10. 2014 v 09:14 +0200:
  Il 17/10/2014 23:57, Finstrle, Ludek ha scritto:
  
   Hi,
  
   I'm trying to install hosted-engine (ovirt 3.5 rc5)
   on CentOS 7 without success.
   I have host with name lvm017.lab.ovirt with CentOS 7.
   I install there ovirt-hosted-engine-setup-1.2.1-1.el7.noarch.
  
   I run hosted-engine --deploy and answer all questions.
   I install CentOS 6.5 as OS for engine and install
   ovirt 3.5 rc5 engine (+ engine-setup).
  
   When I continue (2nd time: 1st reboot, 2nd engine done)
   with hosted-engine installation I always get:
  
   [ INFO ] Engine replied: DB Up!Welcome to Health Status!
   [ ERROR ] Cannot automatically add the host to cluster None: HTTP
Status 500
   [ ERROR ] Failed to execute stage 'Closing up': Cannot add the host
to cluster None
 
 
  You should have got a question about which cluster should be used for
adding the host.
  The cluster name must exist in the engine for allowing setup to add
the host to the cluster.
  Can you retry using oVirt 3.5 we released last Friday ensuring that
the cluster name (usually Default) is the same of the cluster you're
adding the
  host?

No question for cluster name (I also tried installation with edited
answer.conf: OVEHOSTED_ENGINE/clusterName=str:Default without
success). I get OVEHOSTED_ENGINE/clusterName=none:None from
installation.
You can see that Kapetanakis is asking about problem with API (the same
error on engine side -
https://www.mail-archive.com/users@ovirt.org/msg21780.html) after
upgrade to ovirt 3.5. Could it be related?

I installed ovirt-release35 on the physical host and yum update said No
packages marked for update. I'm trying to go through installation once
again but I'm a little bit skeptical.

I see a lot of upgrade success but no new install success. Is somebody
able to install hosted-engine on rhel/centos 7 with ovirt 3.5?



I was successful installing the HE on Fedora 20, which should be very 
similar to rhel7. I'm going to try it on the rhel7 host just to be sure. 
Will update you with the results asap.


--Jirka


Cheers,

Luf

 
 
   [ INFO ] Stage: Clean up
  
   ovirt-hosted-engine-setup log:
   2014-10-17 23:26:26 DEBUG
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host
add_host._closeup:415 Connecting to the Engine
   2014-10-17 23:26:28 DEBUG
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host
add_host._closeup:502 Cannot add the host to cluster None
   Traceback (most recent call last):
   File
/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/engine/add_host.py,
line 426, in _closeup
   ca_file=self.cert,
   File /usr/lib/python2.7/site-packages/ovirtsdk/api.py, line 154,
in __init__
   url=''
   File
/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py,
line 118, in request
   persistent_auth=self._persistent_auth)
   File
/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py,
line 146, in __doRequest
   persistent_auth=persistent_auth
   File /usr/lib/python2.7/site-packages/ovirtsdk/web/connection.py,
line 134, in doRequest
   raise RequestError, response
   RequestError:
   status: 500
   reason: Internal Server Error
   detail: HTTP Status 500
   2014-10-17 23:26:28 ERROR
otopi.plugins.ovirt_hosted_engine_setup.engine.add_host
add_host._closeup:510 Cannot automatically add the host to cluster None:
   HTTP Status 500
  
   2014-10-17 23:26:28 DEBUG otopi.context context._executeMethod:152
method exception
   Traceback (most recent call last):
   File /usr/lib/python2.7/site-packages/otopi/context.py, line 142,
in _executeMethod
   method['method']()
   File
/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/engine/add_host.py,
line 517, in _closeup
   cluster=cluster_name,
   RuntimeError: Cannot add the host to cluster None
   2014-10-17 23:26:28 ERROR otopi.context context._executeMethod:161
Failed to execute stage 'Closing up': Cannot add the host to cluster None
  
   There is nothing except DEBUG/INFO in vdsm.log
  
   server.log from the engine:
   2014-10-17 23:26:26,143 INFO [org.hibernate.validator.util.Version]
(ajp--127.0.0.1-8702-1) Hibernate Validator 4.2.0.Final
   2014-10-17 23:26:28,606 WARN [org.jboss.modules]
(ajp--127.0.0.1-8702-4) Failed to define class
org.apache.fop.apps.FopFactory in Module
   org.apache.xmlgraphics.fop:main from local module loader @51a30bb
(roots:
  
/var/lib/ovirt-engine/jboss_runtime/modules/00-ovirt-engine-modules,/var/lib/ovirt-engine/jboss_runtime/modules/01-ovirt-engine-jboss-as-modules):
   org.jboss.modules.ModuleLoadError: Error loading module from
  
/var/lib/ovirt-engine/jboss_runtime/modules/00-ovirt-engine-modules/org/apache/xmlgraphics/batik/main/module.xml
   at
org.jboss.modules.ModuleLoadException.toError(ModuleLoadException.java:78)
   at org.jboss.modules.Module.getPathsUnchecked(Module.java:1166)
   at 

Re: [ovirt-users] [hosted-engine-ha] restart-loop

2014-10-02 Thread Jiri Moskovcak

On 10/01/2014 02:39 PM, Daniel Helgenberger wrote:


On 01.10.2014 13:33, Jiri Moskovcak wrote:

On 10/01/2014 01:17 PM, Daniel Helgenberger wrote:

Hello Jirka,
On 01.10.2014 09:10, Jiri Moskovcak wrote:

Hi Daniel,
from the logs it seems like you ran into [1]. It should be fixed in
ovirt-hosted-engine-ha-1.1.5 (part of oVirt 3.4.2).

I am running 3.4.4 - and from hosted-engine --vm-status both hosts had a
score of 2400...

- doesn't seem like it from the logs, I can see the transition from
EngineStart to EngineUp and directly to EngineUpBadHealth, if you have
the latest version it should go to the EngineStarting before it's
EngineUp, are you sure you've restarted the services (broker and agent)
after update? Please provide output of rpm -q ovirt-hosted-engine-ha.

here you go:
rpm -q ovirt-hosted-engine-ha
ovirt-hosted-engine-ha-1.1.5-1.el6.noarch


also, I upgraded to 3.4.3 prior to 3.4.4. I cannot recall whatevter I
restarted ovirt-ha-agent; but it is highly likely. Here system reboots
after kernel updates:
reboot   system boot  2.6.32-431.29.2. Tue Sep 30 21:46 - 14:36  (16:50)
reboot   system boot  2.6.32-431.29.2. Mon Sep 29 12:19 - 21:44 (1+09:24)
reboot   system boot  2.6.32-431.29.2. Fri Sep 12 08:47 - 12:17 (17+03:30)
reboot   system boot  2.6.32-431.20.3. Mon Sep  1 17:48 - 08:44 (10+14:56)


ok, so please just to be 100% sure, check the version on both hosts (it 
should be = 1.1.5) and restart broker and agent and then try to 
reproduce the problem. I went thru the code in 1.1.5 and I don't see any 
code path which could take the agent from EngineStart to EngineUp 
without going thru the EngineStarting state - this was the behavior 
prior 1.1.5.


Regards,
Jirka



Thanks,
Jirka


--Jirka

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1093366

On 09/27/2014 12:40 PM, Daniel Helgenberger wrote:

Hello,

before filing a BZ against 3.4 branch I wanted to get some input on the
following issue:

Steps, root shell on one engine-ha hosts, using hosted-engine cmd:
1. set global maintenance
2. shutdown hosted-engine vm
(do some work)
3. disable global maintenance

Result: My engine was started and immediately powered down again, in a loop.
I could only manually brake this with:
1. enable global mt. gain
2. start engine
3. disable global mt.

I attached the hosts' engine-ha broker logs as well as agent logs, from
today 12:00  to 12:27, right after I 'fixed' this.
Note, the engine was started on nodehv02 automatically after i disabled
global mt. @ about 12:05

Thanks



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users







___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [hosted-engine-ha] restart-loop

2014-10-01 Thread Jiri Moskovcak

On 10/01/2014 01:17 PM, Daniel Helgenberger wrote:

Hello Jirka,
On 01.10.2014 09:10, Jiri Moskovcak wrote:

Hi Daniel,
from the logs it seems like you ran into [1]. It should be fixed in
ovirt-hosted-engine-ha-1.1.5 (part of oVirt 3.4.2).

I am running 3.4.4 - and from hosted-engine --vm-status both hosts had a
score of 2400...


- doesn't seem like it from the logs, I can see the transition from 
EngineStart to EngineUp and directly to EngineUpBadHealth, if you have 
the latest version it should go to the EngineStarting before it's 
EngineUp, are you sure you've restarted the services (broker and agent) 
after update? Please provide output of rpm -q ovirt-hosted-engine-ha.


Thanks,
Jirka





--Jirka

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1093366

On 09/27/2014 12:40 PM, Daniel Helgenberger wrote:

Hello,

before filing a BZ against 3.4 branch I wanted to get some input on the
following issue:

Steps, root shell on one engine-ha hosts, using hosted-engine cmd:
1. set global maintenance
2. shutdown hosted-engine vm
(do some work)
3. disable global maintenance

Result: My engine was started and immediately powered down again, in a loop.
I could only manually brake this with:
1. enable global mt. gain
2. start engine
3. disable global mt.

I attached the hosts' engine-ha broker logs as well as agent logs, from
today 12:00  to 12:27, right after I 'fixed' this.
Note, the engine was started on nodehv02 automatically after i disabled
global mt. @ about 12:05

Thanks



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users







___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted engine setup on second host fails

2014-09-24 Thread Jiri Moskovcak

Hi,
it's getting a little too long, so please forgive the top post. The 
engine emits the message Host with the same address already exists. 
only if you trying to add host with the same hostname it doesn't have 
any connection to it's ID, so please check if your hosts have unique 
hostnames (e.g. I ran into this when I didn't get hostname from dhcp and 
both of my hosts were localhost.localdomain).


Regards,
Jirka

On 09/24/2014 07:59 AM, Yedidyah Bar David wrote:

- Original Message -

From: Yedidyah Bar David d...@redhat.com
To: Itamar Heim ih...@redhat.com
Cc: Stefan Wendler stefan.wend...@tngtech.com, users@ovirt.org
Sent: Wednesday, September 24, 2014 8:40:58 AM
Subject: Re: [ovirt-users] hosted engine setup on second host fails

- Original Message -

From: Itamar Heim ih...@redhat.com
To: Stefan Wendler stefan.wend...@tngtech.com
Cc: Yedidyah Bar David ybard...@redhat.com, users@ovirt.org
Sent: Tuesday, September 23, 2014 7:07:12 PM
Subject: Re: [ovirt-users] hosted engine setup on second host fails


On Sep 23, 2014 7:03 PM, Stefan Wendler stefan.wend...@tngtech.com wrote:


On 09/23/2014 17:01, Itamar Heim wrote:

On 09/23/2014 05:17 PM, Stefan Wendler wrote:

On 09/22/2014 10:52, Stefan Wendler wrote:

On 09/19/2014 15:58, Itamar Heim wrote:

On 09/19/2014 03:32 PM, Stefan Wendler wrote:

Hi there.

I'm trying to install a hosted-engine on our second node (fist
engine
runs on node1).

But I always get the message:

[ ERROR ] Cannot automatically add the host to the Default cluster:
Cannot add Host. Host with the same address already exists.

I'm not entirely sure what I have to do when this message comes, so
I
just press ENTER:

###
To continue make a selection from the options below:
  (1) Continue setup - engine installation is complete
  (2) Power off and restart the VM
  (3) Abort setup

  (1, 2, 3)[1]:


Is there any other interaction required prior to selecting 1?

In the Web Gui I get the following message:

X Adding new Host hosted_engine_2 to Cluster Default

Here is the console output:

# hosted-engine --deploy
[ INFO  ] Stage: Initializing
  Continuing will configure this host for serving as
hypervisor
and create a VM where you have to install oVirt Engine afterwards.
  Are you sure you want to continue? (Yes, No)[Yes]:
[ INFO  ] Generating a temporary VNC password.
[ INFO  ] Stage: Environment setup
  Configuration files: []
  Log file:
/var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20140919141012-k2lag6.log


  Version: otopi-1.2.3 (otopi-1.2.3-1.el6)
[ INFO  ] Hardware supports virtualization
[ INFO  ] Bridge ovirtmgmt already created
[ INFO  ] Stage: Environment packages setup
[ INFO  ] Stage: Programs detection
[ INFO  ] Stage: Environment setup
[ INFO  ] Stage: Environment customization

  --== STORAGE CONFIGURATION ==--

  During customization use CTRL-D to abort.
  Please specify the storage you would like to use (nfs3,
nfs4)[nfs3]:
  Please specify the full shared storage connection path
to use
(example: host:/path): some address:/volume1
  The specified storage location already contains a data
domain.
Is this an additional host setup (Yes, No)[Yes]?
[ INFO  ] Installing on additional host
  Please specify the Host ID [Must be integer, default:
  2]:
  The Host ID is already known. Is this a re-deployment
on an
additional host that was previously set up (Yes, No)[Yes]?


I admit I never tried that. Not sure how exactly it's supposed to work.


A bit more details:

Normally, a host is registered only in the engine's database. A hosted
engine is additionally registered in a special hosted-engine metadata
file managed by the ha daemon [1]. The question above appears if the host id
is found in this metadata file. It seems we never check if it's already
in the engine database - the assumption is that if an existing host is
re-purposed as a hosted-engine, it should first be uninstalled - at least
not be in use (no VMs) and removed from its cluster/dc/the engine.

[1] http://www.ovirt.org/images/d/d5/Fosdem-hosted-engine.pdf pages 17-18





  --== SYSTEM CONFIGURATION ==--

[WARNING] A configuration file must be supplied to deploy Hosted
Engine
on an additional host.
  The answer file may be fetched from the first host
using scp.
  If you do not want to download it automatically you can
abort
the setup answering no to the following question.
  Do you want to scp the answer file from the first host?
(Yes,
No)[Yes]:
  Please provide the FQDN or IP of the first host:
node1.domain
  Enter 'root' user password for host node1.domain:
[ INFO  ] Answer file successfully downloaded

  --== NETWORK CONFIGURATION ==--

  The following CPU types are 

Re: [ovirt-users] hosted engine - can't contact destroyed storage

2014-09-03 Thread Jiri Moskovcak
I'd just like to add a note that this problem is not directly related 
with the fact that it's hosted engine.


--Jirka

On 09/02/2014 01:36 PM, James Clarke wrote:

Might as well admit to fixing it myself.  SSH'd to the host and saw that
this share was mounted.  Forced an umount and now the host is up.


Weird, but working.


Thanks,

James

---

*
*

*From:*James Clarke

*Sent:* 02 September 2014 12:25
*To:* users@ovirt.org
*Subject:* FW: hosted engine - can't contact destroyed storage

​Hi All!


I decommissioned a NFS export domain this morning, ended up 'destroying'
it through the web interface as detaching kept failing.​ Now one of my
hosts keeps flipping between 'Non Operational' and 'Unassigned'.  All of
the VMs on this host are still running.  I am in global-maintenance to
prevent migrations etc.


vdsm.log seems to indicate that it is related to connecting to the
destroyed storage domain:


Thread-206::WARNING::2014-09-02
12:19:05,692::fileSD::673::scanDomains::(collectMetaFiles) Metadata
collection for domain path
/rhev/data-center/mnt/10.0.0.30:_mnt_kvm1_export timedout
Traceback (most recent call last):
   File /usr/share/vdsm/storage/fileSD.py, line 662, in collectMetaFiles
 sd.DOMAIN_META_DATA))
   File /usr/share/vdsm/storage/remoteFileHandler.py, line 297, in
callCrabRPCFunction
 *args, **kwargs)
   File /usr/share/vdsm/storage/remoteFileHandler.py, line 184, in
callCrabRPCFunction
 rawLength = self._recvAll(LENGTH_STRUCT_LENGTH, timeout)
   File /usr/share/vdsm/storage/remoteFileHandler.py, line 150, in
_recvAll
 raise Timeout()
Timeout
Thread-206::DEBUG::2014-09-02
12:19:05,695::remoteFileHandler::260::RepoFileHelper.PoolHandler::(stop)
Pool handler existed, OUT: '' ERR: ''
Thread-210::WARNING::2014-09-02
12:19:05,745::fileSD::673::scanDomains::(collectMetaFiles) Metadata
collection for domain path
/rhev/data-center/mnt/10.0.0.30:_mnt_kvm1_export timedout
Traceback (most recent call last):
   File /usr/share/vdsm/storage/fileSD.py, line 662, in collectMetaFiles
 sd.DOMAIN_META_DATA))
   File /usr/share/vdsm/storage/remoteFileHandler.py, line 297, in
callCrabRPCFunction
 *args, **kwargs)
   File /usr/share/vdsm/storage/remoteFileHandler.py, line 184, in
callCrabRPCFunction
 rawLength = self._recvAll(LENGTH_STRUCT_LENGTH, timeout)
   File /usr/share/vdsm/storage/remoteFileHandler.py, line 150, in
_recvAll
 raise Timeout()
Timeout


Given that this storage domain doesn't exist any more and is not visible
in the web interface, how can I get this host to stop trying to connect
to it, initialize and become online?



James

/Watch How to Turn Data into Action to Improve Customer Experiences in
an Agile World http://www.edigitalresearch.com/news/item/nid/878290232

We are delighted to be ranked 32nd in ‘The Sunday Times Top 100 Best
Small Companies to Work For 2014‘ list. /

This message is sent in confidence for the addressee only. The contents
are not to be disclosed to anyone other than the addressee.
Unauthorised recipients must preserve this confidentiality and should
please advise the sender immediately of any error in transmission.

Any attachment(s) to this message have been checked for viruses, but
please rely on your own virus checker and procedures.

Please note that Internet email is not a secure communications medium.
We advise that you understand and observe this when emailing us.

eDigitalResearch plc is a public limited company registered at the
Registrar Of Companies for England and Wales. Company registration
number: 5424597
Registered Office: Vanbrugh House, Hedge End, Hampshire, SO30 2AF

PS: Save paper - do you really need to print this email?



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] iSCSI self hosted engine

2014-08-28 Thread Jiri Moskovcak

On 08/27/2014 02:29 PM, Markus Mathes wrote:

Hi,

I thought about a self hosted engine setup using 2 hosts connected to
a iSCSI storage.
A few questions arose and I didn't succeed finding the answers on the
web. I somehow don't get how the iSCSI integration in ovirt works.

- As far as I understand from the information I found, all iSCSI
access for the virtual machines is done through one host. Does this
mean, that all disk related traffic of the other host is going to the
first host and then to the iSCSI storage?


- no, every host has a separate access to the storage, only the storage 
setup is done by a single host and then it's available for everyone.


In this case. How are the

virtual disk images then made available for the VMs on the other host?
- If the first host fails, the engine will get started on the other
host. Will the engine deal with the fact, that now the iSCSI storage
domain has to be accessed using the other host. Or is iSCSI just not
suitable for this kind of usage.
- Has the self hosted engine to use a different iSCSI lun than the
storage domain?

Thanks
Markus



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Self-hosted engine won't start

2014-08-18 Thread Jiri Moskovcak

Hi John,
this is the patch fixing your problem [1]. It can be found at the top of 
that bz page. It's really a simple change, so if you want you can just 
change it manually on your system without waiting for a patches version.


--Jirka

[1] 
http://gerrit.ovirt.org/#/c/31510/2/ovirt_hosted_engine_ha/agent/states.py


On 08/18/2014 12:17 AM, John Gardeniers wrote:

Hi Jirka,

Thanks for the update. It sounds like the same bug but with a few extra
issues thrown in. e.g. Comment 9 seems to me to be a completely separate
bug, although it may affect the issue I reported.

I can't see any mention of how the problem is being resolved, which I am
interested in, but will keep an eye on it.

I'll try the patched version when I get the time and enthusiasm to give
it another crack.

regards,
John


On 14/08/14 22:57, Jiri Moskovcak wrote:

Hi John,
after a deeper look I realized that you're probably facing [1]. The
patch is ready and I will also backport it to 3.4 branch.

--Jirka

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1093638

On 07/29/2014 11:41 PM, John Gardeniers wrote:

Hi Jiri,

Sorry, I can't supply the log because the hosts have been recycled but
I'm sure it would have contained exactly the same information that you
already have from host2. It's a classic deadlock situation that should
never be allowed to happen. A simple and time proven solution was in my
original post.

The reason for recycling the hosts is that I discovered yesterday that
although the engine was still running it could not be accessed in any
way. Upon further finding that there was no way to get it restarted I
decided to abandon the whole idea of self-hosting until such time as I
see an indication that it's production ready.

regards,
John


On 29/07/14 22:52, Jiri Moskovcak wrote:

Hi John,
thanks for the logs. Seems like the engine is running on host2 and it
decides that it doesn't have the best score and shuts the engine down
and then neither of them want's to start the vm until you restart the
host2. Unfortunately the logs doesn't contain the part from host1 from
2014-07-24 09:XX which I'd like to investigate because it might
contain the information why host1 refused to start the vm when host2
killed it.

Regards,
Jirka

On 07/28/2014 02:57 AM, John Gardeniers wrote:

Hi Jira,

Version: ovirt-hosted-engine-ha-1.1.5-1.el6.noarch

Attached are the logs. Thanks for looking.

Regards,
John


On 25/07/14 17:47, Jiri Moskovcak wrote:

On 07/24/2014 11:37 PM, John Gardeniers wrote:

Hi Jiri,

Perhaps you can tell me how to determine the exact version of
ovirt-hosted-engine-ha.


Centos/RHEL/Fedora: rpm -q ovirt-hosted-engine-ha


As for the logs, I am not going to attach 60MB
of logs to an email,


- there are other ways to share the logs


nor can I see any imaginagle reason for you wanting
to see them all, as the bulk is historical. I have already included
the
*relevant* sections. However, if you think there may be some other
section that may help you feel free to be more explicit about
what you
are looking for. Right now I fail to understand what you might
hope to
see in logs from several weeks ago that you can't get from the last
day
or so.



It's a standard way, people tend to think that they know what is a
relevant part of a log, but in many cases they fail. Asking for the
whole logs has proven to be faster than trying to find the relevant
part through the user. And you're right, I don't need the logs from
last week, just logs since the last start of the services when you
observed the problem.

Regards,
Jirka


regards,
John


On 24/07/14 19:10, Jiri Moskovcak wrote:

Hi, please provide the the exact versions of ovirt-hosted-engine-ha
and all logs from /var/log/ovirt-hosted-engine-ha/

Thank you,
Jirka

On 07/24/2014 01:29 AM, John Gardeniers wrote:

Hi All,

I have created a lab with 2 hypervisors and a self-hosted engine.
Today
I followed the upgrade instructions as described in
http://www.ovirt.org/Hosted_Engine_Howto and rebooted the
engine. I
didn't really do an upgrade but simply wanted to test what would
happen
when the engine was rebooted.

When the engine didn't restart I re-ran hosted-engine
--set-maintenance=none and restarted the vdsm, ovirt-ha-agent and
ovirt-ha-broker services on both nodes. 15 minutes later it still
hadn't
restarted, so I then tried rebooting both hypervisers. After an
hour
there was still no sign of the engine starting. The agent logs
don't
help me much. The following bits are repeated over and over.

ovirt1 (192.168.19.20):

MainThread::INFO::2014-07-24
09:18:40,272::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)




Trying: notify time=1406157520.27 type=state_transition
detail=EngineDown-EngineDown hostname='ovirt1.om.net'
MainThread::INFO::2014-07-24
09:18:40,272::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)




Success, was notification of state_transition
(EngineDown-EngineDown)
sent? ignored
MainThread::INFO::2014-07-24
09:18

Re: [ovirt-users] Self-hosted engine won't start

2014-08-14 Thread Jiri Moskovcak

Hi John,
after a deeper look I realized that you're probably facing [1]. The 
patch is ready and I will also backport it to 3.4 branch.


--Jirka

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1093638

On 07/29/2014 11:41 PM, John Gardeniers wrote:

Hi Jiri,

Sorry, I can't supply the log because the hosts have been recycled but
I'm sure it would have contained exactly the same information that you
already have from host2. It's a classic deadlock situation that should
never be allowed to happen. A simple and time proven solution was in my
original post.

The reason for recycling the hosts is that I discovered yesterday that
although the engine was still running it could not be accessed in any
way. Upon further finding that there was no way to get it restarted I
decided to abandon the whole idea of self-hosting until such time as I
see an indication that it's production ready.

regards,
John


On 29/07/14 22:52, Jiri Moskovcak wrote:

Hi John,
thanks for the logs. Seems like the engine is running on host2 and it
decides that it doesn't have the best score and shuts the engine down
and then neither of them want's to start the vm until you restart the
host2. Unfortunately the logs doesn't contain the part from host1 from
2014-07-24 09:XX which I'd like to investigate because it might
contain the information why host1 refused to start the vm when host2
killed it.

Regards,
Jirka

On 07/28/2014 02:57 AM, John Gardeniers wrote:

Hi Jira,

Version: ovirt-hosted-engine-ha-1.1.5-1.el6.noarch

Attached are the logs. Thanks for looking.

Regards,
John


On 25/07/14 17:47, Jiri Moskovcak wrote:

On 07/24/2014 11:37 PM, John Gardeniers wrote:

Hi Jiri,

Perhaps you can tell me how to determine the exact version of
ovirt-hosted-engine-ha.


Centos/RHEL/Fedora: rpm -q ovirt-hosted-engine-ha


As for the logs, I am not going to attach 60MB
of logs to an email,


- there are other ways to share the logs


nor can I see any imaginagle reason for you wanting
to see them all, as the bulk is historical. I have already included
the
*relevant* sections. However, if you think there may be some other
section that may help you feel free to be more explicit about what you
are looking for. Right now I fail to understand what you might hope to
see in logs from several weeks ago that you can't get from the last
day
or so.



It's a standard way, people tend to think that they know what is a
relevant part of a log, but in many cases they fail. Asking for the
whole logs has proven to be faster than trying to find the relevant
part through the user. And you're right, I don't need the logs from
last week, just logs since the last start of the services when you
observed the problem.

Regards,
Jirka


regards,
John


On 24/07/14 19:10, Jiri Moskovcak wrote:

Hi, please provide the the exact versions of ovirt-hosted-engine-ha
and all logs from /var/log/ovirt-hosted-engine-ha/

Thank you,
Jirka

On 07/24/2014 01:29 AM, John Gardeniers wrote:

Hi All,

I have created a lab with 2 hypervisors and a self-hosted engine.
Today
I followed the upgrade instructions as described in
http://www.ovirt.org/Hosted_Engine_Howto and rebooted the engine. I
didn't really do an upgrade but simply wanted to test what would
happen
when the engine was rebooted.

When the engine didn't restart I re-ran hosted-engine
--set-maintenance=none and restarted the vdsm, ovirt-ha-agent and
ovirt-ha-broker services on both nodes. 15 minutes later it still
hadn't
restarted, so I then tried rebooting both hypervisers. After an hour
there was still no sign of the engine starting. The agent logs don't
help me much. The following bits are repeated over and over.

ovirt1 (192.168.19.20):

MainThread::INFO::2014-07-24
09:18:40,272::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)



Trying: notify time=1406157520.27 type=state_transition
detail=EngineDown-EngineDown hostname='ovirt1.om.net'
MainThread::INFO::2014-07-24
09:18:40,272::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)



Success, was notification of state_transition
(EngineDown-EngineDown)
sent? ignored
MainThread::INFO::2014-07-24
09:18:40,594::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)



Current state EngineDown (score: 2400)
MainThread::INFO::2014-07-24
09:18:40,594::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)



Best remote host 192.168.19.21 (id: 2, score: 2400)

ovirt2 (192.168.19.21):

MainThread::INFO::2014-07-24
09:18:04,005::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)



Trying: notify time=1406157484.01 type=state_transition
detail=EngineDown-EngineDown hostname='ovirt2.om.net'
MainThread::INFO::2014-07-24
09:18:04,006::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)



Success, was notification of state_transition
(EngineDown-EngineDown)
sent? ignored
MainThread::INFO::2014-07-24
09:18

Re: [ovirt-users] ovirt-ha notifications flood of ovirt-hosted-engine state transition GlobalMaintenance-GlobalMaintenance messages?

2014-08-11 Thread Jiri Moskovcak

On 08/08/2014 05:29 PM, Darrell Budic wrote:


On Aug 8, 2014, at 1:22 AM, Jiri Moskovcak jmosk...@redhat.com wrote:


On 08/07/2014 07:08 PM, Darrell Budic wrote:

Why do the ha brokers send this message every 15 seconds? It isn’t really a 
state transition, and it’s a little excessive for a reminder that it’s in 
Global Maintenance. This is with centos 6.5 hosts.
Any thing I can do on my side to get it to send just one for the initial 
transition into maintenance mode?


- unfortunately with the current code it's either a message every 15secs or 
never
- if you want to silence it, you can edit 
/etc/ovirt-hosted-engine-ha/agent-log.conf and change

[logger_root]
level=INFO

to

[logger_root]
level=ERROR

--Jirka


Thanks Jirka, I may silence it :)

Every so often, one of my groups decides to send me a bunch of erroneous messages. 
Just figured out that if I started the ovirt-engine VM by hand, the ha-agent 
doesn’t seem to pickup that it’s running, but sends me “EngineDown-EngineStart” 
 “EngineStart-EngineUp” messages every 10 mins or so. Doesn’t affect the 
running engine, but more spam :) When I shut the running engine down and let the 
agent start it up, it shuts up about it, so something that happens during automatic 
launch that isn’t happening if it’s launched with —vm-start maybe?

I’ve had it send me similar messages seemingly randomly as well, no apparent 
cause and the engine vm shows no interruptions in uptime. Then it gets quiet 
again an hour or 3 later. No idea what caused it, just thought I’d mention it 
in context.

  -Darrell



Hi Darrell,
that's interesting would you mind sharing the logs and the exact version 
of ovirt-hosted-engine-ha package if you run into this problem again?


Thanks,
Jirka
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] ovirt-ha notifications flood of ovirt-hosted-engine state transition GlobalMaintenance-GlobalMaintenance messages?

2014-08-08 Thread Jiri Moskovcak

On 08/07/2014 07:08 PM, Darrell Budic wrote:

Why do the ha brokers send this message every 15 seconds? It isn’t really a 
state transition, and it’s a little excessive for a reminder that it’s in 
Global Maintenance. This is with centos 6.5 hosts.
Any thing I can do on my side to get it to send just one for the initial 
transition into maintenance mode?


- unfortunately with the current code it's either a message every 15secs 
or never
- if you want to silence it, you can edit 
/etc/ovirt-hosted-engine-ha/agent-log.conf and change


[logger_root]
level=INFO

to

[logger_root]
level=ERROR

--Jirka




   -Darrell
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt 3.5 2nd test day report - hosted engine

2014-07-30 Thread Jiri Moskovcak

On 07/30/2014 02:02 AM, Greg Padgett wrote:

Hi all,

Today I tested Hosted Engine on iSCSI (setup and operation) on Fedora
20.  The first part of setup went smoothly, but there were some hiccups
I eventually ran into:

  - HA services didn't start after setup [1]
  - HA agent failed without reporting an error [2][3]

I also noticed that when an iSCSI target has multiple LUNs, a random (?)
one would be chosen by setup.  I ended up running setup again with only
one LUN available to make sure this didn't cause further errors.

After setup of the first host completed, things seemed to be working
well.  However, after completing 2nd host setup and rebooting the first
host, I have errors in both agent.log files. [4]

I'll follow up with msivak and jmoskovc to see if these are errors on my
part or something I can help troubleshoot.

Thanks,
Greg


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1123285
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1124624
[3] http://gerrit.ovirt.org/#/c/30814/
[4] First host:
   Error: 'path to storage domain 35ff13aa-7ff1-4add-9869-651267e36921 not
   found in /rhev/data-center/mnt' - trying to restart agent
   [followed by agent existing after too many retries]


- also seems like the vdsm failed to connect the domain


Second host:
   Exception: Failed to start monitoring domain (sd_uuid=35ff13aa-
   7ff1-4add-9869-651267e36921, host_id=2): timeout during domain
   acquisition


- I'm not sure what we can do about this, if vdsm is not able to connect 
the domain after 90 seconds. We can increase the timeout, but how much 
is enough and yet not too much?


--Jirka


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Self-hosted engine won't start

2014-07-29 Thread Jiri Moskovcak

Hi John,
thanks for the logs. Seems like the engine is running on host2 and it 
decides that it doesn't have the best score and shuts the engine down 
and then neither of them want's to start the vm until you restart the 
host2. Unfortunately the logs doesn't contain the part from host1 from 
2014-07-24 09:XX which I'd like to investigate because it might contain 
the information why host1 refused to start the vm when host2 killed it.


Regards,
Jirka

On 07/28/2014 02:57 AM, John Gardeniers wrote:

Hi Jira,

Version: ovirt-hosted-engine-ha-1.1.5-1.el6.noarch

Attached are the logs. Thanks for looking.

Regards,
John


On 25/07/14 17:47, Jiri Moskovcak wrote:

On 07/24/2014 11:37 PM, John Gardeniers wrote:

Hi Jiri,

Perhaps you can tell me how to determine the exact version of
ovirt-hosted-engine-ha.


Centos/RHEL/Fedora: rpm -q ovirt-hosted-engine-ha


As for the logs, I am not going to attach 60MB
of logs to an email,


- there are other ways to share the logs


nor can I see any imaginagle reason for you wanting
to see them all, as the bulk is historical. I have already included the
*relevant* sections. However, if you think there may be some other
section that may help you feel free to be more explicit about what you
are looking for. Right now I fail to understand what you might hope to
see in logs from several weeks ago that you can't get from the last day
or so.



It's a standard way, people tend to think that they know what is a
relevant part of a log, but in many cases they fail. Asking for the
whole logs has proven to be faster than trying to find the relevant
part through the user. And you're right, I don't need the logs from
last week, just logs since the last start of the services when you
observed the problem.

Regards,
Jirka


regards,
John


On 24/07/14 19:10, Jiri Moskovcak wrote:

Hi, please provide the the exact versions of ovirt-hosted-engine-ha
and all logs from /var/log/ovirt-hosted-engine-ha/

Thank you,
Jirka

On 07/24/2014 01:29 AM, John Gardeniers wrote:

Hi All,

I have created a lab with 2 hypervisors and a self-hosted engine.
Today
I followed the upgrade instructions as described in
http://www.ovirt.org/Hosted_Engine_Howto and rebooted the engine. I
didn't really do an upgrade but simply wanted to test what would
happen
when the engine was rebooted.

When the engine didn't restart I re-ran hosted-engine
--set-maintenance=none and restarted the vdsm, ovirt-ha-agent and
ovirt-ha-broker services on both nodes. 15 minutes later it still
hadn't
restarted, so I then tried rebooting both hypervisers. After an hour
there was still no sign of the engine starting. The agent logs don't
help me much. The following bits are repeated over and over.

ovirt1 (192.168.19.20):

MainThread::INFO::2014-07-24
09:18:40,272::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)


Trying: notify time=1406157520.27 type=state_transition
detail=EngineDown-EngineDown hostname='ovirt1.om.net'
MainThread::INFO::2014-07-24
09:18:40,272::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)


Success, was notification of state_transition (EngineDown-EngineDown)
sent? ignored
MainThread::INFO::2014-07-24
09:18:40,594::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)


Current state EngineDown (score: 2400)
MainThread::INFO::2014-07-24
09:18:40,594::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)


Best remote host 192.168.19.21 (id: 2, score: 2400)

ovirt2 (192.168.19.21):

MainThread::INFO::2014-07-24
09:18:04,005::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)


Trying: notify time=1406157484.01 type=state_transition
detail=EngineDown-EngineDown hostname='ovirt2.om.net'
MainThread::INFO::2014-07-24
09:18:04,006::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)


Success, was notification of state_transition (EngineDown-EngineDown)
sent? ignored
MainThread::INFO::2014-07-24
09:18:04,324::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)


Current state EngineDown (score: 2400)
MainThread::INFO::2014-07-24
09:18:04,324::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)


Best remote host 192.168.19.20 (id: 1, score: 2400)

   From the above information I decided to simply shut down one
hypervisor
and see what happens. The engine did start back up again a few minutes
later.

The interesting part is that each hypervisor seems to think the
other is
a better host. The two machines are identical, so there's no reason I
can see for this odd behaviour. In a lab environment this is little
more
than an annoying inconvenience. In a production environment it
would be
completely unacceptable.

May I suggest that this issue be looked into and some means found to
eliminate this kind of mutual exclusion? e.g. After a few minutes

Re: [ovirt-users] Self-hosted engine won't start

2014-07-25 Thread Jiri Moskovcak

On 07/24/2014 11:37 PM, John Gardeniers wrote:

Hi Jiri,

Perhaps you can tell me how to determine the exact version of
ovirt-hosted-engine-ha.


Centos/RHEL/Fedora: rpm -q ovirt-hosted-engine-ha


As for the logs, I am not going to attach 60MB
of logs to an email,


- there are other ways to share the logs


nor can I see any imaginagle reason for you wanting
to see them all, as the bulk is historical. I have already included the
*relevant* sections. However, if you think there may be some other
section that may help you feel free to be more explicit about what you
are looking for. Right now I fail to understand what you might hope to
see in logs from several weeks ago that you can't get from the last day
or so.



It's a standard way, people tend to think that they know what is a 
relevant part of a log, but in many cases they fail. Asking for the 
whole logs has proven to be faster than trying to find the relevant part 
through the user. And you're right, I don't need the logs from last 
week, just logs since the last start of the services when you observed 
the problem.


Regards,
Jirka


regards,
John


On 24/07/14 19:10, Jiri Moskovcak wrote:

Hi, please provide the the exact versions of ovirt-hosted-engine-ha
and all logs from /var/log/ovirt-hosted-engine-ha/

Thank you,
Jirka

On 07/24/2014 01:29 AM, John Gardeniers wrote:

Hi All,

I have created a lab with 2 hypervisors and a self-hosted engine. Today
I followed the upgrade instructions as described in
http://www.ovirt.org/Hosted_Engine_Howto and rebooted the engine. I
didn't really do an upgrade but simply wanted to test what would happen
when the engine was rebooted.

When the engine didn't restart I re-ran hosted-engine
--set-maintenance=none and restarted the vdsm, ovirt-ha-agent and
ovirt-ha-broker services on both nodes. 15 minutes later it still hadn't
restarted, so I then tried rebooting both hypervisers. After an hour
there was still no sign of the engine starting. The agent logs don't
help me much. The following bits are repeated over and over.

ovirt1 (192.168.19.20):

MainThread::INFO::2014-07-24
09:18:40,272::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)

Trying: notify time=1406157520.27 type=state_transition
detail=EngineDown-EngineDown hostname='ovirt1.om.net'
MainThread::INFO::2014-07-24
09:18:40,272::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)

Success, was notification of state_transition (EngineDown-EngineDown)
sent? ignored
MainThread::INFO::2014-07-24
09:18:40,594::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)

Current state EngineDown (score: 2400)
MainThread::INFO::2014-07-24
09:18:40,594::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)

Best remote host 192.168.19.21 (id: 2, score: 2400)

ovirt2 (192.168.19.21):

MainThread::INFO::2014-07-24
09:18:04,005::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)

Trying: notify time=1406157484.01 type=state_transition
detail=EngineDown-EngineDown hostname='ovirt2.om.net'
MainThread::INFO::2014-07-24
09:18:04,006::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)

Success, was notification of state_transition (EngineDown-EngineDown)
sent? ignored
MainThread::INFO::2014-07-24
09:18:04,324::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)

Current state EngineDown (score: 2400)
MainThread::INFO::2014-07-24
09:18:04,324::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)

Best remote host 192.168.19.20 (id: 1, score: 2400)

  From the above information I decided to simply shut down one hypervisor
and see what happens. The engine did start back up again a few minutes
later.

The interesting part is that each hypervisor seems to think the other is
a better host. The two machines are identical, so there's no reason I
can see for this odd behaviour. In a lab environment this is little more
than an annoying inconvenience. In a production environment it would be
completely unacceptable.

May I suggest that this issue be looked into and some means found to
eliminate this kind of mutual exclusion? e.g. After a few minutes of
such an issue one hypervisor could be randomly given a slightly higher
weighting, which should result in it being chosen to start the engine.

regards,
John
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users




__
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
__




___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org

Re: [ovirt-users] Self-hosted engine won't start

2014-07-24 Thread Jiri Moskovcak
Hi, please provide the the exact versions of ovirt-hosted-engine-ha and 
all logs from /var/log/ovirt-hosted-engine-ha/


Thank you,
Jirka

On 07/24/2014 01:29 AM, John Gardeniers wrote:

Hi All,

I have created a lab with 2 hypervisors and a self-hosted engine. Today
I followed the upgrade instructions as described in
http://www.ovirt.org/Hosted_Engine_Howto and rebooted the engine. I
didn't really do an upgrade but simply wanted to test what would happen
when the engine was rebooted.

When the engine didn't restart I re-ran hosted-engine
--set-maintenance=none and restarted the vdsm, ovirt-ha-agent and
ovirt-ha-broker services on both nodes. 15 minutes later it still hadn't
restarted, so I then tried rebooting both hypervisers. After an hour
there was still no sign of the engine starting. The agent logs don't
help me much. The following bits are repeated over and over.

ovirt1 (192.168.19.20):

MainThread::INFO::2014-07-24
09:18:40,272::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1406157520.27 type=state_transition
detail=EngineDown-EngineDown hostname='ovirt1.om.net'
MainThread::INFO::2014-07-24
09:18:40,272::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition (EngineDown-EngineDown)
sent? ignored
MainThread::INFO::2014-07-24
09:18:40,594::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-07-24
09:18:40,594::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 192.168.19.21 (id: 2, score: 2400)

ovirt2 (192.168.19.21):

MainThread::INFO::2014-07-24
09:18:04,005::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1406157484.01 type=state_transition
detail=EngineDown-EngineDown hostname='ovirt2.om.net'
MainThread::INFO::2014-07-24
09:18:04,006::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition (EngineDown-EngineDown)
sent? ignored
MainThread::INFO::2014-07-24
09:18:04,324::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-07-24
09:18:04,324::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 192.168.19.20 (id: 1, score: 2400)

 From the above information I decided to simply shut down one hypervisor
and see what happens. The engine did start back up again a few minutes
later.

The interesting part is that each hypervisor seems to think the other is
a better host. The two machines are identical, so there's no reason I
can see for this odd behaviour. In a lab environment this is little more
than an annoying inconvenience. In a production environment it would be
completely unacceptable.

May I suggest that this issue be looked into and some means found to
eliminate this kind of mutual exclusion? e.g. After a few minutes of
such an issue one hypervisor could be randomly given a slightly higher
weighting, which should result in it being chosen to start the engine.

regards,
John
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Setup of hosted Engine Fails

2014-07-21 Thread Jiri Moskovcak

Hi Andrew,
thanks for debugging this, please create a bug against vdsm to make sure 
it gets proper attention.


Thanks,
Jirka

On 07/19/2014 12:36 PM, Andrew Lau wrote:

Quick update, it seems to be related to the latest vdsm package,

service vdsmd start
vdsm: Running mkdirs
vdsm: Running configure_coredump
vdsm: Running configure_vdsm_logs
vdsm: Running run_init_hooks
vdsm: Running gencerts
vdsm: Running check_is_configured
libvirt is not configured for vdsm yet
Modules libvirt are not configured
  Traceback (most recent call last):
   File /usr/bin/vdsm-tool, line 145, in module
 sys.exit(main())
   File /usr/bin/vdsm-tool, line 142, in main
 return tool_command[cmd][command](*args[1:])
   File /usr/lib64/python2.6/site-packages/vdsm/tool/configurator.py,
line 282, in isconfigured
 raise RuntimeError(msg)
RuntimeError:

One of the modules is not configured to work with VDSM.
To configure the module use the following:
'vdsm-tool configure [module_name]'.

If all modules are not configured try to use:
'vdsm-tool configure --force'
(The force flag will stop the module's service and start it
afterwards automatically to load the new configuration.)

vdsm: stopped during execute check_is_configured task (task returned
with error code 1).
vdsm start [FAILED]

yum downgrade vdsm*

​Here's the package changes for reference,

-- Running transaction check
--- Package vdsm.x86_64 0:4.14.9-0.el6 will be a downgrade
--- Package vdsm.x86_64 0:4.14.11-0.el6 will be erased
--- Package vdsm-cli.noarch 0:4.14.9-0.el6 will be a downgrade
--- Package vdsm-cli.noarch 0:4.14.11-0.el6 will be erased
--- Package vdsm-python.x86_64 0:4.14.9-0.el6 will be a downgrade
--- Package vdsm-python.x86_64 0:4.14.11-0.el6 will be erased
--- Package vdsm-python-zombiereaper.noarch 0:4.14.9-0.el6 will be a
downgrade
--- Package vdsm-python-zombiereaper.noarch 0:4.14.11-0.el6 will be erased
--- Package vdsm-xmlrpc.noarch 0:4.14.9-0.el6 will be a downgrade
--- Package vdsm-xmlrpc.noarch 0:4.14.11-0.el6 will be erased

service vdsmd start
initctl: Job is already running: libvirtd
vdsm: Running mkdirs
vdsm: Running configure_coredump
vdsm: Running configure_vdsm_logs
vdsm: Running run_init_hooks
vdsm: Running gencerts
vdsm: Running check_is_configured
libvirt is already configured for vdsm
sanlock service is already configured
vdsm: Running validate_configuration
SUCCESS: ssl configured to true. No conflicts
vdsm: Running prepare_transient_repository
vdsm: Running syslog_available
vdsm: Running nwfilter
vdsm: Running dummybr
vdsm: Running load_needed_modules
vdsm: Running tune_system
vdsm: Running test_space
vdsm: Running test_lo
vdsm: Running unified_network_persistence_upgrade
vdsm: Running restore_nets
vdsm: Running upgrade_300_nets
Starting up vdsm daemon:
vdsm start [  OK  ]
[root@ov-hv1-2a-08-23 ~]# service vdsmd status
VDS daemon server is running


On Sat, Jul 19, 2014 at 6:58 PM, Andrew Lau and...@andrewklau.com
mailto:and...@andrewklau.com wrote:

It seems vdsm is not running,

service vdsmd status
VDS daemon is not running, and its watchdog is running

The only logs in /var/log/vdsm/ that appear to have any content is
/var/log/vdsm/supervdsm.log - everything else is blank

MainThread::DEBUG::2014-07-19
18:55:34,793::supervdsmServer::424::SuperVdsm.Server::(main)
Terminated normally
MainThread::DEBUG::2014-07-19
18:55:38,033::netconfpersistence::134::root::(_getConfigs)
Non-existing config set.
MainThread::DEBUG::2014-07-19
18:55:38,034::netconfpersistence::134::root::(_getConfigs)
Non-existing config set.
MainThread::DEBUG::2014-07-19
18:55:38,058::supervdsmServer::384::SuperVdsm.Server::(main) Making
sure I'm root - SuperVdsm
MainThread::DEBUG::2014-07-19
18:55:38,059::supervdsmServer::393::SuperVdsm.Server::(main) Parsing
cmd args
MainThread::DEBUG::2014-07-19
18:55:38,059::supervdsmServer::396::SuperVdsm.Server::(main)
Cleaning old socket /var/run/vdsm/svdsm.sock
MainThread::DEBUG::2014-07-19
18:55:38,059::supervdsmServer::400::SuperVdsm.Server::(main) Setting
up keep alive thread
MainThread::DEBUG::2014-07-19
18:55:38,059::supervdsmServer::406::SuperVdsm.Server::(main)
Creating remote object manager
MainThread::DEBUG::2014-07-19
18:55:38,061::supervdsmServer::417::SuperVdsm.Server::(main) Started
serving super vdsm object
sourceRoute::DEBUG::2014-07-19
18:55:38,062::sourceRouteThread::56::root::(_subscribeToInotifyLoop)
sourceRouteThread.subscribeToInotifyLoop started


On Sat, Jul 19, 2014 at 6:48 PM, Andrew Lau and...@andrewklau.com
mailto:and...@andrewklau.com wrote:

Here's a snippet from my hosted-engine-setup log

2014-07-19 18:45:14 DEBUG otopi.context
context._executeMethod:138 Stage late_setup METHOD


Re: [ovirt-users] [Gluster-devel] Can we debug some truths/myths/facts about hosted-engine and gluster?

2014-07-21 Thread Jiri Moskovcak

On 07/19/2014 08:58 AM, Pranith Kumar Karampuri wrote:


On 07/19/2014 11:25 AM, Andrew Lau wrote:



On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri
pkara...@redhat.com mailto:pkara...@redhat.com wrote:


On 07/18/2014 05:43 PM, Andrew Lau wrote:

​ ​

On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur
vbel...@redhat.com mailto:vbel...@redhat.com wrote:

[Adding gluster-devel]


On 07/18/2014 05:20 PM, Andrew Lau wrote:

Hi all,

As most of you have got hints from previous messages,
hosted engine
won't work on gluster . A quote from BZ1097639

Using hosted engine with Gluster backed storage is
currently something
we really warn against.


I think this bug should be closed or re-targeted at
documentation, because there is nothing we can do here.
Hosted engine assumes that all writes are atomic and
(immediately) available for all hosts in the cluster.
Gluster violates those assumptions.
​

I tried going through BZ1097639 but could not find much
detail with respect to gluster there.

A few questions around the problem:

1. Can somebody please explain in detail the scenario that
causes the problem?

2. Is hosted engine performing synchronous writes to ensure
that writes are durable?

Also, if there is any documentation that details the hosted
engine architecture that would help in enhancing our
understanding of its interactions with gluster.


​

Now my question, does this theory prevent a scenario of
perhaps
something like a gluster replicated volume being mounted
as a glusterfs
filesystem and then re-exported as the native kernel NFS
share for the
hosted-engine to consume? It could then be possible to
chuck ctdb in
there to provide a last resort failover solution. I have
tried myself
and suggested it to two people who are running a similar
setup. Now
using the native kernel NFS server for hosted-engine and
they haven't
reported as many issues. Curious, could anyone validate
my theory on this?


If we obtain more details on the use case and obtain gluster
logs from the failed scenarios, we should be able to
understand the problem better. That could be the first step
in validating your theory or evolving further recommendations :).


​ I'm not sure how useful this is, but ​Jiri Moskovcak tracked
this down in an off list message.

​ Message Quote:​

​ ==​

​We were able to track it down to this (thanks Andrew for
providing the testing setup):

-b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine'
Traceback (most recent call last):
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,
line 165, in handle
  response = success  + self._dispatch(data)
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,
line 261, in _dispatch
  .get_all_stats_for_service_type(**options)
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
line 41, in get_all_stats_for_service_type
  d = self.get_raw_stats_for_service_type(storage_dir, service_type)
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
line 74, in get_raw_stats_for_service_type
  f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 116] Stale file handle:

'/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata'

Andrew/Jiri,
Would it be possible to post gluster logs of both the
mount and bricks on the bz? I can take a look at it once. If I
gather nothing then probably I will ask for your help in
re-creating the issue.

Pranith


​Unfortunately, I don't have the logs for that setup any more.. ​I'll
try replicate when I get a chance. If I understand the comment from
the BZ, I don't think it's a gluster bug per-say, more just how
gluster does its replication.

hi Andrew,
  Thanks for that. I couldn't come to any conclusions because no
logs were available. It is unlikely that self-heal is involved because
there were no bricks going down/up according to the bug description.



Hi,
I've never had such setup, I guessed problem with gluster based on 
OSError: [Errno 116] Stale file handle: which happens when the file 
opened by application on client gets removed on the server. I'm pretty 
sure we (hosted-engine) don't remove that file, so I think it's some 
gluster magic moving the data around...


--Jirka


Pranith

Re: [ovirt-users] Setup of hosted Engine Fails

2014-07-16 Thread Jiri Moskovcak

On 07/16/2014 12:47 AM, Christopher Jaggon wrote:

Here is a list of packages :

rpm -qa | grep -i vdsm gives :

vdsm-python-4.14.9-0.el6.x86_64
vdsm-python-zombiereaper-4.14.9-0.el6.noarch
vdsm-xmlrpc-4.14.9-0.el6.noarch
vdsm-4.14.9-0.el6.x86_64
vdsm-cli-4.14.9-0.el6.noarch

When I try to run the hosted engine setup I get this error in the log :

[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Waiting for VDSM hardware info
[ ERROR ] Failed to execute stage 'Environment setup': [Errno 111]
Connection refused
[ INFO  ] Stage: Clean up

Any advice and why this maybe so?


Hi, my first advice would be to check out the logs
/var/log/vdsm/vdsm.log
/var/log/ovirt-hosted-engine-setup/*.log

--Jirka





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Spam Cluster Polcies

2014-06-30 Thread Jiri Moskovcak

On 06/30/2014 01:53 PM, Gilad Chaplik wrote:

- Original Message -

From: Maurice James mja...@media-node.com
To: users users@ovirt.org
Sent: Friday, June 27, 2014 1:20:23 AM
Subject: [ovirt-users] Spam  Cluster Polcies

Can someone explain the cluster policies to me? The explanation at
http://www.ovirt.org/Features/Even_VM_Count_Distribution is not quite
clicking for me.



 * HighVMSCount - Maximum VM limit. Exceeding it qualifies the host as
 overloaded. ( Understood )


minimal #vms on host to enable the module (i.e worst host should have more than 
HighVMSCount vms).


 * MigrationThreshold - defines a buffer before we start migrating VMs
 from the host ( Not quite grasping. If my High limit is 5 does this mean
 that it will only begin to move VMs when the count reaches 10? )


don't think that there's any relation with HighLimit.
It means that migration will happen only if there is a difference of X vms or 
more between worst host (source host) to destination host.

let say that you have 2 hosts. one has 10 vms and the second 8. if the threshold 
is 3, no migration will happen (10-8 ! 3).


- exactly




 * SPMVMCountGrace - defines how many slots for VMs should be reserved on
 SPM hosts (the SPM host should have less load than others, so this
 variable defines how many VM less it should have) ( Does this mean that
 if I have a host with only 3 VMs the SPM will 3 VMs minus the
 SpmWmGrace? Or is this HighVmCount minus SpmVmGrace count?)


you're right, the SPM host will have current + grace VMs in terms of module 
calculations.
I think that this is problematic, it is likely to be selected as worst host 
candidate and that's not the wanted behavior iiuc
imo, we should subtract (-) in overloaded hosts and add (+) in underutilized.


- I think it's correct as it is, if the SpmVmGrace+currentVms  
HighVmCount it is overutilized and should be freed if possible.


--J



@Jirka?





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Spam Cluster Polcies

2014-06-30 Thread Jiri Moskovcak

On 06/30/2014 03:11 PM, Gilad Chaplik wrote:

- Original Message -

From: Jiri Moskovcak jmosk...@redhat.com
To: Gilad Chaplik gchap...@redhat.com, Maurice James 
mja...@media-node.com
Cc: users users@ovirt.org
Sent: Monday, June 30, 2014 3:57:29 PM
Subject: Re: [ovirt-users] Spam  Cluster Polcies

On 06/30/2014 01:53 PM, Gilad Chaplik wrote:

- Original Message -

From: Maurice James mja...@media-node.com
To: users users@ovirt.org
Sent: Friday, June 27, 2014 1:20:23 AM
Subject: [ovirt-users] Spam  Cluster Polcies

Can someone explain the cluster policies to me? The explanation at
http://www.ovirt.org/Features/Even_VM_Count_Distribution is not quite
clicking for me.



  * HighVMSCount - Maximum VM limit. Exceeding it qualifies the host as
  overloaded. ( Understood )


minimal #vms on host to enable the module (i.e worst host should have more
than HighVMSCount vms).


  * MigrationThreshold - defines a buffer before we start migrating VMs
  from the host ( Not quite grasping. If my High limit is 5 does this
  mean
  that it will only begin to move VMs when the count reaches 10? )


don't think that there's any relation with HighLimit.
It means that migration will happen only if there is a difference of X vms
or more between worst host (source host) to destination host.

let say that you have 2 hosts. one has 10 vms and the second 8. if the
threshold is 3, no migration will happen (10-8 ! 3).


- exactly




  * SPMVMCountGrace - defines how many slots for VMs should be reserved
  on
  SPM hosts (the SPM host should have less load than others, so this
  variable defines how many VM less it should have) ( Does this mean
  that
  if I have a host with only 3 VMs the SPM will 3 VMs minus the
  SpmWmGrace? Or is this HighVmCount minus SpmVmGrace count?)


you're right, the SPM host will have current + grace VMs in terms of module
calculations.
I think that this is problematic, it is likely to be selected as worst host
candidate and that's not the wanted behavior iiuc
imo, we should subtract (-) in overloaded hosts and add (+) in
underutilized.


- I think it's correct as it is, if the SpmVmGrace+currentVms 
HighVmCount it is overutilized and should be freed if possible.


not sure I understand, let's take an example
2 hosts: spm: 8 #vms, hsm: 8 #vms.
SPMVMCountGrace = 2.
in this case the 'worst' host will be the SPM... shouldn't we deduct 2 from spm 
and not add? we want less noise there.



- that's correct, the SPM will be marked as overutilized and the engine 
will try to find a host to migrate some vms from the SPM. The SpmGrace 
is not about lowering the noise it's about keeping the host less 
utilized by running less VMs on it (which will also lower the noise, 
because it will give the SPM host worse score when looking for target to 
migrate to)


--J



--J



@Jirka?





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users






___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VM HostedEngie is down. Exist message: internal error Failed to acquire lock error -243

2014-06-06 Thread Jiri Moskovcak
I've seen that problem in other threads, the common denominator was nfs 
on top of gluster. So if you have this setup, then it's a known 
problem. Or you should double check if you hosts have different ids 
otherwise they would be trying to acquire the same lock.


--Jirka

On 06/06/2014 08:03 AM, Andrew Lau wrote:

Hi Ivan,

Thanks for the in depth reply.

I've only seen this happen twice, and only after I added a third host
to the HA cluster. I wonder if that's the root problem.

Have you seen this happen on all your installs or only just after your
manual migration? It's a little frustrating this is happening as I was
hoping to get this into a production environment. It was all working
except that log message :(

Thanks,
Andrew


On Fri, Jun 6, 2014 at 3:20 PM, combuster combus...@archlinux.us wrote:

Hi Andrew,

this is something that I saw in my logs too, first on one node and then on
the other three. When that happend on all four of them, engine was corrupted
beyond repair.

First of all, I think that message is saying that sanlock can't get a lock
on the shared storage that you defined for the hostedengine during
installation. I got this error when I've tried to manually migrate the
hosted engine. There is an unresolved bug there and I think it's related to
this one:

[Bug 1093366 - Migration of hosted-engine vm put target host score to zero]
https://bugzilla.redhat.com/show_bug.cgi?id=1093366

This is a blocker bug (or should be) for the selfhostedengine and, from my
own experience with it, shouldn't be used in the production enviroment (not
untill it's fixed).

Nothing that I've done couldn't fix the fact that the score for the target
node was Zero, tried to reinstall the node, reboot the node, restarted
several services, tailed a tons of logs etc but to no avail. When only one
node was left (that was actually running the hosted engine), I brought the
engine's vm down gracefully (hosted-engine --vm-shutdown I belive) and after
that, when I've tried to start the vm - it wouldn't load. Running VNC showed
that the filesystem inside the vm was corrupted and when I ran fsck and
finally started up - it was too badly damaged. I succeded to start the
engine itself (after repairing postgresql service that wouldn't want to
start) but the database was damaged enough and acted pretty weird (showed
that storage domains were down but the vm's were running fine etc). Lucky
me, I had already exported all of the VM's on the first sign of trouble and
then installed ovirt-engine on the dedicated server and attached the export
domain.

So while really a usefull feature, and it's working (for the most part ie,
automatic migration works), manually migrating VM with the hosted-engine
will lead to troubles.

I hope that my experience with it, will be of use to you. It happened to me
two weeks ago, ovirt-engine was current (3.4.1) and there was no fix
available.

Regards,

Ivan

On 06/06/2014 05:12 AM, Andrew Lau wrote:

Hi,

I'm seeing this weird message in my engine log

2014-06-06 03:06:09,380 INFO
[org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo]
(DefaultQuartzScheduler_Worker-79) RefreshVmList vm id
85d4cfb9-f063-4c7c-a9f8-2b74f5f7afa5 status = WaitForLaunch on vds
ov-hv2-2a-08-23 ignoring it in the refresh until migration is done
2014-06-06 03:06:12,494 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
(DefaultQuartzScheduler_Worker-89) START, DestroyVDSCommand(HostName =
ov-hv2-2a-08-23, HostId = c04c62be-5d34-4e73-bd26-26f805b2dc60,
vmId=85d4cfb9-f063-4c7c-a9f8-2b74f5f7afa5, force=false,
secondsToWait=0, gracefully=false), log id: 62a9d4c1
2014-06-06 03:06:12,561 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
(DefaultQuartzScheduler_Worker-89) FINISH, DestroyVDSCommand, log id:
62a9d4c1
2014-06-06 03:06:12,652 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_
Worker-89) Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM HostedEngine is down. Exit
message: internal error Failed to acquire lock: error -243.

It also appears to occur on the other hosts in the cluster, except the
host which is running the hosted-engine. So right now 3 servers, it
shows up twice in the engine UI.

The engine VM continues to run peacefully, without any issues on the
host which doesn't have that error.

Any ideas?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted-engine configure startup time allowance?

2014-05-28 Thread Jiri Moskovcak

On 05/28/2014 08:52 AM, Andrew Lau wrote:

Hi,

I was just wondering if it's possible to configure the startup-time
allowance of the hosted-engine? I seem to sometime have this issue
where my hosted-engine would start automatically but it would be sent
a reboot signal 30 seconds before the engine has time to startup. This
is because it fails the 'liveliness check', just before it reboots the
engine status would be set to up but as the reboot signal was already
sent the VM will reboot and then startup on another host.

This then goes into a loop, until I do a global maintenance, manual
bootup and then maintenance mode none.



Hi Andrew,
try to look into [1] and tweak the timeouts there. Don't forget to 
restart the ovirt-ha-agent service when you change it.



--Jirka

[1] 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/constants.py



Thanks,
Andrew.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted-engine configure startup time allowance?

2014-05-28 Thread Jiri Moskovcak

On 05/28/2014 10:08 AM, Andrew Lau wrote:

Hi Jiri,

On Wed, May 28, 2014 at 5:10 PM, Jiri Moskovcak jmosk...@redhat.com wrote:

On 05/28/2014 08:52 AM, Andrew Lau wrote:


Hi,

I was just wondering if it's possible to configure the startup-time
allowance of the hosted-engine? I seem to sometime have this issue
where my hosted-engine would start automatically but it would be sent
a reboot signal 30 seconds before the engine has time to startup. This
is because it fails the 'liveliness check', just before it reboots the
engine status would be set to up but as the reboot signal was already
sent the VM will reboot and then startup on another host.

This then goes into a loop, until I do a global maintenance, manual
bootup and then maintenance mode none.



Hi Andrew,
try to look into [1] and tweak the timeouts there. Don't forget to restart
the ovirt-ha-agent service when you change it.



Thanks, I'll try that out.

Are all those values there available in /etc/ovirt-hosted-engine-ha/agent.conf ?



- no, they're not



--Jirka

[1]
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/constants.py


Thanks,
Andrew.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine problem - Engine VM will not start

2014-05-22 Thread Jiri Moskovcak

On 05/22/2014 09:06 AM, Sven Kieske wrote:

Imho your assumptions are wrong, but _I_ also may be wrong
so please correct me, if you can.

AFAIK:
first:
A Datacenter without a master storage domain can't
become operational.


- that's correct, first you need to attach master storage (which must be 
differetn from what you used for hosted engine setup) and then the data 
center will become operational and you can attach iso domain.




second: for hosted engine, the only supported setup type
is on an nfs master storage domain?


advertisement we're adding support iscsi for 3.5 /advertisement ;)



So I think you really need the nfs.

Sorry if I'm wrong, I didn't install this setup yet, my
knowledge is purely based of reading the wiki and the ML
so there might be some errors in my statements, maybe some
dev could correct me.


- you got it right, I think the confusing part here is the fact that you 
need additional storage and can't use the storage used for the HE. (at 
least that's what got me)


--Jirka



Am 21.05.2014 16:17, schrieb Bob Doolittle:

I'm afraid NFS was a red herring. NFS shares from the host to the engine
are not required for basic oVirt operation, correct?
If I understand Sandro correctly, that should not be affecting my
storage connections to engine. I'm sorry I brought it up - it was my
misunderstanding.

I believe the first thing to look at is why the ISO domain is unattached
to my Default Datacenter, is that correct? Then my Datacenter should
become operational, and I can add a Data Domain.

I can manually mount the ISO_DOMAIN directory on both my host and my
engine without issues (it resides on my engine).

So why is my Datacenter not visible when I go to attach my ISO domain?

-Bob




___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine problem - Engine VM will not start

2014-05-21 Thread Jiri Moskovcak

On 05/21/2014 02:49 PM, Bob Doolittle wrote:

On 05/21/2014 03:09 AM, Sven Kieske wrote:

I'd want to add that these rules are for NFSv3
asking Bob if he is maybe useing NFSv4 ?


At the moment I don't need either one. I need to solve my major issues
first. Then when things are working I'll worry about setting up NFS to
export new domains from my host.

Like - why didn't my default domains get configured properly?

Where is my Data Domain, and why is my ISO Domain unattached?
Why didn't hosted-engine --deploy set this up properly? I took the
defaults during deployment for domain setup.

When I first login to webadmin #vms, it shows HostedEngine as green/up.
At #storage it shows my ISO Domain and ovirt-image-repository as
unattached. No Data Domain.
At #dataCenters it shows my Default datacenter as down/uninitialized
If I go to #storage and select ISO_DOMAIN and select it's Data Center
tab (#storage-data_center), it doesn't show any Data Centers to attach to.

-Bob


- can you login to the VM running the engine and try to mount the nfs 
share manually to some directory, just to see if it works? Neither 
engine nor setup is responsible for setting the nfs share (and 
configuring iptables for nfs server), so it's up to you to set it up 
properly and make sure it's mountable from engine.


--Jirka





Am 21.05.2014 08:43, schrieb Sandro Bonazzola:

I'm not saying NFS issue is irrelevant :-)
I'm saying that if you're adding NFS service on the node running
hosted engine you'll need to configure iptables for allowing to mount
the shares.
This means at least opening rpc-bind port 111 and NFS port 2049 and
ports 662 875 892 32769 32803 assuming you've configured NFS with:

RPCRQUOTADOPTS=-p 875
LOCKD_TCPPORT=32803
LOCKD_UDPPORT=32769
RPCMOUNTDOPTS=-p 892
STATDARG=-p 662 -o 2020

Alternative is to use NFS storage on a different host.


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine problem - Engine VM will not start

2014-05-20 Thread Jiri Moskovcak

On 05/20/2014 03:33 AM, Bob Doolittle wrote:

I have successfully completed hosted-engine --deploy with 3.4.1, and set
up my Engine. This is with F19 for both the host and engine
(with the sos package workarounds stated).

However, it does not seem to have initialized any Storage Domains, and
the Datacenter cannot initialize.

I've attached my setup and vdsm logs.

Any guidance appreciated.

Thanks,
 Bob



Hi Bob,
can you try to run:

$ hosted-engine --connect-storage
$ service ovirt-ha-broker restart
$ service ovirt-ha-agent restart

and see if it helps?

Thanks,
Jirka




___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine problem - Engine VM will not start

2014-05-20 Thread Jiri Moskovcak

On 05/20/2014 02:57 PM, Bob Doolittle wrote:

Well that was interesting.
When I ran hosted-engine --connect-storage, the Data Center went green,
and I could see an unattached ISO domain and ovirt-image-repository (but
no Data domain).
But after restarting ovirt-ha-broker and ovirt-ha-agent, the storage
disappeared again and the Data Center went red.

In retrospect, there appears to be a problem with iptables/firewalld
that could be related.
I noticed two things:
- firewalld is stopped and disabled on the host
- I could not manually NFS mount (v3 or v4) from the host to the engine,
unless I did service iptables stop

So it doesn't appear to me that hosted-engine did the right things with
firewalld/iptables. If these problems occurred during the --deploy,
could that result in this situation?
I have temporarily disabled iptables until I get things working, but
clearly that's insufficient to resolve the problem at this point.



- iptables/firewalld is configured during the setup, which is Sandro's 
domain. Sandro, could you please take a look at this?


Thanks,
--Jirka


-Bob

On 05/20/2014 05:18 AM, Jiri Moskovcak wrote:

On 05/20/2014 03:33 AM, Bob Doolittle wrote:

I have successfully completed hosted-engine --deploy with 3.4.1, and set
up my Engine. This is with F19 for both the host and engine
(with the sos package workarounds stated).

However, it does not seem to have initialized any Storage Domains, and
the Datacenter cannot initialize.

I've attached my setup and vdsm logs.

Any guidance appreciated.

Thanks,
  Bob


Hi Bob,
can you try to run:

$ hosted-engine --connect-storage
$ service ovirt-ha-broker restart
$ service ovirt-ha-agent restart

and see if it helps?

Thanks,
Jirka



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users







___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted engine health check issues

2014-05-06 Thread Jiri Moskovcak
 has some logic based on the reporting.

 Regards

 --
 Martin Sivák
 msi...@redhat.com mailto:msi...@redhat.com
 Red Hat Czech
 RHEV-M SLA / Brno, CZ

 - Original Message -
 On 04/15/2014 04:53 PM, Jiri Moskovcak
wrote:
 On 04/14/2014 10:50 AM, René Koch wrote:
 Hi,

 I have some issues with hosted engine
status.

 oVirt hosts think that hosted engine
is down because
   it
 seems
   that
 hosts
 can't write to hosted-engine.lockspace
due to
   glusterfs
 issues
   (or
 at
 least I think so).

 Here's the output of vm-status:

 # hosted-engine --vm-status


 --== Host 1 status ==--

 Status up-to-date  : False
 Hostname   :
10.0.200.102
 Host ID: 1
 Engine status  :
unknown
   stale-data
 Score  : 2400
 Local maintenance  : False
 Host timestamp :
1397035677
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=1397035677 (Wed Apr
  9 11:27:57
   2014)
 host-id=1
 score=2400
 maintenance=False
 state=EngineUp


 --== Host 2 status ==--

 Status up-to-date  : True
 Hostname   :
10.0.200.101
 Host ID: 2
 Engine status  :
{'reason': 'vm
   not
 running
   on
 this
 host', 'health': 'bad', 'vm': 'down',
'detail':
   'unknown'}
 Score  : 0
 Local maintenance  : False
 Host timestamp :
1397464031
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=1397464031 (Mon Apr
14 10:27:11
   2014)
 host-id=2
 score=0
 maintenance=False
 state=EngineUnexpectedlyDown
 timeout=Mon Apr 14 10:35:05 2014

 oVirt engine is sending me 2 emails
every 10 minutes
   with
 the
 following
 subjects:
 - ovirt-hosted-engine state transition
 EngineDown-EngineStart
 - ovirt-hosted-engine state transition
 EngineStart-EngineUp

 In oVirt webadmin I can see the
following message:
 VM HostedEngine is down. Exit message:
internal error
 Failed to
 acquire
 lock: error -243.

 These messages are really annoying as
oVirt isn't
   doing
 anything
 with
 hosted engine - I have an uptime of 9
days in my
   engine
 vm.

 So my questions are now:
 Is it intended to send out these
messages and detect
   that
 ovirt
 engine
 is down (which is false anyway), but
not to restart
   the
 vm?

 How can I disable notifications? I'm
planning to
   write a
 Nagios
 plugin
 which parses the output of
hosted-engine --vm-status
   and
 only
   Nagios
 should notify me, not hosted-engine
script.

 Is is possible or planned to make the
whole ha feature
   optional? I
 really really really hate cluster
software as it
   causes
 more

[ovirt-users] Ceph

2014-05-01 Thread Jiri Moskovcak

Hello oVirt users!

When I read [1] I got curious how much demand for supporting Ceph we can 
expect in the near future. So if use it or plan to use it please let us 
know.


Have a nice day,
Jirka

[1] 
http://www.redhat.com/about/news/press-archive/2014/4/red-hat-to-acquire-inktank-provider-of-ceph

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine datastore crashed on adding a new datastore

2014-04-25 Thread Jiri Moskovcak

On 04/25/2014 11:43 AM, Klas Mattsson wrote:

Hello all,

For some reason, when I try to add a second datastore (separate IP and
empty share with 36:36 user rights), the share which has got the
hosted-engine on it crashes.

This of course crashes the hosted engine.

When the share is up again, it's not sufficient to create a directory
/rhev/data-center/mnt/ with identical name as before and mounting the
share there.
The hosted-engine still refuses to boot up.

The load (pure I/O) on the NFS server goes up to about 11 as well.

Any ideas on where I should check?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Hi Klas,
can you please attach the engine logs located in /var/log/ovirt-engine/ 
? And what exactly crashes? Just the engine or the vm running the 
engine? If the whole vm with the engine then please attach the logs from 
the host /var/log/vdsm and /var/log/libvirt.


Thank you,
Jirka
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted Engine error -243

2014-04-23 Thread Jiri Moskovcak

Hi,
I'm not sure yet what causes the problem, but the workaround should be:

open file 
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/states.py 
in your favorite editor, go to line 52 and change it:


from: except ValueError:
to: except (ValueError, TypeError):

--Jirka

On 04/23/2014 12:43 PM, Kevin Tibi wrote:

Hi,

/var/log/ovirt-hosted-engine-ha/broker.log

Host1:
Thread-118327::INFO::2014-04-23
12:34:59,360::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
Connection established
Thread-118327::INFO::2014-04-23
12:34:59,375::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
Connection closed
Thread-118328::INFO::2014-04-23
12:35:14,546::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
Connection established
Thread-118328::INFO::2014-04-23
12:35:14,549::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
Connection closed

Host2:
Thread-4::INFO::2014-04-23
12:36:08,020::mem_free::53::mem_free.MemFree::(action
  ) memFree: 9816
Thread-3::INFO::2014-04-23
12:36:08,240::mgmt_bridge::59::mgmt_bridge.MgmtBridge
  ::(action) Found bridge ovirtmgmt
Thread-296455::INFO::2014-04-23
12:36:08,678::listener::134::ovirt_hosted_engine
  _ha.broker.listener.ConnectionHandler::(setup) Connection established
Thread-296455::INFO::2014-04-23
12:36:08,684::listener::184::ovirt_hosted_engine
  _ha.broker.listener.ConnectionHandler::(handle) Connection closed



/var/log/ovirt-hosted-engine-ha/agent.log

host1:

MainThread::INFO::2014-04-02
17:46:14,856::state_decorators::25::ovirt_hosted_en
   gine_ha.agent.hosted_engine.HostedEngine::(check) Unknown local
engine vm statusno actions taken
MainThread::INFO::2014-04-02
17:46:14,857::brokerlink::108::ovirt_hosted_engine_
   ha.lib.brokerlink.BrokerLink::(notify) Trying: notify
time=1396453574.86 type=st   ate_transition
detail=UnknownLocalVmState-UnknownLocalVmState hostname='host01.o
 virt.lan'
MainThread::INFO::2014-04-02
17:46:14,858::brokerlink::117::ovirt_hosted_engine_
   ha.lib.brokerlink.BrokerLink::(notify) Success, was notification
of state_transi   tion
(UnknownLocalVmState-UnknownLocalVmState) sent? ignored
MainThread::WARNING::2014-04-02
17:46:15,463::hosted_engine::334::ovirt_hosted_e
   ngine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Error
while monito   ring engine: float() argument
must be a string or a number
MainThread::WARNING::2014-04-02
17:46:15,464::hosted_engine::337::ovirt_hosted_e
   ngine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Unexpected error
Traceback (most recent call last):
   File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_eng
 ine.py, line 323, in start_monitoring
 state.score(self._log))
   File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/states.py
 , line 160, in score
 lm, logger, score, score_cfg)
   File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/states.py
 , line 61, in _penalize_memory
 if self._float_or_default(lm['mem-free'], 0)  vm_mem:
   File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/states.py
 , line 51, in _float_or_default
 return float(value)
TypeError: float() argument must be a string or a number
MainThread::ERROR::2014-04-02
17:46:15,464::hosted_engine::350::ovirt_hosted_eng
 ine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Shutting down the ag   ent because of 3 failures
in a row!
MainThread::INFO::2014-04-02
17:46:15,466::agent::116::ovirt_hosted_engine_ha.ag
http://ovirt_hosted_engine_ha.ag
ent.agent.Agent::(run) Agent shutting down


host2:

MainThread::INFO::2014-04-23
12:36:44,800::hosted_engine::323::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUnexpectedlyDown (score: 0)
MainThread::INFO::2014-04-23
12:36:54,844::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1398249414.84 type=state_transition
detail=EngineUnexpectedlyDown-EngineUnexpectedlyDown
hostname='host02.ovirt.lan'
MainThread::INFO::2014-04-23
12:36:54,846::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(EngineUnexpectedlyDown-EngineUnexpectedlyDown) sent? ignored

/var/log/vdsm/vdsm.log

host1 :

Thread-116::DEBUG::2014-04-23
12:40:17,060::fileSD::225::Storage.Misc.excCmd::(getReadDelay) '/bin/dd
iflag=direct
if=/rhev/data-center/mnt/host01.ovirt.lan:_home_iso/cc51143e-8ad7-4b0b-a4d2-9024dffc1188/dom_md/metadata
bs=4096 count=1' (cwd None)
Thread-116::DEBUG::2014-04-23
12:40:17,070::fileSD::225::Storage.Misc.excCmd::(getReadDelay) SUCCESS:
err = '0+1 

Re: [ovirt-users] hosted engine health check issues

2014-04-17 Thread Jiri Moskovcak

On 04/17/2014 09:34 AM, René Koch wrote:

On 04/15/2014 04:53 PM, Jiri Moskovcak wrote:

On 04/14/2014 10:50 AM, René Koch wrote:

Hi,

I have some issues with hosted engine status.

oVirt hosts think that hosted engine is down because it seems that hosts
can't write to hosted-engine.lockspace due to glusterfs issues (or at
least I think so).

Here's the output of vm-status:

# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date  : False
Hostname   : 10.0.200.102
Host ID: 1
Engine status  : unknown stale-data
Score  : 2400
Local maintenance  : False
Host timestamp : 1397035677
Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=1397035677 (Wed Apr  9 11:27:57 2014)
 host-id=1
 score=2400
 maintenance=False
 state=EngineUp


--== Host 2 status ==--

Status up-to-date  : True
Hostname   : 10.0.200.101
Host ID: 2
Engine status  : {'reason': 'vm not running on this
host', 'health': 'bad', 'vm': 'down', 'detail': 'unknown'}
Score  : 0
Local maintenance  : False
Host timestamp : 1397464031
Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=1397464031 (Mon Apr 14 10:27:11 2014)
 host-id=2
 score=0
 maintenance=False
 state=EngineUnexpectedlyDown
 timeout=Mon Apr 14 10:35:05 2014

oVirt engine is sending me 2 emails every 10 minutes with the following
subjects:
- ovirt-hosted-engine state transition EngineDown-EngineStart
- ovirt-hosted-engine state transition EngineStart-EngineUp

In oVirt webadmin I can see the following message:
VM HostedEngine is down. Exit message: internal error Failed to acquire
lock: error -243.

These messages are really annoying as oVirt isn't doing anything with
hosted engine - I have an uptime of 9 days in my engine vm.

So my questions are now:
Is it intended to send out these messages and detect that ovirt engine
is down (which is false anyway), but not to restart the vm?

How can I disable notifications? I'm planning to write a Nagios plugin
which parses the output of hosted-engine --vm-status and only Nagios
should notify me, not hosted-engine script.

Is is possible or planned to make the whole ha feature optional? I
really really really hate cluster software as it causes more troubles
then standalone machines and in my case the hosted-engine ha feature
really causes troubles (and I didn't had a hardware or network outage
yet only issues with hosted-engine ha agent). I don't need any ha
feature for hosted engine. I just want to run engine virtualized on
oVirt and if engine vm fails (e.g. because of issues with a host) I'll
restart it on another node.


Hi, you can:
1. edit /etc/ovirt-hosted-engine-ha/{agent,broker}-log.conf and tweak
the logger as you like
2. or kill ovirt-ha-broker  ovirt-ha-agent services


Thanks for the information.
So engine is able to run when ovirt-ha-broker and ovirt-ha-agent isn't
running?



- yes, it might cause some problems if you set up another host for 
hosted engine and run the agent on the other host, but as long as you 
don't have the agent running anywhere or you don't need to migrate the 
engine vm, you should be fine.


--Jirka



Regards,
René



--Jirka


Thanks,
René






___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine

2014-04-16 Thread Jiri Moskovcak
There is no config file, but if you really want to change it, then you 
can play with the values in:


/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/constants.py 
 (or python 2.7 depending on your setup)


In the case when you pull the plug it should be driven by 
ENGINE_BAD_HEALTH_EXPIRATION_SECS


--Jirka

On 04/16/2014 04:43 AM, Andrew Lau wrote:

Not that I know of.. someone else may now. Sorry

On Wed, Apr 16, 2014 at 10:18 AM, Maurice James mja...@media-node.com wrote:

Is there any way to change that 10 minute delay?

- Original Message -
From: Andrew Lau and...@andrewklau.com
To: Maurice James mja...@media-node.com
Cc: users users@ovirt.org
Sent: Tuesday, April 15, 2014 7:24:40 PM
Subject: Re: [ovirt-users] Hosted engine

Technically yes, I believe there's a 10 minute or so delay before
it'll come up on the other host.
Pretty sure iscsi is not available for now, only NFS.

On Wed, Apr 16, 2014 at 4:08 AM, Maurice James mja...@media-node.com wrote:


Scenario
I have a hosted engine setup with 2 nodes with shared iscsi storage.

Question:
If I yanked the plug on the host that the hosted engine vm is running on,
will it come back up on the remaining host without any intervention?

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted engine health check issues

2014-04-15 Thread Jiri Moskovcak

On 04/14/2014 10:50 AM, René Koch wrote:

Hi,

I have some issues with hosted engine status.

oVirt hosts think that hosted engine is down because it seems that hosts
can't write to hosted-engine.lockspace due to glusterfs issues (or at
least I think so).

Here's the output of vm-status:

# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date  : False
Hostname   : 10.0.200.102
Host ID: 1
Engine status  : unknown stale-data
Score  : 2400
Local maintenance  : False
Host timestamp : 1397035677
Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=1397035677 (Wed Apr  9 11:27:57 2014)
 host-id=1
 score=2400
 maintenance=False
 state=EngineUp


--== Host 2 status ==--

Status up-to-date  : True
Hostname   : 10.0.200.101
Host ID: 2
Engine status  : {'reason': 'vm not running on this
host', 'health': 'bad', 'vm': 'down', 'detail': 'unknown'}
Score  : 0
Local maintenance  : False
Host timestamp : 1397464031
Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=1397464031 (Mon Apr 14 10:27:11 2014)
 host-id=2
 score=0
 maintenance=False
 state=EngineUnexpectedlyDown
 timeout=Mon Apr 14 10:35:05 2014

oVirt engine is sending me 2 emails every 10 minutes with the following
subjects:
- ovirt-hosted-engine state transition EngineDown-EngineStart
- ovirt-hosted-engine state transition EngineStart-EngineUp

In oVirt webadmin I can see the following message:
VM HostedEngine is down. Exit message: internal error Failed to acquire
lock: error -243.

These messages are really annoying as oVirt isn't doing anything with
hosted engine - I have an uptime of 9 days in my engine vm.

So my questions are now:
Is it intended to send out these messages and detect that ovirt engine
is down (which is false anyway), but not to restart the vm?

How can I disable notifications? I'm planning to write a Nagios plugin
which parses the output of hosted-engine --vm-status and only Nagios
should notify me, not hosted-engine script.

Is is possible or planned to make the whole ha feature optional? I
really really really hate cluster software as it causes more troubles
then standalone machines and in my case the hosted-engine ha feature
really causes troubles (and I didn't had a hardware or network outage
yet only issues with hosted-engine ha agent). I don't need any ha
feature for hosted engine. I just want to run engine virtualized on
oVirt and if engine vm fails (e.g. because of issues with a host) I'll
restart it on another node.


Hi, you can:
1. edit /etc/ovirt-hosted-engine-ha/{agent,broker}-log.conf and tweak 
the logger as you like

2. or kill ovirt-ha-broker  ovirt-ha-agent services

--Jirka


Thanks,
René




___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Hosted Engine recovery failure of all HA - nodes

2014-04-09 Thread Jiri Moskovcak

On 04/08/2014 06:09 PM, Daniel Helgenberger wrote:

Hello,

I have an oVirt 3.4 hosted engine lab setup witch I am evaluating for
production use.

I simulated an ungraceful shutdown of all HA nodes (powercut) while
the engine was running. After powering up, the system did not recover
itself (it seemed).
I had to restart the ovirt-hosted-ha service (witch was in a locked
state) and then manually run 'hosted-engine --vm-start'.

What is the supposed procedure after a shutdown (graceful / ungraceful)
of Hosted-Engine HA nodes? Should the engine recover by itself? Should
the running VM's be restarted automatically?


When this happens the agent should start the engine VM and the engine 
should take care of restarting the VMs which were running on that 
restarted host and are marked as HA. Can you please provide contents ov 
/var/log/ovirt* from the host after the powercut when the engine VM 
doesn't come up?


Thanks,
Jirka



Thanks,
Daniel







___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Users] Hosted Engine recovery failure of all HA - nodes

2014-04-09 Thread Jiri Moskovcak

On 04/09/2014 02:32 PM, Daniel Helgenberger wrote:

On Mi, 2014-04-09 at 09:18 +0200, Jiri Moskovcak wrote:

On 04/08/2014 06:09 PM, Daniel Helgenberger wrote:

Hello,

I have an oVirt 3.4 hosted engine lab setup witch I am evaluating for
production use.

I simulated an ungraceful shutdown of all HA nodes (powercut) while
the engine was running. After powering up, the system did not recover
itself (it seemed).
I had to restart the ovirt-hosted-ha service (witch was in a locked
state) and then manually run 'hosted-engine --vm-start'.

What is the supposed procedure after a shutdown (graceful / ungraceful)
of Hosted-Engine HA nodes? Should the engine recover by itself? Should
the running VM's be restarted automatically?


When this happens the agent should start the engine VM and the engine
should take care of restarting the VMs which were running on that
restarted host and are marked as HA. Can you please provide contents ov
/var/log/ovirt* from the host after the powercut when the engine VM
doesn't come up?


Hello Jirka,

I accidentally already send the message without pointing out the
interesting part; this is:

 start logging ha-agent after reboot:
/var/log/ovirt-hosted-engine-ha/agent.log:MainTMainThread::INFO::2014-04-08 
15:53:33,862::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run) 
ovirt-hosted-engine-ha agent 1.1.2-1 started
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:33,936::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname)
 Found certificate common name: 192.168.50.201
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:33,937::hosted_engine::363::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
 Initializing ha-broker connection
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:33,937::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
 Starting monitor ping, options {'addr': '192.168.50.1'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:33,939::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
 Success, id 139700911299600
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:33,939::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
 Starting monitor mgmt-bridge, options {'use_ssl': 'true', 'bridge_name': 
'ovirtmgmt', 'address': '0'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:34,013::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
 Success, id 139700911300304
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:34,013::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
 Starting monitor mem-free, options {'use_ssl': 'true', 'address': '0'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:34,015::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
 Success, id 139700911300112
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:34,015::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
 Starting monitor cpu-load-no-engine, options {'use_ssl': 'true', 'vm_uuid': 
'e68a11c8-1251-4c13-9e3b-3847bbb4fa3d', 'address': '0'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:34,018::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
 Success, id 139700911300240
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:34,018::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
 Starting monitor engine-health, options {'use_ssl': 'true', 'vm_uuid': 
'e68a11c8-1251-4c13-9e3b-3847bbb4fa3d', 'address': '0'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:34,024::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
 Success, id 139700723857104
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:34,024::hosted_engine::386::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
 Broker initialized, all submonitors started
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:53:34,312::hosted_engine::430::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_cond_start_service)
 Starting vdsmd
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::CRITICAL::2014-04-08 
15:53:34,442::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could 
not start ha-agent
(10 min nothing)
 here I did a 'service ovirt-hosted-ha start'
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08 
15:59:16,698::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run) 
ovirt-hosted-engine-ha agent 1.1.2-1

Re: [Users] hosted engine help

2014-03-25 Thread Jiri Moskovcak

On 03/13/2014 05:25 PM, Jason Brooks wrote:



- Original Message -

From: Greg Padgett gpadg...@redhat.com
To: Jason Brooks jbro...@redhat.com
Cc: Sandro Bonazzola sbona...@redhat.com, users@ovirt.org, Martin Sivak 
msi...@redhat.com
Sent: Tuesday, March 11, 2014 7:52:42 AM
Subject: Re: [Users] hosted engine help

On 03/11/2014 04:09 PM, Sandro Bonazzola wrote:

Il 07/03/2014 01:10, Jason Brooks ha scritto:

Hey everyone, I've been testing out oVirt 3.4 w/ hosted engine, and
while I've managed to bring the engine up, I've only been able to do it
manually, using hosted-engine --vm-start.

The ovirt-ha-agent service fails reliably for me, erroring out with
RequestError: Request failed: success.

I've pasted error passages from the ha agent and vdsm logs below.

Any pointers?

Regards, Jason

***

ovirt-ha-agent.log

MainThread::CRITICAL::2014-03-06
18:48:30,622::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
Could not start ha-agent
Traceback (most recent call last):
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py,
line 97, in run
  self._run_agent()
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py,
line 154, in _run_agent

hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring()
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
line 303, in start_monitoring
  for old_state, state, delay in self.fsm:
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py,
line 125, in next
  new_data = self.refresh(self._state.data)
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py,
line 77, in refresh
  stats.update(self.hosted_engine.collect_stats())
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
line 623, in collect_stats
  constants.SERVICE_TYPE)
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py,
line 171, in get_stats_from_storage
  result = self._checked_communicate(request)
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py,
line 198, in _checked_communicate
  raise RequestError(Request failed: {0}.format(msg))
RequestError: Request failed: success


vdsm.log

Thread-29::ERROR::2014-03-06 18:48:11,101::API::1607::vds::(_getHaInfo)
failed to retrieve Hosted Engine HA info
Traceback (most recent call last):
File /usr/share/vdsm/API.py, line 1598, in _getHaInfo
  stats = instance.get_all_stats()
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py,
line 86, in get_all_stats
  constants.SERVICE_TYPE)
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py,
line 171, in get_stats_from_storage
  result = self._checked_communicate(request)
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py,
line 198, in _checked_communicate
  raise RequestError(Request failed: {0}.format(msg))
RequestError: Request failed: success



Greg, Martin, Request failed: success ?


Hi Jason,

I talked to Martin about this and opened a bug [1]/submitted a patch [2].
   Based on your mail, I'm not sure if you experienced a race condition or
some other issue.  This patch should help the former case, but if you're
still experiencing problems then we would need to investigate further.


I made these changes to my install and now I get a different error:

MainThread::CRITICAL::2014-03-13 
12:05:47,749::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could 
not start ha-agent
Traceback (most recent call last):
   File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py, line 
97, in run
 self._run_agent()
   File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py, line 
154, in _run_agent
 hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring()
   File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
 line 303, in start_monitoring
 for old_state, state, delay in self.fsm:
   File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py, 
line 125, in next
 new_data = self.refresh(self._state.data)
   File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py,
 line 77, in refresh
 stats.update(self.hosted_engine.collect_stats())
   File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
 line 623, in collect_stats
 constants.SERVICE_TYPE)
   File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py, 
line 171, in get_stats_from_storage
 result = self._checked_communicate(request)
   File 
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py, 
line 191, in _checked_communicate
 return parts[1]
IndexError: list index out of range

I'm attaching my vdsm.log, agent.log and 

Re: [Users] hosted engine help

2014-03-25 Thread Jiri Moskovcak

On 03/25/2014 06:34 PM, Jiri Moskovcak wrote:

On 03/13/2014 05:25 PM, Jason Brooks wrote:



- Original Message -

From: Greg Padgett gpadg...@redhat.com
To: Jason Brooks jbro...@redhat.com
Cc: Sandro Bonazzola sbona...@redhat.com, users@ovirt.org,
Martin Sivak msi...@redhat.com
Sent: Tuesday, March 11, 2014 7:52:42 AM
Subject: Re: [Users] hosted engine help

On 03/11/2014 04:09 PM, Sandro Bonazzola wrote:

Il 07/03/2014 01:10, Jason Brooks ha scritto:

Hey everyone, I've been testing out oVirt 3.4 w/ hosted engine, and
while I've managed to bring the engine up, I've only been able to
do it
manually, using hosted-engine --vm-start.

The ovirt-ha-agent service fails reliably for me, erroring out with
RequestError: Request failed: success.

I've pasted error passages from the ha agent and vdsm logs below.

Any pointers?

Regards, Jason

***

ovirt-ha-agent.log

MainThread::CRITICAL::2014-03-06
18:48:30,622::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run)

Could not start ha-agent
Traceback (most recent call last):
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py,

line 97, in run
  self._run_agent()
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py,

line 154, in _run_agent

hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring()
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,

line 303, in start_monitoring
  for old_state, state, delay in self.fsm:
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py,

line 125, in next
  new_data = self.refresh(self._state.data)
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py,

line 77, in refresh
  stats.update(self.hosted_engine.collect_stats())
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,

line 623, in collect_stats
  constants.SERVICE_TYPE)
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py,

line 171, in get_stats_from_storage
  result = self._checked_communicate(request)
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py,

line 198, in _checked_communicate
  raise RequestError(Request failed: {0}.format(msg))
RequestError: Request failed: success


vdsm.log

Thread-29::ERROR::2014-03-06
18:48:11,101::API::1607::vds::(_getHaInfo)
failed to retrieve Hosted Engine HA info
Traceback (most recent call last):
File /usr/share/vdsm/API.py, line 1598, in _getHaInfo
  stats = instance.get_all_stats()
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py,

line 86, in get_all_stats
  constants.SERVICE_TYPE)
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py,

line 171, in get_stats_from_storage
  result = self._checked_communicate(request)
File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py,

line 198, in _checked_communicate
  raise RequestError(Request failed: {0}.format(msg))
RequestError: Request failed: success



Greg, Martin, Request failed: success ?


Hi Jason,

I talked to Martin about this and opened a bug [1]/submitted a patch
[2].
   Based on your mail, I'm not sure if you experienced a race
condition or
some other issue.  This patch should help the former case, but if you're
still experiencing problems then we would need to investigate further.


I made these changes to my install and now I get a different error:

MainThread::CRITICAL::2014-03-13
12:05:47,749::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
Could not start ha-agent
Traceback (most recent call last):
   File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py,
line 97, in run
 self._run_agent()
   File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py,
line 154, in _run_agent

hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring()
   File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
line 303, in start_monitoring
 for old_state, state, delay in self.fsm:
   File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py,
line 125, in next
 new_data = self.refresh(self._state.data)
   File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py,
line 77, in refresh
 stats.update(self.hosted_engine.collect_stats())
   File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py,
line 623, in collect_stats
 constants.SERVICE_TYPE)
   File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py,
line 171, in get_stats_from_storage
 result = self._checked_communicate(request)
   File
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py,
line 191, in _checked_communicate
 return parts[1]
IndexError: list index out of range

I'm

Re: [Users] Self-hosted engine setup ok but no engine vm running

2014-03-23 Thread Jiri Moskovcak

On 03/15/2014 03:03 AM, Giuseppe Ragusa wrote:

Hi all,
while testing further a from-scratch self-hosted-engine installation on
CentOS 6.5 (after two setup restarts: applying a workaround for a
missing pki directory and tweaking my own iptables rules to allow ping
towards default gateway) on a physical node (oVirt 3.4.0_pre + GlusterFS
3.5.0beta4; NFS storage for engine VM), the process ends successfully
but the Engine VM is not found running afterwards.

I archived the whole /var/log directory and attached here for completeness.

I'll wait a bit for questions or other hints/requests before trying any
further action.

Many thanks in advance for your assistance,
Giuseppe



According to the logs you ran into: 
https://bugzilla.redhat.com/show_bug.cgi?id=1075126 It's already fixed 
in ovirt-hosted-engine-ha-1.1.2.1


--Jirka




___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] [Engine-devel] oVirt February 2014 Updates

2014-03-04 Thread Jiri Moskovcak

On 03/03/2014 05:15 PM, Antoni Segura Puimedon wrote:



- Original Message -

From: Itamar Heim ih...@redhat.com
To: users@ovirt.org
Sent: Monday, March 3, 2014 3:25:07 PM
Subject: [Engine-devel] oVirt February 2014 Updates

1. Releases

- oVirt 3.3.3 was released early in the month:
http://www.ovirt.org/OVirt_3.3.3_release_notes

- oVirt 3.3.4 about to release.
http://www.ovirt.org/OVirt_3.3.4_release_notes

- oVirt 3.4.0 about to release!

2. Events
- Leonardo Vaz is organizing ovirt attendance at FISL15, the largest
FOSS conference in LATAM which will happen from 7th to 10th of May in
Porto Alegre, Brazil:
http://softwarelivre.org/fisl15

- Allon Mureinik gave a presentation on DR with oVirt at devconf.cz
http://www.slideshare.net/AllonMureinik/dev-conf-ovirt-dr


Jiři Moskovcak presented as well the oVirt scheduler (@Jiri, can you attach
slides?)



- my slides (It's basically Gilad's slides from fosdem): 
http://jmoskovc.fedorapeople.org/scheduling_devconf.odp



I presented vdsm pluggable networking showing how to write parts of a 
configurator
and network hooks.
https://blog.antoni.me/devconf14/
(Better look at it with Chromium, firefox has a bug with svg files)



- oVirt workshop in korea slides (korean)
http://www.slideshare.net/rogan/20140208-ovirtkorea-01
https://www.facebook.com/groups/ovirt.korea

- Rogan also presented oVirt integration with OpenStack in OpenStack
day in Korea
http://alturl.com/m3jnx

- Pat Pierson posted on basic network setup
http://izen.ghostpeppersrus.com/setting-up-networks/

- Fosdem 2014 sessions (slides and videos) are at:
http://www.ovirt.org/FOSDEM_2014

- and some at Infrastructure.Next Ghent the week after forsdem.

3. oVirt Activity (software)

- oVirt Jenkins plugin by Dustin Kut Moy Cheung to control VM slaves
managed by ovirt/RHEV
https://github.com/thescouser89/ovirt-slaves-plugin

- Opaque oVirt/RHEV/Proxmox client and source code released
 https://play.google.com/store/apps/details?id=com.undatech.opaque

- great to see the NUMA push from HP:
http://www.ovirt.org/Features/NUMA_and_Virtual_NUMA
http://www.ovirt.org/Features/Detailed_NUMA_and_Virtual_NUMA

4. oVirt Activity (blogs, preso's)

- oVirt's has been accepted as a mentoring project for the Google
Summer of Code 2014.

- Oved Ourfali posted on Importing Glance images as oVirt templates
http://alturl.com/h7xid

- v2v had seen many active discussions. here's a post by Jon Archer on
how to Import regular kvm image to oVirt or RHEV
http://jonarcher.info/2014/02/import-regular-kvm-image-ovirt-rhev/

- great reviews on amazon.com for Getting Started with oVirt 3.3
http://alturl.com/5rk2p

- oVirt Deep Dive 3.3 slides (Chinese)
http://www.slideshare.net/mobile/johnwoolee/ovirt-deep-dive#

- oVirt intro video (russian)
http://alturl.com/it546

- how to install oVirt 3.3 on CentOS 6.5
http://www.youtube.com/watch?v=5i5ilSKsmbo

5. Related
- NetApp GA'd their Virtual Storage Console for RHEV, which is
implemented as an oVirt UI plugin (and then some)
http://captainkvm.com/2014/02/vsc-for-rhev-is-ga-today/#more-660
___
Engine-devel mailing list
engine-de...@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users