Re: [ovirt-users] Safe to upgrade HE hosts from GUI?

2016-07-31 Thread Wee Sritippho

On 29/7/2559 17:07, Simone Tiraboschi wrote:

On Fri, Jul 29, 2016 at 11:35 AM, Wee Sritippho <we...@forest.go.th> wrote:

On 29/7/2559 15:50, Simone Tiraboschi wrote:

On Fri, Jul 29, 2016 at 6:31 AM, Wee Sritippho <we...@forest.go.th> wrote:

On 28/7/2559 15:54, Simone Tiraboschi wrote:

On Thu, Jul 28, 2016 at 10:41 AM, Wee Sritippho <we...@forest.go.th>
wrote:

On 21/7/2559 16:53, Simone Tiraboschi wrote:

On Thu, Jul 21, 2016 at 11:43 AM, Wee Sritippho <we...@forest.go.th>
wrote:


Can I just follow

http://www.ovirt.org/documentation/how-to/hosted-engine/#upgrade-hosted-engine
until step 3 and do everything else via GUI?

Yes, absolutely.


Hi, I upgrade a host (host02) via GUI and now its score is 0. Restarted
the services but the result is still the same. Kinda lost now. What
should I
do next?


Can you please attach ovirt-ha-agent logs?


Yes, here are the logs:
https://app.box.com/s/b4urjty8dsuj98n3ywygpk3oh5o7pbsh

Thanks Wee,
your issue is here:
MainThread::ERROR::2016-07-17

14:32:45,586::storage_server::143::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(_validate_pre_connected_path)
The hosted-engine storage domain is already mounted on

'/rhev/data-center/mnt/glusterSD/host02.ovirt.forest.go.th:_hosted__engine/639e689c-8493-479b-a6eb-cc92b6fc4cf4'
with a path that is not supported anymore: the right path should be

'/rhev/data-center/mnt/glusterSD/host01.ovirt.forest.go.th:_hosted__engine/639e689c-8493-479b-a6eb-cc92b6fc4cf4'.

Did you manually tried to avoid the issue of a single entry point for
the gluster FS volume using host01.ovirt.forest.go.th:_hosted__engine
and host02.ovirt.forest.go.th:_hosted__engine there?
This could cause a lot of confusion since the code could not detect
that the storage domain is the same and you can end with it mounted
twice into different locations and a lot of issues.
The correct solution of that issue was this one:
https://bugzilla.redhat.com/show_bug.cgi?id=1298693#c20

Now, to have it fixed on your env you have to hack a bit.
First step, you have to edit
/etc/ovirt-hosted-engine/hosted-engine.conf on all your hosted-engine
hosts to ensure that the storage field always point to the same entry
point (host01 for instance)
Then on each host you can add something like:

mnt_options=backupvolfile-server=host02.ovirt.forest.go.th:host03.ovirt.forest.go.th,fetch-attempts=2,log-level=WARNING,log-file=/var/log/engine_domain.log

Then check the representation of your storage connection in the table
storage_server_connections of the engine DB and make sure that
connection refers to the entry point you used in hosted-engine.conf on
all your hosts, you have lastly to set the value of mount_options also
here.

Weird. The configuration in all hosts are already referring to host01.

but for sure you have a connection pointing to host02 somewhere, did
you try to manually deploy from CLI connecting the gluster volume on
host02?

If I recall correctly, yes.

Also, in the storage_server_connections table:

engine=> SELECT * FROM storage_server_connections;
   id  | connection|
user_name | password | iqn | port | portal | storage_type | mount_options |
vfs_type
  | nfs_version | nfs_timeo | nfs_retrans
--+--+---+--+-+--++--+---+--
-+-+---+-
  bd78d299-c8ff-4251-8aab-432ce6443ae8 |
host01.ovirt.forest.go.th:/hosted_engine |   | | |  | 1
|7 |   | glusterfs
  | |   |
(1 row)



Please tune also the value of network.ping-timeout for your glusterFS
volume to avoid this:
   https://bugzilla.redhat.com/show_bug.cgi?id=1319657#c17


--
Wee



--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Safe to upgrade HE hosts from GUI?

2016-07-29 Thread Wee Sritippho

On 29/7/2559 15:50, Simone Tiraboschi wrote:

On Fri, Jul 29, 2016 at 6:31 AM, Wee Sritippho <we...@forest.go.th> wrote:

On 28/7/2559 15:54, Simone Tiraboschi wrote:

On Thu, Jul 28, 2016 at 10:41 AM, Wee Sritippho <we...@forest.go.th> wrote:

On 21/7/2559 16:53, Simone Tiraboschi wrote:

On Thu, Jul 21, 2016 at 11:43 AM, Wee Sritippho <we...@forest.go.th>
wrote:


Can I just follow
http://www.ovirt.org/documentation/how-to/hosted-engine/#upgrade-hosted-engine
until step 3 and do everything else via GUI?

Yes, absolutely.


Hi, I upgrade a host (host02) via GUI and now its score is 0. Restarted
the services but the result is still the same. Kinda lost now. What should I
do next?


Can you please attach ovirt-ha-agent logs?


Yes, here are the logs:
https://app.box.com/s/b4urjty8dsuj98n3ywygpk3oh5o7pbsh

Thanks Wee,
your issue is here:
MainThread::ERROR::2016-07-17
14:32:45,586::storage_server::143::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(_validate_pre_connected_path)
The hosted-engine storage domain is already mounted on
'/rhev/data-center/mnt/glusterSD/host02.ovirt.forest.go.th:_hosted__engine/639e689c-8493-479b-a6eb-cc92b6fc4cf4'
with a path that is not supported anymore: the right path should be
'/rhev/data-center/mnt/glusterSD/host01.ovirt.forest.go.th:_hosted__engine/639e689c-8493-479b-a6eb-cc92b6fc4cf4'.

Did you manually tried to avoid the issue of a single entry point for
the gluster FS volume using host01.ovirt.forest.go.th:_hosted__engine
and host02.ovirt.forest.go.th:_hosted__engine there?
This could cause a lot of confusion since the code could not detect
that the storage domain is the same and you can end with it mounted
twice into different locations and a lot of issues.
The correct solution of that issue was this one:
https://bugzilla.redhat.com/show_bug.cgi?id=1298693#c20

Now, to have it fixed on your env you have to hack a bit.
First step, you have to edit
/etc/ovirt-hosted-engine/hosted-engine.conf on all your hosted-engine
hosts to ensure that the storage field always point to the same entry
point (host01 for instance)
Then on each host you can add something like:
mnt_options=backupvolfile-server=host02.ovirt.forest.go.th:host03.ovirt.forest.go.th,fetch-attempts=2,log-level=WARNING,log-file=/var/log/engine_domain.log

Then check the representation of your storage connection in the table
storage_server_connections of the engine DB and make sure that
connection refers to the entry point you used in hosted-engine.conf on
all your hosts, you have lastly to set the value of mount_options also
here.

Weird. The configuration in all hosts are already referring to host01.

Also, in the storage_server_connections table:

engine=> SELECT * FROM storage_server_connections;
  id  | connection| 
user_name | password | iqn | port | portal | storage_type | 
mount_options | vfs_type

 | nfs_version | nfs_timeo | nfs_retrans
--+--+---+--+-+--++--+---+--
-+-+---+-
 bd78d299-c8ff-4251-8aab-432ce6443ae8 | 
host01.ovirt.forest.go.th:/hosted_engine |   | | |  | 
1  |7 |   | glusterfs

 | |   |
(1 row)



Please tune also the value of network.ping-timeout for your glusterFS
volume to avoid this:
  https://bugzilla.redhat.com/show_bug.cgi?id=1319657#c17


--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Safe to upgrade HE hosts from GUI?

2016-07-28 Thread Wee Sritippho

On 28/7/2559 15:54, Simone Tiraboschi wrote:
On Thu, Jul 28, 2016 at 10:41 AM, Wee Sritippho <we...@forest.go.th 
<mailto:we...@forest.go.th>> wrote:


On 21/7/2559 16:53, Simone Tiraboschi wrote:

On Thu, Jul 21, 2016 at 11:43 AM, Wee Sritippho
<we...@forest.go.th <mailto:we...@forest.go.th>> wrote:

Can I just follow

http://www.ovirt.org/documentation/how-to/hosted-engine/#upgrade-hosted-engine
until step 3 and do everything else via GUI?

Yes, absolutely.


Hi, I upgrade a host (host02) via GUI and now its score is 0.
Restarted the services but the result is still the same. Kinda
lost now. What should I do next?


Can you please attach ovirt-ha-agent logs?

Yes, here are the logs:
https://app.box.com/s/b4urjty8dsuj98n3ywygpk3oh5o7pbsh

--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Safe to upgrade HE hosts from GUI?

2016-07-28 Thread Wee Sritippho

On 21/7/2559 16:53, Simone Tiraboschi wrote:
On Thu, Jul 21, 2016 at 11:43 AM, Wee Sritippho <we...@forest.go.th 
<mailto:we...@forest.go.th>> wrote:


Can I just follow

http://www.ovirt.org/documentation/how-to/hosted-engine/#upgrade-hosted-engine
until step 3 and do everything else via GUI?

Yes, absolutely.

Hi, I upgrade a host (host02) via GUI and now its score is 0. Restarted 
the services but the result is still the same. Kinda lost now. What 
should I do next?


[root@host02 ~]# service vdsmd restart
Redirecting to /bin/systemctl restart  vdsmd.service
[root@host02 ~]# systemctl restart ovirt-ha-broker && systemctl restart 
ovirt-ha-agent

[root@host02 ~]# systemctl status ovirt-ha-broker
● ovirt-ha-broker.service - oVirt Hosted Engine High Availability 
Communications Broker
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; 
enabled; vendor preset: disabled)

   Active: active (running) since Thu 2016-07-28 15:09:38 ICT; 20min ago
 Main PID: 4614 (ovirt-ha-broker)
   CGroup: /system.slice/ovirt-ha-broker.service
   └─4614 /usr/bin/python 
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon


Jul 28 15:29:35 host02.ovirt.forest.go.th ovirt-ha-broker[4614]: 
INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection 
established
Jul 28 15:29:35 host02.ovirt.forest.go.th ovirt-ha-broker[4614]: 
INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection 
closed
Jul 28 15:29:35 host02.ovirt.forest.go.th ovirt-ha-broker[4614]: 
INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection 
established
Jul 28 15:29:35 host02.ovirt.forest.go.th ovirt-ha-broker[4614]: 
INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection 
closed
Jul 28 15:29:48 host02.ovirt.forest.go.th ovirt-ha-broker[4614]: 
INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection 
established
Jul 28 15:29:48 host02.ovirt.forest.go.th ovirt-ha-broker[4614]: 
INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection 
closed
Jul 28 15:29:48 host02.ovirt.forest.go.th ovirt-ha-broker[4614]: 
INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection 
established
Jul 28 15:29:48 host02.ovirt.forest.go.th ovirt-ha-broker[4614]: 
INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection 
closed
Jul 28 15:29:48 host02.ovirt.forest.go.th ovirt-ha-broker[4614]: 
INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection 
established
Jul 28 15:29:48 host02.ovirt.forest.go.th ovirt-ha-broker[4614]: 
INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection 
closed

[root@host02 ~]# systemctl status ovirt-ha-agent
● ovirt-ha-agent.service - oVirt Hosted Engine High Availability 
Monitoring Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; 
enabled; vendor preset: disabled)

   Active: active (running) since Thu 2016-07-28 15:28:34 ICT; 1min 19s ago
 Main PID: 11488 (ovirt-ha-agent)
   CGroup: /system.slice/ovirt-ha-agent.service
   └─11488 /usr/bin/python 
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon


Jul 28 15:29:52 host02.ovirt.forest.go.th ovirt-ha-agent[11488]: 
/usr/lib/python2.7/site-packages/yajsonrpc/stomp.py:352: 
DeprecationWarning: Dispatcher.pend...instead.
Jul 28 15:29:52 host02.ovirt.forest.go.th ovirt-ha-agent[11488]: pending 
= getattr(dispatcher, 'pending', lambda: 0)
Jul 28 15:29:53 host02.ovirt.forest.go.th ovirt-ha-agent[11488]: 
/usr/lib/python2.7/site-packages/yajsonrpc/stomp.py:352: 
DeprecationWarning: Dispatcher.pend...instead.
Jul 28 15:29:53 host02.ovirt.forest.go.th ovirt-ha-agent[11488]: pending 
= getattr(dispatcher, 'pending', lambda: 0)
Jul 28 15:29:53 host02.ovirt.forest.go.th ovirt-ha-agent[11488]: 
/usr/lib/python2.7/site-packages/yajsonrpc/stomp.py:352: 
DeprecationWarning: Dispatcher.pend...instead.
Jul 28 15:29:53 host02.ovirt.forest.go.th ovirt-ha-agent[11488]: pending 
= getattr(dispatcher, 'pending', lambda: 0)
Jul 28 15:29:53 host02.ovirt.forest.go.th ovirt-ha-agent[11488]: 
/usr/lib/python2.7/site-packages/yajsonrpc/stomp.py:352: 
DeprecationWarning: Dispatcher.pend...instead.
Jul 28 15:29:53 host02.ovirt.forest.go.th ovirt-ha-agent[11488]: pending 
= getattr(dispatcher, 'pending', lambda: 0)
Jul 28 15:29:53 host02.ovirt.forest.go.th ovirt-ha-agent[11488]: 
ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Error: 
'Attempt to call functi...rt agent
Jul 28 15:29:53 host02.ovirt.forest.go.th ovirt-ha-agent[11488]: 
ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Error: 'Attempt to call 
function: teardownIma...rt agent

Hint: Some lines were ellipsized, use -l to show in full.
[root@host01 ~]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date  : True
Hostname   : host01.ovirt.forest.go.th
Host ID: 1
Engine status  : {"h

[ovirt-users] Safe to upgrade HE hosts from GUI?

2016-07-21 Thread Wee Sritippho

Hi,

I used to follow 
http://www.ovirt.org/documentation/how-to/hosted-engine/#upgrade-hosted-engine 
when upgrading Hosted Engine (HE) but always fail to make the engine VM 
migrate to the fresh upgraded host as described in step 7. Furthermore, 
the update available icon never disappeared from the GUI.


So I though using the GUI might be better for an amateur like me.

Can I just follow 
http://www.ovirt.org/documentation/how-to/hosted-engine/#upgrade-hosted-engine 
until step 3 and do everything else via GUI?


Thank you,


--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [hosted-engine] engine VM didn't migrate to the fresh upgraded host

2016-07-07 Thread Wee Sritippho

On 7/7/2559 15:41, Simone Tiraboschi wrote:

On Thu, Jul 7, 2016 at 7:17 AM, Wee Sritippho <we...@forest.go.th> wrote:

Hi, I followed this instruction:

http://www.ovirt.org/documentation/how-to/hosted-engine/#upgrade-hosted-engine

However, when I exited the global maintenance mode in step 7 and waited for
about 15 minutes, the engine VM still doesn't migrate to the fresh upgraded
host.

In you case it didn't migrated since both host-2 and host-3 were
already at 3400 points and so there wasn't any reason to migrate.


BTW, after step 6, did I have to put the host out of its local maintenance
mode? The instruction didn't state this so I guess it's a special case when
upgrading and didn't do anything.

hosted-engine --set-maintenance --mode=none will also exit local
maintenance mode.


I didn't type the command line but right click on engine vm, and choose 
'Disable Global HA Maintenance Mode'. Could this be the cause that 
prevent engine vm from migrating?


I'm suffering from this. Didn't wait until the engine vm migrate, and 
when upgrading the last host (which the engine vm live in), the host 
stuck at "Preparing For Maintenance" forever.



[root@host01 me]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date  : True
Hostname   : host01.ovirt.forest.go.th
Host ID: 1
Engine status  : {"reason": "vm not running on this
host", "health": "bad", "vm": "down", "detail": "unknown"}
Score  : 0
stopped: False
Local maintenance  : True
crc32  : 33cc9d8c
Host timestamp : 4993624


--== Host 2 status ==--

Status up-to-date  : True
Hostname   : host02.ovirt.forest.go.th
Host ID: 2
Engine status  : {"reason": "vm not running on this
host", "health": "bad", "vm": "down", "detail": "unknown"}
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 6dc9b311
Host timestamp : 4244063


--== Host 3 status ==--

Status up-to-date  : True
Hostname   : host03.ovirt.forest.go.th
Host ID: 3
Engine status  : {"health": "good", "vm": "up",
"detail": "up"}
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 29513baf
Host timestamp : 5537027

Thank you

--
Wee


--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Only 1 VM migrated to another host after setting a host to maintenance mode - another VM didn't

2016-07-07 Thread Wee Sritippho
I'm sorry. After looking thoroughly, the ordinary VM *did* migrated, the 
one that didn't migrate is the engine VM.


I've added agent.log & broker.log to the logs folder: 
https://app.box.com/s/ibi8pjxyxv4khek2menho1orumstek8y



On 7/7/2559 15:34, Wee Sritippho wrote:


Hi, I was trying to upgrade the hosted-engine environment from v3.6.5 
to v3.6.7.


Upgrading host01 and host02 is OK. But when I was trying to put host03 
(which has the engine VM in it) to maintenance mode, the process stuck 
- only the engine VM was migrated, another ordinary VM didn't.


I also noted that a lot of "Moving Host xxx to Maintenance" task was 
gradually created but didn't finished.


Here are the log files: 
https://app.box.com/s/ibi8pjxyxv4khek2menho1orumstek8y


--
Wee


--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Only 1 VM migrated to another host after setting a host to maintenance mode - another VM didn't

2016-07-07 Thread Wee Sritippho
Hi, I was trying to upgrade the hosted-engine environment from v3.6.5 to 
v3.6.7.


Upgrading host01 and host02 is OK. But when I was trying to put host03 
(which has the engine VM in it) to maintenance mode, the process stuck - 
only the engine VM was migrated, another ordinary VM didn't.


I also noted that a lot of "Moving Host xxx to Maintenance" task was 
gradually created but didn't finished.


Here are the log files: 
https://app.box.com/s/ibi8pjxyxv4khek2menho1orumstek8y


--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] [hosted-engine] engine VM didn't migrate to the fresh upgraded host

2016-07-06 Thread Wee Sritippho

Hi, I followed this instruction:

http://www.ovirt.org/documentation/how-to/hosted-engine/#upgrade-hosted-engine

However, when I exited the global maintenance mode in step 7 and waited 
for about 15 minutes, the engine VM still doesn't migrate to the fresh 
upgraded host.


BTW, after step 6, did I have to put the host out of its local 
maintenance mode? The instruction didn't state this so I guess it's a 
special case when upgrading and didn't do anything.


[root@host01 me]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date  : True
Hostname   : host01.ovirt.forest.go.th
Host ID: 1
Engine status  : {"reason": "vm not running on this 
host", "health": "bad", "vm": "down", "detail": "unknown"}

Score  : 0
stopped: False
Local maintenance  : True
crc32  : 33cc9d8c
Host timestamp : 4993624


--== Host 2 status ==--

Status up-to-date  : True
Hostname   : host02.ovirt.forest.go.th
Host ID: 2
Engine status  : {"reason": "vm not running on this 
host", "health": "bad", "vm": "down", "detail": "unknown"}

Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 6dc9b311
Host timestamp : 4244063


--== Host 3 status ==--

Status up-to-date  : True
Hostname   : host03.ovirt.forest.go.th
Host ID: 3
Engine status  : {"health": "good", "vm": "up", 
"detail": "up"}

Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 29513baf
Host timestamp : 5537027

Thank you

--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Adding another host to my cluster

2016-05-12 Thread Wee Sritippho
Hi,

I used to have a similar problem where one of my host can't be deployed due to 
the absence of ovirtmgmt bridge. Simone said it's a bug ( 
https://bugzilla.redhat.com/1323465 ) which would be fixed in 3.6.6.

This is what I've done to solve it:

1. In the web UI, set the failed host to maintenance.
2. Remove it.
3. In that host, run a script from 
https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install
4. Install ovirt-hosted-engine-setup again.
5. Redeploy again.

Hope that helps

On 11 พฤษภาคม 2016 22 นาฬิกา 48 นาที 58 วินาที GMT+07:00, Gervais de Montbrun 
 wrote:
>Hi Folks,
>
>I hate to reply to my own message, but I'm really hoping someone can
>help me with my issue
>http://lists.ovirt.org/pipermail/users/2016-May/039690.html
>
>
>Does anyone have a suggestion for me? If there is any more information
>that I can provide that would help you to help me, please advise.
>
>Cheers,
>Gervais
>
>
>
>> On May 9, 2016, at 1:42 PM, Gervais de Montbrun
>> wrote:
>> 
>> Hi All,
>> 
>> I'm trying to add a third host into my oVirt cluster. I have hosted
>engine setup on the first two. It's failing to finish the hosted-engine
>--deploy on this third host. I wiped the server and did a CentOS 7
>minimum install and ran it again to have a clean machine.
>> 
>> My setup:
>> CentOS 7 clean install
>> yum install -y
>http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm
>
>> yum install -y ovirt-hosted-engine-setup
>> yum upgrade -y && reboot
>> systemctl disable NetworkManager ; systemctl stop NetworkManager ;
>systemctl disable firewalld ; systemctl stop firewalld
>> hosted-engine --deploy
>> 
>> hosted-engine --deploy always throws an error:
>> [ ERROR ] The VDSM host was found in a failed state. Please check
>engine and bootstrap installation logs.
>> [ ERROR ] Unable to add Cultivar2 to the manager
>> and then echo's
>> [ INFO  ] Waiting for VDSM hardware info
>> ...
>> [ ERROR ] Failed to execute stage 'Closing up': VDSM did not start
>within 120 seconds
>> [ INFO  ] Stage: Clean up
>> [ INFO  ] Generating answer file
>'/var/lib/ovirt-hosted-engine-setup/answers/answers-20160509131103.conf'
>> [ INFO  ] Stage: Pre-termination
>> [ INFO  ] Stage: Termination
>> [ ERROR ] Hosted Engine deployment failed: this system is not
>reliable, please check the issue, fix and redeploy
>>  Log file is located at
>/var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160509130658-qb8ev0.log
>> 
>> Full output of hosted-engine --deploy included in the attached zip
>file.
>> I've also included vdsm.log (There is more than one tries worth of
>tries in there).
>> You'll also find the
>ovirt-hosted-engine-setup-20160509130658-qb8ev0.log listed above.
>> 
>> This is my "test" setup. Cultivar0 is my first host and my nfs server
>for storage. I have two hosts in the setup already and everything is
>working fine. The host does show up in the oVirt admin, but shows
>"Installed Failed"
>> 
>> 
>> Trying to reinstall from within the interface just fails again.
>> 
>> The ovirt bridge interface is not configured and there are no config
>files in /etc/sysconfi/network-scripts related to ovirt.
>> 
>> OS:
>> [root@cultivar2 ovirt-hosted-engine-setup]# cat /etc/redhat-release 
>> CentOS Linux release 7.2.1511 (Core) 
>> 
>> [root@cultivar2 ovirt-hosted-engine-setup]# uname -a
>> Linux cultivar2.grove.silverorange.com
> 3.10.0-327.13.1.el7.x86_64
>#1 SMP Thu Mar 31 16:04:38 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
>> 
>> Versions:
>> [root@cultivar2 ovirt-hosted-engine-setup]# rpm -qa | grep -i ovirt
>> libgovirt-0.3.3-1.el7_2.1.x86_64
>> ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch
>> ovirt-host-deploy-1.4.1-1.el7.centos.noarch
>> ovirt-vmconsole-1.0.0-1.el7.centos.noarch
>> ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
>> ovirt-release36-007-1.noarch
>> ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
>> ovirt-setup-lib-1.0.1-1.el7.centos.noarch
>> ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch
>> [root@cultivar2 ovirt-hosted-engine-setup]# 
>> [root@cultivar2 ovirt-hosted-engine-setup]# 
>> [root@cultivar2 ovirt-hosted-engine-setup]# 
>> [root@cultivar2 ovirt-hosted-engine-setup]# rpm -qa | grep -i virt
>> libvirt-daemon-driver-secret-1.2.17-13.el7_2.4.x86_64
>> virt-viewer-2.0-6.el7.x86_64
>> libgovirt-0.3.3-1.el7_2.1.x86_64
>> libvirt-daemon-kvm-1.2.17-13.el7_2.4.x86_64
>> ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch
>> fence-virt-0.3.2-2.el7.x86_64
>> virt-what-1.13-6.el7.x86_64
>> libvirt-python-1.2.17-2.el7.x86_64
>> libvirt-daemon-1.2.17-13.el7_2.4.x86_64
>> libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.4.x86_64
>> libvirt-lock-sanlock-1.2.17-13.el7_2.4.x86_64
>> libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.4.x86_64
>> 

Re: [ovirt-users] Fencing failed, fence agent ipmilan used instead of ilo4

2016-05-10 Thread Wee Sritippho

Found a workaround.

Changed the fence agent type from "ilo4" to "ipmilan" then added 
"lanplus=1,power_wait=30" (without quotes) to options.


Now the host can be fenced successfully and all HA VMs in that host will 
be restarted in another hosts.


Did a small experiment with power_wait parameter, here are the results:
- power_wait=60 : HA VMs restarted and are pingable in ~2:45 minutes 
after connection lost
- power_wait=30 : HA VMs restarted and are pingable in ~2:15 minutes 
after connection lost


On 10/5/2559 12:52, Wee Sritippho wrote:

Hi,

I'm running an oVirt hosted-engine environment on 3 hosts. To test 
VMs' HA functionality, I shutdown host02's link, where one of my HA 
VMs is running on, using this command:


2016-05-10 09:59:19 ICT [root@host02 ~]# ip link set bond0 down

Few seconds later, an attempt to fence host02 was issued, and this 
entry appears in the web UI event tab "May 10, 2016 10:00:34 ... 
Executing power management status on Host hosted_engine_2 using Proxy 
Host hosted_engine_1 and Fence Agent ipmilan:172.16.3.5.". The IP 
"172.16.3.5" was correct the Fence Agent "ipmilan" was not.


Even though a failure message "May 10, 2016 10:00:36 ... Execution of 
power management status on Host hosted_engine_2 using Proxy Host 
hosted_engine_1 and Fence Agent ipmilan:172.16.3.5 failed." appears in 
the web UI event tab, host02 was successfully powered off.


The last message in the web GUI event tab is "May 10, 2016 10:00:40 AM 
... Host hosted_engine_2 is rebooting.", but the host wasn't actually 
rebooted - I have to boot it manually using iLo web UI.


How can fix this issue in order to make VMs' HA work?

Thank you.

Here is my power management settings:
hosted_engine_1 -> ilo4 : 172.16.3.4
hosted_engine_2 -> ilo4 : 172.16.3.5
hosted_engine_3 -> ilo4 : 172.16.3.6

Here are the log files:
https://app.box.com/s/fs5let8955rjbcuxuy0p42ixj4dzou6m

[root@engine ~]# rpm -qa | grep ovirt
ovirt-engine-wildfly-8.2.1-1.el7.x86_64
ovirt-engine-setup-plugin-ovirt-engine-common-3.6.5.3-1.el7.centos.noarch
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-engine-cli-3.6.2.0-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-3.6.5.3-1.el7.centos.noarch
ovirt-engine-backend-3.6.5.3-1.el7.centos.noarch
ovirt-iso-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extensions-api-impl-3.6.5.3-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-image-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extension-aaa-jdbc-1.0.6-1.el7.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
ovirt-host-deploy-java-1.4.1-1.el7.centos.noarch
ovirt-engine-setup-base-3.6.5.3-1.el7.centos.noarch
ovirt-engine-setup-plugin-websocket-proxy-3.6.5.3-1.el7.centos.noarch
ovirt-engine-tools-backup-3.6.5.3-1.el7.centos.noarch
ovirt-vmconsole-proxy-1.0.0-1.el7.centos.noarch
ovirt-engine-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch
ovirt-engine-setup-3.6.5.3-1.el7.centos.noarch
ovirt-engine-webadmin-portal-3.6.5.3-1.el7.centos.noarch
ovirt-engine-tools-3.6.5.3-1.el7.centos.noarch
ovirt-engine-restapi-3.6.5.3-1.el7.centos.noarch
ovirt-engine-3.6.5.3-1.el7.centos.noarch
ovirt-guest-agent-common-1.0.11-1.el7.noarch
ovirt-engine-wildfly-overlay-8.0.5-1.el7.noarch
ovirt-engine-lib-3.6.5.3-1.el7.centos.noarch
ovirt-engine-websocket-proxy-3.6.5.3-1.el7.centos.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch 


ovirt-engine-userportal-3.6.5.3-1.el7.centos.noarch
ovirt-engine-dbscripts-3.6.5.3-1.el7.centos.noarch

[root@host03 ~]# rpm -qa | grep ovirt
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch

[root@host03 ~]# rpm -qa | grep vdsm
vdsm-cli-4.17.26-1.el7.noarch
vdsm-4.17.26-1.el7.noarch
vdsm-infra-4.17.26-1.el7.noarch
vdsm-xmlrpc-4.17.26-1.el7.noarch
vdsm-yajsonrpc-4.17.26-1.el7.noarch
vdsm-hook-vmfex-dev-4.17.26-1.el7.noarch
vdsm-python-4.17.26-1.el7.noarch
vdsm-jsonrpc-4.17.26-1.el7.noarch



--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Fencing failed, fence agent ipmilan used instead of ilo4

2016-05-09 Thread Wee Sritippho

Hi,

I'm running an oVirt hosted-engine environment on 3 hosts. To test VMs' 
HA functionality, I shutdown host02's link, where one of my HA VMs is 
running on, using this command:


2016-05-10 09:59:19 ICT [root@host02 ~]# ip link set bond0 down

Few seconds later, an attempt to fence host02 was issued, and this entry 
appears in the web UI event tab "May 10, 2016 10:00:34 ... Executing 
power management status on Host hosted_engine_2 using Proxy Host 
hosted_engine_1 and Fence Agent ipmilan:172.16.3.5.". The IP 
"172.16.3.5" was correct the Fence Agent "ipmilan" was not.


Even though a failure message "May 10, 2016 10:00:36 ... Execution of 
power management status on Host hosted_engine_2 using Proxy Host 
hosted_engine_1 and Fence Agent ipmilan:172.16.3.5 failed." appears in 
the web UI event tab, host02 was successfully powered off.


The last message in the web GUI event tab is "May 10, 2016 10:00:40 AM 
... Host hosted_engine_2 is rebooting.", but the host wasn't actually 
rebooted - I have to boot it manually using iLo web UI.


How can fix this issue in order to make VMs' HA work?

Thank you.

Here is my power management settings:
hosted_engine_1 -> ilo4 : 172.16.3.4
hosted_engine_2 -> ilo4 : 172.16.3.5
hosted_engine_3 -> ilo4 : 172.16.3.6

Here are the log files:
https://app.box.com/s/fs5let8955rjbcuxuy0p42ixj4dzou6m

[root@engine ~]# rpm -qa | grep ovirt
ovirt-engine-wildfly-8.2.1-1.el7.x86_64
ovirt-engine-setup-plugin-ovirt-engine-common-3.6.5.3-1.el7.centos.noarch
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-engine-cli-3.6.2.0-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-3.6.5.3-1.el7.centos.noarch
ovirt-engine-backend-3.6.5.3-1.el7.centos.noarch
ovirt-iso-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extensions-api-impl-3.6.5.3-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-image-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extension-aaa-jdbc-1.0.6-1.el7.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
ovirt-host-deploy-java-1.4.1-1.el7.centos.noarch
ovirt-engine-setup-base-3.6.5.3-1.el7.centos.noarch
ovirt-engine-setup-plugin-websocket-proxy-3.6.5.3-1.el7.centos.noarch
ovirt-engine-tools-backup-3.6.5.3-1.el7.centos.noarch
ovirt-vmconsole-proxy-1.0.0-1.el7.centos.noarch
ovirt-engine-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch
ovirt-engine-setup-3.6.5.3-1.el7.centos.noarch
ovirt-engine-webadmin-portal-3.6.5.3-1.el7.centos.noarch
ovirt-engine-tools-3.6.5.3-1.el7.centos.noarch
ovirt-engine-restapi-3.6.5.3-1.el7.centos.noarch
ovirt-engine-3.6.5.3-1.el7.centos.noarch
ovirt-guest-agent-common-1.0.11-1.el7.noarch
ovirt-engine-wildfly-overlay-8.0.5-1.el7.noarch
ovirt-engine-lib-3.6.5.3-1.el7.centos.noarch
ovirt-engine-websocket-proxy-3.6.5.3-1.el7.centos.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch
ovirt-engine-userportal-3.6.5.3-1.el7.centos.noarch
ovirt-engine-dbscripts-3.6.5.3-1.el7.centos.noarch

[root@host03 ~]# rpm -qa | grep ovirt
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch

[root@host03 ~]# rpm -qa | grep vdsm
vdsm-cli-4.17.26-1.el7.noarch
vdsm-4.17.26-1.el7.noarch
vdsm-infra-4.17.26-1.el7.noarch
vdsm-xmlrpc-4.17.26-1.el7.noarch
vdsm-yajsonrpc-4.17.26-1.el7.noarch
vdsm-hook-vmfex-dev-4.17.26-1.el7.noarch
vdsm-python-4.17.26-1.el7.noarch
vdsm-jsonrpc-4.17.26-1.el7.noarch

--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [hosted-engine] engine VM doesn't respawn when its host was killed (poweroff)

2016-05-04 Thread Wee Sritippho


On 4 พฤษภาคม 2016 18 นาฬิกา 48 นาที 25 วินาที GMT+07:00, Martin Sivak 
<msi...@redhat.com> wrote:
>Hi,
>
>you have an ISO domain inside the hosted engine VM, don't you?
>
>MainThread::INFO::2016-05-04
>12:28:47,090::ovf_store::109::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>Extracting Engine VM OVF from the OVF_STORE
>MainThread::INFO::2016-05-04
>12:38:47,504::ovf_store::116::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>OVF_STORE volume path:
>/rhev/data-center/mnt/blockSD/d2dad0e9-4f7d-41d6-b61c-487d44ae6d5d/images/157b67ef-1a29-4e51-9396-79d3425b7871/a394b440-91bb-4c7c-b344-146240d66a43
>
>There is a 10 minute gap between two log lines. We log something every
>10 seconds..
>
>Please check https://bugzilla.redhat.com/show_bug.cgi?id=1332813 to
>see if it might be the same issue.

Yes, exactly the same issue.

Thank you.

>Regards
>
>--
>Martin Sivak
>SLA / oVirt
>
>
>On Wed, May 4, 2016 at 8:34 AM, Wee Sritippho <we...@forest.go.th>
>wrote:
>> I've tried again and made sure all hosts have same clock.
>>
>> After added all 3 hosts, I tested it by shutting down host01. The
>engine was
>> restarted on host02 in less than 2 minutes. I enabled and tested
>power
>> management on all hosts (using ilo4), then tried disabling host02's
>network
>> to test the fencing. Waited for about 5 minutes and saw in the
>console that
>> host02 wasn't fenced. I thought the fencing didn't work and enabled
>the
>> network again. host02 was then fenced immediately after the network
>was
>> enabled (didn't know why) and the engine was never restarted, even
>when
>> host02 is up and running again. I have to start the engine vm
>manually by
>> running "hosted-engine --vm-start" on host02.
>>
>> I thought it might have something to do with ilo4, so I disabled
>power
>> management for all hosts and tried to poweroff host02 again. After
>about 10
>> minutes, the engine still won't start, so I manually start it on
>host01
>> instead.
>>
>> Here are my recent actions:
>>
>> 2016-05-04 12:25:51 ICT - run hosted-engine --vm-status on host01, vm
>is
>> running on host01
>> 2016-05-04 12:28:32 ICT - run reboot on host01, engine vm is down
>> 2016-05-04 12:34:57 ICT - run hosted-engine --vm-status on host01,
>engine
>> status on every hosts is "unknown stale-data", host01's score=0,
>> stopped=true
>> 2016-05-04 12:37:30 ICT - host01 is pingable
>> 2016-05-04 12:41:09 ICT - run hosted-engine --vm-status on host02,
>engine
>> status on every hosts is "unknown stale-data", all hosts' score=3400,
>> stopped=false
>> 2016-05-04 12:43:29 ICT - run hosted-engine --vm-status on host02, vm
>is
>> running on host01
>>
>> Log files: https://app.box.com/s/jjgn14onv19e1qi82mkf24jl2baa2l9s
>>
>>
>> On 1/5/2559 19:32, Yedidyah Bar David wrote:
>>>
>>> It's very hard to understand your flow when time moves backwards.
>>>
>>> Please try again from a clean state. Make sure all hosts have same
>clock.
>>> Then document the exact time you do stuff - starting/stopping a
>host,
>>> checking status, etc.
>>>
>>> Some things to check from your logs:
>>>
>>> in agent.host01.log:
>>>
>>> MainThread::INFO::2016-04-25
>>>
>>>
>15:32:41,370::states::488::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
>>> Engine down and local host has best score (3400), attempting to
>start
>>> engine VM
>>> ...
>>> MainThread::INFO::2016-04-25
>>>
>>>
>15:32:44,276::hosted_engine::1147::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
>>> Engine VM started on localhost
>>> ...
>>> MainThread::INFO::2016-04-25
>>>
>>>
>15:32:58,478::states::672::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
>>> Score is 0 due to unexpected vm shutdown at Mon Apr 25 15:32:58 2016
>>>
>>> Why?
>>>
>>> Also, in agent.host03.log:
>>>
>>> MainThread::INFO::2016-04-25
>>>
>>>
>15:29:53,218::states::488::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
>>> Engine down and local host has best score (3400), attempting to
>start
>>> engine VM
>>> MainThread::INFO::2016-04-25
>>>
>>>
>15:29:53,223::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>>> Trying: notify time=1461572993.22 type=state_tra

Re: [ovirt-users] [hosted-engine] engine VM doesn't respawn when its host was killed (poweroff)

2016-05-04 Thread Wee Sritippho

I've tried again and made sure all hosts have same clock.

After added all 3 hosts, I tested it by shutting down host01. The engine 
was restarted on host02 in less than 2 minutes. I enabled and tested 
power management on all hosts (using ilo4), then tried disabling 
host02's network to test the fencing. Waited for about 5 minutes and saw 
in the console that host02 wasn't fenced. I thought the fencing didn't 
work and enabled the network again. host02 was then fenced immediately 
after the network was enabled (didn't know why) and the engine was never 
restarted, even when host02 is up and running again. I have to start the 
engine vm manually by running "hosted-engine --vm-start" on host02.


I thought it might have something to do with ilo4, so I disabled power 
management for all hosts and tried to poweroff host02 again. After about 
10 minutes, the engine still won't start, so I manually start it on 
host01 instead.


Here are my recent actions:

2016-05-04 12:25:51 ICT - run hosted-engine --vm-status on host01, vm is 
running on host01

2016-05-04 12:28:32 ICT - run reboot on host01, engine vm is down
2016-05-04 12:34:57 ICT - run hosted-engine --vm-status on host01, 
engine status on every hosts is "unknown stale-data", host01's score=0, 
stopped=true

2016-05-04 12:37:30 ICT - host01 is pingable
2016-05-04 12:41:09 ICT - run hosted-engine --vm-status on host02, 
engine status on every hosts is "unknown stale-data", all hosts' 
score=3400, stopped=false
2016-05-04 12:43:29 ICT - run hosted-engine --vm-status on host02, vm is 
running on host01


Log files: https://app.box.com/s/jjgn14onv19e1qi82mkf24jl2baa2l9s

On 1/5/2559 19:32, Yedidyah Bar David wrote:

It's very hard to understand your flow when time moves backwards.

Please try again from a clean state. Make sure all hosts have same clock.
Then document the exact time you do stuff - starting/stopping a host,
checking status, etc.

Some things to check from your logs:

in agent.host01.log:

MainThread::INFO::2016-04-25
15:32:41,370::states::488::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Engine down and local host has best score (3400), attempting to start
engine VM
...
MainThread::INFO::2016-04-25
15:32:44,276::hosted_engine::1147::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
Engine VM started on localhost
...
MainThread::INFO::2016-04-25
15:32:58,478::states::672::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
Score is 0 due to unexpected vm shutdown at Mon Apr 25 15:32:58 2016

Why?

Also, in agent.host03.log:

MainThread::INFO::2016-04-25
15:29:53,218::states::488::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Engine down and local host has best score (3400), attempting to start
engine VM
MainThread::INFO::2016-04-25
15:29:53,223::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1461572993.22 type=state_transition
detail=EngineDown-EngineStart hostname='host03.ovirt.forest.go.th'
MainThread::ERROR::2016-04-25
15:30:23,253::brokerlink::279::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(_communicate)
Connection closed: Connection timed out

Why?

Also, in addition to the actions you stated, you changed a lot maintenance mode.

You can try something like this to get some interesting lines from agent.log:

egrep -i 'start eng|shut|vm started|vm running|vm is running on|
maintenance detected|migra'

Best,

On Mon, Apr 25, 2016 at 12:27 PM, Wee Sritippho <we...@forest.go.th> wrote:

The hosted engine storage is located in an external Fibre Channel SAN.


On 25/4/2559 16:19, Martin Sivak wrote:

Hi,

it seems that all nodes lost access to storage for some reason after
the host was killed. Where is your hosted engine storage located?

Regards

--
Martin Sivak
SLA / oVirt


On Mon, Apr 25, 2016 at 10:58 AM, Wee Sritippho <we...@forest.go.th>
wrote:

Hi,

  From the hosted-engine FAQ, the engine VM should be up and running in
about
5 minutes after its host was forced poweroff. However, after updated
oVirt
3.6.4 to 3.6.5, the engine VM won't restart automatically even after 10+
minutes (I already made sure that global maintenance mode is set to
none). I
initially thought its a time sync issue, so I installed and enabled ntp
on
the hosts and engine. However, the issue still persists.

###Versions:
[root@host01 ~]# rpm -qa | grep ovirt
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.5.3-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-hosted-engine-setup-1.3.5.0-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
[root@host01 ~]# rpm -qa | grep vdsm
vdsm-infra-4.17.26-0.el7.centos.noarch
vdsm-jsonrpc-4.17.26-0.el7.centos.noarch
vdsm-gluster-4.17.26-0.e

Re: [ovirt-users] Additional hosted-engine host deployment failed due to message timeout

2016-05-03 Thread Wee Sritippho

On 3/5/2559 13:25, knarra wrote:

On 05/03/2016 08:35 AM, Wee Sritippho wrote:

Hi,

I'm making a fresh hosted-engine installation on 3 hosts. First 2 
hosts succeeded, but the 3rd one stuck at termination state 
"Installing Host hosted_engine_3. Stage: Termination.", then, 3 
minutes later, "VDSM hosted_engine_3 command failed: Message timeout 
which can be caused by communication issues". Currently, the 3rd host 
status in web UI is stuck at "Installing".


How can I proceed? Could I just run this script 
<https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install> 
in the 3rd host, and run "hosted-engine --deploy" again?
You could do service ovirt-engine restart and remove the host from the 
UI. Redeploying the second time should work fine.

Thank you. This is what I've done and succeeded:
1. Set host03 to maintenance.
2. Remove host03.
3. redeploy host03 with "hosted-engine --deploy" but failed with "[ 
ERROR ] Failed to execute stage 'Programs detection': Hosted Engine HA 
services are already running on this system. Hosted Engine cannot be 
deployed on a host already running those services."
4. Run the script from 
https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install

5. Install ovirt-hosted-engine-setup again.
6. Redeploy again.


log files: https://app.box.com/s/a5typfe6cbozs9uo9osg68gtmq8793t6

[me@host03 ~]$ rpm -qa | grep ovirt
ovirt-release36-007-1.noarch
ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch

[me@host03 ~]$ rpm -qa | grep vdsm
vdsm-hook-vmfex-dev-4.17.26-1.el7.noarch
vdsm-xmlrpc-4.17.26-1.el7.noarch
vdsm-infra-4.17.26-1.el7.noarch
vdsm-yajsonrpc-4.17.26-1.el7.noarch
vdsm-python-4.17.26-1.el7.noarch
vdsm-4.17.26-1.el7.noarch
vdsm-cli-4.17.26-1.el7.noarch
vdsm-jsonrpc-4.17.26-1.el7.noarch

[root@engine ~]# rpm -qa | grep ovirt
ovirt-engine-wildfly-8.2.1-1.el7.x86_64
ovirt-engine-setup-plugin-ovirt-engine-common-3.6.5.3-1.el7.centos.noarch
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-engine-cli-3.6.2.0-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-3.6.5.3-1.el7.centos.noarch
ovirt-engine-backend-3.6.5.3-1.el7.centos.noarch
ovirt-iso-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extensions-api-impl-3.6.5.3-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-image-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extension-aaa-jdbc-1.0.6-1.el7.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
ovirt-host-deploy-java-1.4.1-1.el7.centos.noarch
ovirt-engine-setup-base-3.6.5.3-1.el7.centos.noarch
ovirt-engine-setup-plugin-websocket-proxy-3.6.5.3-1.el7.centos.noarch
ovirt-engine-tools-backup-3.6.5.3-1.el7.centos.noarch
ovirt-vmconsole-proxy-1.0.0-1.el7.centos.noarch
ovirt-engine-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch
ovirt-engine-setup-3.6.5.3-1.el7.centos.noarch
ovirt-engine-webadmin-portal-3.6.5.3-1.el7.centos.noarch
ovirt-engine-tools-3.6.5.3-1.el7.centos.noarch
ovirt-engine-restapi-3.6.5.3-1.el7.centos.noarch
ovirt-engine-3.6.5.3-1.el7.centos.noarch
ovirt-engine-wildfly-overlay-8.0.5-1.el7.noarch
ovirt-engine-lib-3.6.5.3-1.el7.centos.noarch
ovirt-engine-websocket-proxy-3.6.5.3-1.el7.centos.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch
ovirt-engine-userportal-3.6.5.3-1.el7.centos.noarch
ovirt-engine-dbscripts-3.6.5.3-1.el7.centos.noarch

Thanks,
--
Wee


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users




--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Additional hosted-engine host deployment failed due to message timeout

2016-05-02 Thread Wee Sritippho

Hi,

I'm making a fresh hosted-engine installation on 3 hosts. First 2 hosts 
succeeded, but the 3rd one stuck at termination state "Installing Host 
hosted_engine_3. Stage: Termination.", then, 3 minutes later, "VDSM 
hosted_engine_3 command failed: Message timeout which can be caused by 
communication issues". Currently, the 3rd host status in web UI is stuck 
at "Installing".


How can I proceed? Could I just run this script 
 
in the 3rd host, and run "hosted-engine --deploy" again?


log files: https://app.box.com/s/a5typfe6cbozs9uo9osg68gtmq8793t6

[me@host03 ~]$ rpm -qa | grep ovirt
ovirt-release36-007-1.noarch
ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch

[me@host03 ~]$ rpm -qa | grep vdsm
vdsm-hook-vmfex-dev-4.17.26-1.el7.noarch
vdsm-xmlrpc-4.17.26-1.el7.noarch
vdsm-infra-4.17.26-1.el7.noarch
vdsm-yajsonrpc-4.17.26-1.el7.noarch
vdsm-python-4.17.26-1.el7.noarch
vdsm-4.17.26-1.el7.noarch
vdsm-cli-4.17.26-1.el7.noarch
vdsm-jsonrpc-4.17.26-1.el7.noarch

[root@engine ~]# rpm -qa | grep ovirt
ovirt-engine-wildfly-8.2.1-1.el7.x86_64
ovirt-engine-setup-plugin-ovirt-engine-common-3.6.5.3-1.el7.centos.noarch
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-engine-cli-3.6.2.0-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-3.6.5.3-1.el7.centos.noarch
ovirt-engine-backend-3.6.5.3-1.el7.centos.noarch
ovirt-iso-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extensions-api-impl-3.6.5.3-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-image-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extension-aaa-jdbc-1.0.6-1.el7.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
ovirt-host-deploy-java-1.4.1-1.el7.centos.noarch
ovirt-engine-setup-base-3.6.5.3-1.el7.centos.noarch
ovirt-engine-setup-plugin-websocket-proxy-3.6.5.3-1.el7.centos.noarch
ovirt-engine-tools-backup-3.6.5.3-1.el7.centos.noarch
ovirt-vmconsole-proxy-1.0.0-1.el7.centos.noarch
ovirt-engine-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch
ovirt-engine-setup-3.6.5.3-1.el7.centos.noarch
ovirt-engine-webadmin-portal-3.6.5.3-1.el7.centos.noarch
ovirt-engine-tools-3.6.5.3-1.el7.centos.noarch
ovirt-engine-restapi-3.6.5.3-1.el7.centos.noarch
ovirt-engine-3.6.5.3-1.el7.centos.noarch
ovirt-engine-wildfly-overlay-8.0.5-1.el7.noarch
ovirt-engine-lib-3.6.5.3-1.el7.centos.noarch
ovirt-engine-websocket-proxy-3.6.5.3-1.el7.centos.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch
ovirt-engine-userportal-3.6.5.3-1.el7.centos.noarch
ovirt-engine-dbscripts-3.6.5.3-1.el7.centos.noarch

Thanks,

--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] [hosted-engine] Dedicated LUN for hosted_engine data domain?

2016-04-26 Thread Wee Sritippho

Hi,

I used to install oVirt hosted-engine 3.5 using fibre channel storage. 
It required a dedicated LUN, which can't be used by other VMs, to host 
the hosted engine VM. Therefore, I have to create a small LUN (50GB) for 
the engine VM in addition to the LUN that I used to host other VMs.


Does this still apply in 3.6.x?

I'm currently using 3.6.4 and notice that I can now choose to create a 
new domain using the LUN reserved for the hosted engine (I didn't choose 
that LUN, but when I create my first data domain, 'hosted_storage' 
domain was created automatically and that LUN can't be selected by new 
domains anymore), and am able to deploy a new disk to 'hosted_storage' 
like any other data domain.


--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [hosted-engine] engine VM doesn't respawn when its host was killed (poweroff)

2016-04-25 Thread Wee Sritippho

The hosted engine storage is located in an external Fibre Channel SAN.

On 25/4/2559 16:19, Martin Sivak wrote:

Hi,

it seems that all nodes lost access to storage for some reason after
the host was killed. Where is your hosted engine storage located?

Regards

--
Martin Sivak
SLA / oVirt


On Mon, Apr 25, 2016 at 10:58 AM, Wee Sritippho <we...@forest.go.th> wrote:

Hi,

 From the hosted-engine FAQ, the engine VM should be up and running in about
5 minutes after its host was forced poweroff. However, after updated oVirt
3.6.4 to 3.6.5, the engine VM won't restart automatically even after 10+
minutes (I already made sure that global maintenance mode is set to none). I
initially thought its a time sync issue, so I installed and enabled ntp on
the hosts and engine. However, the issue still persists.

###Versions:
[root@host01 ~]# rpm -qa | grep ovirt
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.5.3-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-hosted-engine-setup-1.3.5.0-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
[root@host01 ~]# rpm -qa | grep vdsm
vdsm-infra-4.17.26-0.el7.centos.noarch
vdsm-jsonrpc-4.17.26-0.el7.centos.noarch
vdsm-gluster-4.17.26-0.el7.centos.noarch
vdsm-python-4.17.26-0.el7.centos.noarch
vdsm-yajsonrpc-4.17.26-0.el7.centos.noarch
vdsm-4.17.26-0.el7.centos.noarch
vdsm-cli-4.17.26-0.el7.centos.noarch
vdsm-xmlrpc-4.17.26-0.el7.centos.noarch
vdsm-hook-vmfex-dev-4.17.26-0.el7.centos.noarch

###Log files:
https://app.box.com/s/fkurmwagogwkv5smkwwq7i4ztmwf9q9r

###After host02 was killed:
[root@host03 wees]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date  : True
Hostname   : host01.ovirt.forest.go.th
Host ID: 1
Engine status  : {"reason": "vm not running on this
host", "health": "bad", "vm": "down", "detail": "unknown"}
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 396766e0
Host timestamp : 4391


--== Host 2 status ==--

Status up-to-date  : True
Hostname   : host02.ovirt.forest.go.th
Host ID: 2
Engine status  : {"health": "good", "vm": "up",
"detail": "up"}
Score  : 0
stopped: True
Local maintenance  : False
crc32  : 3a345b65
Host timestamp : 1458


--== Host 3 status ==--

Status up-to-date  : True
Hostname   : host03.ovirt.forest.go.th
Host ID: 3
Engine status  : {"reason": "vm not running on this
host", "health": "bad", "vm": "down", "detail": "unknown"}
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 4c34b0ed
Host timestamp : 11958

###After host02 was killed for a while:
[root@host03 wees]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date  : False
Hostname   : host01.ovirt.forest.go.th
Host ID: 1
Engine status  : unknown stale-data
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 72e4e418
Host timestamp : 4415


--== Host 2 status ==--

Status up-to-date  : False
Hostname   : host02.ovirt.forest.go.th
Host ID: 2
Engine status  : unknown stale-data
Score  : 0
stopped: True
Local maintenance  : False
crc32  : 3a345b65
Host timestamp : 1458


--== Host 3 status ==--

Status up-to-date  : False
Hostname   : host03.ovirt.forest.go.th
Host ID: 3
Engine status  : unknown stale-data
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 4c34b0ed
Host timestamp : 11958

###After host02 was up again completely:
[roo

[ovirt-users] [hosted-engine] engine VM doesn't respawn when its host was killed (poweroff)

2016-04-25 Thread Wee Sritippho

Hi,

From the hosted-engine FAQ, the engine VM should be up and running in 
about 5 minutes after its host was forced poweroff. However, after 
updated oVirt 3.6.4 to 3.6.5, the engine VM won't restart automatically 
even after 10+ minutes (I already made sure that global maintenance mode 
is set to none). I initially thought its a time sync issue, so I 
installed and enabled ntp on the hosts and engine. However, the issue 
still persists.


###Versions:
[root@host01 ~]# rpm -qa | grep ovirt
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.5.3-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-hosted-engine-setup-1.3.5.0-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
[root@host01 ~]# rpm -qa | grep vdsm
vdsm-infra-4.17.26-0.el7.centos.noarch
vdsm-jsonrpc-4.17.26-0.el7.centos.noarch
vdsm-gluster-4.17.26-0.el7.centos.noarch
vdsm-python-4.17.26-0.el7.centos.noarch
vdsm-yajsonrpc-4.17.26-0.el7.centos.noarch
vdsm-4.17.26-0.el7.centos.noarch
vdsm-cli-4.17.26-0.el7.centos.noarch
vdsm-xmlrpc-4.17.26-0.el7.centos.noarch
vdsm-hook-vmfex-dev-4.17.26-0.el7.centos.noarch

###Log files:
https://app.box.com/s/fkurmwagogwkv5smkwwq7i4ztmwf9q9r

###After host02 was killed:
[root@host03 wees]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date  : True
Hostname   : host01.ovirt.forest.go.th
Host ID: 1
Engine status  : {"reason": "vm not running on this 
host", "health": "bad", "vm": "down", "detail": "unknown"}

Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 396766e0
Host timestamp : 4391


--== Host 2 status ==--

Status up-to-date  : True
Hostname   : host02.ovirt.forest.go.th
Host ID: 2
Engine status  : {"health": "good", "vm": "up", 
"detail": "up"}

Score  : 0
stopped: True
Local maintenance  : False
crc32  : 3a345b65
Host timestamp : 1458


--== Host 3 status ==--

Status up-to-date  : True
Hostname   : host03.ovirt.forest.go.th
Host ID: 3
Engine status  : {"reason": "vm not running on this 
host", "health": "bad", "vm": "down", "detail": "unknown"}

Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 4c34b0ed
Host timestamp : 11958

###After host02 was killed for a while:
[root@host03 wees]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date  : False
Hostname   : host01.ovirt.forest.go.th
Host ID: 1
Engine status  : unknown stale-data
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 72e4e418
Host timestamp : 4415


--== Host 2 status ==--

Status up-to-date  : False
Hostname   : host02.ovirt.forest.go.th
Host ID: 2
Engine status  : unknown stale-data
Score  : 0
stopped: True
Local maintenance  : False
crc32  : 3a345b65
Host timestamp : 1458


--== Host 3 status ==--

Status up-to-date  : False
Hostname   : host03.ovirt.forest.go.th
Host ID: 3
Engine status  : unknown stale-data
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 4c34b0ed
Host timestamp : 11958

###After host02 was up again completely:
[root@host03 wees]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date  : True
Hostname   : host01.ovirt.forest.go.th
Host ID: 1
Engine status  : {"reason": "vm not running on this 
host", "health": "bad", "vm": "down", "detail": "unknown"}

Score  : 0
stopped: False
Local maintenance  : False
crc32  : f5728fca
Host timestamp : 


--== Host 2 status ==--


Re: [ovirt-users] [hosted-engine] engine failed to start after rebooted

2016-04-22 Thread Wee Sritippho
Here is host01's broker.log:
https://gist.github.com/weeix/d73aa8506b296c27110747464ea33312/raw/e73938f4dce3591006b07e6ea61760831f4a2f18/broker.log

On 22 เมษายน 2016 15 นาฬิกา 04 นาที 40 วินาที GMT+07:00, Simone Tiraboschi 
<stira...@redhat.com> wrote:
>On Fri, Apr 22, 2016 at 9:46 AM, Simone Tiraboschi
><stira...@redhat.com> wrote:
>> On Fri, Apr 22, 2016 at 9:44 AM, Wee Sritippho <we...@forest.go.th>
>wrote:
>>> Hi,
>>>
>>> I were upgrading oVirt from 3.6.4.1 to 3.6.5. The engine-vm was
>running on
>>> host02. These are the steps that I've done:
>>>
>>> 1. Set hosted engine maintenance mode to global
>>> 2. Accessed engine-vm and upgraded oVirt to latest version
>>> 3. Run 'reboot' in engine-vm
>>> 4. After about 10 minutes, the engine-vm still doesn't boot, so I
>set hosted
>>> engine maintenance mode back to none.
>>
>> This is absolutely normal: in global maintenance mode the agent will
>> not bring up the VM.
>>
>>> 5. After another 10 minutes, the engine-vm still doesn't boot, so I
>>> restarted host02, host01 then host03 before the engine-vm would be
>>> accessible again. I then have to activate host01 and host03 again.
>>
>> This instead is pretty strange: exiting the maintenance mode an host
>> should bring up the engine VM.
>
>OK,
>it didn't start on host02 since it was in local maintenance mode:
>MainThread::INFO::2016-04-23
>01:08:12,597::hosted_engine::462::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>Current state LocalMaintenance (score: 0)
>
>The issue on host01 is here:
>
>MainThread::INFO::2016-04-23
>01:22:14,608::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>Trying: notify time=1461349334.61 type=state_transition
>detail=GlobalMaintenance-ReinitializeFSM
>hostname='host01.ovirt.forest.go.th'
>MainThread::ERROR::2016-04-23
>01:22:44,638::brokerlink::279::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(_communicate)
>Connection closed: Connection timed out
>
>The agent failed talking with the broker service (can you please also
>attach broker logs from host01?).
>Rebooting the host simply restarted also the broker and so the engine
>VM went up.
>No the issue is why the broker went down and didn't restarted.
>
>
>>> Here are the log files from ovirt-hosted-engine-ha folder:
>>>   - host01:
>https://gist.github.com/weeix/d73aa8506b296c27110747464ea33312
>>>   - host02:
>https://gist.github.com/weeix/c1b7033f07fb104fdd483cf7ea3a7852
>>>
>>> How to correctly restart the engine-vm when we need to?
>>>
>>> --
>>> Wee
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users

-- 
วีร์ ศรีทิพโพธิ์
นักวิชาการคอมพิวเตอร์ปฏิบัติการ
ศูนย์สารสนเทศ กรมป่าไม้
โทร. 025614292-3 ต่อ 5621
มือถือ. 0864678919___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] [hosted-engine] engine failed to start after rebooted

2016-04-22 Thread Wee Sritippho

Hi,

I were upgrading oVirt from 3.6.4.1 to 3.6.5. The engine-vm was running 
on host02. These are the steps that I've done:


1. Set hosted engine maintenance mode to global
2. Accessed engine-vm and upgraded oVirt to latest version
3. Run 'reboot' in engine-vm
4. After about 10 minutes, the engine-vm still doesn't boot, so I set 
hosted engine maintenance mode back to none.
5. After another 10 minutes, the engine-vm still doesn't boot, so I 
restarted host02, host01 then host03 before the engine-vm would be 
accessible again. I then have to activate host01 and host03 again.


Here are the log files from ovirt-hosted-engine-ha folder:
  - host01: https://gist.github.com/weeix/d73aa8506b296c27110747464ea33312
  - host02: https://gist.github.com/weeix/c1b7033f07fb104fdd483cf7ea3a7852

How to correctly restart the engine-vm when we need to?

--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [hosted-engine] Metadata too new error when adding 2nd host

2016-04-21 Thread Wee Sritippho


On 20 เมษายน 2016 18 นาฬิกา 29 นาที 26 วินาที GMT+07:00, Yedidyah Bar David 
<d...@redhat.com> wrote:
>On Wed, Apr 20, 2016 at 1:42 PM, Wee Sritippho <we...@forest.go.th>
>wrote:
>> Hi Didi & Martin,
>>
>> I followed your instructions and are able to add the 2nd host. Thank
>you :)
>>
>> This is what I've done:
>>
>> [root@host01 ~]# hosted-engine --set-maintenance --mode=global
>>
>> [root@host01 ~]# systemctl stop ovirt-ha-agent
>>
>> [root@host01 ~]# systemctl stop ovirt-ha-broker
>>
>> [root@host01 ~]# find /rhev -name hosted-engine.metadata
>>
>/rhev/data-center/mnt/blockSD/47e3e4ac-534a-4e11-b14e-27ecb4585431/ha_agent/hosted-engine.metadata
>>
>/rhev/data-center/mnt/blockSD/336dc4a3-f65c-4a67-bc42-1f73597564cf/ha_agent/hosted-engine.metadata
>>
>> [root@host01 ~]# ls -al
>>
>/rhev/data-center/mnt/blockSD/47e3e4ac-534a-4e11-b14e-27ecb4585431/ha_agent/hosted-engine.metadata
>> lrwxrwxrwx. 1 vdsm kvm 132 Apr 20 02:56
>>
>/rhev/data-center/mnt/blockSD/47e3e4ac-534a-4e11-b14e-27ecb4585431/ha_agent/hosted-engine.metadata
>> ->
>>
>/var/run/vdsm/storage/47e3e4ac-534a-4e11-b14e-27ecb4585431/d92632bf-8c15-44ba-9aa8-4a39dcb81e8d/4761bb8d-779e-4378-8b13-7b12f96f5c56
>>
>> [root@host01 ~]# ls -al
>>
>/rhev/data-center/mnt/blockSD/336dc4a3-f65c-4a67-bc42-1f73597564cf/ha_agent/hosted-engine.metadata
>> lrwxrwxrwx. 1 vdsm kvm 132 Apr 21 03:40
>>
>/rhev/data-center/mnt/blockSD/336dc4a3-f65c-4a67-bc42-1f73597564cf/ha_agent/hosted-engine.metadata
>> ->
>>
>/var/run/vdsm/storage/336dc4a3-f65c-4a67-bc42-1f73597564cf/49d6ee16-cfa0-47f2-b461-125bc6f614db/89ee314d-33ce-43fb-9a66-0852c5f675d3
>>
>> [root@host01 ~]# dd if=/dev/zero
>>
>of=/var/run/vdsm/storage/336dc4a3-f65c-4a67-bc42-1f73597564cf/49d6ee16-cfa0-47f2-b461-125bc6f614db/89ee314d-33ce-43fb-9a66-0852c5f675d3
>> bs=1M
>> dd: error writing
>>
>‘/var/run/vdsm/storage/336dc4a3-f65c-4a67-bc42-1f73597564cf/49d6ee16-cfa0-47f2-b461-125bc6f614db/89ee314d-33ce-43fb-9a66-0852c5f675d3’:
>> No space left on device
>> 129+0 records in
>> 128+0 records out
>> 134217728 bytes (134 MB) copied, 0.246691 s, 544 MB/s
>>
>> [root@host01 ~]# systemctl start ovirt-ha-broker
>>
>> [root@host01 ~]# systemctl start ovirt-ha-agent
>>
>> [root@host01 ~]# hosted-engine --set-maintenance --mode=none
>>
>> (Found 2 metadata files but the first one is red when I used 'ls -al'
>so I
>> assume it is a leftover from the previous failed installation and
>didn't
>> touch it)
>>
>> BTW, how to properly clean the FC storage before using it with oVirt?
>I used
>> "parted /dev/mapper/wwid mklabel msdos" to destroy the partition
>table.
>> Isn't that enough?
>
>Even this should not be needed in 3.6. Did you start with 3.6? Or
>upgraded
>from a previous version?

I started with 3.6.1. The first deployment failed due to corrupted os when I 
tried to restart the vm with option 3 (power off & restart vm) before trying to 
install ovirt-engine on it. I then chose another option to destroy the vm and 
to quit the setup, destroyed the FC LUN's partition table and then run 
hosted-engine --deploy on the 1st host again with success.

>Also please verify that output of 'hosted-engine --vm-status' makes
>sense.
>
>Thanks,
>
>>
>>
>> On 20/4/2559 15:11, Martin Sivak wrote:
>>>>
>>>> Assuming you never deployed a host with ID 52, this is likely a
>result of
>>>> a
>>>> corruption or dirt or something like that.
>>>> I see that you use FC storage. In previous versions, we did not
>clean
>>>> such
>>>> storage, so you might have dirt left.
>>>
>>> This is the exact reason for an error like yours. Using dirty block
>>> storage. Please stop all hosted engine tooling (both agent and
>broker)
>>> and fill the metadata drive with zeros.
>>>
>>> You will have to find the proper hosted-engine.metadata file (which
>>> will be a symlink) under /rhev:
>>>
>>> Example:
>>>
>>> [root@dev-03 rhev]# find . -name hosted-engine.metadata
>>>
>>>
>>>
>./data-center/mnt/str-01.rhev.lab.eng.brq.redhat.com:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/ha_agent/hosted-engine.metadata
>>>
>>> [root@dev-03 rhev]# ls -al
>>>
>>>
>./data-center/mnt/str-01:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/ha_agent/hosted-engine.metadata
>>>
>>> lrwxrwxrwx. 1 vdsm kvm 201 Mar 15 15:00
>>>

Re: [ovirt-users] [hosted-engine] Metadata too new error when adding 2nd host

2016-04-20 Thread Wee Sritippho

Hi Didi & Martin,

I followed your instructions and are able to add the 2nd host. Thank you :)

This is what I've done:

[root@host01 ~]# hosted-engine --set-maintenance --mode=global

[root@host01 ~]# systemctl stop ovirt-ha-agent

[root@host01 ~]# systemctl stop ovirt-ha-broker

[root@host01 ~]# find /rhev -name hosted-engine.metadata
/rhev/data-center/mnt/blockSD/47e3e4ac-534a-4e11-b14e-27ecb4585431/ha_agent/hosted-engine.metadata
/rhev/data-center/mnt/blockSD/336dc4a3-f65c-4a67-bc42-1f73597564cf/ha_agent/hosted-engine.metadata

[root@host01 ~]# ls -al 
/rhev/data-center/mnt/blockSD/47e3e4ac-534a-4e11-b14e-27ecb4585431/ha_agent/hosted-engine.metadata
lrwxrwxrwx. 1 vdsm kvm 132 Apr 20 02:56 
/rhev/data-center/mnt/blockSD/47e3e4ac-534a-4e11-b14e-27ecb4585431/ha_agent/hosted-engine.metadata 
-> 
/var/run/vdsm/storage/47e3e4ac-534a-4e11-b14e-27ecb4585431/d92632bf-8c15-44ba-9aa8-4a39dcb81e8d/4761bb8d-779e-4378-8b13-7b12f96f5c56


[root@host01 ~]# ls -al 
/rhev/data-center/mnt/blockSD/336dc4a3-f65c-4a67-bc42-1f73597564cf/ha_agent/hosted-engine.metadata
lrwxrwxrwx. 1 vdsm kvm 132 Apr 21 03:40 
/rhev/data-center/mnt/blockSD/336dc4a3-f65c-4a67-bc42-1f73597564cf/ha_agent/hosted-engine.metadata 
-> 
/var/run/vdsm/storage/336dc4a3-f65c-4a67-bc42-1f73597564cf/49d6ee16-cfa0-47f2-b461-125bc6f614db/89ee314d-33ce-43fb-9a66-0852c5f675d3


[root@host01 ~]# dd if=/dev/zero 
of=/var/run/vdsm/storage/336dc4a3-f65c-4a67-bc42-1f73597564cf/49d6ee16-cfa0-47f2-b461-125bc6f614db/89ee314d-33ce-43fb-9a66-0852c5f675d3 
bs=1M
dd: error writing 
‘/var/run/vdsm/storage/336dc4a3-f65c-4a67-bc42-1f73597564cf/49d6ee16-cfa0-47f2-b461-125bc6f614db/89ee314d-33ce-43fb-9a66-0852c5f675d3’: 
No space left on device

129+0 records in
128+0 records out
134217728 bytes (134 MB) copied, 0.246691 s, 544 MB/s

[root@host01 ~]# systemctl start ovirt-ha-broker

[root@host01 ~]# systemctl start ovirt-ha-agent

[root@host01 ~]# hosted-engine --set-maintenance --mode=none

(Found 2 metadata files but the first one is red when I used 'ls -al' so 
I assume it is a leftover from the previous failed installation and 
didn't touch it)


BTW, how to properly clean the FC storage before using it with oVirt? I 
used "parted /dev/mapper/wwid mklabel msdos" to destroy the partition 
table. Isn't that enough?


On 20/4/2559 15:11, Martin Sivak wrote:

Assuming you never deployed a host with ID 52, this is likely a result of a
corruption or dirt or something like that.
I see that you use FC storage. In previous versions, we did not clean such
storage, so you might have dirt left.

This is the exact reason for an error like yours. Using dirty block
storage. Please stop all hosted engine tooling (both agent and broker)
and fill the metadata drive with zeros.

You will have to find the proper hosted-engine.metadata file (which
will be a symlink) under /rhev:

Example:

[root@dev-03 rhev]# find . -name hosted-engine.metadata

./data-center/mnt/str-01.rhev.lab.eng.brq.redhat.com:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/ha_agent/hosted-engine.metadata

[root@dev-03 rhev]# ls -al
./data-center/mnt/str-01:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/ha_agent/hosted-engine.metadata

lrwxrwxrwx. 1 vdsm kvm 201 Mar 15 15:00
./data-center/mnt/str-01:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/ha_agent/hosted-engine.metadata
-> 
/rhev/data-center/mnt/str-01:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/images/6ab3f215-f234-4cd4-b9d4-8680767c3d99/dcbfa48d-8543-42d1-93dc-aa40855c4855

And use (for example) dd if=/dev/zero of=/path/to/metadata bs=1M to
clean it - But be CAREFUL to not touch any other file or disk you
might find.

Then restart the hosted engine tools and all should be fine.



Martin


On Wed, Apr 20, 2016 at 8:20 AM, Yedidyah Bar David <d...@redhat.com> wrote:

On Wed, Apr 20, 2016 at 7:15 AM, Wee Sritippho <we...@forest.go.th> wrote:

Hi,

I used CentOS-7-x86_64-Minimal-1511.iso to install the hosts and the engine.

The 1st host and the hosted-engine were installed successfully, but the 2nd
host failed with this error message:

"Failed to execute stage 'Setup validation': Metadata version 2 from host 52
too new for this agent (highest compatible version: 1)"

Assuming you never deployed a host with ID 52, this is likely a result of a
corruption or dirt or something like that.

What do you get on host 1 running 'hosted-engine --vm-status'?

I see that you use FC storage. In previous versions, we did not clean such
storage, so you might have dirt left. See also [1]. You can try cleaning
using [2].

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1238823
[2] 
https://www.ovirt.org/documentation/how-to/hosted-engine/#lockspace-corrupted-recovery-procedure


Here is the package versions:

[root@host02 ~]# rpm -qa | grep ovirt
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarc

[ovirt-users] [hosted-engine] Metadata version too new error when adding 2nd host

2016-04-19 Thread Wee Sritippho

Hi,

I used CentOS-7-x86_64-Minimal-1511.iso to install the hosts and the engine.

The 1st host and the hosted-engine were installed successfully, but the 
2nd host setup failed with this error message:


"Failed to execute stage 'Setup validation': Metadata version 2 from 
host 52 too new for this agent (highest compatible version: 1)"


Here is the packages versions:

[root@host02 ~]# rpm -qa | grep ovirt
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.5.1-1.el7.centos.noarch
ovirt-hosted-engine-setup-1.3.4.0-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-engine-sdk-python-3.6.3.0-1.el7.centos.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch

[root@engine ~]# rpm -qa | grep ovirt
ovirt-engine-setup-base-3.6.4.1-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-common-3.6.4.1-1.el7.centos.noarch
ovirt-vmconsole-proxy-1.0.0-1.el7.centos.noarch
ovirt-engine-tools-3.6.4.1-1.el7.centos.noarch
ovirt-engine-vmconsole-proxy-helper-3.6.4.1-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-engine-sdk-python-3.6.3.0-1.el7.centos.noarch
ovirt-iso-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extensions-api-impl-3.6.4.1-1.el7.centos.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
ovirt-host-deploy-java-1.4.1-1.el7.centos.noarch
ovirt-engine-cli-3.6.2.0-1.el7.centos.noarch
ovirt-engine-setup-plugin-websocket-proxy-3.6.4.1-1.el7.centos.noarch
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-engine-backend-3.6.4.1-1.el7.centos.noarch
ovirt-engine-dbscripts-3.6.4.1-1.el7.centos.noarch
ovirt-engine-webadmin-portal-3.6.4.1-1.el7.centos.noarch
ovirt-engine-setup-3.6.4.1-1.el7.centos.noarch
ovirt-engine-3.6.4.1-1.el7.centos.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.4.1-1.el7.centos.noarch
ovirt-guest-agent-common-1.0.11-1.el7.noarch
ovirt-engine-wildfly-8.2.1-1.el7.x86_64
ovirt-engine-wildfly-overlay-8.0.5-1.el7.noarch
ovirt-engine-websocket-proxy-3.6.4.1-1.el7.centos.noarch
ovirt-engine-restapi-3.6.4.1-1.el7.centos.noarch
ovirt-engine-userportal-3.6.4.1-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-3.6.4.1-1.el7.centos.noarch
ovirt-image-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extension-aaa-jdbc-1.0.6-1.el7.noarch
ovirt-engine-lib-3.6.4.1-1.el7.centos.noarch


Here are the log files: 
https://gist.github.com/weeix/1743f88d3afe1f405889a67ed4011141


--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] [hosted-engine] Metadata too new error when adding 2nd host

2016-04-19 Thread Wee Sritippho

Hi,

I used CentOS-7-x86_64-Minimal-1511.iso to install the hosts and the engine.

The 1st host and the hosted-engine were installed successfully, but the 
2nd host failed with this error message:


"Failed to execute stage 'Setup validation': Metadata version 2 from 
host 52 too new for this agent (highest compatible version: 1)"


Here is the package versions:

[root@host02 ~]# rpm -qa | grep ovirt
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.5.1-1.el7.centos.noarch
ovirt-hosted-engine-setup-1.3.4.0-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-engine-sdk-python-3.6.3.0-1.el7.centos.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch

[root@engine ~]# rpm -qa | grep ovirt
ovirt-engine-setup-base-3.6.4.1-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-common-3.6.4.1-1.el7.centos.noarch
ovirt-vmconsole-proxy-1.0.0-1.el7.centos.noarch
ovirt-engine-tools-3.6.4.1-1.el7.centos.noarch
ovirt-engine-vmconsole-proxy-helper-3.6.4.1-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-engine-sdk-python-3.6.3.0-1.el7.centos.noarch
ovirt-iso-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extensions-api-impl-3.6.4.1-1.el7.centos.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
ovirt-host-deploy-java-1.4.1-1.el7.centos.noarch
ovirt-engine-cli-3.6.2.0-1.el7.centos.noarch
ovirt-engine-setup-plugin-websocket-proxy-3.6.4.1-1.el7.centos.noarch
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-engine-backend-3.6.4.1-1.el7.centos.noarch
ovirt-engine-dbscripts-3.6.4.1-1.el7.centos.noarch
ovirt-engine-webadmin-portal-3.6.4.1-1.el7.centos.noarch
ovirt-engine-setup-3.6.4.1-1.el7.centos.noarch
ovirt-engine-3.6.4.1-1.el7.centos.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.4.1-1.el7.centos.noarch
ovirt-guest-agent-common-1.0.11-1.el7.noarch
ovirt-engine-wildfly-8.2.1-1.el7.x86_64
ovirt-engine-wildfly-overlay-8.0.5-1.el7.noarch
ovirt-engine-websocket-proxy-3.6.4.1-1.el7.centos.noarch
ovirt-engine-restapi-3.6.4.1-1.el7.centos.noarch
ovirt-engine-userportal-3.6.4.1-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-3.6.4.1-1.el7.centos.noarch
ovirt-image-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extension-aaa-jdbc-1.0.6-1.el7.noarch
ovirt-engine-lib-3.6.4.1-1.el7.centos.noarch


Here are the log files: 
https://gist.github.com/weeix/1743f88d3afe1f405889a67ed4011141


--
Wee

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [hosted-engine] The VDSM host was found in a failed state

2016-03-23 Thread Wee Sritippho

Hi Didi,

It was indeed the iptable issue. I forgot to open the udp ports.

Here are the versions of relevant packages:

Host:
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-engine-appliance-3.6-20160301.1.el7.centos.noarch
ovirt-release36-005-1.noarch
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.4.3-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.3.0-1.el7.centos.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
ovirt-hosted-engine-setup-1.3.3.4-1.el7.centos.noarch
vdsm-jsonrpc-4.17.23-1.el7.noarch
vdsm-infra-4.17.23-1.el7.noarch
vdsm-hook-vmfex-dev-4.17.23-1.el7.noarch
vdsm-cli-4.17.23-1.el7.noarch
vdsm-gluster-4.17.23-1.el7.noarch
vdsm-yajsonrpc-4.17.23-1.el7.noarch
vdsm-python-4.17.23-1.el7.noarch
vdsm-4.17.23-1.el7.noarch
vdsm-xmlrpc-4.17.23-1.el7.noarch

Engine (ovirt-engine-appliance-3.6-20160301.1.el7.centos.noarch):
ovirt-engine-lib-3.6.3.4-1.el7.centos.noarch
ovirt-engine-websocket-proxy-3.6.3.4-1.el7.centos.noarch
ovirt-engine-wildfly-8.2.1-1.el7.x86_64
ovirt-engine-tools-3.6.3.4-1.el7.centos.noarch
ovirt-engine-setup-3.6.3.4-1.el7.centos.noarch
ovirt-iso-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extensions-api-impl-3.6.3.4-1.el7.centos.noarch
ovirt-engine-cli-3.6.2.0-1.el7.centos.noarch
ovirt-engine-setup-base-3.6.3.4-1.el7.centos.noarch
ovirt-engine-setup-plugin-websocket-proxy-3.6.3.4-1.el7.centos.noarch
ovirt-engine-webadmin-portal-3.6.3.4-1.el7.centos.noarch
ovirt-engine-backend-3.6.3.4-1.el7.centos.noarch
ovirt-engine-restapi-3.6.3.4-1.el7.centos.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.3.4-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-3.6.3.4-1.el7.centos.noarch
ovirt-engine-3.6.3.4-1.el7.centos.noarch
ovirt-guest-agent-common-1.0.11-1.el7.noarch
ovirt-release36-003-1.noarch
ovirt-engine-sdk-python-3.6.3.0-1.el7.centos.noarch
ovirt-image-uploader-3.6.0-1.el7.centos.noarch
ovirt-engine-extension-aaa-jdbc-1.0.6-1.el7.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-engine-wildfly-overlay-8.0.4-1.el7.noarch
ovirt-vmconsole-proxy-1.0.0-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-common-3.6.3.4-1.el7.centos.noarch
ovirt-host-deploy-java-1.4.1-1.el7.centos.noarch
ovirt-engine-dbscripts-3.6.3.4-1.el7.centos.noarch
ovirt-engine-vmconsole-proxy-helper-3.6.3.4-1.el7.centos.noarch
ovirt-engine-userportal-3.6.3.4-1.el7.centos.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
ovirt-vmconsole-1.0.0-1.el7.centos.noarch

And here are the log files (press 'Raw' to view full file):
https://gist.github.com/anonymous/24627289549e35317b7f

Thank you,
Wee


On 23/3/2559 14:11, Yedidyah Bar David wrote:

On Wed, Mar 23, 2016 at 6:40 AM, Wee Sritippho <we...@forest.go.th> wrote:

Hi,

I'm installing oVirt hosted-engine using a fibre channel storage. During the
deployment I found this error:

 [ ERROR ] The VDSM host was found in a failed state. Please check engine
and bootstrap installation logs.
 [ ERROR ] Unable to add hosted_engine_1 to the manager

Tried to reinstall the host via web GUI, but got this error:

 Host hosted_engine_1 installation failed. Host is not reachable.

How do I fix this?


You were asked:

 iptables was detected on your computer, do you wish setup to configure it?

and replied 'No'. So it later told you:

  The following network ports should be opened:
  tcp:5900
  tcp:5901
  udp:5900
  udp:5901
  An example of the required configuration for iptables
can be found at:
  /etc/ovirt-hosted-engine/iptables.example

Did you?

Also, your vdsm.log has lots of noise found on the hosted-engine storage.
Something like:

https://bugzilla.redhat.com/show_bug.cgi?id=1238823

Please provide versions of relevant packages on host and engine vm and HA logs
(/var/log/ovirt-hosted-engine-ha/*).

Adding Martin.


P.S. The log files were about 10 MB so I zipped it all

Thanks.

You can also upload to some file-sharing service and post a link, might be
more comfortable for some of the subscribers of this list.

Best,



---
ซอฟต์แวร์ Avast แอนตี้ไวรัสตรวจสอบหาไวรัสจากอีเมลนี้แล้ว
https://www.avast.com/antivirus

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] [hosted-engine] admin@internal password for additional nodes?

2016-02-25 Thread Wee Sritippho

Hi,

I'm trying to deploy a 2nd host to my hosted-engine environment, but at 
some point, the setup ask me to type a password for admin@internal 
again. Do I need to type the same password that I choose when deploying 
the 1st host? If not, would it replace the old password?


Thank you,
Wee

---
ซอฟต์แวร์ Avast แอนตี้ไวรัสตรวจสอบหาไวรัสจากอีเมลนี้แล้ว
https://www.avast.com/antivirus

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [hosted-engine] Error creating a glusterfs storage domain

2016-02-24 Thread Wee Sritippho

Hi,

Wow. How dumb of me. I just realized that I answered "Yes" in this 
configuration question:


iptables was detected on your computer, do you wish setup to configure 
it? (Yes, No)[Yes]:


So the hosted-engine setup configure my empty iptables to allow just 
some necessary port (excluding glusterfs)


I solved this by editing /etc/sysconfig/iptables to:

# oVirt+glusterfs firewall configuration.
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
# vdsm
-A INPUT -p tcp --dport 54321 -j ACCEPT
# rpc.statd
-A INPUT -p tcp --dport 111 -j ACCEPT
-A INPUT -p udp --dport 111 -j ACCEPT
# SSH
-A INPUT -p tcp --dport 22 -j ACCEPT
# snmp
-A INPUT -p udp --dport 161 -j ACCEPT
# libvirt tls
-A INPUT -p tcp --dport 16514 -j ACCEPT
# serial consoles
-A INPUT -p tcp -m multiport --dports 2223 -j ACCEPT
# guest consoles
-A INPUT -p tcp -m multiport --dports 5900:6923 -j ACCEPT
# migration
-A INPUT -p tcp -m multiport --dports 49152:49216 -j ACCEPT
# glusterfs
-A INPUT -p tcp --dport 24007:24008 -j ACCEPT
-A INPUT -p tcp --dport 24007:24008 -j ACCEPT
-A INPUT -p tcp --dport 38465:38467 -j ACCEPT
# nfs
-A INPUT -p tcp --dport 2049 -j ACCEPT

# Reject any other input traffic
-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -m physdev ! --physdev-is-bridged -j REJECT --reject-with 
icmp-host-prohibited

COMMIT

Then restarted iptables and run 'hosted-engine --deploy again'. This 
time, I made sure to answer "No" when the setup asked me if it should 
alter iptables or not. The deployment was success, although with some 
errors:


[ ERROR ] The VDSM host was found in a failed state. Please check engine 
and bootstrap installation logs.

[ ERROR ] Unable to add hosted_engine_1 to the manager

I were somehow able to solve them by manually SSH from the engine to the 
host, so that the host's key fingerprint was added to the engine's 
known_hosts. Then I logged into the engine's web UI and manually 
reinstall hosted_engine_1 with the "Automatically configure host 
firewall" option deselected (since I already included all of its 
configuration in my iptables file).


I also set the virt group profile on the storage domain's volume as you 
suggested.


Thank you very much for guiding me.
Wee

On 23/2/2559 17:49, Sahina Bose wrote:

The error indicates : OSError: [Errno 30] Read-only file system

Can you check the output of "gluster volume status gv0" on 
host01.ovirt.forest.go.th. Please make sure that firewall is not 
blocking gluster ports from communicating on the 3 nodes.


On a different note, since you are using gv0 as storage domain, set 
the virt group profile on this volume - "gluster volume set gv0 group 
virt"


On 02/23/2016 01:39 PM, Wee Sritippho wrote:

Hi,

I'm trying to deploy an oVirt Hosed Engine environment using this 
glusterfs volume:


# gluster volume info

Volume Name: gv0
Type: Replicate
Volume ID: 37bba03b-7276-421a-8960-81e28196ebde
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: host01.ovirt.forest.go.th:/data/brick1/gv0
Brick2: host03.ovirt.forest.go.th:/data/brick1/gv0
Brick3: host02.ovirt.forest.go.th:/data/brick1/gv0
Options Reconfigured:
storage.owner-gid: 36
storage.owner-uid: 36
performance.readdir-ahead: on

But the deployment failed with this error message:

[ ERROR ] Failed to execute stage 'Misc configuration': Error 
creating a storage domain: ('storageType=7, 
sdUUID=be5f66d8-57ef-43c8-90a5-e9132e0c95b4, 
domainName=hosted_storage, domClass=1, 
typeSpecificArg=host01.ovirt.forest.go.th:/gv0 domVersion=3',)


I tried to figure out what is happening via the log files:

Line ~7243 of vdsm.log
Line ~2930 of ovirt-hosted-engine-setup-20160223204857-585hqv.log

But didn't seem to understand it at all.

Please guide me on how to solve this problem.

Here is my environment:

CentOS Linux release 7.2.1511 (Core)
ovirt-hosted-engine-setup-1.3.2.3-1.el7.centos.noarch
vdsm-4.17.18-1.el7.noarch
glusterfs-3.7.8-1.el7.x86_64

Thank you,
Wee


---
ซอฟต์แวร์ Avast แอนตี้ไวรัสตรวจสอบหาไวรัสจากอีเมลนี้แล้ว
https://www.avast.com/antivirus


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users






---
ซอฟต์แวร์ Avast แอนตี้ไวรัสตรวจสอบหาไวรัสจากอีเมลนี้แล้ว
https://www.avast.com/antivirus
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] 3.6: GlusterFS data domain failover?

2015-11-20 Thread Wee Sritippho
Hi,

My POC environment have 2 hosts - host A and host B, both are CentOS7. 
Installed oVirt 3.6 self-hosted engine. I manually created a 2-brick 
GlusterFS volume using both hosts and added it to my datacenter.

I tried shutting down host A. The hosted-engine restarted in host B 
within 3 minutes, which is very cool. However, the GlusterFS data 
domain, which I set both 'Use Host' and 'Path' to the host A, is down 
along with it.

Here comes my questions:
1. How can I enable failover GlusterFS data domain?
2. How can I reverse back to the state before adding the data domain? 
The data domain is super persistent - I can't edit or delete it. I put 
it to maintenance mode but still unable to detach or destroy it because 
it requires me to remove the datacenter first. I tried but can't remove 
the datacenter either.
3. Why can't I add another GlusterFS data domain? When I choose 
'GlusterFS' as my 'Storage Type' every text field become grayed-out.
4. When host A restarted again, I notice that 'Use Host' was changed 
from host A to host B instead. Is this an expected behavior?

Regards,
Wee Sritippho

P.S: Please excuse my poor English.___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users