[ovirt-users] Redeploying hosted engine from backup

2021-09-06 Thread Artem Tambovskiy
Hello,

Just had an issue with my cluster with a self-hosted engine (hosted-engine
is not coming up) and decided to redeploy it as I have a backup.

Just tried a hosted-engine --deploy --restore-from-file=engine.backup

But the script is asking questions like DC name, Cluster name which I can't
recall correctly.
What will be the consequence of the wrong answer? Is there any chance to
get this info from the hosts or backup file?
-- 
Regards,
Artem
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/O7PK465YAK2ZNZM532WR5ACLKAEA2Q54/


[ovirt-users] Does anyone have a positive experience with physical host to oVirt conversion?

2019-06-18 Thread Artem Tambovskiy
Hello,

Just would like to check if it really possible to convert old Centos6 based
physical box into oVirt 4.3 VM? I haven't been able to find any success
stories on this and process seems a bit complicated.
As I understand it I need a virt-v2v conversion proxy + image with virt-p2v
running on physical host which needs to be converted. I'm a bit lost how I
can get the converted VM into oVirt cluster than?

-- 
Regards,
Artem
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SW5U3TX5OO2D2Q74TH4EGBTSJBGS5NDJ/


[ovirt-users] Re: Can't bring upgraded to 4.3 host back to cluster

2019-06-11 Thread Artem Tambovskiy
Shani,

supervdsm failing too.

[root@ovirt1 vdsm]# systemctl status supervdsmd
● supervdsmd.service - Auxiliary vdsm service for running helper functions
as root
   Loaded: loaded (/usr/lib/systemd/system/supervdsmd.service; static;
vendor preset: enabled)
   Active: failed (Result: start-limit) since Tue 2019-06-11 16:18:16 MSK;
5s ago
  Process: 176025 ExecStart=/usr/share/vdsm/daemonAdapter
/usr/share/vdsm/supervdsmd --sockfile /var/run/vdsm/svdsm.sock
(code=exited, status=1/FAILURE)
 Main PID: 176025 (code=exited, status=1/FAILURE)

Jun 11 16:18:16 ovirt1.telia.ru systemd[1]: Unit supervdsmd.service entered
failed state.
Jun 11 16:18:16 ovirt1.telia.ru systemd[1]: supervdsmd.service failed.
Jun 11 16:18:16 ovirt1.telia.ru systemd[1]: supervdsmd.service holdoff time
over, scheduling restart.
Jun 11 16:18:16 ovirt1.telia.ru systemd[1]: Stopped Auxiliary vdsm service
for running helper functions as root.
Jun 11 16:18:16 ovirt1.telia.ru systemd[1]: start request repeated too
quickly for supervdsmd.service
Jun 11 16:18:16 ovirt1.telia.ru systemd[1]: Failed to start Auxiliary vdsm
service for running helper functions as root.
Jun 11 16:18:16 ovirt1.telia.ru systemd[1]: Unit supervdsmd.service entered
failed state.
Jun 11 16:18:16 ovirt1.telia.ru systemd[1]: supervdsmd.service failed.


supervdsm.log is full of messages like
logfile::DEBUG::2019-06-11 16:18:46,379::concurrent::193::root::(run) START
thread  (func=>, args=(), kwargs={})
logfile::DEBUG::2019-06-11 16:19:04,401::concurrent::193::root::(run) START
thread  (func=>, args=(), kwargs={})
logfile::DEBUG::2019-06-11 16:19:06,289::concurrent::193::root::(run) START
thread  (func=>, args=(), kwargs={})
logfile::DEBUG::2019-06-11 16:19:17,535::concurrent::193::root::(run) START
thread  (func=>, args=(), kwargs={})
logfile::DEBUG::2019-06-11 16:19:21,528::concurrent::193::root::(run) START
thread  (func=>, args=(), kwargs={})
logfile::DEBUG::2019-06-11 16:19:24,541::concurrent::193::root::(run) START
thread  (func=>, args=(), kwargs={})
logfile::DEBUG::2019-06-11 16:19:42,543::concurrent::193::root::(run) START
thread  (func=>, args=(), kwargs={})
logfile::DEBUG::2019-06-11 16:19:57,442::concurrent::193::root::(run) START
thread  (func=>, args=(), kwargs={})
logfile::DEBUG::2019-06-11 16:20:18,539::concurrent::193::root::(run) START
thread  (func=>, args=(), kwargs={})
logfile::DEBUG::2019-06-11 16:20:32,041::concurrent::193::root::(run) START
thread  (func=>, args=(), kwargs={})
logfile::DEBUG::2019-06-11 16:20:41,051::concurrent::193::root::(run) START
thread  (func=>, args=(), kwargs={})

Regards,
Artem


On Tue, Jun 11, 2019 at 3:59 PM Shani Leviim  wrote:

> +Dan Kenigsberg 
>
> Hi Artem,
> Thanks for the log.
>
> It seems that this error message appears quite a lot:
> 2019-06-11 12:10:35,283+0300 ERROR (MainThread) [root] Panic: Connect to
> supervdsm service failed: [Errno 2] No such file or directory (panic:29)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line
> 86, in _connect
> self._manager.connect, Exception, timeout=60, tries=3)
>   File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line
> 58, in retry
> return func()
>   File "/usr/lib64/python2.7/multiprocessing/managers.py", line 500, in
> connect
> conn = Client(self._address, authkey=self._authkey)
>   File "/usr/lib64/python2.7/multiprocessing/connection.py", line 173, in
> Client
> c = SocketClient(address)
>   File "/usr/lib64/python2.7/multiprocessing/connection.py", line 308, in
> SocketClient
> s.connect(address)
>   File "/usr/lib64/python2.7/socket.py", line 224, in meth
> return getattr(self._sock,name)(*args)
> error: [Errno 2] No such file or directory
>
> Can you please verify that the 'supervdsmd.service' is running?
>
>
> *Regards,*
>
> *Shani Leviim*
>
>
> On Tue, Jun 11, 2019 at 3:04 PM Artem Tambovskiy <
> artem.tambovs...@gmail.com> wrote:
>
>> Hi Shani,
>>
>> yes, you are right - I can do ssh form aby to any hosts in the cluster.
>> vdsm.log attached.
>> I have tried to restart vdsm manually and even done a host restart
>> several times with no success.
>> Host activation fails all the time ...
>>
>> Thank you in advance for your help!
>> Regard,
>> Artem
>>
>> On Tue, Jun 11, 2019 at 10:51 AM Shani Leviim  wrote:
>>
>>> Hi Artem,
>>> According to oVirt documentation [1], hosts on the same cluster should
>>> be reachable from one to each other.
>>>
>>> Can you please share your vdsm log?
>>> I suppose you do manage to ssh that inactive host (correct me if I'm
>>> wrong).
>>> While getting the vdsm log,

[ovirt-users] Re: Can't bring upgraded to 4.3 host back to cluster

2019-06-11 Thread Artem Tambovskiy
Hi Shani,

yes, you are right - I can do ssh form aby to any hosts in the cluster.
vdsm.log attached.
I have tried to restart vdsm manually and even done a host restart several
times with no success.
Host activation fails all the time ...

Thank you in advance for your help!
Regard,
Artem

On Tue, Jun 11, 2019 at 10:51 AM Shani Leviim  wrote:

> Hi Artem,
> According to oVirt documentation [1], hosts on the same cluster should be
> reachable from one to each other.
>
> Can you please share your vdsm log?
> I suppose you do manage to ssh that inactive host (correct me if I'm
> wrong).
> While getting the vdsm log, maybe try to restart the network and vdsmd
> services on the host.
>
> Another thing you can try on the UI is putting the host on maintenance and
> then activate it.
>
> [1]
> https://www.ovirt.org/documentation/admin-guide/chap-Clusters.html#introduction-to-clusters
>
>
> *Regards,*
>
> *Shani Leviim*
>
>
> On Mon, Jun 10, 2019 at 4:42 PM Artem Tambovskiy <
> artem.tambovs...@gmail.com> wrote:
>
>> Hello,
>>
>> May I ask you for and advise?
>> I'm running a small oVirt cluster and couple of months ago I decided to
>> do an upgrade from oVirt 4.2.8 to 4.3 and having an issues since that time.
>> I can only guess what I did wrong - probably one of the problems that I
>> haven't switched the cluster from iptables to firewalld. But this is just
>> my guess.
>>
>> The problem is that I have upgraded the engine and one host, and then I
>> done an upgrade of second host I can't bring it to active state. Looks like
>> VDSM can't detect the network and fails to start. I even tried to reinstall
>> the hosts from UI (I have seen that the packages being installed) but
>> again, VDSM doesn't startup at the end and reinstallation fails.
>>
>> Looking at hosts process list I see  script *wait_for_ipv4s*  hanging
>> forever.
>>
>> vdsm   8603  1  6 16:26 ?00:00:00 /usr/bin/python
>> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
>>
>> *root   8630  1  0 16:26 ?00:00:00 /bin/sh
>> /usr/libexec/vdsm/vdsmd_init_common.sh --pre-startroot   8645   8630  6
>> 16:26 ?00:00:00 /usr/bin/python2 /usr/libexec/vdsm/wait_for_ipv4s*
>> root   8688  1 30 16:27 ?00:00:00 /usr/bin/python2
>> /usr/share/vdsm/supervdsmd --sockfile /var/run/vdsm/svdsm.sock
>> vdsm   8715  1  0 16:27 ?00:00:00 /usr/bin/python
>> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker
>>
>> The all hosts in cluster are reachable from each other ...  That could be
>> the issue?
>>
>> Thank you in advance!
>> --
>> Regards,
>> Artem
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/TQX3LN2TEM4DECKKUMMRCWXTRM6BGIAB/
>>
>

-- 
Regards,
Artem


vdsm.tar.bzip2
Description: Binary data
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/U65YDKV4P6IFXENCQOCGNR23KXTM6HHD/


[ovirt-users] Can't bring upgraded to 4.3 host back to cluster

2019-06-10 Thread Artem Tambovskiy
Hello,

May I ask you for and advise?
I'm running a small oVirt cluster and couple of months ago I decided to do
an upgrade from oVirt 4.2.8 to 4.3 and having an issues since that time. I
can only guess what I did wrong - probably one of the problems that I
haven't switched the cluster from iptables to firewalld. But this is just
my guess.

The problem is that I have upgraded the engine and one host, and then I
done an upgrade of second host I can't bring it to active state. Looks like
VDSM can't detect the network and fails to start. I even tried to reinstall
the hosts from UI (I have seen that the packages being installed) but
again, VDSM doesn't startup at the end and reinstallation fails.

Looking at hosts process list I see  script *wait_for_ipv4s*  hanging
forever.

vdsm   8603  1  6 16:26 ?00:00:00 /usr/bin/python
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent

*root   8630  1  0 16:26 ?00:00:00 /bin/sh
/usr/libexec/vdsm/vdsmd_init_common.sh --pre-startroot   8645   8630  6
16:26 ?00:00:00 /usr/bin/python2 /usr/libexec/vdsm/wait_for_ipv4s*
root   8688  1 30 16:27 ?00:00:00 /usr/bin/python2
/usr/share/vdsm/supervdsmd --sockfile /var/run/vdsm/svdsm.sock
vdsm   8715  1  0 16:27 ?00:00:00 /usr/bin/python
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker

The all hosts in cluster are reachable from each other ...  That could be
the issue?

Thank you in advance!
-- 
Regards,
Artem
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TQX3LN2TEM4DECKKUMMRCWXTRM6BGIAB/


[ovirt-users] Re: Hosts not coming back into oVirt

2019-03-22 Thread Artem Tambovskiy
Hi,

I have exactly the same issue after upgrade from 4.2.8 to 4.3.2. I can
reach the host from SHE but the VDSM is constantly failing to start on the
host after upgrade.

чт, 21 мар. 2019 г., 19:48 Simone Tiraboschi :

>
>
> On Thu, Mar 21, 2019 at 3:47 PM Arif Ali  wrote:
>
>> Hi all,
>>
>> Recently deployed oVirt version 4.3.1
>>
>> It's in a self-hosted engine environment
>>
>> Used the steps via cockpit to install the engine, and was able to add
>> the rest of the oVirt nodes without any specific problems
>>
>> We tested the HA of the hosted-engine without a problem, and then at one
>> point of turn off the machine that was hosting the engine, to mimic
>> failure to see how it goes; the vm was able to move over successfully,
>> but some of the oVirt started to go into Unassigned. From a total of 6
>> oVirt hosts, I have 4 of them in this state.
>>
>> Clicking on the host, I see the following message in the events. I can
>> get to the hosts via the engine, and ping the machine, so not sure what
>> it's doing that it's no longer working
>>
>> VDSM  command Get Host Capabilities failed: Message timeout which
>> can be caused by communication issues
>>
>> Mind you, I have been trying to resolve this issue since Monday, and
>> have tried various things, like rebooting and re-installing the oVirt
>> hosts, without having much luck
>>
>> So any assistance on this would be grateful, maybe I've missed something
>> really simple, and I am overlooking it
>>
>
> Can you please check that VDSM is correctly running on that nodes?
> Are you able to correctly reach that nodes from the engine VM?
>
>
>>
>> --
>> regards,
>>
>> Arif Ali
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/FYG7NEV24JCCR4RIXLOMZ2CAPYAH4GDH/
>>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y7YVXSLFJ3XCQSPJSPQ2K2OCCMS2F465/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FN23MNIIKFLVJPBVHU3T236X7S6I42HK/


[ovirt-users] Host unresponsive after upgrade 4.2.8 -> 4.3.2 failed

2019-03-19 Thread Artem Tambovskiy
Hello,

Just started upgrading my small cluster to from 4.2.8 to 4.3.2 and endup in
the situation that one of the hosts is not working after upgrade.
For some reason vdsmd is not starting up, I have tried to restart it
manually with no luck:

Any ideas on what could be the reason?

[root@ovirt2 log]# systemctl restart vdsmd
A dependency job for vdsmd.service failed. See 'journalctl -xe' for details.
[root@ovirt2 log]# journalctl -xe
-- Unit ovirt-ha-agent.service has finished shutting down.
Mar 19 15:47:47 ovirt2.domain.org systemd[1]: Starting Virtual Desktop
Server Manager...
-- Subject: Unit vdsmd.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit vdsmd.service has begun starting up.
Mar 19 15:47:47 ovirt2.domain.org vdsmd_init_common.sh[56717]: vdsm:
Running mkdirs
Mar 19 15:47:47 ovirt2.domain.org vdsmd_init_common.sh[56717]: vdsm:
Running configure_coredump
Mar 19 15:47:47 ovirt2.domain.org vdsmd_init_common.sh[56717]: vdsm:
Running configure_vdsm_logs
Mar 19 15:47:47 ovirt2.domain.org vdsmd_init_common.sh[56717]: vdsm:
Running wait_for_network
Mar 19 15:47:47 ovirt2.domain.org supervdsmd[56716]: Supervdsm failed to
start: 'module' object has no attribute 'Accounting'
Mar 19 15:47:47 ovirt2.domain.org python2[56716]: detected unhandled Python
exception in '/usr/share/vdsm/supervdsmd'
Mar 19 15:47:48 ovirt2.domain.org abrt-server[56745]: Duplicate: core
backtrace
Mar 19 15:47:48 ovirt2.domain.org abrt-server[56745]: DUP_OF_DIR:
/var/tmp/abrt/Python-2019-03-19-14:23:04-17292
Mar 19 15:47:48 ovirt2.domain.org abrt-server[56745]: Deleting problem
directory Python-2019-03-19-15:47:47-56716 (dup of
Python-2019-03-19-14:23:04-17292
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: Traceback (most
recent call last):
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: File
"/usr/share/vdsm/supervdsmd", line 26, in 
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]:
supervdsm_server.main(sys.argv[1:])
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: File
"/usr/lib/python2.7/site-packages/vdsm/supervdsm_server.py", line 294, in
main
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: module_name))
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: File
"/usr/lib64/python2.7/importlib/__init__.py", line 37, in import_module
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: __import__(name)
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: File
"/usr/lib/python2.7/site-packages/vdsm/supervdsm_api/systemd.py", line 34,
in 
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]:
cmdutils.Accounting.CPU,
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: AttributeError:
'module' object has no attribute 'Accounting'
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: supervdsmd.service: main
process exited, code=exited, status=1/FAILURE
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: Unit supervdsmd.service
entered failed state.
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: supervdsmd.service failed.
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: supervdsmd.service holdoff
time over, scheduling restart.
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: Cannot add dependency job for
unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: Stopped Auxiliary vdsm
service for running helper functions as root.
-- Subject: Unit supervdsmd.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit supervdsmd.service has finished shutting down.
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: Started Auxiliary vdsm
service for running helper functions as root.
-- Subject: Unit supervdsmd.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit supervdsmd.service has finished starting up.
-- 
-- The start-up result is done.
Mar 19 15:47:50 ovirt2.domain.org supervdsmd[56757]: Supervdsm failed to
start: 'module' object has no attribute 'Accounting'
Mar 19 15:47:50 ovirt2.domain.org python2[56757]: detected unhandled Python
exception in '/usr/share/vdsm/supervdsmd'


-- 
Regards,
Artem
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RXQ7ZH2EZ74CO3VID7PXAXO6CHK4BXH3/


[ovirt-users] VM clone network interfaces names changes

2018-10-12 Thread Artem Tambovskiy
Hello,

I have a question indirectly related to oVirt - I have a VM with CentOS 6
running on my cluster, which has 6 virtual interfaces (eth0 - eth5). Now
its a time to do an upgrade to CentOS 7 based, and I did a VM clone to test
an upgrade process and was a bit surprised to see that now I have interface
names shifted  eth6 - eth11. I would afraid that I'll run out of digits
soon :)

Anyway how to change the interfaces names back to originals and perhaps
prevent them form the further changes? I do understand that MAC address has
changed, but don't get why it changing the interfaces names.

Cleaning up everything from /etc/udev/rules.d/70-persistent-net.rules
doesn't really helped ..

Regards,
Artem
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VF7FANR24CQS4DHDBLYXAUTHL3G42FR2/


[ovirt-users] lost connection to hosted engine

2018-10-02 Thread Artem Tambovskiy
Hi,

Just run into the issue during cluster upgrade from 4.24 to 4.2.6.1. I'm
running small cluster with 2 hosts and gluster storage. Once I upgraded one
of the hosts to 4.2.6.1 something went wrong (looks like it tried to start
HE instance) and I can't connect to hosted-engine any longer.

As I can see HostedEngine is still running on the second host (and another
yet 7 VM's) , but I can't stop it.
ovirt-ha-agent and ovirt-ha-broker are failing to start. hosted-engine
--vm-status gives nothing but error message
"The hosted engine configuration has not been retrieved from shared
storage. Please ensure that ovirt-ha-agent is running and the storage
server is reachable."

ps -ef shows plenty of vdsm processes in defunc state thats probably the
reason why agent and brocker can't start. Just wondering that is the good
way to start problem resolution here to minimize downtime for running VM's?

Restart vdsm and try again restarting agent and broker or just reboot the
whole host?

Regards,
Artem
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BKU2N2UOEHWJ3XKJ5DRTERKBTQZ4X7EB/


[ovirt-users] ovirt host upgrade 4.2.2 -> 4.2.3

2018-05-13 Thread Artem Tambovskiy
Hello,

I'm upgrading my cluster from 4.2.2 to 4.2.3, HE upgrade went well, but
having some issues with hosts upgrade: for some reason yum complaining
about conflicts during transaction check:


Transaction check error:
  file /usr/share/cockpit/networkmanager/manifest.json from install of
cockpit-system-160-3.el7.centos.noarch conflicts with file from package
cockpit-networkmanager-160-1.el7.centos.noarch

Any ideas about the reason for this?

Regards,
Artem
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org


Re: [ovirt-users] Hosted engine VDSM issue with sanlock

2018-03-29 Thread Artem Tambovskiy
Hi,

How many hosts you have? Check hosted-engine.conf on all hosts including
the one you have problem with and look if all host_id values are unique. It
might happen that you have several hosts with host_id=1

Regards,
Artem

ср, 28 мар. 2018 г., 20:49 Jamie Lawrence :

> I still can't resolve this issue.
>
> I have a host that is stuck in a cycle; it will be marked non responsive,
> then come back up, ending with an "finished activation" message in the GUI.
> Then it repeats.
>
> The root cause seems to be sanlock.  I'm just unclear on why it started or
> how to resolve it. The only "approved" knob I'm aware of is
> --reinitialize-lockspace and the manual equivalent, neither of which fix
> anything.
>
> Anyone have a guess?
>
> -j
>
> - - - vdsm.log - - - -
>
> 2018-03-28 10:38:22,207-0700 INFO  (monitor/b41eb20) [storage.SANLock]
> Acquiring host id for domain b41eb20a-eafb-481b-9a50-a135cf42b15e (id=1,
> async=True) (clusterlock:284)
> 2018-03-28 10:38:22,208-0700 ERROR (monitor/b41eb20) [storage.Monitor]
> Error acquiring host id 1 for domain b41eb20a-eafb-481b-9a50-a135cf42b15e
> (monitor:568)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
> 565, in _acquireHostId
> self.domain.acquireHostId(self.hostId, async=True)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 828, in
> acquireHostId
> self._manifest.acquireHostId(hostId, async)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 453, in
> acquireHostId
> self._domainLock.acquireHostId(hostId, async)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
> line 315, in acquireHostId
> raise se.AcquireHostIdFailure(self._sdUUID, e)
> AcquireHostIdFailure: Cannot acquire host id:
> (u'b41eb20a-eafb-481b-9a50-a135cf42b15e', SanlockException(22, 'Sanlock
> lockspace add failure', 'Invalid argument'))
> 2018-03-28 10:38:23,078-0700 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC
> call Host.ping2 succeeded in 0.00 seconds (__init__:573)
> 2018-03-28 10:38:23,085-0700 INFO  (jsonrpc/6) [vdsm.api] START
> repoStats(domains=[u'b41eb20a-eafb-481b-9a50-a135cf42b15e'])
> from=::1,54450, task_id=186d7e8b-7b4e-485d-a9e0-c0cb46eed621 (api:46)
> 2018-03-28 10:38:23,085-0700 INFO  (jsonrpc/6) [vdsm.api] FINISH repoStats
> return={u'b41eb20a-eafb-481b-9a50-a135cf42b15e': {'code': 0, 'actual':
> True, 'version': 4, 'acquired': False, 'delay': '0.000812547', 'lastCheck':
> '0.4', 'valid': True}} from=::1,54450,
> task_id=186d7e8b-7b4e-485d-a9e0-c0cb46eed621 (api:52)
> 2018-03-28 10:38:23,086-0700 INFO  (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC
> call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:573)
> 2018-03-28 10:38:23,092-0700 WARN  (vdsm.Scheduler) [Executor] Worker
> blocked:  action= at 0x1d44150>
> timeout=15, duration=150 at 0x7f076c05fb90> task#=83985 at 0x7f082c08e510>,
> traceback:
> File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap
>   self.__bootstrap_inner()
> File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
>   self.run()
> File: "/usr/lib64/python2.7/threading.py", line 765, in run
>   self.__target(*self.__args, **self.__kwargs)
> File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line
> 194, in run
>   ret = func(*args, **kwargs)
> File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in
> _run
>   self._execute_task()
> File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in
> _execute_task
>   task()
> File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in
> __call__
>   self._callable()
> File: "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 213,
> in __call__
>   self._func()
> File: "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 578,
> in __call__
>   stats = hostapi.get_stats(self._cif, self._samples.stats())
> File: "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 77, in
> get_stats
>   ret['haStats'] = _getHaInfo()
> File: "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 182, in
> _getHaInfo
>   stats = instance.get_all_stats()
> File:
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
> line 93, in get_all_stats
>   stats = broker.get_stats_from_storage()
> File:
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 135, in get_stats_from_storage
>   result = self._proxy.get_stats()
> File: "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
>   return self.__send(self.__name, args)
> File: "/usr/lib64/python2.7/xmlrpclib.py", line 1587, in __request
>   verbose=self.__verbose
> File: "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request
>   return self.single_request(host, handler, request_body, verbose)
> File: "/usr/lib64/python2.7/xmlrpclib.py", line 1303, in single_request
>   response = h.getresponse(buffering=True)
> File: 

Re: [ovirt-users] Issue with deploy HE on another host 4.1

2018-03-02 Thread Artem Tambovskiy
Hello Krzysztof,

As I can see both hosts have the same host_id=1, which causing conflict.

You need this this manually on the newly deployed host and restart
ovirt-ha-agent.
You may run following command on engine VM in order to find correct host_id
values for your hosts.

sudo -u postgres psql -d engine -c 'select vds_name, vds_spm_id from vds'

Once you fixed host_id and restarted agents, i would advise to check
sanlock client status in order to see that there are no conflicts and hosts
using correct host_id values.

Regards,
Artem

пт, 2 мар. 2018 г., 17:10 Krzysztof Wajda :

> Hi,
>
> I have an issue with Hosted Engine when I try to deploy via gui on another
> host. There is no errors after deploy but in GUI I see only "Not active"
> status HE, and hosted-engine --status shows only 1 node (on both nodes same
> output). In hosted-engine.conf I see that host_id is the same as it is on
> primary host with HE !? Issue looks quite similar like in
>
> http://lists.ovirt.org/pipermail/users/2018-February/086932.html
>
> Here is config file on newly deployed node :
>
> ca_cert=/etc/pki/vdsm/libvirt-spice/ca-cert.pem
> gateway=192.168.8.1
> iqn=
> conf_image_UUID=f2813205-4b0c-45f3-a9cb-3748f61d2194
> ca_cert=/etc/pki/vdsm/libvirt-spice/ca-cert.pem
> sdUUID=7e7a275c-6939-4f79-85f6-d695209951ea
> connectionUUID=81a2f9a3-2efe-448f-b305-e22543068044
> conf_volume_UUID=d6b7e25c-9912-47ff-b104-9d424b9f34b8
> user=
> host_id=1
> bridge=ovirtmgmt
> metadata_image_UUID=fe95f22e-b468-4adf-a754-21d419ae3e67
> spUUID=----
> mnt_options=
> fqdn=dev-ovirtengine0.somedomain.it
> portal=
> vm_disk_id=febde231-92cc-4599-8f55-816f63132739
> metadata_volume_UUID=7ebaf268-15ec-4c76-ba89-b5e2dc143830
> vm_disk_vol_id=e3920b18-4467-44f8-b2d0-629b3b1d1a58
> domainType=fc
> port=
> console=vnc
> ca_subject="C=EN, L=Test, O=Test, CN=Test"
> password=
> vmid=3f7d9c1d-6c3e-4b96-b85d-d240f3bf9b76
> lockspace_image_UUID=49e318ad-63a3-4efd-977c-33b8c4c93728
> lockspace_volume_UUID=91bcb5cf-006c-42b4-b419-6ac9f841f50a
> vdsm_use_ssl=true
> storage=None
> conf=/var/run/ovirt-hosted-engine-ha/vm.conf
>
> This is original one:
>
> fqdn=dev-ovirtengine0.somedomain.it
> vm_disk_id=febde231-92cc-4599-8f55-816f63132739
> vm_disk_vol_id=e3920b18-4467-44f8-b2d0-629b3b1d1a58
> vmid=3f7d9c1d-6c3e-4b96-b85d-d240f3bf9b76
> storage=None
> mnt_options=
> conf=/var/run/ovirt-hosted-engine-ha/vm.conf
> host_id=1
> console=vnc
> domainType=fc
> spUUID=----
> sdUUID=7e7a275c-6939-4f79-85f6-d695209951ea
> connectionUUID=81a2f9a3-2efe-448f-b305-e22543068044
> ca_cert=/etc/pki/vdsm/libvirt-spice/ca-cert.pem
> ca_subject="C=EN, L=Test, O=Test, CN=Test"
> vdsm_use_ssl=true
> gateway=192.168.8.1
> bridge=ovirtmgmt
> metadata_volume_UUID=7ebaf268-15ec-4c76-ba89-b5e2dc143830
> metadata_image_UUID=fe95f22e-b468-4adf-a754-21d419ae3e67
> lockspace_volume_UUID=91bcb5cf-006c-42b4-b419-6ac9f841f50a
> lockspace_image_UUID=49e318ad-63a3-4efd-977c-33b8c4c93728
> conf_volume_UUID=d6b7e25c-9912-47ff-b104-9d424b9f34b8
> conf_image_UUID=f2813205-4b0c-45f3-a9cb-3748f61d2194
>
> # The following are used only for iSCSI storage
> iqn=
> portal=
> user=
> password=
> port=
>
> Packages:
>
> ovirt-imageio-daemon-1.0.0-1.el7.noarch
> ovirt-host-deploy-1.6.7-1.el7.centos.noarch
> ovirt-release41-4.1.9-1.el7.centos.noarch
> ovirt-setup-lib-1.1.4-1.el7.centos.noarch
> ovirt-hosted-engine-ha-2.1.8-1.el7.centos.noarch
> ovirt-hosted-engine-setup-2.1.4-1.el7.centos.noarch
> ovirt-vmconsole-1.0.4-1.el7.centos.noarch
> ovirt-vmconsole-host-1.0.4-1.el7.centos.noarch
> ovirt-engine-sdk-python-3.6.9.1-1.el7.centos.noarch
> ovirt-imageio-common-1.0.0-1.el7.noarch
>
> Output from agent.log
>
> MainThread::INFO::2018-03-02
> 15:01:47,279::brokerlink::141::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
> Success, id 140493346760912
> MainThread::INFO::2018-03-02
> 15:01:51,011::brokerlink::179::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(set_storage_domain)
> Success, id 140493346759824
> MainThread::INFO::2018-03-02
> 15:01:51,011::hosted_engine::601::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
> Broker initialized, all submonitors started
> MainThread::INFO::2018-03-02
> 15:01:51,045::hosted_engine::704::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock)
> Ensuring lease for lockspace hosted-engine, host id 1 is acquired (file:
> /var/run/vdsm/storage/7e7a275c-6939-4f79-85f6-d695209951ea/49e318ad-63a3-4efd-977c-33b8c4c93728/91bcb5cf-006c-42b4-b419-6ac9f841f50a)
> MainThread::INFO::2018-03-02
> 15:04:12,058::hosted_engine::745::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock)
> Failed to acquire the lock. Waiting '5's before the next attempt
>
> Regards
>
> Krzysztof
>
> ___
> Users mailing list
> Users@ovirt.org
> 

[ovirt-users] Question about sanlock lockspaces

2018-02-22 Thread Artem Tambovskiy
Hello,

I'm still troubleshooting my cluster and trying to figure out which
lockspaces should be present and which shouldn't.

If HE VM is not running both ovirt-ha-agent and ovirt-ha-broker are down
and storage disconnected by hosted-engine --disconnect-storage should I see
something related to HE storage domain in
sanlock client status output?

For some reason on one host I don't see anything and the second one still
reports about present lockspace for HE storage domain. Is this normal?

[root@ovirt1 ~]# sanlock client status
daemon b1d7fea2-e8a9-4645-b449-97702fc3808e.ovirt1.tel
p -1 helper
p -1 listener
p -1 status
p 3763
p 62861 quaggaVM
p 63111 powerDNS
p 107818 pjsip_freepbx_14
p 109092 revizorro_dev
p 109589 routerVM
s a40cc3a9-54d6-40fd-acee-525ef29c8ce3:2:/rhev/data-center/mnt/glusterSD/
ovirt2.telia.ru\:_data/a40cc3a9-54d6-40fd-acee-525ef29c8ce3/dom_md/ids:0
s 4a7f8717-9bb0-4d80-8016-498fa4b88162:1:/rhev/data-center/mnt/glusterSD/
ovirt2.telia.ru\:_engine/4a7f8717-9bb0-4d80-8016-498fa4b88162/dom_md/ids:0
r a40cc3a9-54d6-40fd-acee-525ef29c8ce3:SDM:/rhev/data-center/mnt/glusterSD/
ovirt2.telia.ru\:_data/a40cc3a9-54d6-40fd-acee-525ef29c8ce3/dom_md/leases:1048576:49
p 3763

As it looks to me lockspace
4a7f8717-9bb0-4d80-8016-498fa4b88162:1:/rhev/data-center/mnt/glusterSD/
ovirt2.telia.ru\:_engine/4a7f8717-9bb0-4d80-8016-498fa4b88162/dom_md/ids:0
shouldn't be present, and it doesn't match to the host_id, but may be I'm
wrong here...

Regards,
Artem
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Fwd: why host is not capable to run HE?

2018-02-21 Thread Artem Tambovskiy
I took a HE VM down and stopped ovirt-ha-agents on both hosts.
Tried  hosted-engine --reinitialize-lockspace  the command just silently
executes and I'm not sure if it doing something at all.
I also tried to clean the metadata. On one host it went correct, on second
host it always failing with following messages:

INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:VDSM domain
monitor status: PENDING
INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:VDSM domain
monitor status: PENDING
INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:VDSM domain
monitor status: PENDING
INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:VDSM domain
monitor status: PENDING
ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed to
start monitoring domain (sd_uuid=4a7f8717-9bb0-4d80-8016-498fa4b88162,
host_id=2): timeout during domain acquisition
ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Traceback (most recent call
last):
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
line 191, in _run_agent
return action(he)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
line 67, in action_clean
return he.clean(options.force_cleanup)
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 345, in clean
self._initialize_domain_monitor()
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 829, in _initialize_domain_monitor
raise Exception(msg)
Exception: Failed to start monitoring domain
(sd_uuid=4a7f8717-9bb0-4d80-8016-498fa4b88162,
host_id=2): timeout during domain acquisition

ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Trying to restart agent
WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent, attempt
'0'
ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors occurred,
giving up. Please review the log and consider filing a bug.
INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down

I'm not an expert when it comes to read the sanlock but the output looks a
bit strange to me:

from first host (host_id=2)

[root@ovirt1 ~]# sanlock client status
daemon b1d7fea2-e8a9-4645-b449-97702fc3808e.ovirt1.tel
p -1 helper
p -1 listener
p -1 status
p 3763
p 62861 quaggaVM
p 63111 powerDNS
p 107818 pjsip_freepbx_14
p 109092 revizorro_dev
p 109589 routerVM
s hosted-engine:2:/var/run/vdsm/storage/4a7f8717-9bb0-4d80-
8016-498fa4b88162/093faa75-5e33-4559-84fa-1f1f8d48153b/
911c7637-b49d-463e-b186-23b404e50769:0
s a40cc3a9-54d6-40fd-acee-525ef29c8ce3:2:/rhev/data-center/mnt/glusterSD/
ovirt2.telia.ru\:_data/a40cc3a9-54d6-40fd-acee-525ef29c8ce3/dom_md/ids:0
s 4a7f8717-9bb0-4d80-8016-498fa4b88162:1:/rhev/data-center/mnt/glusterSD/
ovirt2.telia.ru\:_engine/4a7f8717-9bb0-4d80-8016-498fa4b88162/dom_md/ids:0
r a40cc3a9-54d6-40fd-acee-525ef29c8ce3:SDM:/rhev/data-center/mnt/glusterSD/
ovirt2.telia.ru\:_data/a40cc3a9-54d6-40fd-acee-525ef29c8ce3/dom_md/leases:1048576:49
p 3763


from second host (host_id=1)

[root@ovirt2 ~]# sanlock client status
daemon 9263e081-e5ea-416b-866a-0a73fe32fe16.ovirt2.tel
p -1 helper
p -1 listener
p 150440 CentOS-Desk
p 151061 centos-dev-box
p 151288 revizorro_nfq
p 151954 gitlabVM
p -1 status
s hosted-engine:1:/var/run/vdsm/storage/4a7f8717-9bb0-4d80-
8016-498fa4b88162/093faa75-5e33-4559-84fa-1f1f8d48153b/
911c7637-b49d-463e-b186-23b404e50769:0
s a40cc3a9-54d6-40fd-acee-525ef29c8ce3:1:/rhev/data-center/mnt/glusterSD/
ovirt2.telia.ru\:_data/a40cc3a9-54d6-40fd-acee-525ef29c8ce3/dom_md/ids:0
s 4a7f8717-9bb0-4d80-8016-498fa4b88162:1:/rhev/data-center/mnt/glusterSD/
ovirt2.telia.ru\:_engine/4a7f8717-9bb0-4d80-8016-498fa4b88162/dom_md/ids:0
ADD

Not sure if there is a problem with locspace
4a7f8717-9bb0-4d80-8016-498fa4b88162,
but both hosts showing 1 as a host_id here. Is this correct? Should't they
have different Id's here?

Once ha-agent's has been started hosted-engine --vm-status showing
'unknow-stale-data' for the second host. And HE just doesn't start on
second host at all.
Host redeployment haven't helped as well.

Any advises on this?
Regards,
Artem


On Mon, Feb 19, 2018 at 9:32 PM, Artem Tambovskiy <
artem.tambovs...@gmail.com> wrote:

> Thanks Martin.
>
> As you suggested I updated hosted-engine.conf with correct host_id values
> and restarted ovirt-ha-agent services on both hosts and now I run into the
> problem with  status "unknown-stale-data" :(
> And second host still doesn't looks as capable to run HE.
>
> Should I stop HE VM, bring down ovirt-ha-agents and reinitialize-lockspace
> and start ovirt-ha-agents again?
>
> Regards,
> Artem
>
>
>
> On Mon, Feb 19, 2018 at 6:45 PM, Martin Sivak <msi...@redhat.com> wrote:
>
>> Hi Artem,
>>
>> just a restart of ovirt-ha-agent services should be enough.
>>
>> Best regards
>>
>&g

[ovirt-users] Fwd: Fwd: why host is not capable to run HE?

2018-02-19 Thread Artem Tambovskiy
Thanks Martin.

As you suggested I updated hosted-engine.conf with correct host_id values
and restarted ovirt-ha-agent services on both hosts and now I run into the
problem with  status "unknown-stale-data" :(
And second host still doesn't looks as capable to run HE.

Should I stop HE VM, bring down ovirt-ha-agents and reinitialize-lockspace
and start ovirt-ha-agents again?

Regards,
Artem



On Mon, Feb 19, 2018 at 6:45 PM, Martin Sivak <msi...@redhat.com> wrote:

> Hi Artem,
>
> just a restart of ovirt-ha-agent services should be enough.
>
> Best regards
>
> Martin Sivak
>
> On Mon, Feb 19, 2018 at 4:40 PM, Artem Tambovskiy
> <artem.tambovs...@gmail.com> wrote:
> > Ok, understood.
> > Once I set correct host_id on both hosts how to take changes in force?
> With
> > minimal downtime? Or i need reboot both hosts anyway?
> >
> > Regards,
> > Artem
> >
> > 19 февр. 2018 г. 18:18 пользователь "Simone Tiraboschi"
> > <stira...@redhat.com> написал:
> >
> >>
> >>
> >> On Mon, Feb 19, 2018 at 4:12 PM, Artem Tambovskiy
> >> <artem.tambovs...@gmail.com> wrote:
> >>>
> >>>
> >>> Thanks a lot, Simone!
> >>>
> >>> This is clearly shows a problem:
> >>>
> >>> [root@ov-eng ovirt-engine]# sudo -u postgres psql -d engine -c 'select
> >>> vds_name, vds_spm_id from vds'
> >>> vds_name | vds_spm_id
> >>> -+
> >>>  ovirt1.local |  2
> >>>  ovirt2.local |  1
> >>> (2 rows)
> >>>
> >>> While hosted-engine.conf on ovirt1.local have host_id=1, and
> ovirt2.local
> >>> host_id=2. So totally opposite values.
> >>> So how to get this fixed in the simple way? Update the engine DB?
> >>
> >>
> >> I'd suggest to manually fix /etc/ovirt-hosted-engine/hosted-engine.conf
> on
> >> both the hosts
> >>
> >>>
> >>>
> >>> Regards,
> >>> Artem
> >>>
> >>> On Mon, Feb 19, 2018 at 5:37 PM, Simone Tiraboschi <
> stira...@redhat.com>
> >>> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On Mon, Feb 19, 2018 at 12:13 PM, Artem Tambovskiy
> >>>> <artem.tambovs...@gmail.com> wrote:
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> Last weekend my cluster suffered form a massive power outage due to
> >>>>> human mistake.
> >>>>> I'm using SHE setup with Gluster, I managed to bring the cluster up
> >>>>> quickly, but once again I have a problem with duplicated host_id
> >>>>> (https://bugzilla.redhat.com/show_bug.cgi?id=1543988) on second
> host and due
> >>>>> to this second host is not capable to run HE.
> >>>>>
> >>>>> I manually updated file hosted_engine.conf with correct host_id and
> >>>>> restarted agent & broker - no effect. Than I rebooted the host
> itself -
> >>>>> still no changes. How to fix this issue?
> >>>>
> >>>>
> >>>> I'd suggest to run this command on the engine VM:
> >>>> sudo -u postgres scl enable rh-postgresql95 --  psql -d engine -c
> >>>> 'select vds_name, vds_spm_id from vds'
> >>>> (just  sudo -u postgres psql -d engine -c 'select vds_name, vds_spm_id
> >>>> from vds'  if still on 4.1) and check
> >>>> /etc/ovirt-hosted-engine/hosted-engine.conf on all the involved host.
> >>>> Maybe you can also have a leftover configuration file on undeployed
> >>>> host.
> >>>>
> >>>> When you find a conflict you should manually bring down sanlock
> >>>> In doubt a reboot of both the hosts will solve for sure.
> >>>>
> >>>>
> >>>>>
> >>>>>
> >>>>> Regards,
> >>>>> Artem
> >>>>>
> >>>>> ___
> >>>>> Users mailing list
> >>>>> Users@ovirt.org
> >>>>> http://lists.ovirt.org/mailman/listinfo/users
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> ___
> >>> Users mailing list
> >>> Users@ovirt.org
> >>> http://lists.ovirt.org/mailman/listinfo/users
> >>>
> >>
> >
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Fwd: why host is not capable to run HE?

2018-02-19 Thread Artem Tambovskiy
Ok, understood.
Once I set correct host_id on both hosts how to take changes in force? With
minimal downtime? Or i need reboot both hosts anyway?

Regards,
Artem

19 февр. 2018 г. 18:18 пользователь "Simone Tiraboschi" <stira...@redhat.com>
написал:

>
>
> On Mon, Feb 19, 2018 at 4:12 PM, Artem Tambovskiy <
> artem.tambovs...@gmail.com> wrote:
>
>>
>> Thanks a lot, Simone!
>>
>> This is clearly shows a problem:
>>
>> [root@ov-eng ovirt-engine]# sudo -u postgres psql -d engine -c 'select
>> vds_name, vds_spm_id from vds'
>> vds_name | vds_spm_id
>> -+
>>  ovirt1.local |  2
>>  ovirt2.local |  1
>> (2 rows)
>>
>> While hosted-engine.conf on ovirt1.local have host_id=1, and
>> ovirt2.local host_id=2. So totally opposite values.
>> So how to get this fixed in the simple way? Update the engine DB?
>>
>
> I'd suggest to manually fix /etc/ovirt-hosted-engine/hosted-engine.conf
> on both the hosts
>
>
>>
>> Regards,
>> Artem
>>
>> On Mon, Feb 19, 2018 at 5:37 PM, Simone Tiraboschi <stira...@redhat.com>
>> wrote:
>>
>>>
>>>
>>> On Mon, Feb 19, 2018 at 12:13 PM, Artem Tambovskiy <
>>> artem.tambovs...@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> Last weekend my cluster suffered form a massive power outage due to
>>>> human mistake.
>>>> I'm using SHE setup with Gluster, I managed to bring the cluster up
>>>> quickly, but once again I have a problem with duplicated host_id  (
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1543988) on second host
>>>> and due to this second host is not capable to run HE.
>>>>
>>>> I manually updated file hosted_engine.conf with correct host_id and
>>>> restarted agent & broker - no effect. Than I rebooted the host itself -
>>>> still no changes. How to fix this issue?
>>>>
>>>
>>> I'd suggest to run this command on the engine VM:
>>> sudo -u postgres scl enable rh-postgresql95 --  psql -d engine -c
>>> 'select vds_name, vds_spm_id from vds'
>>> (just  sudo -u postgres psql -d engine -c 'select vds_name, vds_spm_id
>>> from vds'  if still on 4.1) and check 
>>> /etc/ovirt-hosted-engine/hosted-engine.conf
>>> on all the involved host.
>>> Maybe you can also have a leftover configuration file on undeployed host.
>>>
>>> When you find a conflict you should manually bring down sanlock
>>> In doubt a reboot of both the hosts will solve for sure.
>>>
>>>
>>>
>>>>
>>>> Regards,
>>>> Artem
>>>>
>>>> ___
>>>> Users mailing list
>>>> Users@ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Fwd: why host is not capable to run HE?

2018-02-19 Thread Artem Tambovskiy
Thanks a lot, Simone!

This is clearly shows a problem:

[root@ov-eng ovirt-engine]# sudo -u postgres psql -d engine -c 'select
vds_name, vds_spm_id from vds'
vds_name | vds_spm_id
-+
 ovirt1.local |  2
 ovirt2.local |  1
(2 rows)

While hosted-engine.conf on ovirt1.local have host_id=1, and ovirt2.local
host_id=2. So totally opposite values.
So how to get this fixed in the simple way? Update the engine DB?

Regards,
Artem

On Mon, Feb 19, 2018 at 5:37 PM, Simone Tiraboschi <stira...@redhat.com>
wrote:

>
>
> On Mon, Feb 19, 2018 at 12:13 PM, Artem Tambovskiy <
> artem.tambovs...@gmail.com> wrote:
>
>> Hello,
>>
>> Last weekend my cluster suffered form a massive power outage due to human
>> mistake.
>> I'm using SHE setup with Gluster, I managed to bring the cluster up
>> quickly, but once again I have a problem with duplicated host_id  (
>> https://bugzilla.redhat.com/show_bug.cgi?id=1543988) on second host and
>> due to this second host is not capable to run HE.
>>
>> I manually updated file hosted_engine.conf with correct host_id and
>> restarted agent & broker - no effect. Than I rebooted the host itself -
>> still no changes. How to fix this issue?
>>
>
> I'd suggest to run this command on the engine VM:
> sudo -u postgres scl enable rh-postgresql95 --  psql -d engine -c 'select
> vds_name, vds_spm_id from vds'
> (just  sudo -u postgres psql -d engine -c 'select vds_name, vds_spm_id
> from vds'  if still on 4.1) and check 
> /etc/ovirt-hosted-engine/hosted-engine.conf
> on all the involved host.
> Maybe you can also have a leftover configuration file on undeployed host.
>
> When you find a conflict you should manually bring down sanlock
> In doubt a reboot of both the hosts will solve for sure.
>
>
>
>>
>> Regards,
>> Artem
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] why host is not capable to run HE?

2018-02-19 Thread Artem Tambovskiy
Hello,

Last weekend my cluster suffered form a massive power outage due to human
mistake.
I'm using SHE setup with Gluster, I managed to bring the cluster up
quickly, but once again I have a problem with duplicated host_id  (
https://bugzilla.redhat.com/show_bug.cgi?id=1543988) on second host and due
to this second host is not capable to run HE.

I manually updated file hosted_engine.conf with correct host_id and
restarted agent & broker - no effect. Than I rebooted the host itself -
still no changes. How to fix this issue?

Regards,
Artem
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted-engine unknow stale-data

2018-01-21 Thread Artem Tambovskiy
Hello Kasturi,

Yes, I set global maintenance mode intentionally,
I'm run out of the ideas troubleshooting my cluster and decided to undeploy
the hosted engine from second host, clean the installation and add again to
the cluster.
Also I cleaned the metadata with *hosted-engine --clean-metadata
--host-id=2 --force-clean *But once I added the second host to the cluster
again it doesn't show the capability to run hosted engine. And doesn't even
appear in the output hosted-engine --vm-status
[root@ovirt1 ~]#hosted-engine --vm-status --== Host 1 status ==--
conf_on_shared_storage : True Status up-to-date : True Hostname :
ovirt1.telia.ru Host ID : 1 Engine status : {"health": "good", "vm": "up",
"detail": "up"} Score : 3400 stopped : False Local maintenance : False
crc32 : a23c7cbd local_conf_timestamp : 848931 Host timestamp : 848930
Extra metadata (valid at timestamp): metadata_parse_version=1
metadata_feature_version=1 timestamp=848930 (Mon Jan 22 09:53:29 2018)
host-id=1 score=3400 vm_conf_refresh_time=848931 (Mon Jan 22 09:53:29 2018)
conf_on_shared_storage=True maintenance=False state=GlobalMaintenance
stopped=False

On redeployed second host I see unknown-stale-data again, and second host
doesn't show up as a hosted-engine capable.
[root@ovirt2 ~]# hosted-engine --vm-status


--== Host 1 status ==--

conf_on_shared_storage : True
Status up-to-date  : False
Hostname   : ovirt1.telia.ru
Host ID: 1
Engine status  : unknown stale-data
Score  : 0
stopped: False
Local maintenance  : False
crc32  : 18765f68
local_conf_timestamp   : 848951
Host timestamp : 848951
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=848951 (Mon Jan 22 09:53:49 2018)
host-id=1
score=0
vm_conf_refresh_time=848951 (Mon Jan 22 09:53:50 2018)
conf_on_shared_storage=True
maintenance=False
state=ReinitializeFSM
stopped=False


Really strange situation ...

Regards,
Artem



On Mon, Jan 22, 2018 at 9:46 AM, Kasturi Narra <kna...@redhat.com> wrote:

> Hello Artem,
>
> Any reason why you chose hosted-engine undeploy action for the second
> host ? I see that the cluster is in global maintenance mode, was this
> intended ?
>
> command to clear the entries from hosted-engine --vm-status is "hosted-engine
> --clean-metadata --host-id= --force-clean"
>
> Hope this helps !!
>
> Thanks
> kasturi
>
>
> On Fri, Jan 19, 2018 at 12:07 AM, Artem Tambovskiy <
> artem.tambovs...@gmail.com> wrote:
>
>> Hi,
>>
>> Ok, i decided to remove second host from the cluster.
>> I reinstalled from webUI it with hosted-engine action UNDEPLOY, and
>> removed it from the cluster aftewards.
>> All VM's are fine hosted engine running ok,
>> But hosted-engine --vm-status still showing 2 hosts.
>>
>> How I can clean the traces of second host in a correct way?
>>
>>
>> --== Host 1 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date  : True
>> Hostname   : ovirt1.telia.ru
>> Host ID: 1
>> Engine status  : {"health": "good", "vm": "up",
>> "detail": "up"}
>> Score  : 3400
>> stopped: False
>> Local maintenance  : False
>> crc32  : 1b1b6f6d
>> local_conf_timestamp   : 545385
>> Host timestamp : 545385
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=545385 (Thu Jan 18 21:34:25 2018)
>> host-id=1
>> score=3400
>> vm_conf_refresh_time=545385 (Thu Jan 18 21:34:25 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=GlobalMaintenance
>> stopped=False
>>
>>
>> --== Host 2 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date  : False
>> Hostname   : ovirt1.telia.ru
>> Host ID: 2
>> Engine status  : unknown stale-data
>> Score  : 0
>> stopped: True
>> Local maintenance 

Re: [ovirt-users] correct settings for gluster based storage domain

2018-01-19 Thread Artem Tambovskiy
Ok,
Alexey, you have picked the third option and leaving host selection to DNS
resolver.

But in general the solution 2 also should work, right?

Regards,
Artem



On Fri, Jan 19, 2018 at 4:50 PM, Николаев Алексей <
alexeynikolaev.p...@yandex.ru> wrote:

> https://ovirt.org/documentation/self-hosted/chap-Deploying_Self-Hosted_
> Engine/
>
>
> For Gluster storage, specify the full address, using either the FQDN or IP
> address, and path name of the shared storage domain.
>
> *Important:* Only replica 3 Gluster storage is supported. Ensure the
> following configuration has been made:
>
>-
>
>In the /etc/glusterfs/glusterd.vol file on all three Gluster servers,
>set rpc-auth-allow-insecure to on.
>
>  option rpc-auth-allow-insecure on
>
>-
>
>Configure the volume as follows:
>
>  gluster volume set volume cluster.quorum-type auto
>  gluster volume set volume network.ping-timeout 10
>  gluster volume set volume auth.allow \*
>  gluster volume set volume group virt
>  gluster volume set volume storage.owner-uid 36
>  gluster volume set volume storage.owner-gid 36
>  gluster volume set volume server.allow-insecure on
>
>
>
> I have problems with hosted engine storage on gluster replica 3 arbiter
> with oVIrt 4.1.
> I recommend update oVirt to 4.2. I have no problems with 4.2.
>
>
> 19.01.2018, 16:43, "Artem Tambovskiy" <artem.tambovs...@gmail.com>:
>
>
> I'm still troubleshooting the my oVirt 4.1.8 cluster and idea came to my
> mind that I have an issue with storage settings for hosted_engine storage
> domain.
>
> But in general if I have a 2 ovirt nodes running gluster + 3rd host as
> arbiter, how the settings should looks like?
>
> lets say I have a 3 nodes:
> ovirt1.domain.com (gluster + ovirt)
> ovirt2.domain.com (gluster + ovirt)
> ovirt3.domain.com (gluster)
>
> How the correct storage domain config should looks like?
>
> Option 1:
>  /etc/ovirt-hosted-engine/hosted-engine.conf
> 
> storage=ovirt1.domain.com:/engine
> mnt_options=backup-volfile-servers=ovirt2.domain.com:ovirt3.domain.com
>
> Option 2:
>  /etc/ovirt-hosted-engine/hosted-engine.conf
> 
> storage=localhost:/engine
> mnt_options=backup-volfile-servers=ovirt1.domain.com:ovirt2.domain.com:o
> virt3.domain.com
>
> Option 3:
> Setup a DNS record gluster.domain.com pointing to IP addresses of gluster
> nodes
>
>  /etc/ovirt-hosted-engine/hosted-engine.conf
> 
> storage=gluster.domain.com:/engine
> mnt_options=
>
> Of course its related not only to hosted engine domain, but to all gluster
> based storage domains.
>
> Thank you in advance!
> Regards,
> Artem
>
> ,
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] correct settings for gluster based storage domain

2018-01-19 Thread Artem Tambovskiy
I'm still troubleshooting the my oVirt 4.1.8 cluster and idea came to my
mind that I have an issue with storage settings for hosted_engine storage
domain.

But in general if I have a 2 ovirt nodes running gluster + 3rd host as
arbiter, how the settings should looks like?

lets say I have a 3 nodes:
ovirt1.domain.com (gluster + ovirt)
ovirt2.domain.com (gluster + ovirt)
ovirt3.domain.com (gluster)

How the correct storage domain config should looks like?

Option 1:
 /etc/ovirt-hosted-engine/hosted-engine.conf

storage=ovirt1.domain.com:/engine
mnt_options=backup-volfile-servers=ovirt2.domain.com:ovirt3.domain.com

Option 2:
 /etc/ovirt-hosted-engine/hosted-engine.conf

storage=localhost:/engine
mnt_options=backup-volfile-servers=ovirt1.domain.com:ovirt2.domain.com:o
virt3.domain.com

Option 3:
Setup a DNS record gluster.domain.com pointing to IP addresses of gluster
nodes

 /etc/ovirt-hosted-engine/hosted-engine.conf

storage=gluster.domain.com:/engine
mnt_options=

Of course its related not only to hosted engine domain, but to all gluster
based storage domains.

Thank you in advance!
Regards,
Artem
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted-engine unknow stale-data

2018-01-18 Thread Artem Tambovskiy
Hi,

Ok, i decided to remove second host from the cluster.
I reinstalled from webUI it with hosted-engine action UNDEPLOY, and removed
it from the cluster aftewards.
All VM's are fine hosted engine running ok,
But hosted-engine --vm-status still showing 2 hosts.

How I can clean the traces of second host in a correct way?


--== Host 1 status ==--

conf_on_shared_storage : True
Status up-to-date  : True
Hostname   : ovirt1.telia.ru
Host ID: 1
Engine status  : {"health": "good", "vm": "up",
"detail": "up"}
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 1b1b6f6d
local_conf_timestamp   : 545385
Host timestamp : 545385
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=545385 (Thu Jan 18 21:34:25 2018)
host-id=1
score=3400
vm_conf_refresh_time=545385 (Thu Jan 18 21:34:25 2018)
conf_on_shared_storage=True
maintenance=False
state=GlobalMaintenance
stopped=False


--== Host 2 status ==--

conf_on_shared_storage : True
Status up-to-date  : False
Hostname   : ovirt1.telia.ru
Host ID: 2
Engine status  : unknown stale-data
Score  : 0
stopped: True
Local maintenance  : False
crc32  : c7037c03
local_conf_timestamp   : 7530
Host timestamp : 7530
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=7530 (Fri Jan 12 16:10:12 2018)
host-id=2
score=0
vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
conf_on_shared_storage=True
maintenance=False
state=AgentStopped
stopped=True


!! Cluster is in GLOBAL MAINTENANCE mode !!

Thank you in advance!
Regards,
Artem


On Wed, Jan 17, 2018 at 6:47 PM, Artem Tambovskiy <
artem.tambovs...@gmail.com> wrote:

> Hello,
>
> Any further suggestions on how to fix the issue and make HA setup working?
> Can the complete removal of second host (with complete removal ovirt
> configuration files and packages) from cluster and adding it again solve
> the issue? Or it might completly ruin the cluster?
>
> Regards,
> Artem
>
> 16 янв. 2018 г. 17:00 пользователь "Artem Tambovskiy" <
> artem.tambovs...@gmail.com> написал:
>
> Hi Martin,
>>
>> Thanks for feedback.
>>
>> All hosts and hosted-engine running 4.1.8 release.
>> The strange thing : I can see that host ID is set to 1 on both hosts at
>> /etc/ovirt-hosted-engine/hosted-engine.conf file.
>> I have no idea how this happen, the only thing I have changed recently is
>> that I have changed mnt_options in order to add backup-volfile-servers
>> by using hosted-engine --set-shared-config command
>>
>> Both agent and broker are running on second host
>>
>> [root@ovirt2 ovirt-hosted-engine-ha]# ps -ef | grep ovirt-ha-
>> vdsm  42331  1 26 14:40 ?00:31:35 /usr/bin/python
>> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon
>> vdsm  42332  1  0 14:40 ?00:00:16 /usr/bin/python
>> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon
>>
>> but I saw some tracebacks during the broker start
>>
>> [root@ovirt2 ovirt-hosted-engine-ha]# systemctl status ovirt-ha-broker -l
>> ● ovirt-ha-broker.service - oVirt Hosted Engine High Availability
>> Communications Broker
>>Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service;
>> enabled; vendor preset: disabled)
>>Active: active (running) since Tue 2018-01-16 14:40:15 MSK; 1h 58min
>> ago
>>  Main PID: 42331 (ovirt-ha-broker)
>>CGroup: /system.slice/ovirt-ha-broker.service
>>└─42331 /usr/bin/python 
>> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker
>> --no-daemon
>>
>> Jan 16 14:40:15 ovirt2.telia.ru systemd[1]: Started oVirt Hosted Engine
>> High Availability Communications Broker.
>> Jan 16 14:40:15 ovirt2.telia.ru systemd[1]: Starting oVirt Hosted Engine
>> High Availability Communications Broker...
>> Jan 16 14:40:16 ovirt2.telia.ru ovirt-ha-broker[42331]: ovirt-ha-broker
>> ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error
>> handling request, data: 'set-storage-domain FilesystemBackend
>> dom_type=glusterfs sd_uuid=4a

Re: [ovirt-users] hosted-engine unknow stale-data

2018-01-17 Thread Artem Tambovskiy
Hello,

Any further suggestions on how to fix the issue and make HA setup working?
Can the complete removal of second host (with complete removal ovirt
configuration files and packages) from cluster and adding it again solve
the issue? Or it might completly ruin the cluster?

Regards,
Artem

16 янв. 2018 г. 17:00 пользователь "Artem Tambovskiy" <
artem.tambovs...@gmail.com> написал:

> Hi Martin,
>
> Thanks for feedback.
>
> All hosts and hosted-engine running 4.1.8 release.
> The strange thing : I can see that host ID is set to 1 on both hosts at
> /etc/ovirt-hosted-engine/hosted-engine.conf file.
> I have no idea how this happen, the only thing I have changed recently is
> that I have changed mnt_options in order to add backup-volfile-servers
> by using hosted-engine --set-shared-config command
>
> Both agent and broker are running on second host
>
> [root@ovirt2 ovirt-hosted-engine-ha]# ps -ef | grep ovirt-ha-
> vdsm  42331  1 26 14:40 ?00:31:35 /usr/bin/python
> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon
> vdsm  42332  1  0 14:40 ?00:00:16 /usr/bin/python
> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon
>
> but I saw some tracebacks during the broker start
>
> [root@ovirt2 ovirt-hosted-engine-ha]# systemctl status ovirt-ha-broker -l
> ● ovirt-ha-broker.service - oVirt Hosted Engine High Availability
> Communications Broker
>Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service;
> enabled; vendor preset: disabled)
>Active: active (running) since Tue 2018-01-16 14:40:15 MSK; 1h 58min ago
>  Main PID: 42331 (ovirt-ha-broker)
>CGroup: /system.slice/ovirt-ha-broker.service
>└─42331 /usr/bin/python 
> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker
> --no-daemon
>
> Jan 16 14:40:15 ovirt2.telia.ru systemd[1]: Started oVirt Hosted Engine
> High Availability Communications Broker.
> Jan 16 14:40:15 ovirt2.telia.ru systemd[1]: Starting oVirt Hosted Engine
> High Availability Communications Broker...
> Jan 16 14:40:16 ovirt2.telia.ru ovirt-ha-broker[42331]: ovirt-ha-broker
> ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error
> handling request, data: 'set-storage-domain FilesystemBackend
> dom_type=glusterfs sd_uuid=4a7f8717-9bb0-4d80-8016-498fa4b88162'
> Traceback (most
> recent call last):
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
> line 166, in handle
> data)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
> line 299, in _dispatch
>
> .set_storage_domain(client, sd_type, **options)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
> line 66, in set_storage_domain
>
> self._backends[client].connect()
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
> line 462, in connect
> self._dom_type)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
> line 107, in get_domain_path
> " in
> {1}".format(sd_uuid, parent))
>
> BackendFailureException: path to storage domain 
> 4a7f8717-9bb0-4d80-8016-498fa4b88162
> not found in /rhev/data-center/mnt/glusterSD
>
>
>
> I have tried to issue hosted-engine --connect-storage on second host
> followed by agent & broker restart
> But there is no any visible improvements.
>
> Regards,
> Artem
>
>
>
>
>
>
>
> On Tue, Jan 16, 2018 at 4:18 PM, Martin Sivak <msi...@redhat.com> wrote:
>
>> Hi everybody,
>>
>> there are couple of things to check here.
>>
>> - what version of hosted engine agent is this? The logs look like
>> coming from 4.1
>> - what version of engine is used?
>> - check the host ID in /etc/ovirt-hosted-engine/hosted-engine.conf on
>> both hosts, the numbers must be different
>> - it looks like the agent or broker on host 2 is not active (or there
>> would be a report)
>> - the second host does not see data from the first host (unknown
>> stale-data), wait for a minute and check again, then check the storage
>

Re: [ovirt-users] hosted-engine unknow stale-data

2018-01-16 Thread Artem Tambovskiy
Hi Martin,

Thanks for feedback.

All hosts and hosted-engine running 4.1.8 release.
The strange thing : I can see that host ID is set to 1 on both hosts at
/etc/ovirt-hosted-engine/hosted-engine.conf file.
I have no idea how this happen, the only thing I have changed recently is
that I have changed mnt_options in order to add backup-volfile-servers
by using hosted-engine --set-shared-config command

Both agent and broker are running on second host

[root@ovirt2 ovirt-hosted-engine-ha]# ps -ef | grep ovirt-ha-
vdsm  42331  1 26 14:40 ?00:31:35 /usr/bin/python
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon
vdsm  42332  1  0 14:40 ?00:00:16 /usr/bin/python
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon

but I saw some tracebacks during the broker start

[root@ovirt2 ovirt-hosted-engine-ha]# systemctl status ovirt-ha-broker -l
● ovirt-ha-broker.service - oVirt Hosted Engine High Availability
Communications Broker
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service;
enabled; vendor preset: disabled)
   Active: active (running) since Tue 2018-01-16 14:40:15 MSK; 1h 58min ago
 Main PID: 42331 (ovirt-ha-broker)
   CGroup: /system.slice/ovirt-ha-broker.service
   └─42331 /usr/bin/python
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon

Jan 16 14:40:15 ovirt2.telia.ru systemd[1]: Started oVirt Hosted Engine
High Availability Communications Broker.
Jan 16 14:40:15 ovirt2.telia.ru systemd[1]: Starting oVirt Hosted Engine
High Availability Communications Broker...
Jan 16 14:40:16 ovirt2.telia.ru ovirt-ha-broker[42331]: ovirt-ha-broker
ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error
handling request, data: 'set-storage-domain FilesystemBackend
dom_type=glusterfs sd_uuid=4a7f8717-9bb0-4d80-8016-498fa4b88162'
Traceback (most
recent call last):
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
line 166, in handle
data)
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
line 299, in _dispatch

.set_storage_domain(client, sd_type, **options)
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
line 66, in set_storage_domain

self._backends[client].connect()
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
line 462, in connect
self._dom_type)
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
line 107, in get_domain_path
" in
{1}".format(sd_uuid, parent))

BackendFailureException: path to storage domain
4a7f8717-9bb0-4d80-8016-498fa4b88162 not found in
/rhev/data-center/mnt/glusterSD



I have tried to issue hosted-engine --connect-storage on second host
followed by agent & broker restart
But there is no any visible improvements.

Regards,
Artem







On Tue, Jan 16, 2018 at 4:18 PM, Martin Sivak <msi...@redhat.com> wrote:

> Hi everybody,
>
> there are couple of things to check here.
>
> - what version of hosted engine agent is this? The logs look like
> coming from 4.1
> - what version of engine is used?
> - check the host ID in /etc/ovirt-hosted-engine/hosted-engine.conf on
> both hosts, the numbers must be different
> - it looks like the agent or broker on host 2 is not active (or there
> would be a report)
> - the second host does not see data from the first host (unknown
> stale-data), wait for a minute and check again, then check the storage
> connection
>
> And then the general troubleshooting:
>
> - put hosted engine in global maintenance mode (and check that it is
> visible from the other host using he --vm-status)
> - mount storage domain (hosted-engine --connect-storage)
> - check sanlock client status to see if proper lockspaces are present
>
> Best regards
>
> Martin Sivak
>
> On Tue, Jan 16, 2018 at 1:16 PM, Derek Atkins <de...@ihtfp.com> wrote:
> > Why are both hosts reporting as ovirt 1?
> > Look at the hostname fields to see what mean.
> >
> > -derek
> > Sent using my mobile device. Please excuse any typos.
> >
> > On January 16, 2018 7:11:09 AM Artem Tambovskiy <
> artem.tambovs...@gmail.com>
> > wrote:
> >>
> >> Hello,
> >>
> >> Yes, I followed exactly

Re: [ovirt-users] hosted-engine unknow stale-data

2018-01-16 Thread Artem Tambovskiy
ks ?
>
> 1) Move the host to maintenance
> 2) click on reinstall
> 3) provide the password
> 4) uncheck 'automatically configure host firewall'
> 5) click on 'Deploy' tab
> 6) click Hosted Engine deployment as 'Deploy'
>
> And once the host installation is done, wait till the active score of the
> host shows 3400 in the general tab then check hosted-engine --vm-status.
>
> Thanks
> kasturi
>
> On Mon, Jan 15, 2018 at 4:57 PM, Artem Tambovskiy <
> artem.tambovs...@gmail.com> wrote:
>
>> Hello,
>>
>> I have uploaded 2 archives with all relevant logs to shared hosting
>> files from host 1  (which is currently running all VM's including
>> hosted_engine)  -  https://yadi.sk/d/PttRoYV63RTvhK
>> files from second host - https://yadi.sk/d/UBducEsV3RTvhc
>>
>> I have tried to restart both ovirt-ha-agent and ovirt-ha-broker but it
>> gives no effect. I have also tried to shutdown hosted_engine VM, stop
>> ovirt-ha-agent and ovirt-ha-broker  services disconnect storage and connect
>> it again  - no effect as well.
>> Also I tried to reinstall second host from WebGUI - this lead to the
>> interesting situation - now  hosted-engine --vm-status  shows that both
>> hosts have the same address.
>>
>> [root@ovirt1 ~]# hosted-engine --vm-status
>>
>> --== Host 1 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date  : True
>> Hostname   : ovirt1.telia.ru
>> Host ID: 1
>> Engine status  : {"health": "good", "vm": "up",
>> "detail": "up"}
>> Score  : 3400
>> stopped: False
>> Local maintenance  : False
>> crc32  : a7758085
>> local_conf_timestamp   : 259327
>> Host timestamp : 259327
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=259327 (Mon Jan 15 14:06:48 2018)
>> host-id=1
>> score=3400
>> vm_conf_refresh_time=259327 (Mon Jan 15 14:06:48 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=EngineUp
>> stopped=False
>>
>>
>> --== Host 2 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date  : False
>> Hostname   : ovirt1.telia.ru
>> Host ID: 2
>> Engine status  : unknown stale-data
>> Score  : 0
>> stopped: True
>> Local maintenance  : False
>> crc32  : c7037c03
>> local_conf_timestamp   : 7530
>> Host timestamp : 7530
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=7530 (Fri Jan 12 16:10:12 2018)
>> host-id=2
>> score=0
>> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=AgentStopped
>> stopped=True
>>
>> Gluster seems working fine. all gluster nodes showing connected state.
>>
>> Any advises on how to resolve this situation are highly appreciated!
>>
>> Regards,
>> Artem
>>
>>
>> On Mon, Jan 15, 2018 at 11:45 AM, Kasturi Narra <kna...@redhat.com>
>> wrote:
>>
>>> Hello Artem,
>>>
>>> Can you check if glusterd service is running on host1 and all
>>> the peers are in connected state ? If yes, can you restart ovirt-ha-agent
>>> and broker services and check if things are working fine ?
>>>
>>> Thanks
>>> kasturi
>>>
>>> On Sat, Jan 13, 2018 at 12:33 AM, Artem Tambovskiy <
>>> artem.tambovs...@gmail.com> wrote:
>>>
>>>> Explored logs on both hosts.
>>>> broker.log shows no errors.
>>>>
>>>> agent.log looking not good:
>>>>
>>>> on host1 (which running hosted engine) :
>>>>
>>>> MainThread::ERROR::2018-01-12 21:51:03,883::agent::205::ovir
>>>> t_hosted_engine_ha.agent.agent.Agent::(_run_agent) Traceback (most
>>>> recent call last):
>>>>   F

Re: [ovirt-users] hosted-engine unknow stale-data

2018-01-15 Thread Artem Tambovskiy
Hello,

I have uploaded 2 archives with all relevant logs to shared hosting
files from host 1  (which is currently running all VM's including
hosted_engine)  -  https://yadi.sk/d/PttRoYV63RTvhK
files from second host - https://yadi.sk/d/UBducEsV3RTvhc

I have tried to restart both ovirt-ha-agent and ovirt-ha-broker but it
gives no effect. I have also tried to shutdown hosted_engine VM, stop
ovirt-ha-agent and ovirt-ha-broker  services disconnect storage and connect
it again  - no effect as well.
Also I tried to reinstall second host from WebGUI - this lead to the
interesting situation - now  hosted-engine --vm-status  shows that both
hosts have the same address.

[root@ovirt1 ~]# hosted-engine --vm-status

--== Host 1 status ==--

conf_on_shared_storage : True
Status up-to-date  : True
Hostname   : ovirt1.telia.ru
Host ID: 1
Engine status  : {"health": "good", "vm": "up",
"detail": "up"}
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : a7758085
local_conf_timestamp   : 259327
Host timestamp : 259327
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=259327 (Mon Jan 15 14:06:48 2018)
host-id=1
score=3400
vm_conf_refresh_time=259327 (Mon Jan 15 14:06:48 2018)
conf_on_shared_storage=True
maintenance=False
state=EngineUp
stopped=False


--== Host 2 status ==--

conf_on_shared_storage : True
Status up-to-date  : False
Hostname   : ovirt1.telia.ru
Host ID: 2
Engine status  : unknown stale-data
Score  : 0
stopped: True
Local maintenance  : False
crc32  : c7037c03
local_conf_timestamp   : 7530
Host timestamp : 7530
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=7530 (Fri Jan 12 16:10:12 2018)
host-id=2
score=0
vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
conf_on_shared_storage=True
maintenance=False
state=AgentStopped
stopped=True

Gluster seems working fine. all gluster nodes showing connected state.

Any advises on how to resolve this situation are highly appreciated!

Regards,
Artem


On Mon, Jan 15, 2018 at 11:45 AM, Kasturi Narra <kna...@redhat.com> wrote:

> Hello Artem,
>
> Can you check if glusterd service is running on host1 and all the
> peers are in connected state ? If yes, can you restart ovirt-ha-agent and
> broker services and check if things are working fine ?
>
> Thanks
> kasturi
>
> On Sat, Jan 13, 2018 at 12:33 AM, Artem Tambovskiy <
> artem.tambovs...@gmail.com> wrote:
>
>> Explored logs on both hosts.
>> broker.log shows no errors.
>>
>> agent.log looking not good:
>>
>> on host1 (which running hosted engine) :
>>
>> MainThread::ERROR::2018-01-12 21:51:03,883::agent::205::ovir
>> t_hosted_engine_ha.agent.agent.Agent::(_run_agent) Traceback (most
>> recent call last):
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 191, in _run_agent
>> return action(he)
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 64, in action_proper
>> return he.start_monitoring()
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 411, in start_monitoring
>> self._initialize_sanlock()
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 749, in _initialize_sanlock
>> "Failed to initialize sanlock, the number of errors has"
>> SanlockInitializationError: Failed to initialize sanlock, the number of
>> errors has exceeded the limit
>>
>> MainThread::ERROR::2018-01-12 21:51:03,884::agent::206::ovir
>> t_hosted_engine_ha.agent.agent.Agent::(_run_agent) Trying to restart
>> agent
>> MainThread::WARNING::2018-01-12 21:51:08,889::agent::209::ovir
>> t_hosted_engine_ha.agent.agent.Agent::(_run_agent) Restarting agent,
>> attempt '1'
>> MainThread::INFO::2018-01-12 21:51:08,919::hosted_engine::2
>> 42::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname)
>> Found cer

Re: [ovirt-users] hosted-engine unknow stale-data

2018-01-12 Thread Artem Tambovskiy
)
Connecting the storage
MainThread::INFO::2018-01-12
22:02:29,586::storage_server::220::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(validate_storage_server)
Validating storage server


Any suggestions how to resolve this .

regards,
Artem


On Fri, Jan 12, 2018 at 7:08 PM, Artem Tambovskiy <
artem.tambovs...@gmail.com> wrote:

> Trying to fix one thing I broke another :(
>
> I fixed mnt_options for hosted engine storage domain and installed latest
> security patches to my hosts and hosted engine. All VM's up and running,
> but  hosted_engine --vm-status reports about issues:
>
> [root@ovirt1 ~]# hosted-engine --vm-status
>
>
> --== Host 1 status ==--
>
> conf_on_shared_storage : True
> Status up-to-date  : False
> Hostname   : ovirt2
> Host ID: 1
> Engine status  : unknown stale-data
> Score  : 0
> stopped: False
> Local maintenance  : False
> crc32  : 193164b8
> local_conf_timestamp   : 8350
> Host timestamp : 8350
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=8350 (Fri Jan 12 19:03:54 2018)
> host-id=1
> score=0
> vm_conf_refresh_time=8350 (Fri Jan 12 19:03:54 2018)
> conf_on_shared_storage=True
> maintenance=False
> state=EngineUnexpectedlyDown
> stopped=False
> timeout=Thu Jan  1 05:24:43 1970
>
>
> --== Host 2 status ==--
>
> conf_on_shared_storage : True
> Status up-to-date  : False
> Hostname   : ovirt1.telia.ru
> Host ID: 2
> Engine status  : unknown stale-data
> Score  : 0
> stopped: True
> Local maintenance  : False
> crc32  : c7037c03
> local_conf_timestamp   : 7530
> Host timestamp : 7530
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=7530 (Fri Jan 12 16:10:12 2018)
> host-id=2
> score=0
> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
> conf_on_shared_storage=True
> maintenance=False
> state=AgentStopped
> stopped=True
> [root@ovirt1 ~]#
>
>
>
> from second host situation looks a bit different:
>
>
> [root@ovirt2 ~]# hosted-engine --vm-status
>
>
> --== Host 1 status ==--
>
> conf_on_shared_storage : True
> Status up-to-date  : True
> Hostname   : ovirt2
> Host ID: 1
> Engine status  : {"reason": "vm not running on this
> host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score  : 0
> stopped: False
> Local maintenance  : False
> crc32  : 78eabdb6
> local_conf_timestamp   : 8403
> Host timestamp : 8402
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=8402 (Fri Jan 12 19:04:47 2018)
> host-id=1
> score=0
> vm_conf_refresh_time=8403 (Fri Jan 12 19:04:47 2018)
> conf_on_shared_storage=True
> maintenance=False
> state=EngineUnexpectedlyDown
> stopped=False
> timeout=Thu Jan  1 05:24:43 1970
>
>
> --== Host 2 status ==--
>
> conf_on_shared_storage : True
> Status up-to-date  : False
> Hostname   : ovirt1.telia.ru
> Host ID: 2
> Engine status  : unknown stale-data
> Score  : 0
> stopped: True
> Local maintenance  : False
> crc32  : c7037c03
> local_conf_timestamp   : 7530
> Host timestamp : 7530
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=7530 (Fri Jan 12 16:10:12 2018)
> host-id=2
> score=0
> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
> conf_on_shared_storage=True
> 

[ovirt-users] (no subject)

2018-01-12 Thread Artem Tambovskiy
Trying to fix one thing I broke another :(

I fixed mnt_options for hosted engine storage domain and installed latest
security patches to my hosts and hosted engine. All VM's up and running,
but  hosted_engine --vm-status reports about issues:

[root@ovirt1 ~]# hosted-engine --vm-status


--== Host 1 status ==--

conf_on_shared_storage : True
Status up-to-date  : False
Hostname   : ovirt2
Host ID: 1
Engine status  : unknown stale-data
Score  : 0
stopped: False
Local maintenance  : False
crc32  : 193164b8
local_conf_timestamp   : 8350
Host timestamp : 8350
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=8350 (Fri Jan 12 19:03:54 2018)
host-id=1
score=0
vm_conf_refresh_time=8350 (Fri Jan 12 19:03:54 2018)
conf_on_shared_storage=True
maintenance=False
state=EngineUnexpectedlyDown
stopped=False
timeout=Thu Jan  1 05:24:43 1970


--== Host 2 status ==--

conf_on_shared_storage : True
Status up-to-date  : False
Hostname   : ovirt1.telia.ru
Host ID: 2
Engine status  : unknown stale-data
Score  : 0
stopped: True
Local maintenance  : False
crc32  : c7037c03
local_conf_timestamp   : 7530
Host timestamp : 7530
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=7530 (Fri Jan 12 16:10:12 2018)
host-id=2
score=0
vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
conf_on_shared_storage=True
maintenance=False
state=AgentStopped
stopped=True
[root@ovirt1 ~]#



from second host situation looks a bit different:


[root@ovirt2 ~]# hosted-engine --vm-status


--== Host 1 status ==--

conf_on_shared_storage : True
Status up-to-date  : True
Hostname   : ovirt2
Host ID: 1
Engine status  : {"reason": "vm not running on this
host", "health": "bad", "vm": "down", "detail": "unknown"}
Score  : 0
stopped: False
Local maintenance  : False
crc32  : 78eabdb6
local_conf_timestamp   : 8403
Host timestamp : 8402
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=8402 (Fri Jan 12 19:04:47 2018)
host-id=1
score=0
vm_conf_refresh_time=8403 (Fri Jan 12 19:04:47 2018)
conf_on_shared_storage=True
maintenance=False
state=EngineUnexpectedlyDown
stopped=False
timeout=Thu Jan  1 05:24:43 1970


--== Host 2 status ==--

conf_on_shared_storage : True
Status up-to-date  : False
Hostname   : ovirt1.telia.ru
Host ID: 2
Engine status  : unknown stale-data
Score  : 0
stopped: True
Local maintenance  : False
crc32  : c7037c03
local_conf_timestamp   : 7530
Host timestamp : 7530
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=7530 (Fri Jan 12 16:10:12 2018)
host-id=2
score=0
vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
conf_on_shared_storage=True
maintenance=False
state=AgentStopped
stopped=True


WebGUI shows that engine running on host ovirt1.
Gluster looks fine
[root@ovirt1 ~]# gluster volume status engine
Status of volume: engine
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick ovirt1.telia.ru:/oVirt/engine 49169 0  Y
3244
Brick ovirt2.telia.ru:/oVirt/engine 49179 0  Y
20372
Brick ovirt3.telia.ru:/oVirt/engine 49206 0  Y
16609
Self-heal Daemon on localhost   N/A   N/AY
117868
Self-heal Daemon on ovirt2.telia.ru N/A   N/AY
20521
Self-heal Daemon on ovirt3  N/A   N/AY
25093

Task Status of Volume engine
--
There are no active volume tasks

How to resolve this issue?

Re: [ovirt-users] mount_options for hosted_engine storage domain

2018-01-12 Thread Artem Tambovskiy
Thanks a lot, Simeone!

hosted-engine --set-shared-config mnt_options
backup-volfile-servers=host1.domain.com:host2.domain.com --type=he_conf
 solved my issue!

Regards,
Artem

On Fri, Jan 12, 2018 at 3:39 PM, Simone Tiraboschi <stira...@redhat.com>
wrote:

>
>
> On Fri, Jan 12, 2018 at 1:22 PM, Artem Tambovskiy <
> artem.tambovs...@gmail.com> wrote:
>
>> Hi,
>>
>> I have deployed a small cluster with 2 ovirt hosts and GlusterFS cluster
>> some time ago. And recently during software upgrade I noticed that I made
>> some mistakes during the installation:
>>
>> if the host which was deployed first will be taken down for upgrade
>> (powered off or rebooted) the engine becomes unavailable (even all VM's and
>> hosted engine were migrated to second host in advance).
>>
>> I was thinking that this is due to missing mnt_options=backup-volfile--se
>> rvers=host1.domain.com;host2.domain.com option for hosted engine storage
>> domain.
>> Is there any good way to fix this? I have tried
>> edit /etc/ovirt-hosted-engine/hosted-engine.conf manually to add missing
>> mnt_options but after while I noticed that those changes are gone.
>>
>
> The master copy used at host-deploy time is on the shared storage domain,
> you can change it with:
> hosted-engine --set-shared-conf mnt_options=backup-volfile--servers=
> host1.domain.com;host2.domain.com
>
> And then edit /etc/ovirt-hosted-engine/hosted-engine.conf and restart
> ovirt-ha-agent on existing HE hosts.
>
>
>
>>
>> Any suggestions?
>>
>> Thanks in advance!
>> Artem
>>
>>
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] mount_options for hosted_engine storage domain

2018-01-12 Thread Artem Tambovskiy
Hi,

I have deployed a small cluster with 2 ovirt hosts and GlusterFS cluster
some time ago. And recently during software upgrade I noticed that I made
some mistakes during the installation:

if the host which was deployed first will be taken down for upgrade
(powered off or rebooted) the engine becomes unavailable (even all VM's and
hosted engine were migrated to second host in advance).

I was thinking that this is due to missing
mnt_options=backup-volfile--servers=host1.domain.com;host2.domain.com
option for hosted engine storage domain.
Is there any good way to fix this? I have tried
edit /etc/ovirt-hosted-engine/hosted-engine.conf manually to add missing
mnt_options but after while I noticed that those changes are gone.

Any suggestions?

Thanks in advance!
Artem
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Q: Partitioning - oVirt 4.1 & GlusterFS 2-node System

2017-12-13 Thread Artem Tambovskiy
Hi,

AFAIK, during hosted engine deployment installer will check the GlusterFS
replica type. And replica 3 is a mandatory requirement. Previously, i got
and idvise within this mailing list to look on DRDB solution if you do t
have a third node to to run at a GlusterFS replica 3.

14 дек. 2017 г. 1:51 пользователь "Andrei V"  написал:

> Hi, Donny,
>
> Thanks for the link.
>
> Am I understood correctly that I'm need at least 3-node system to run in
> failover mode? So far I'm plan to deploy only 2 nodes, either with hosted
> either with bare metal engine.
>
> *The key thing to keep in mind regarding host maintenance and downtime is
> that this converged  three node system relies on having at least two of the
> nodes up at all times. If you bring down  two machines at once, you'll run
> afoul of the Gluster quorum rules that guard us from split-brain states in
> our storage, the volumes served by your remaining host will go read-only,
> and the VMs stored on those volumes will pause and require a shutdown and
> restart in order to run again.*
>
> What happens if in 2-node glusterfs system (with hosted engine) one node
> goes down?
> Bare metal engine can manage this situation, but I'm not sure about hosted
> engine.
>
>
> On 12/13/2017 11:17 PM, Donny Davis wrote:
>
> I would start here
> https://ovirt.org/blog/2017/04/up-and-running-with-ovirt-
> 4.1-and-gluster-storage/
>
> Pretty good basic guidance.
>
> Also with software defined storage its recommended their are at least two
> "storage" nodes and one arbiter node to maintain quorum.
>
> On Wed, Dec 13, 2017 at 3:45 PM, Andrei V  wrote:
>
>> Hi,
>>
>> I'm going to setup relatively simple 2-node system with oVirt 4.1,
>> GlusterFS, and several VMs running.
>> Each node going to be installed on dual Xeon system with single RAID 5.
>>
>> oVirt node installer uses relatively simple default partitioning scheme.
>> Should I leave it as is, or there are better options?
>> I never used GlusterFS before, so any expert opinion is very welcome.
>>
>> Thanks in advance.
>> Andrei
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Standalone Gluster Storage

2017-12-13 Thread Artem Tambovskiy
Hi,

I just updated almost all storage domains with backup-volfile-servers mount
options, the last one remaining is the hosted_storage domain which serves
hosted engine VM. I wonder if this domain also needs to be configured
with  backup-volfile-servers
option? If so how to do this - I can't put this domain on maintenance via
web UI.

Regards,
Artem

On Wed, Dec 13, 2017 at 9:03 AM, Sahina Bose <sab...@redhat.com> wrote:

> The backup-volfile-servers as an additional mount option should handle the
> case where one of the servers goes down - storage domain should continue to
> be available.
> The servers specified for this option can be the servers participating in
> your volume. For instance, the set of unique servers from the "gluster
> volume info" command.
>
> If even with this mount option, you're facing an issue - please log a bug
> with gluster mount logs and vdsm logs.
>
> thanks
> sahina
>
> On Wed, Dec 13, 2017 at 12:37 AM, Beau Sapach <bsap...@ualberta.ca> wrote:
>
>> We did use the backup-volfile-servers option but still had trouble.  We
>> were simply adding all servers in the cluster as backups, is there a best
>> practice that should be followed?
>>
>> On Tue, Dec 12, 2017 at 8:59 AM, Artem Tambovskiy <
>> artem.tambovs...@gmail.com> wrote:
>>
>>> I did exactly the same mistake with my standalone GlusterFS cluster and
>>> now need to take down all Storage Domains in order to fix this mistake.
>>> Probably, worth to add a few words about this in Installation guide!
>>>
>>> On Tue, Dec 12, 2017 at 4:52 PM, Simone Tiraboschi <stira...@redhat.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Dec 11, 2017 at 8:44 PM, Beau Sapach <bsap...@ualberta.ca>
>>>> wrote:
>>>>
>>>>> We've been doing some experimenting with gluster, and have built a
>>>>> stand-alone gluster cluster (not managed by oVirt).  We've been able to
>>>>> create a storage domain backed by that gluster cluster and run VMs with
>>>>> their disks on that storage.
>>>>>
>>>>> The problem we have is that when we take a gluster node down for
>>>>> updates, maintenance etc. the entire storage domain goes offline in oVirt.
>>>>> Other gluster clients, that is servers connecting directly to the gluster
>>>>> cluster don't seem to notice if one node goes offline.
>>>>>
>>>>> Is anyone else using gluster storage in oVirt that is not managed
>>>>> within oVirt?
>>>>>
>>>>
>>>> Did you set also the backup-volfile-servers mount option?
>>>>
>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Beau Sapach
>>>>> *System Administrator | Information Technology Services | University
>>>>> of Alberta Libraries*
>>>>> *Phone: 780.492.4181 <(780)%20492-4181> | Email:
>>>>> beau.sap...@ualberta.ca <beau.sap...@ualberta.ca>*
>>>>>
>>>>>
>>>>> ___
>>>>> Users mailing list
>>>>> Users@ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>
>>>> ___
>>>> Users mailing list
>>>> Users@ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>>
>> --
>> Beau Sapach
>> *System Administrator | Information Technology Services | University of
>> Alberta Libraries*
>> *Phone: 780.492.4181 | Email: beau.sap...@ualberta.ca
>> <beau.sap...@ualberta.ca>*
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Standalone Gluster Storage

2017-12-12 Thread Artem Tambovskiy
I did exactly the same mistake with my standalone GlusterFS cluster and now
need to take down all Storage Domains in order to fix this mistake.
Probably, worth to add a few words about this in Installation guide!

On Tue, Dec 12, 2017 at 4:52 PM, Simone Tiraboschi 
wrote:

>
>
> On Mon, Dec 11, 2017 at 8:44 PM, Beau Sapach  wrote:
>
>> We've been doing some experimenting with gluster, and have built a
>> stand-alone gluster cluster (not managed by oVirt).  We've been able to
>> create a storage domain backed by that gluster cluster and run VMs with
>> their disks on that storage.
>>
>> The problem we have is that when we take a gluster node down for updates,
>> maintenance etc. the entire storage domain goes offline in oVirt.  Other
>> gluster clients, that is servers connecting directly to the gluster cluster
>> don't seem to notice if one node goes offline.
>>
>> Is anyone else using gluster storage in oVirt that is not managed within
>> oVirt?
>>
>
> Did you set also the backup-volfile-servers mount option?
>
>
>>
>>
>> --
>> Beau Sapach
>> *System Administrator | Information Technology Services | University of
>> Alberta Libraries*
>> *Phone: 780.492.4181 <(780)%20492-4181> | Email: beau.sap...@ualberta.ca
>> *
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] extending cloud Images in oVirt

2017-11-23 Thread Artem Tambovskiy
I have a question indirectly releted to Ovirt. I need to move one old setup
into VM running in oVirt cluster. The VM was based on on Debian 8.9, so I
took a Debian cloude image from
https://cdimage.debian.org/cdimage/openstack/8.9.8-20171105/ uploaded it
into my cluster and attached it to VM. All looks good but ... the disk
shows only 2G and indeed ned more disk space. I tried to edit and add more
space - and it didnt work.

Any ideas how to extend those cloud images?

Regards,
Artem
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Non-responsive host, VM's are still running - how to resolve?

2017-11-14 Thread Artem Tambovskiy
', 'memUsage': '49', 'guestFQDN': '', 'memoryStats': {u'swap_out':
'0', u'majflt': '0', u'swap_usage': '0', u'mem_cached': '549844',
u'mem_free': '1054040', u'mem_buffers': '2080', u'swap_in': '0',
u'swap_total': '4064252', u'pageflt': '148', u'mem_total': '1815524',
u'mem_unused': '502116'}, 'session': 'Unknown', 'netIfaces': [],
'guestCPUCount': -1, 'appsList': (), 'guestIPs': '', 'disksUsage': []}}
Nov 14 21:01:34 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready yet,
ignoring event u'|virt|VM_status|ca2815c5-f815-469d-869d-a8fe1cb8c2e7'
args={u'ca2815c5-f815-469d-869d-a8fe1cb8c2e7': {'status': 'Up', 'username':
'Unknown', 'memUsage': '14', 'guestFQDN': '', 'memoryStats': {u'swap_out':
'0', u'majflt': '0', u'swap_usage': '0', u'mem_cached': '497136',
u'mem_free': '1801440', u'mem_buffers': '102108', u'swap_in': '0',
u'swap_total': '1046524', u'pageflt': '64', u'mem_total': '2046116',
u'mem_unused': '1202196'}, 'session': 'Unknown', 'netIfaces': [],
'guestCPUCount': -1, 'appsList': (), 'guestIPs': '', 'disksUsage': []}}

On Tue, Nov 14, 2017 at 8:49 PM, Darrell Budic <bu...@onholyground.com>
wrote:

> Try restarting vdsmd from the shell, “systemctl restart vdsmd”.
>
>
> ------
> *From:* Artem Tambovskiy <artem.tambovs...@gmail.com>
> *Subject:* [ovirt-users] Non-responsive host, VM's are still running -
> how to resolve?
> *Date:* November 14, 2017 at 11:23:32 AM CST
> *To:* users
>
> Apparently, i lost the host which was running hosted-engine and another 4
> VM's exactly during migration of second host from bare-metal to second host
> in the cluster. For some reason first host entered the "Non reponsive"
> state. The interesting thing is that hosted-engine and all other VM's up
> and running, so its like a communication problem between hosted-engine and
> host.
>
> The engine.log at hosted-engine is full of following messages:
>
> 2017-11-14 17:06:43,158Z INFO  
> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient]
> (SSL Stomp Reactor) [] Connecting to ovirt2/80.239.162.106
> 2017-11-14 17:06:43,159Z ERROR 
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
> (DefaultQuartzScheduler9) [50938c3] Command 'GetAllVmStatsVDSCommand(HostName
> = ovirt2.telia.ru, VdsIdVDSCommandParametersBase:{runAsync='true',
> hostId='3970247c-69eb-4bd8-b263-9100703a8243'})' execution failed:
> java.net.NoRouteToHostException: No route to host
> 2017-11-14 17:06:43,159Z INFO  [org.ovirt.engine.core.
> vdsbroker.monitoring.PollVmStatsRefresher] (DefaultQuartzScheduler9)
> [50938c3] Failed to fetch vms info for host 'ovirt2.telia.ru' - skipping
> VMs monitoring.
> 2017-11-14 17:06:45,929Z INFO  
> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient]
> (SSL Stomp Reactor) [] Connecting to ovirt2/80.239.162.106
> 2017-11-14 17:06:45,930Z ERROR 
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
> (DefaultQuartzScheduler2) [6080f1cc] Command 
> 'GetCapabilitiesVDSCommand(HostName
> = ovirt2.telia.ru, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
> hostId='3970247c-69eb-4bd8-b263-9100703a8243', vds='Host[ovirt2.telia.ru,
> 3970247c-69eb-4bd8-b263-9100703a8243]'})' execution failed: 
> java.net.NoRouteToHostException:
> No route to host
> 2017-11-14 17:06:45,930Z ERROR [org.ovirt.engine.core.
> vdsbroker.monitoring.HostMonitoring] (DefaultQuartzScheduler2) [6080f1cc]
> Failure to refresh host 'ovirt2.telia.ru' runtime info: 
> java.net.NoRouteToHostException:
> No route to host
> 2017-11-14 17:06:48,933Z INFO  
> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient]
> (SSL Stomp Reactor) [] Connecting to ovirt2/80.239.162.106
> 2017-11-14 17:06:48,934Z ERROR 
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
> (DefaultQuartzScheduler6) [1a64dfea] Command 
> 'GetCapabilitiesVDSCommand(HostName
> = ovirt2.telia.ru, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
> hostId='3970247c-69eb-4bd8-b263-9100703a8243', vds='Host[ovirt2.telia.ru,
> 3970247c-69eb-4bd8-b263-9100703a8243]'})' execution failed: 
> java.net.NoRouteToHostException:
> No route to host
> 2017-11-14 17:06:48,934Z ERROR [org.ovirt.engine.core.
> vdsbroker.monitoring.HostMonitoring] (DefaultQuartzScheduler6) [1a64dfea]
> Failure to refresh host 'ovirt2.telia.ru' runtime info: 
> java.net.NoRouteToHostException:
> No route to host
> 2017-11-14 17:06:50,931Z INFO  
> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient]
> (SSL Stomp Reactor) [] Connecting to ovirt2/80.239.162.106
> 2017-11-14 17:06:50,932Z ERROR 
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand]
> (DefaultQuartzScheduler4) [6b19d168] Command 'SpmStatusVDSCommand(HostName
> = ovirt2.telia.ru, SpmStatusVDSCom

[ovirt-users] Non-responsive host, VM's are still running - how to resolve?

2017-11-14 Thread Artem Tambovskiy
Apparently, i lost the host which was running hosted-engine and another 4
VM's exactly during migration of second host from bare-metal to second host
in the cluster. For some reason first host entered the "Non reponsive"
state. The interesting thing is that hosted-engine and all other VM's up
and running, so its like a communication problem between hosted-engine and
host.

The engine.log at hosted-engine is full of following messages:

2017-11-14 17:06:43,158Z INFO
[org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
[] Connecting to ovirt2/80.239.162.106
2017-11-14 17:06:43,159Z ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
(DefaultQuartzScheduler9) [50938c3] Command
'GetAllVmStatsVDSCommand(HostName = ovirt2.telia.ru,
VdsIdVDSCommandParametersBase:{runAsync='true',
hostId='3970247c-69eb-4bd8-b263-9100703a8243'})' execution failed:
java.net.NoRouteToHostException: No route to host
2017-11-14 17:06:43,159Z INFO
[org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
(DefaultQuartzScheduler9) [50938c3] Failed to fetch vms info for host '
ovirt2.telia.ru' - skipping VMs monitoring.
2017-11-14 17:06:45,929Z INFO
[org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
[] Connecting to ovirt2/80.239.162.106
2017-11-14 17:06:45,930Z ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
(DefaultQuartzScheduler2) [6080f1cc] Command
'GetCapabilitiesVDSCommand(HostName = ovirt2.telia.ru,
VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
hostId='3970247c-69eb-4bd8-b263-9100703a8243',
vds='Host[ovirt2.telia.ru,3970247c-69eb-4bd8-b263-9100703a8243]'})'
execution failed: java.net.NoRouteToHostException: No route to host
2017-11-14 17:06:45,930Z ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
(DefaultQuartzScheduler2) [6080f1cc] Failure to refresh host '
ovirt2.telia.ru' runtime info: java.net.NoRouteToHostException: No route to
host
2017-11-14 17:06:48,933Z INFO
[org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
[] Connecting to ovirt2/80.239.162.106
2017-11-14 17:06:48,934Z ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
(DefaultQuartzScheduler6) [1a64dfea] Command
'GetCapabilitiesVDSCommand(HostName = ovirt2.telia.ru,
VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
hostId='3970247c-69eb-4bd8-b263-9100703a8243',
vds='Host[ovirt2.telia.ru,3970247c-69eb-4bd8-b263-9100703a8243]'})'
execution failed: java.net.NoRouteToHostException: No route to host
2017-11-14 17:06:48,934Z ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
(DefaultQuartzScheduler6) [1a64dfea] Failure to refresh host '
ovirt2.telia.ru' runtime info: java.net.NoRouteToHostException: No route to
host
2017-11-14 17:06:50,931Z INFO
[org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
[] Connecting to ovirt2/80.239.162.106
2017-11-14 17:06:50,932Z ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand]
(DefaultQuartzScheduler4) [6b19d168] Command 'SpmStatusVDSCommand(HostName
= ovirt2.telia.ru, SpmStatusVDSCommandParameters:{runAsync='true',
hostId='3970247c-69eb-4bd8-b263-9100703a8243',
storagePoolId='5a044257-02ec-0382-0243-01f2'})' execution failed:
java.net.NoRouteToHostException: No route to host
2017-11-14 17:06:50,939Z INFO
[org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
[] Connecting to ovirt2/80.239.162.106
2017-11-14 17:06:50,940Z ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler4) [6b19d168]
IrsBroker::Failed::GetStoragePoolInfoVDS
2017-11-14 17:06:50,940Z ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.GetStoragePoolInfoVDSCommand]
(DefaultQuartzScheduler4) [6b19d168] Command 'GetStoragePoolInfoVDSCommand(
GetStoragePoolInfoVDSCommandParameters:{runAsync='true',
storagePoolId='5a044257-02ec-0382-0243-01f2',
ignoreFailoverLimit='true'})' execution failed: IRSProtocolException:
2017-11-14 17:06:51,937Z INFO
[org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
[] Connecting to ovirt2/80.239.162.106
2017-11-14 17:06:51,938Z ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
(DefaultQuartzScheduler7) [7f23a3bd] Command
'GetCapabilitiesVDSCommand(HostName = ovirt2.telia.ru,
VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
hostId='3970247c-69eb-4bd8-b263-9100703a8243',
vds='Host[ovirt2.telia.ru,3970247c-69eb-4bd8-b263-9100703a8243]'})'
execution failed: java.net.NoRouteToHostException: No route to host
2017-11-14 17:06:51,938Z ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
(DefaultQuartzScheduler7) [7f23a3bd] Failure to refresh host '
ovirt2.telia.ru' runtime info: java.net.NoRouteToHostException: No route to
host
2017-11-14 17:06:54,941Z INFO
[org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
[] Connecting to ovirt2/80.239.162.106
2017-11-14 

Re: [ovirt-users] Host Power Management Configuration questions

2017-11-14 Thread Artem Tambovskiy
Hi,

In the engine.log appears following:

2017-11-14 12:04:33,081+03 ERROR
[org.ovirt.engine.core.bll.pm.FenceProxyLocator] (default task-184)
[32fe1ce0-2e25-4e2e-a6bf-59f39a65b2f1] Can not run fence action on host
'ovirt.prod.env', no suitable proxy host was found.
2017-11-14 12:04:36,534+03 INFO
 [org.ovirt.engine.core.bll.hostdeploy.UpdateVdsCommand] (default task-186)
[d83ce46d-ce89-4804-aba1-761103e93e8c] Running command: UpdateVdsCommand
internal: false. Entities affected :  ID:
a9bb1c6f-b9c9-4dc3-a24e-b83b2004552d Type: VDSAction group
EDIT_HOST_CONFIGURATION with role type ADMIN
2017-11-14 12:04:36,704+03 ERROR
[org.ovirt.engine.core.bll.pm.FenceProxyLocator] (default task-186)
[d83ce46d-ce89-4804-aba1-761103e93e8c] Can not run fence action on host
'ovirt.prod.env', no suitable proxy host was found.
2017-11-14 12:04:36,705+03 INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase]
(default task-186) [d83ce46d-ce89-4804-aba1-761103e93e8c] Failed to get vds
'a9bb1c6f-b9c9-4dc3-a24e-b83b2004552d', error: null
2017-11-14 12:04:36,705+03 INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase]
(default task-186) [d83ce46d-ce89-4804-aba1-761103e93e8c] Failed to get vds
'a9bb1c6f-b9c9-4dc3-a24e-b83b2004552d', error: null
2017-11-14 12:04:36,705+03 INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase]
(default task-186) [d83ce46d-ce89-4804-aba1-761103e93e8c] Failed to get vds
'a9bb1c6f-b9c9-4dc3-a24e-b83b2004552d', error null
2017-11-14 12:04:36,705+03 INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase]
(default task-186) [d83ce46d-ce89-4804-aba1-761103e93e8c] Failed to get vds
'a9bb1c6f-b9c9-4dc3-a24e-b83b2004552d', error null
2017-11-14 12:04:36,720+03 WARN
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-186) [d83ce46d-ce89-4804-aba1-761103e93e8c] EVENT_ID:
VDS_ALERT_PM_HEALTH_CHECK_START_MIGHT_FAIL(9,010), Correlation ID: null,
Call Stack: null, Custom Event ID: -1, Message: Health check on Host
 indicates that future attempts to Start this host using
Power-Management are expected to fail.
2017-11-14 12:04:36,720+03 INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase]
(default task-186) [d83ce46d-ce89-4804-aba1-761103e93e8c] Failed to get vds
'a9bb1c6f-b9c9-4dc3-a24e-b83b2004552d', error: null
2017-11-14 12:04:36,720+03 INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase]
(default task-186) [d83ce46d-ce89-4804-aba1-761103e93e8c] Failed to get vds
'a9bb1c6f-b9c9-4dc3-a24e-b83b2004552d', error: null
2017-11-14 12:04:36,720+03 INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase]
(default task-186) [d83ce46d-ce89-4804-aba1-761103e93e8c] Failed to get vds
'a9bb1c6f-b9c9-4dc3-a24e-b83b2004552d', error null
2017-11-14 12:04:36,720+03 INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase]
(default task-186) [d83ce46d-ce89-4804-aba1-761103e93e8c] Failed to get vds
'a9bb1c6f-b9c9-4dc3-a24e-b83b2004552d', error null
2017-11-14 12:04:36,731+03 WARN
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-186) [d83ce46d-ce89-4804-aba1-761103e93e8c] EVENT_ID:
VDS_ALERT_PM_HEALTH_CHECK_STOP_MIGHT_FAIL(9,011), Correlation ID: null,
Call Stack: null, Custom Event ID: -1, Message: Health check on Host
 indicates that future attempts to Stop this host using
Power-Management are expected to fail.
2017-11-14 12:04:36,765+03 WARN
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-186) [d83ce46d-ce89-4804-aba1-761103e93e8c] EVENT_ID:
KDUMP_DETECTION_NOT_CONFIGURED_ON_VDS(617), Correlation ID: null, Call
Stack: null, Custom Event ID: -1, Message: Kdump integration is enabled for
host ovirt.prod.env, but kdump is not configured properly on host.
2017-11-14 12:04:36,781+03 INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-186) [d83ce46d-ce89-4804-aba1-761103e93e8c] EVENT_ID:
USER_UPDATE_VDS(43), Correlation ID: d83ce46d-ce89-4804-aba1-761103e93e8c,
Call Stack: null, Custom Event ID: -1, Message: Host ovirt.prod.env
configuration was updated by arta00@internal-authz.

Just let me know if more logs are needed.

Regards,
Artem

On Tue, Nov 14, 2017 at 11:52 AM, Martin Perina <mper...@redhat.com> wrote:

> Hi,
>
> could you please provide engine logs so we can investigate?
>
> Thanks
>
> Martin
>
>
> On Tue, Nov 14, 2017 at 9:33 AM, Artem Tambovskiy <
> artem.tambovs...@gmail.com> wrote:
>
>> Trying to configure power management for a certain host and fence agent
>> always fail when I'm pressing Test button.
>>
>>  At the same time from command line on the same host all looks good:
>>
>> [root@ovirt ~]# fence_ipmilan -a 172.16.22.1 -l user -p pwd -o status -v
>> -P
>> Executing: /usr/bin/ipmitool -I lanplus -H 

[ovirt-users] Host Power Management Configuration questions

2017-11-14 Thread Artem Tambovskiy
Trying to configure power management for a certain host and fence agent
always fail when I'm pressing Test button.

 At the same time from command line on the same host all looks good:

[root@ovirt ~]# fence_ipmilan -a 172.16.22.1 -l user -p pwd -o status -v -P
Executing: /usr/bin/ipmitool -I lanplus -H 172.16.22.1 -p 623 -U user -P
pwd -L ADMINISTRATOR chassis power status

0 Chassis Power is on

Status: ON
[root@ovirt ~]#

What could be the reason?

Regards,
Artem
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Hosted-Engine environment, strange messages in event log

2017-11-12 Thread Artem Tambovskiy
Any suggestion what can be the reason for those strange messages
(repeating every hour) in the web gui event log:

Nov 11, 2017 7:07:01 PM Status of host ovirt2.prod.env was set to Up.
Nov 11, 2017 7:06:54 PM Failed to update OVF disks
94b7554b-4c18-4296-b795-98ca6c0fb251, 002af29c-58df-493d-a45c-5009d4dfc1de,
OVF data isn't updated on those OVF stores (Data Center oVirtDC, Storage
Domain oVirtMigration).
Nov 11, 2017 7:06:53 PM Failed to update OVF disks
c7bb37de-739b-4899-8d23-d9197f81b596, OVF data isn't updated on those OVF
stores (Data Center oVirtDC, Storage Domain oVirtStorageData).
Nov 11, 2017 7:06:53 PM Host ovirt2.prod.env is not responding. Host cannot
be fenced automatically because power management for the host is disabled.
Nov 11, 2017 6:06:57 PM Status of host ovirt2.prod.env was set to Up.
Nov 11, 2017 6:06:51 PM Failed to update OVF disks
94b7554b-4c18-4296-b795-98ca6c0fb251, 002af29c-58df-493d-a45c-5009d4dfc1de,
OVF data isn't updated on those OVF stores (Data Center oVirtDC, Storage
Domain oVirtMigration).
Nov 11, 2017 6:06:48 PM Failed to update OVF disks
c7bb37de-739b-4899-8d23-d9197f81b596, OVF data isn't updated on those OVF
stores (Data Center oVirtDC, Storage Domain oVirtStorageData).
Nov 11, 2017 6:06:48 PM Host ovirt2.prod.env is not responding. Host cannot
be fenced automatically because power management for the host is disabled.

Power Management is not configured yet, plaqnning to test it on coming
week. So far I have one host serving 5 VM's + hosted_engine. I was planning
to add one more host to the cluster this week.
All storage domains running on GlusterFS cluster.

Regards,
Artem
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted Engine installation + GlusterFS cluster

2017-11-09 Thread Artem Tambovskiy
One more thing is  - firewall rules.

For 3 gluster bricks I have configured following:
firewall-cmd --zone=public --add-port=24007-24009/tcp
--add-port=49152-49664/tcp --permanent

and this seems not enough. have to stop the firewall in order to make the
cluster working.

I have noticed 490xx being used by gluster, any ideas on that documented
range?

 lsof -i | grep gluster | grep "490"
glusterfs 32301root   10u  IPv4 148985  0t0  TCP
ovirt1:49159->ovirt1:49099 (ESTABLISHED)
glusterfs 32301root   17u  IPv4 153084  0t0  TCP
ovirt1:49159->ovirt2:49096 (ESTABLISHED)
glusterfs 46346root   17u  IPv4 156437  0t0  TCP
ovirt1:49161->ovirt1:49093 (ESTABLISHED)
glusterfs 46346root   18u  IPv4 149985  0t0  TCP
ovirt1:49161->ovirt2:49090 (ESTABLISHED)
glusterfs 46380root8u  IPv4 151389  0t0  TCP
ovirt1:49090->ovirt3:49161 (ESTABLISHED)
glusterfs 46380root   11u  IPv4 148986  0t0  TCP
ovirt1:49091->ovirt2:49161 (ESTABLISHED)
glusterfs 46380root   21u  IPv4 153074  0t0  TCP
ovirt1:49099->ovirt1:49159 (ESTABLISHED)
glusterfs 46380root   25u  IPv4 153075  0t0  TCP
ovirt1:49097->ovirt2:49160 (ESTABLISHED)
glusterfs 46380root   26u  IPv4 153076  0t0  TCP
ovirt1:49095->ovirt3:49159 (ESTABLISHED)
glusterfs 46380root   27u  IPv4 153077  0t0  TCP
ovirt1:49093->ovirt1:49161 (ESTABLISHED)

Regards,
Artem

On Thu, Nov 9, 2017 at 3:56 PM, Artem Tambovskiy <artem.tambovs...@gmail.com
> wrote:

> Hi,
>
> Just realized that I probably went in the wrong way. Reinstalled
> everything from the scratch added 4 volumes (hosted_engine, data, export,
> iso). All looks good so far.
> But if go to the Cluster properties and tick the checkbox "Enable Cluster
> Service" - the host will be marked as Non-Operational. Am I messing up the
> things?
> Or I'm just fine as long as I already have a Data (Master) Storage Domain
> over GlusterFS?
>
> Regards,
> Artem
>
> On Thu, Nov 9, 2017 at 2:46 PM, Fred Rolland <froll...@redhat.com> wrote:
>
>> Hi,
>>
>> The steps for this kind of setup are described in [1].
>> However it seems you have already succeeded in installing, so maybe you
>> need some additional steps [2]
>> Did you add a storage domain that will act as Master Domain? It is
>> needed, then the initial Storage Domain should be imported automatically.
>>
>>
>> [1] https://www.ovirt.org/blog/2017/04/up-and-running-with-ovirt
>> -4.1-and-gluster-storage/
>> [2] https://www.ovirt.org/documentation/gluster-hyperconverged/
>> chap-Additional_Steps/
>>
>> On Thu, Nov 9, 2017 at 10:50 AM, Artem Tambovskiy <
>> artem.tambovs...@gmail.com> wrote:
>>
>>> Another yet attempt to get a help on hosted-engine deployment with
>>> glusterfs cluster.
>>> I already spend a day trying to get bring such a setup to work with no
>>> luck.
>>>
>>> The hosted engine being successfully deployed but I can't activate the
>>> host, the storage domain for the host is missing and I can't even add it.
>>> So either something went wrong during deployment or my glusterfs cluster
>>> doesn't configured properly.
>>>
>>> That are the prerequisites for this?
>>>
>>> - glusterfs cluster of 3 nodes with replica 3 volume
>>> - Any specific volume configs?
>>> - how many volumes should I prepare for hosted engine deployment?
>>>
>>> Any other thoughts?
>>>
>>> Regards,
>>> Artem
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Hosted Engine installation + GlusterFS cluster

2017-11-09 Thread Artem Tambovskiy
Another yet attempt to get a help on hosted-engine deployment with
glusterfs cluster.
I already spend a day trying to get bring such a setup to work with no
luck.

The hosted engine being successfully deployed but I can't activate the
host, the storage domain for the host is missing and I can't even add it.
So either something went wrong during deployment or my glusterfs cluster
doesn't configured properly.

That are the prerequisites for this?

- glusterfs cluster of 3 nodes with replica 3 volume
- Any specific volume configs?
- how many volumes should I prepare for hosted engine deployment?

Any other thoughts?

Regards,
Artem
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Advise needed: building cheap HA oVirt cluster with just 2 physical servers

2017-11-08 Thread Artem Tambovskiy
Hi,

Can anyone share their experience on deploying hosted-engine with GlusterFS
cluster?

I managed to setup a GlusterFS cluster and started to deploy hosted engine.
At first stage I was beaten by firewall rules - the deployment process
interrupted at GlusterFS config stage.
After fixing the rules I got engine up and running, but the host is
non-operational state still. Logged in to Web UI -> and see 2 action
points: "Gluster command failed on server" and Gluster status is
disconnected for this server"

This is a bit strange, since the cluster was properly detected during
deployment and deployment script was supposed to configure the cluster.

  --== STORAGE CONFIGURATION ==--

  Please specify the storage you would like to use (glusterfs,
iscsi, fc, nfs3, nfs4)[nfs3]: glusterfs
[ INFO  ] Please note that Replica 3 support is required for the shared
storage.
  Please specify the full shared storage connection path to use
(example: host:/path): .x.xx:/oVirt
[ INFO  ] GlusterFS replica 3 Volume detected
  Do you want to configure this host and its cluster for gluster?
(Yes, No) [No]: Yes
[ INFO  ] GlusterFS replica 3 Volume detected

Any ideas how to fix this?

Thanks in advance!
Regards,
Artem


On Fri, Nov 3, 2017 at 2:28 PM, Martin Sivak <msi...@redhat.com> wrote:

> Hi,
>
> cockpit is enabled by default when you use ovirt-node. You will
> probably have to install the necessary cockpit packages yourself on
> pure CentOS - you will need cockpit and ovirt + gdeploy cockpit
> plugins (sadly I do not recall the exact package names).
>
> With regards to arbiter and the wizard.. I really do not know, but I
> will alert my colleagues who might have more detailed knowledge of the
> gluster part.
>
> Denis, Sahina: can you please help me here?
>
> Best regards
>
> Martin Sivak
>
> On Fri, Nov 3, 2017 at 11:29 AM, Artem Tambovskiy
> <artem.tambovs...@gmail.com> wrote:
> > Thanks for an article, Martin!
> > Any chance to configure a third cost to act as GlusterFS Arbitr only
> using
> > this wizard?
> >
> > And stupid question - how to make this wizard up and running? I've
> > everything installed and nothing is runnin on port 9090 :)
> >
> > Regards,
> > Artem
> >
> > On Fri, Nov 3, 2017 at 12:49 PM, Martin Sivak <msi...@redhat.com> wrote:
> >>
> >> Hi,
> >>
> >> you should take a look at the hyper converged way of installing oVirt.
> >> We have a cockpit wizard that does almost everything for you:
> >>
> >>
> >> https://www.ovirt.org/documentation/gluster-
> hyperconverged/chap-Deploying_Hyperconverged/
> >>
> >> It uses three hosts and collocates the VMs together with Gluster
> storage.
> >>
> >> Best regards
> >>
> >> --
> >> Martin Sivak
> >> SLA  /oVirt
> >>
> >> On Fri, Nov 3, 2017 at 8:39 AM, Artem Tambovskiy
> >> <artem.tambovs...@gmail.com> wrote:
> >> > Thanks Eduardo!
> >> >
> >> > I think I can find a third server to build a glusterFS storage. So the
> >> > first
> >> > step will be to install a self-hosted engine on the new server and
> start
> >> > building a glusterFS storage. IS there any easy way to migrate
> existing
> >> > 5
> >> > VM's running on the second bare-metal oVirt host, right? I found a
> >> > little
> >> > bit tricky moving oVirt backups between the hosts (at least I failed
> to
> >> > replicate the existing VM's on the second server).
> >> >
> >> > Regards,
> >> > Artem
> >> >
> >> >
> >> >
> >> > On Fri, Nov 3, 2017 at 10:24 AM, Eduardo Mayoral <emayo...@arsys.es>
> >> > wrote:
> >> >>
> >> >> For HA you will need some kind of storage available to all the
> compute
> >> >> nodes in the cluster. If you have no external storage and few nodes,
> I
> >> >> think
> >> >> your best option for storage is gluster , and the minimum number of
> >> >> nodes
> >> >> you will need for HA is 3 (the third gluster node can be
> metadata-only,
> >> >> but
> >> >> you still need that third node to give you quorum, avoid split-brains
> >> >> and
> >> >> have something that you can call "HA" with a straight face.
> >> >>
> >> >> Eduardo Mayoral Jimeno (emayo...@arsys.es)
> >> >> Administrador de sistemas. De

Re: [ovirt-users] Advise needed: building cheap HA oVirt cluster with just 2 physical servers

2017-11-03 Thread Artem Tambovskiy
Thanks for an article, Martin!
Any chance to configure a third cost to act as GlusterFS Arbitr only using
this wizard?

And stupid question - how to make this wizard up and running? I've
everything installed and nothing is runnin on port 9090 :)

Regards,
Artem

On Fri, Nov 3, 2017 at 12:49 PM, Martin Sivak <msi...@redhat.com> wrote:

> Hi,
>
> you should take a look at the hyper converged way of installing oVirt.
> We have a cockpit wizard that does almost everything for you:
>
> https://www.ovirt.org/documentation/gluster-hyperconverged/chap-Deploying_
> Hyperconverged/
>
> It uses three hosts and collocates the VMs together with Gluster storage.
>
> Best regards
>
> --
> Martin Sivak
> SLA  /oVirt
>
> On Fri, Nov 3, 2017 at 8:39 AM, Artem Tambovskiy
> <artem.tambovs...@gmail.com> wrote:
> > Thanks Eduardo!
> >
> > I think I can find a third server to build a glusterFS storage. So the
> first
> > step will be to install a self-hosted engine on the new server and start
> > building a glusterFS storage. IS there any easy way to migrate existing 5
> > VM's running on the second bare-metal oVirt host, right? I found a little
> > bit tricky moving oVirt backups between the hosts (at least I failed to
> > replicate the existing VM's on the second server).
> >
> > Regards,
> > Artem
> >
> >
> >
> > On Fri, Nov 3, 2017 at 10:24 AM, Eduardo Mayoral <emayo...@arsys.es>
> wrote:
> >>
> >> For HA you will need some kind of storage available to all the compute
> >> nodes in the cluster. If you have no external storage and few nodes, I
> think
> >> your best option for storage is gluster , and the minimum number of
> nodes
> >> you will need for HA is 3 (the third gluster node can be metadata-only,
> but
> >> you still need that third node to give you quorum, avoid split-brains
> and
> >> have something that you can call "HA" with a straight face.
> >>
> >> Eduardo Mayoral Jimeno (emayo...@arsys.es)
> >> Administrador de sistemas. Departamento de Plataformas. Arsys internet.
> >> +34 941 620 145 ext. 5153
> >>
> >> On 03/11/17 08:10, Artem Tambovskiy wrote:
> >>
> >> Looking for a design advise on oVirt provisioning. I'm running a PoC lab
> >> on single bare-metal host (suddenly it was setup with just Local Storage
> >> domain) and
> >> no I'd like to rebuild the setup by making a cluster of 2 physical
> >> servers, no external storage array available. That are the options
> here? is
> >> there any options to build cheap HA cluster with just 2 servers?
> >>
> >> Thanks in advance!
> >>
> >> Artem
> >>
> >>
> >> ___
> >> Users mailing list
> >> Users@ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/users
> >>
> >>
> >
> >
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Advise needed: building cheap HA oVirt cluster with just 2 physical servers

2017-11-03 Thread Artem Tambovskiy
Good point! Need to focus on this first

Thanks,
Artem

On Fri, Nov 3, 2017 at 10:50 AM, Karli Sjöberg <ka...@inparadise.se> wrote:

> On fre, 2017-11-03 at 10:39 +0300, Artem Tambovskiy wrote:
> > Thanks Eduardo!
> >
> > I think I can find a third server to build a glusterFS storage. So
> > the first step will be to install a self-hosted engine on the new
> > server and start building a glusterFS storage.
>
> You´ll need to build the Gluster storage first, as you´ll want to
> install the Hosted Engine _in_ the HA Gluster storage, right?
>
> /K
>
> > IS there any easy way to migrate existing 5 VM's running on the
> > second bare-metal oVirt host, right? I found a little bit tricky
> > moving oVirt backups between the hosts (at least I failed to
> > replicate the existing VM's on the second server).
> >
> > Regards,
> > Artem
> >
> >
> >
> > On Fri, Nov 3, 2017 at 10:24 AM, Eduardo Mayoral <emayo...@arsys.es>
> > wrote:
> > > For HA you will need some kind of storage available to all the
> > > compute nodes in the cluster. If you have no external storage and
> > > few nodes, I think your best option for storage is gluster , and
> > > the minimum number of nodes you will need for HA is 3 (the third
> > > gluster node can be metadata-only, but you still need that third
> > > node to give you quorum, avoid split-brains and have something that
> > > you can call "HA" with a straight face.
> > > Eduardo Mayoral Jimeno (emayo...@arsys.es)
> > > Administrador de sistemas. Departamento de Plataformas. Arsys
> > > internet.
> > > +34 941 620 145 ext. 5153
> > > On 03/11/17 08:10, Artem Tambovskiy wrote:
> > > > Looking for a design advise on oVirt provisioning. I'm running a
> > > > PoC lab on single bare-metal host (suddenly it was setup with
> > > > just Local Storage domain) and
> > > > no I'd like to rebuild the setup by making a cluster of 2
> > > > physical servers, no external storage array available. That are
> > > > the options here? is there any options to build cheap HA cluster
> > > > with just 2 servers?
> > > >
> > > > Thanks in advance!
> > > >
> > > > Artem
> > > >
> > > >
> > > > ___
> > > > Users mailing list
> > > > Users@ovirt.org
> > > > http://lists.ovirt.org/mailman/listinfo/users
> > >
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Advise needed: building cheap HA oVirt cluster with just 2 physical servers

2017-11-03 Thread Artem Tambovskiy
Thanks Eduardo!

I think I can find a third server to build a glusterFS storage. So the
first step will be to install a self-hosted engine on the new server and
start building a glusterFS storage. IS there any easy way to migrate
existing 5 VM's running on the second bare-metal oVirt host, right? I found
a little bit tricky moving oVirt backups between the hosts (at least I
failed to replicate the existing VM's on the second server).

Regards,
Artem



On Fri, Nov 3, 2017 at 10:24 AM, Eduardo Mayoral <emayo...@arsys.es> wrote:

> For HA you will need some kind of storage available to all the compute
> nodes in the cluster. If you have no external storage and few nodes, I
> think your best option for storage is gluster , and the minimum number of
> nodes you will need for HA is 3 (the third gluster node can be
> metadata-only, but you still need that third node to give you quorum, avoid
> split-brains and have something that you can call "HA" with a straight face.
>
> Eduardo Mayoral Jimeno (emayo...@arsys.es)
> Administrador de sistemas. Departamento de Plataformas. Arsys internet.+34 
> 941 620 145 ext. 5153 <+34%20941%2062%2001%2045>
>
> On 03/11/17 08:10, Artem Tambovskiy wrote:
>
> Looking for a design advise on oVirt provisioning. I'm running a PoC lab
> on single bare-metal host (suddenly it was setup with just Local Storage
> domain) and
> no I'd like to rebuild the setup by making a cluster of 2 physical
> servers, no external storage array available. That are the options here? is
> there any options to build cheap HA cluster with just 2 servers?
>
> Thanks in advance!
>
> Artem
>
>
> ___
> Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Advise needed: building cheap HA oVirt cluster with just 2 physical servers

2017-11-03 Thread Artem Tambovskiy
Looking for a design advise on oVirt provisioning. I'm running a PoC lab on
single bare-metal host (suddenly it was setup with just Local Storage
domain) and
no I'd like to rebuild the setup by making a cluster of 2 physical servers,
no external storage array available. That are the options here? is there
any options to build cheap HA cluster with just 2 servers?

Thanks in advance!

Artem
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users