[ovirt-users] Re: Cannot restart ovirt after massive failure.

2021-08-13 Thread Gilboa Davara
Shabbat Shalom,

On Wed, Aug 11, 2021 at 10:03 AM Yedidyah Bar David  wrote:

> On Tue, Aug 10, 2021 at 9:20 PM Gilboa Davara  wrote:
> >
> > Hello,
> >
> > Many thanks again for taking the time to try and help me recover this
> machine (even though it would have been far easier to simply redeploy it...)
> >
> >> >
> >> >
> >> > Sadly enough, it seems that --clean-metadata requires an active agent.
> >> > E.g.
> >> > $ hosted-engine --clean-metadata
> >> > The hosted engine configuration has not been retrieved from shared
> storage. Please ensure that ovirt-ha-agent
> >> > is running and the storage server is reachable.
> >>
> >> Did you try to search the net/list archives?
> >
> >
> > Yes. All of them seem to repeat the same clean-metadata command (which
> fails).
>
> I suppose we need better documentation. Sorry. Perhaps open a
> bug/issue about that.
>

Done.
https://bugzilla.redhat.com/show_bug.cgi?id=1993575


>
> >
> >>
> >>
> >> >
> >> > Can I manually delete the metadata state files?
> >>
> >> Yes, see e.g.:
> >>
> >> https://lists.ovirt.org/pipermail/users/2016-April/072676.html
> >>
> >> As an alternative to the 'find' command there, you can also find the
> IDs with:
> >>
> >> $ grep metadata /etc/ovirt-hosted-engine/hosted-engine.conf
> >>
> >> Best regards,
> >> --
> >> Didi
> >
> >
> > Yippie! Success (At least it seems that way...)
> >
> > Following https://lists.ovirt.org/pipermail/users/2016-April/072676.html
> ,
> > I stopped the broker and agent services, archived the existing hosted
> metadata files, created an empty 1GB metadata file using dd, (dd
> if=/dev/zero of=/run/vdsm/storage// bs=1M count=1024), making
> double sure permissions (0660 / 0644), owner (vdsm:kvm) and SELinux labels
> (restorecon, just incase) stay the same.
> > Let everything settle down.
> > Restarted the services
> > ... and everything is up again :)
> >
> > I plan to let the engine run overnight with zero VMs (making sure all
> backups are fully up-to-date).
> > Once done, I'll return to normal (until I replace this setup with a
> normal multi-node setup).
> >
> > Many thanks again!
>
> Glad to hear that, welcome, thanks for the report!
>
> More tests you might want to do before starting your real VMs:
>
> - Set and later clear global maintenance from each hosts, see that this
> propagates to the others (both 'hosted-engine --vm-status' and agent.log)
>
> - Migrate the engine VM between the hosts and see this propagates
>
> - Shutdown the engine VM without global maint and see that it's started
> automatically.
>
> But I do not think all of this is mandatory, if 'hosted-engine --vm-status'
> looks ok on all hosts.
>
> I'd still be careful with other things that might have been corrupted,
> though - obviously can't tell you what/where...
>
>
Host is back to normal.
The log looks clean (minus some odd smtp errors in the log).

Either way, I'm already in the process of replacing this setup with a real
3 host + gluster setup, so I just need this machine to survive the next
couple of weeks :)

- Gilboa
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/N4WIQXDW2ITLT2KDDH4LI4RTZRP2XWTY/


[ovirt-users] Re: Hosted engine on HCI cluster is not running

2021-08-13 Thread David White via Users
Of course right when I sent this email, I went back over to one of my consoles, 
re-ran "hosted-engine --status", and I saw that it was up. I can confirm my 
hosted engine is now online and healthy.

So to recap: restarting vdsmd solved my problem.

I provided lots of details in the Bugzilla, and I generated an sosreport on two 
of my three systems prior to restarting vdsmd.

Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐

On Friday, August 13th, 2021 at 9:31 PM, David White 
 wrote:

> I have updated the Bugzilla with all of the details I included below, as well 
> as additional details.
> 

> I figured better to err on the side of providing too many details than not 
> enough.
> 

> For the oVirt list's edification, I will note that restarting vdsmd on all 3 
> hosts did fix the problem -- to an extent. Unfortunately, my hosted-engine is 
> still not starting (although I can now clearly connect to the hosted-engine 
> storage), and I see this output every time I try to start the hosted-engine:
> 

> [root@cha2-storage ~]# hosted-engine --vm-start
> 

> Command VM.getStats with args {'vmID': 
> 'ffd77d79-a699-455e-88e2-f55ee53166ef'} failed:
> 

> (code=1, message=Virtual machine does not exist: {'vmId': 
> 'ffd77d79-a699-455e-88e2-f55ee53166ef'})
> 

> VM in WaitForLaunch
> 

> I'm not sure if that's because I screwed up when I was doing gluster 
> maintenance, or what.
> 

> But at this point, does this mean I have to re-deploy the hosted engine?
> 

> To confirm, if I re-deploy the hosted engine, will all of my regular VMs 
> remain intact? I have over 20 VMs in this environment, and it would be a 
> major deal to have to rebuild all 20+ of those VMs.
> 

> Sent with ProtonMail Secure Email.
> 

> ‐‐‐ Original Message ‐‐‐
> 

> On Friday, August 13th, 2021 at 2:41 PM, Nir Soffer nsof...@redhat.com wrote:
> 

> > On Fri, Aug 13, 2021 at 9:13 PM David White via Users users@ovirt.org wrote:
> > 

> > > Hello,
> > > 

> > > It appears that my Manager / hosted-engine isn't working, and I'm unable 
> > > to get it to start.
> > > 

> > > I have a 3-node HCI cluster, but right now, Gluster is only running on 1 
> > > host (so no replication).
> > > 

> > > I was hoping to upgrade / replace the storage on my 2nd host today, but 
> > > aborted that maintenance when I found that I couldn't even get into the 
> > > Manager.
> > > 

> > > The storage is mounted, but here's what I see:
> > > 

> > > [root@cha2-storage dwhite]# hosted-engine --vm-status
> > > 

> > > The hosted engine configuration has not been retrieved from shared 
> > > storage. Please ensure that ovirt-ha-agent is running and the storage 
> > > server is reachable.
> > > 

> > > [root@cha2-storage dwhite]# systemctl status ovirt-ha-agent
> > > 

> > > ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability 
> > > Monitoring Agent
> > > 

> > > Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; 
> > > vendor preset: disabled)
> > > 

> > > Active: active (running) since Fri 2021-08-13 11:10:51 EDT; 2h 44min ago
> > > 

> > > Main PID: 3591872 (ovirt-ha-agent)
> > > 

> > > Tasks: 1 (limit: 409676)
> > > 

> > > Memory: 21.5M
> > > 

> > > CGroup: /system.slice/ovirt-ha-agent.service
> > > 

> > > └─3591872 /usr/libexec/platform-python 
> > > /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
> > > 

> > > Aug 13 11:10:51 cha2-storage.mgt.barredowlweb.com systemd[1]: Started 
> > > oVirt Hosted Engine High Availability Monitoring Agent.
> > > 

> > > Any time I try to do anything like connect the engine storage, disconnect 
> > > the engine storage, or connect to the console, it just sits there, and 
> > > doesn't do anything, and I eventually have to ctl-c out of it.
> > > 

> > > Maybe I have to be patient? When I ctl-c, I get a trackback error:
> > > 

> > > [root@cha2-storage dwhite]# hosted-engine --console
> > > 

> > > ^CTraceback (most recent call last):
> > > 

> > > File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
> > > 

> > > "__main__", mod_spec)
> > > 

> > > 

> > > File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
> > > 

> > > exec(code, run_globals)
> > > 

> > > File 
> > > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py",
> > >  line 214, in 
> > > 

> > > [root@cha2-storage dwhite]# args.command(args)
> > > 

> > > File 
> > > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py",
> > >  line 42, in func
> > > 

> > > f(*args, **kwargs)
> > > 

> > > File 
> > > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py",
> > >  line 91, in checkVmStatus
> > > 

> > > cli = ohautil.connect_vdsm_json_rpc()
> > > 

> > > File 
> > > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
> > > line 472, in connect_vdsm_json_rpc
> > > 

> > > __vdsm_json_rpc_connect(logger, timeout)
> > > 

> > > File 
> > > 

[ovirt-users] Re: Hosted engine on HCI cluster is not running

2021-08-13 Thread David White via Users
I have updated the Bugzilla with all of the details I included below, as well 
as additional details. 

I figured better to err on the side of providing too many details than not 
enough. 


For the oVirt list's edification, I will note that restarting vdsmd on all 3 
hosts did fix the problem -- to an extent. Unfortunately, my hosted-engine is 
still not starting (although I can now clearly connect to the hosted-engine 
storage), and I see this output every time I try to start the hosted-engine:

[root@cha2-storage ~]# hosted-engine --vm-start
Command VM.getStats with args {'vmID': 'ffd77d79-a699-455e-88e2-f55ee53166ef'} 
failed:
(code=1, message=Virtual machine does not exist: {'vmId': 
'ffd77d79-a699-455e-88e2-f55ee53166ef'})
VM in WaitForLaunch

I'm not sure if that's because I screwed up when I was doing gluster 
maintenance, or what.
But at this point, does this mean I have to re-deploy the hosted engine?
To confirm, if I re-deploy the hosted engine, will all of my regular VMs remain 
intact? I have over 20 VMs in this environment, and it would be a major deal to 
have to rebuild all 20+ of those VMs.

Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐

On Friday, August 13th, 2021 at 2:41 PM, Nir Soffer  wrote:

> On Fri, Aug 13, 2021 at 9:13 PM David White via Users users@ovirt.org wrote:
> 

> > Hello,
> > 

> > It appears that my Manager / hosted-engine isn't working, and I'm unable to 
> > get it to start.
> > 

> > I have a 3-node HCI cluster, but right now, Gluster is only running on 1 
> > host (so no replication).
> > 

> > I was hoping to upgrade / replace the storage on my 2nd host today, but 
> > aborted that maintenance when I found that I couldn't even get into the 
> > Manager.
> > 

> > The storage is mounted, but here's what I see:
> > 

> > [root@cha2-storage dwhite]# hosted-engine --vm-status
> > 

> > The hosted engine configuration has not been retrieved from shared storage. 
> > Please ensure that ovirt-ha-agent is running and the storage server is 
> > reachable.
> > 

> > [root@cha2-storage dwhite]# systemctl status ovirt-ha-agent
> > 

> > ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring 
> > Agent
> > 

> > Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; 
> > vendor preset: disabled)
> > 

> > Active: active (running) since Fri 2021-08-13 11:10:51 EDT; 2h 44min ago
> > 

> > Main PID: 3591872 (ovirt-ha-agent)
> > 

> > Tasks: 1 (limit: 409676)
> > 

> > Memory: 21.5M
> > 

> > CGroup: /system.slice/ovirt-ha-agent.service
> > 

> > └─3591872 /usr/libexec/platform-python 
> > /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
> > 

> > Aug 13 11:10:51 cha2-storage.mgt.barredowlweb.com systemd[1]: Started oVirt 
> > Hosted Engine High Availability Monitoring Agent.
> > 

> > Any time I try to do anything like connect the engine storage, disconnect 
> > the engine storage, or connect to the console, it just sits there, and 
> > doesn't do anything, and I eventually have to ctl-c out of it.
> > 

> > Maybe I have to be patient? When I ctl-c, I get a trackback error:
> > 

> > [root@cha2-storage dwhite]# hosted-engine --console
> > 

> > ^CTraceback (most recent call last):
> > 

> > File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
> > 

> > "__main__", mod_spec)
> > 

> > 

> > File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
> > 

> > exec(code, run_globals)
> > 

> > File 
> > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py",
> >  line 214, in 
> > 

> > [root@cha2-storage dwhite]# args.command(args)
> > 

> > File 
> > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py",
> >  line 42, in func
> > 

> > f(*args, **kwargs)
> > 

> > File 
> > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py",
> >  line 91, in checkVmStatus
> > 

> > cli = ohautil.connect_vdsm_json_rpc()
> > 

> > File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
> > line 472, in connect_vdsm_json_rpc
> > 

> > __vdsm_json_rpc_connect(logger, timeout)
> > 

> > File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
> > line 395, in __vdsm_json_rpc_connect
> > 

> > timeout=timeout)
> > 

> > File "/usr/lib/python3.6/site-packages/vdsm/client.py", line 154, in connect
> > 

> > outgoing_heartbeat=outgoing_heartbeat, nr_retries=nr_retries)
> > 

> > File "/usr/lib/python3.6/site-packages/yajsonrpc/stompclient.py", line 426, 
> > in SimpleClient
> > 

> > nr_retries, reconnect_interval)
> > 

> > File "/usr/lib/python3.6/site-packages/yajsonrpc/stompclient.py", line 448, 
> > in StandAloneRpcClient
> > 

> > client = StompClient(utils.create_connected_socket(host, port, sslctx),
> > 

> > File "/usr/lib/python3.6/site-packages/vdsm/utils.py", line 379, in 
> > create_connected_socket
> > 

> > sock.connect((host, port))
> > 

> > File "/usr/lib64/python3.6/ssl.py", line 1068, in 

[ovirt-users] Re: Hosted engine on HCI cluster is not running

2021-08-13 Thread Nir Soffer
On Fri, Aug 13, 2021 at 9:13 PM David White via Users  wrote:
>
> Hello,
> It appears that my Manager / hosted-engine isn't working, and I'm unable to 
> get it to start.
>
> I have a 3-node HCI cluster, but right now, Gluster is only running on 1 host 
> (so no replication).
> I was hoping to upgrade / replace the storage on my 2nd host today, but 
> aborted that maintenance when I found that I couldn't even get into the 
> Manager.
>
> The storage is mounted, but here's what I see:
>
> [root@cha2-storage dwhite]# hosted-engine --vm-status
> The hosted engine configuration has not been retrieved from shared storage. 
> Please ensure that ovirt-ha-agent is running and the storage server is 
> reachable.
>
> [root@cha2-storage dwhite]# systemctl status ovirt-ha-agent
> ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring 
> Agent
>Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; 
> vendor preset: disabled)
>Active: active (running) since Fri 2021-08-13 11:10:51 EDT; 2h 44min ago
> Main PID: 3591872 (ovirt-ha-agent)
> Tasks: 1 (limit: 409676)
>Memory: 21.5M
>CGroup: /system.slice/ovirt-ha-agent.service
>└─3591872 /usr/libexec/platform-python 
> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
>
> Aug 13 11:10:51 cha2-storage.mgt.barredowlweb.com systemd[1]: Started oVirt 
> Hosted Engine High Availability Monitoring Agent.
>
>
> Any time I try to do anything like connect the engine storage, disconnect the 
> engine storage, or connect to the console, it just sits there, and doesn't do 
> anything, and I eventually have to ctl-c out of it.
> Maybe I have to be patient? When I ctl-c, I get a trackback error:
>
> [root@cha2-storage dwhite]# hosted-engine --console
> ^CTraceback (most recent call last):
>   File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
>
> "__main__", mod_spec)
>   File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
> exec(code, run_globals)
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", 
> line 214, in 
> [root@cha2-storage dwhite]# args.command(args)
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", 
> line 42, in func
> f(*args, **kwargs)
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", 
> line 91, in checkVmStatus
> cli = ohautil.connect_vdsm_json_rpc()
>   File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
> line 472, in connect_vdsm_json_rpc
> __vdsm_json_rpc_connect(logger, timeout)
>   File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
> line 395, in __vdsm_json_rpc_connect
> timeout=timeout)
>   File "/usr/lib/python3.6/site-packages/vdsm/client.py", line 154, in connect
> outgoing_heartbeat=outgoing_heartbeat, nr_retries=nr_retries)
>   File "/usr/lib/python3.6/site-packages/yajsonrpc/stompclient.py", line 426, 
> in SimpleClient
> nr_retries, reconnect_interval)
>   File "/usr/lib/python3.6/site-packages/yajsonrpc/stompclient.py", line 448, 
> in StandAloneRpcClient
> client = StompClient(utils.create_connected_socket(host, port, sslctx),
>   File "/usr/lib/python3.6/site-packages/vdsm/utils.py", line 379, in 
> create_connected_socket
> sock.connect((host, port))
>   File "/usr/lib64/python3.6/ssl.py", line 1068, in connect
> self._real_connect(addr, False)
>   File "/usr/lib64/python3.6/ssl.py", line 1059, in _real_connect
> self.do_handshake()
>   File "/usr/lib64/python3.6/ssl.py", line 1036, in do_handshake
> self._sslobj.do_handshake()
>   File "/usr/lib64/python3.6/ssl.py", line 648, in do_handshake
> self._sslobj.do_handshake()
>
>
>
> This is what I see in /var/log/ovirt-hosted-engine-ha/broker.log:
>
> MainThread::WARNING::2021-08-11 
> 10:24:41,596::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
>  Can't connect vdsm storage: Connection to storage server failed
> MainThread::ERROR::2021-08-11 
> 10:24:41,596::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> Failed initializing the broker: Connection to storage server failed
> MainThread::ERROR::2021-08-11 
> 10:24:41,598::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> Traceback (most recent call last):
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", 
> line 64, in run
> self._storage_broker_instance = self._get_storage_broker()
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", 
> line 143, in _get_storage_broker
> return storage_broker.StorageBroker()
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>  line 97, in __init__
> self._backend.connect()
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
>  line 375, in connect
> 

[ovirt-users] Hosted engine on HCI cluster is not running

2021-08-13 Thread David White via Users
Hello,
It appears that my Manager / hosted-engine isn't working, and I'm unable to get 
it to start.

I have a 3-node HCI cluster, but right now, Gluster is only running on 1 host 
(so no replication).
I was hoping to upgrade / replace the storage on my 2nd host today, but aborted 
that maintenance when I found that I couldn't even get into the Manager.

The storage is mounted, but here's what I see:

> [root@cha2-storage dwhite]# hosted-engine --vm-statusThe hosted engine 
> configuration has not been retrieved from shared storage. Please ensure that 
> ovirt-ha-agent is running and the storage server is reachable.
> 

> [root@cha2-storage dwhite]# systemctl status ovirt-ha-agent● 
> ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring 
> Agent
>    Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; 
> vendor preset: disabled)
>    Active: active (running) since Fri 2021-08-13 11:10:51 EDT; 2h 44min ago
> Main PID: 3591872 (ovirt-ha-agent)
>     Tasks: 1 (limit: 409676)
>    Memory: 21.5M
>    CGroup: /system.slice/ovirt-ha-agent.service
>    └─3591872 /usr/libexec/platform-python 
> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
> 

> Aug 13 11:10:51 cha2-storage.mgt.barredowlweb.com systemd[1]: Started oVirt 
> Hosted Engine High Availability Monitoring Agent.

Any time I try to do anything like connect the engine storage, disconnect the 
engine storage, or connect to the console, it just sits there, and doesn't do 
anything, and I eventually have to ctl-c out of it.
Maybe I have to be patient? When I ctl-c, I get a trackback error:

> [root@cha2-storage dwhite]# hosted-engine --console^CTraceback (most recent 
> call last):
>   File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
> 

>     "__main__", mod_spec)
>   File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
>     exec(code, run_globals)
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", 
> line 214, in 
> [root@cha2-storage dwhite]# args.command(args)
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", 
> line 42, in func
>     f(*args, **kwargs)
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", 
> line 91, in checkVmStatus
>     cli = ohautil.connect_vdsm_json_rpc()
>   File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
> line 472, in connect_vdsm_json_rpc
>     __vdsm_json_rpc_connect(logger, timeout)
>   File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
> line 395, in __vdsm_json_rpc_connect
>     timeout=timeout)
>   File "/usr/lib/python3.6/site-packages/vdsm/client.py", line 154, in connect
>     outgoing_heartbeat=outgoing_heartbeat, nr_retries=nr_retries)
>   File "/usr/lib/python3.6/site-packages/yajsonrpc/stompclient.py", line 426, 
> in SimpleClient
>     nr_retries, reconnect_interval)
>   File "/usr/lib/python3.6/site-packages/yajsonrpc/stompclient.py", line 448, 
> in StandAloneRpcClient
>     client = StompClient(utils.create_connected_socket(host, port, sslctx),
>   File "/usr/lib/python3.6/site-packages/vdsm/utils.py", line 379, in 
> create_connected_socket
>     sock.connect((host, port))
>   File "/usr/lib64/python3.6/ssl.py", line 1068, in connect
>     self._real_connect(addr, False)
>   File "/usr/lib64/python3.6/ssl.py", line 1059, in _real_connect
>     self.do_handshake()
>   File "/usr/lib64/python3.6/ssl.py", line 1036, in do_handshake
>     self._sslobj.do_handshake()
>   File "/usr/lib64/python3.6/ssl.py", line 648, in do_handshake
>     self._sslobj.do_handshake()

This is what I see in /var/log/ovirt-hosted-engine-ha/broker.log:

> MainThread::WARNING::2021-08-11 
> 10:24:41,596::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
>  Can't connect vdsm storage: Connection to storage server failed
> MainThread::ERROR::2021-08-11 
> 10:24:41,596::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> Failed initializing the broker: Connection to storage server failed
> MainThread::ERROR::2021-08-11 
> 10:24:41,598::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> Traceback (most recent call last):
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", 
> line 64, in run
>     self._storage_broker_instance = self._get_storage_broker()
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", 
> line 143, in _get_storage_broker
>     return storage_broker.StorageBroker()
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>  line 97, in __init__
>     self._backend.connect()
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
>  line 375, in connect
>     sserver.connect_storage_server()
>   File 
> 

[ovirt-users] oVirt Hosted Engine Offline Deployment

2021-08-13 Thread Andrew Lamarra
Hi there. I'm trying to get oVirt up & running on a server in a network that 
has no access to the Internet. We do host mirrors of CentOS 7 & 8 repos (no 
Stream). I first tried the latest version of oVirt 4.4, however, I ran into an 
issue when trying to do the Hosted Engine deployment. The command I used was 
"hosted-engine --deploy --ansible-extra-vars=he_offline_deployment=true". But 
at some point, I get the following error:

[ INFO  ] DNF Errors during downloading metadata for repository 
'ovirt-4.4-centos-ceph-pacific':
   - Status code: 404 for 
http://mirror.centos.org/centos/8-stream/storage/x86_64/ceph-pacific/repodata/repomd.xml
[ ERROR ] DNF Failed to download metadata for repo 
'ovirt-4.4-centos-ceph-pacific': Cannot download repomd.xml: Cannot download 
repodata/repomd.xml: All mirrors were tried

Seems like it's still trying to download something from the CentOS 8 Stream 
repo. So I figured that since we have the CentOS 7 repos hosted, I'll try oVirt 
v4.3. However, I don't see an option to do an "offline deployment" with this 
version. It says the "--ansible-extra-vars" option is invalid. I tried doing a 
regular deployment but it tries downloading from the oVirt repo:

[ INFO ] TASK [ovirt.hosted_enging_setup : Install ovirt-engine-appliance rpm]
[ ERROR ] fatal: [localhost]: FAILED! => ["attempts": 10, "changed": false, 
"msg": "Failure talking to yum: cannot retrieve metalink for repository: 
ovirt-4.3-epel/x86_64. Please verify its path and try again"}

I can bring over a mirror of the ovirt-4.3 repo. I also see that there's some 
other repos in the "ovirt-4.3-dependencies.repo" file. Are those required for 
deploying the hosted engine VM as well? Or does anyone know if there's a way to 
do this deployment completely offline?

Thank you for your time.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GDWCIZ3X3OKE4WXCWUNKQVG7GOVP5AW4/


[ovirt-users] [ANN] oVirt 4.4.8 Fifth Release Candidate is now available for testing

2021-08-13 Thread Lev Veyde
oVirt 4.4.8 Fifth Release Candidate is now available for testing

The oVirt Project is pleased to announce the availability of oVirt 4.4.8
Fifth Release Candidate for testing, as of August 13th, 2021.

This update is the eighth in a series of stabilization updates to the 4.4
series.
Documentation

   -

   If you want to try oVirt as quickly as possible, follow the instructions
   on the Download  page.
   -

   For complete installation, administration, and usage instructions, see
   the oVirt Documentation .
   -

   For upgrading from a previous version, see the oVirt Upgrade Guide
   .
   -

   For a general overview of oVirt, see About oVirt
   .

Important notes before you try it

Please note this is a pre-release build.

The oVirt Project makes no guarantees as to its suitability or usefulness.

This pre-release must not be used in production.
Installation instructions

For installation instructions and additional information please refer to:

https://ovirt.org/documentation/

This release is available now on x86_64 architecture for:

* Red Hat Enterprise Linux 8.4 or similar

* CentOS Stream 8

This release supports Hypervisor Hosts on x86_64 and ppc64le architectures
for:

* Red Hat Enterprise Linux 8.4 or similar

* CentOS Stream 8

* oVirt Node 4.4 based on CentOS Stream 8 (available for x86_64 only)

See the release notes [1] for installation instructions and a list of new
features and bugs fixed.

Notes:

- oVirt Appliance is already available based on CentOS Stream 8

- oVirt Node NG is already available based on CentOS Stream 8

Additional Resources:

* Read more about the oVirt 4.4.8 release highlights:
http://www.ovirt.org/release/4.4.8/

* Get more oVirt project updates on Twitter: https://twitter.com/ovirt

* Check out the latest project news on the oVirt blog:
http://www.ovirt.org/blog/


[1] http://www.ovirt.org/release/4.4.8/

[2] http://resources.ovirt.org/pub/ovirt-4.4-pre/iso/

-- 

Lev Veyde

Senior Software Engineer, RHCE | RHCVA | MCITP

Red Hat Israel



l...@redhat.com | lve...@redhat.com

TRIED. TESTED. TRUSTED. 
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SJLG26HX5PHSD5YJO4WMM4FSK62CKDLD/