[ovirt-users] Re: Failing to restore a backup

2020-07-15 Thread Yedidyah Bar David
On Wed, Jul 15, 2020 at 6:21 PM Andrea Chierici
 wrote:
>
> Dear all,
> I think I finally understood the issue, even if I don't know how to fix it.
>
> Trying to install a new HE from a backup I get the error:
>  "The host has been set in non_operational status, please check engine logs, 
> more info can be found in the engine logs, fix accordingly and re-deploy."
>
> The host, not the hosted engine. This is more clear in another log:
> Host  is set to Non-Operational, it is missing the 
> following networks: 'iscsi_net,sgsi_iscsi,sgsi_priv,sgsi_vpn'
>
> The fact is that those networks are present on the host:
> # ip addr
> 
> 26: sgsi_priv:  mtu 1500 qdisc noqueue state 
> UP group default qlen 1000
> link/ether 90:e2:ba:63:2e:bc brd ff:ff:ff:ff:ff:ff
> inet6 fe80::92e2:baff:fe63:2ebc/64 scope link
> 28: sgsi_vpn:  mtu 1500 qdisc noqueue state 
> UP group default qlen 1000
> link/ether 90:e2:ba:63:2e:bc brd ff:ff:ff:ff:ff:ff
> inet6 fe80::92e2:baff:fe63:2ebc/64 scope link
>valid_lft forever preferred_lft forever
>
> The other two are configured on ovirt but not configurable on bare metal 
> system, indeed if I issue "ip addr" on a production host I don't see those 
> nets at all: I am puzzled. The problem is definitely this one, can anyone 
> provide any suggestion on how to proceed?
> Why is it complaining about sgsi_priv and sgsi_vpn that are not missing at 
> all?

If you pass --restore-from-file, you should be prompted, at some
point, IMO (copying from the code, didn't test recently):

'Pause the execution after adding this host to the '
'engine?\n'
'You will be able to iteratively connect to '
'the restored engine in order to manually '
'review and remediate its configuration before '
'proceeding with the deployment:\nplease ensure that '
'all the datacenter hosts and storage domain are '
'listed as up or in maintenance mode before '
'proceeding.\nThis is normally not required when '
'restoring an up to date and coherent backup. '
'(@VALUES@)[@DEFAULT@]: '

Were you? If so, you can reply 'Yes', and then, later on, you should
get a message:

  - name: Pause the execution to let the user interactively reconfigure the host
  - name: Let the user connect to the bootstrap engine to manually
fix host configuration
  msg: >-
You can now connect to {{ bootstrap_engine_url }} and
check the status of this host and
eventually remediate it, please continue only when the
host is listed as 'up'

- name: Pause execution until {{ he_setup_lock_file.path }} is
removed, delete it once ready to proceed

At this point, the deploy process will wait until you remove this
file, before continuing.
Then, you can login to the engine admin ui, change whatever needed on
the host - including
configuring networks or whatever, until you manage to bring it 'Up'.
Then remove the file.

Good luck and best regards,

>
> Andrea
>
>
> On 15/07/2020 08:33, Yedidyah Bar David wrote:
>
> On Tue, Jul 14, 2020 at 6:04 PM Andrea Chierici
>  wrote:
>
> Hi,
> thank you for your help.
>
>
> I think this is not a critical failure, and is not what failed the restore.
>
>
>
> Recently I tried the 4.3.11 beta and 4.4.1 and the error now is different:
>
> [ INFO  ] Upgrading CA\n[ ERROR ] Failed to execute stage 'Misc 
> configuration': (2, 'No such file or directory')\n[ INFO  ] DNF Performing 
> DNF transaction rollback\n
>
> This is part of 'engine-setup' output, which 'hosted-engine' runs inside the 
> engine VM. If you can access the engine VM, you can try finding more 
> information in /var/log/ovirt-engine/setup/* there. Otherwise, the 
> hosted-engine deploy script might have managed to get a copy to 
> /var/log/ovirt-hosted-engine-setup/engine-logs*. Please check/share these. 
> Thanks.
>
>
> Unfortunately the installation procedures when exiting, deletes the vm, hence 
> I can't log in.
>
> Are you sure? Did you check with 'ps', searching qemu processes?
>
> If it's still up, but still using a local IP address, you can find it
> by searching the hosted-engine logs for 'local_vm_ip' and login there
> from the host.
>
> Here are the ERROR messages I got on the logs copied on the host:
>
> engine.log:2020-07-08 15:05:04,178+02 ERROR 
> [org.ovirt.engine.core.bll.pm.FenceProxyLocator] 
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-89) 
> [45a7e7f3] Can not run fence action on host '', no 
> suitable proxy host was found.
>
> That's ok.
>
> server.log:2020-07-08 15:09:23,081+02 ERROR 
> [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) RESTEASY002010: 
> Failed to execute: javax.ws.rs.WebApplicationException: HTTP 404 Not Found
> server.log:2020-07-08 15:14:19,804+02 ERROR 
> 

[ovirt-users] Re: Failing to restore a backup

2020-07-15 Thread Andrea Chierici

Dear all,
I think I finally understood the issue, even if I don't know how to fix it.

Trying to install a new HE from a backup I get the error:
 "The host has been set in non_operational status, please check engine 
logs, more info can be found in the engine logs, fix accordingly and 
re-deploy."


*The host, not the hosted engine*. This is more clear in another log:
Host  is set to Non-Operational, it is missing the 
following networks: 'iscsi_net,sgsi_iscsi,sgsi_priv,sgsi_vpn'


The fact is that those networks are present on the host:
# ip addr

26: sgsi_priv:  mtu 1500 qdisc noqueue 
state UP group default qlen 1000

    link/ether 90:e2:ba:63:2e:bc brd ff:ff:ff:ff:ff:ff
    inet6 fe80::92e2:baff:fe63:2ebc/64 scope link
28: sgsi_vpn:  mtu 1500 qdisc noqueue 
state UP group default qlen 1000

    link/ether 90:e2:ba:63:2e:bc brd ff:ff:ff:ff:ff:ff
    inet6 fe80::92e2:baff:fe63:2ebc/64 scope link
   valid_lft forever preferred_lft forever

The other two are configured on ovirt but not configurable on bare metal 
system, indeed if I issue "ip addr" on a production host I don't see 
those nets at all: I am puzzled. The problem is definitely this one, can 
anyone provide any suggestion on how to proceed?
Why is it complaining about sgsi_priv and sgsi_vpn that are not missing 
at all?


Andrea


On 15/07/2020 08:33, Yedidyah Bar David wrote:

On Tue, Jul 14, 2020 at 6:04 PM Andrea Chierici
 wrote:

Hi,
thank you for your help.


I think this is not a critical failure, and is not what failed the restore.




Recently I tried the 4.3.11 beta and 4.4.1 and the error now is different:

[ INFO  ] Upgrading CA\n[ ERROR ] Failed to execute stage 'Misc configuration': 
(2, 'No such file or directory')\n[ INFO  ] DNF Performing DNF transaction 
rollback\n


This is part of 'engine-setup' output, which 'hosted-engine' runs inside the 
engine VM. If you can access the engine VM, you can try finding more 
information in /var/log/ovirt-engine/setup/* there. Otherwise, the 
hosted-engine deploy script might have managed to get a copy to 
/var/log/ovirt-hosted-engine-setup/engine-logs*. Please check/share these. 
Thanks.


Unfortunately the installation procedures when exiting, deletes the vm, hence I 
can't log in.

Are you sure? Did you check with 'ps', searching qemu processes?

If it's still up, but still using a local IP address, you can find it
by searching the hosted-engine logs for 'local_vm_ip' and login there
from the host.


Here are the ERROR messages I got on the logs copied on the host:

engine.log:2020-07-08 15:05:04,178+02 ERROR 
[org.ovirt.engine.core.bll.pm.FenceProxyLocator] 
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-89) [45a7e7f3] 
Can not run fence action on host '', no suitable proxy host 
was found.

That's ok.


server.log:2020-07-08 15:09:23,081+02 ERROR 
[org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) RESTEASY002010: 
Failed to execute: javax.ws.rs.WebApplicationException: HTTP 404 Not Found
server.log:2020-07-08 15:14:19,804+02 ERROR 
[org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) RESTEASY002010: 
Failed to execute: javax.ws.rs.WebApplicationException: HTTP 404 Not Found

This probably indicates a problem, but I agree it's not very helpful.


grep: setup: Is a directory

Right - so please search inside it.

Also please check the hosted-engine deploy logs themselves.


Not very helpful.





I simply can't figure out what file is missing.
If, as a test, I try to install the HE without restoring the backup, the 
installation goes smoothly to the end, but at that point I can't restore the 
backup, as far as I can understand.


Another option is to do the restore manually. To find relevant information, search the 
net for "enginevm_before_engine_setup".


Later I will give it a try.

Good luck and best regards,


--
Andrea Chierici - INFN-CNAF 
Viale Berti Pichat 6/2, 40127 BOLOGNA
Office Tel: +39 051 2095463 
SkypeID ataruz
--

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IH2ISMET6PKCP3VUNNTOJZH74F5AEHF2/


[ovirt-users] Re: Failing to restore a backup

2020-07-15 Thread Yedidyah Bar David
On Tue, Jul 14, 2020 at 6:04 PM Andrea Chierici
 wrote:
>
> Hi,
> thank you for your help.
>
>
> I think this is not a critical failure, and is not what failed the restore.
>
>>
>>
>>
>> Recently I tried the 4.3.11 beta and 4.4.1 and the error now is different:
>>
>> [ INFO  ] Upgrading CA\n[ ERROR ] Failed to execute stage 'Misc 
>> configuration': (2, 'No such file or directory')\n[ INFO  ] DNF Performing 
>> DNF transaction rollback\n
>
>
> This is part of 'engine-setup' output, which 'hosted-engine' runs inside the 
> engine VM. If you can access the engine VM, you can try finding more 
> information in /var/log/ovirt-engine/setup/* there. Otherwise, the 
> hosted-engine deploy script might have managed to get a copy to 
> /var/log/ovirt-hosted-engine-setup/engine-logs*. Please check/share these. 
> Thanks.
>
>
> Unfortunately the installation procedures when exiting, deletes the vm, hence 
> I can't log in.

Are you sure? Did you check with 'ps', searching qemu processes?

If it's still up, but still using a local IP address, you can find it
by searching the hosted-engine logs for 'local_vm_ip' and login there
from the host.

> Here are the ERROR messages I got on the logs copied on the host:
>
> engine.log:2020-07-08 15:05:04,178+02 ERROR 
> [org.ovirt.engine.core.bll.pm.FenceProxyLocator] 
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-89) 
> [45a7e7f3] Can not run fence action on host '', no 
> suitable proxy host was found.

That's ok.

>
> server.log:2020-07-08 15:09:23,081+02 ERROR 
> [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) RESTEASY002010: 
> Failed to execute: javax.ws.rs.WebApplicationException: HTTP 404 Not Found
> server.log:2020-07-08 15:14:19,804+02 ERROR 
> [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) RESTEASY002010: 
> Failed to execute: javax.ws.rs.WebApplicationException: HTTP 404 Not Found

This probably indicates a problem, but I agree it's not very helpful.

> grep: setup: Is a directory

Right - so please search inside it.

Also please check the hosted-engine deploy logs themselves.

>
> Not very helpful.
>
>
>
>>
>>
>> I simply can't figure out what file is missing.
>> If, as a test, I try to install the HE without restoring the backup, the 
>> installation goes smoothly to the end, but at that point I can't restore the 
>> backup, as far as I can understand.
>
>
> Another option is to do the restore manually. To find relevant information, 
> search the net for "enginevm_before_engine_setup".
>
>
> Later I will give it a try.

Good luck and best regards,
-- 
Didi
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BRMKX5LC2LIHPRWY77A3SIT6F7ZLECBU/


[ovirt-users] Re: Failing to restore a backup

2020-07-14 Thread Andrea Chierici

Hi,
thank you for your help.



I think this is not a critical failure, and is not what failed the 
restore.




Recently I tried the 4.3.11 beta and 4.4.1 and the error now is
different:

[ INFO  ] Upgrading CA\n[ ERROR ] Failed to execute stage 'Misc
configuration': (2, 'No such file or directory')\n[ INFO  ] DNF
Performing DNF transaction rollback\n


This is part of 'engine-setup' output, which 'hosted-engine' runs 
inside the engine VM. If you can access the engine VM, you can try 
finding more information in /var/log/ovirt-engine/setup/* there. 
Otherwise, the hosted-engine deploy script might have managed to get a 
copy to /var/log/ovirt-hosted-engine-setup/engine-logs*. Please 
check/share these. Thanks.


Unfortunately the installation procedures when exiting, deletes the vm, 
hence I can't log in.

Here are the ERROR messages I got on the logs copied on the host:

engine.log:2020-07-08 15:05:04,178+02 ERROR 
[org.ovirt.engine.core.bll.pm.FenceProxyLocator] 
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-89) 
[45a7e7f3] Can not run fence action on host '', no 
suitable proxy host was found.


server.log:2020-07-08 15:09:23,081+02 ERROR 
[org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) 
RESTEASY002010: Failed to execute: javax.ws.rs.WebApplicationException: 
HTTP 404 Not Found
server.log:2020-07-08 15:14:19,804+02 ERROR 
[org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) 
RESTEASY002010: Failed to execute: javax.ws.rs.WebApplicationException: 
HTTP 404 Not Found

grep: setup: Is a directory

Not very helpful.




I simply can't figure out what file is missing.
If, as a test, I try to install the HE without restoring the
backup, the installation goes smoothly to the end, but at that
point I can't restore the backup, as far as I can understand.


Another option is to do the restore manually. To find relevant 
information, search the net for "enginevm_before_engine_setup".


Later I will give it a try.

Andrea

--
Andrea Chierici - INFN-CNAF 
Viale Berti Pichat 6/2, 40127 BOLOGNA
Office Tel: +39 051 2095463 
SkypeID ataruz
--

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/B5253D2ZVAJIY7Y3J3LVKYFJEPTSTCDY/


[ovirt-users] Re: Failing to restore a backup

2020-07-14 Thread Yedidyah Bar David
On Tue, Jul 14, 2020 at 4:40 PM Anton Louw via Users 
wrote:

>
>
> Hi Andrea,
>
>
>
> I have had sleepless nights with the same issue, but eventually figured it
> out. The two commands I used are below:
>
>
>
>1. Backup the configs
>
> engine-backup --scope=all --mode=backup --file=Full --log=Log_Full
>
>
>
>1. Restore the configs
>
> engine-backup --mode=restore --file=Full --log=Log_Full --provision-db
> --provision-dwh-db --restore-permissions
>

Andrea asked about hosted-engine, where you do not (by default) run restore
manually yourself.


>
>
> I ran the second command after I built a new HE from scratch. You just
> need to download the backup files from the original and copy them to the
> new HE before you restore.
>
>
>
> Give it a bash and see if it works for you.
>
>
>
> Cheers
>
>
>
> *Anton Louw*
> *Cloud Engineer: Storage and Virtualization* at *Vox*
> --
> *T:*  087 805  | *D:* 087 805 1572
> *M:* N/A
> *E:* anton.l...@voxtelecom.co.za
> *A:* Rutherford Estate, 1 Scott Street, Waverley, Johannesburg
> www.vox.co.za
>
> [image: F] 
> [image: T] 
> [image: I] 
> [image: L] 
> [image: Y] 
>
> [image: #VoxBrand]
> 
> *Disclaimer*
>
> The contents of this email are confidential to the sender and the intended
> recipient. Unless the contents are clearly and entirely of a personal
> nature, they are subject to copyright in favour of the holding company of
> the Vox group of companies. Any recipient who receives this email in error
> should immediately report the error to the sender and permanently delete
> this email from all storage devices.
>
> This email has been scanned for viruses and malware, and may have been
> automatically archived by *Mimecast Ltd*, an innovator in Software as a
> Service (SaaS) for business. Providing a *safer* and *more useful* place
> for your human generated data. Specializing in; Security, archiving and
> compliance. To find out more Click Here
> .
>
>
> *From:* Andrea Chierici 
> *Sent:* 14 July 2020 14:54
> *To:* users@ovirt.org
> *Subject:* [ovirt-users] Failing to restore a backup
>
>
>
> Dear all,
> I'm rather new to the list, not to ovirt, that I use since 2014 profitably.
> I've a problem with an ovirt instance and I am desperately seeking for
> help.
>
> I run a 4.3 self hosted engine installation, with 8 hypervisors and an
> iscsi storage.
> Since the storage is not  very reliable, I bought a dell powervault where
> to move everything. No problem to move the VMs, the problem came out with
> the hosted engine.
> I've read many documentation and the procedure I think I must follow
> involves backing up the current HE, powering it off, installing a new host
> where to create a new HE recovering the backup.
> The command I used to generate the backup is:
> engine-backup --mode=backup --file=file_name --log=log_file_name
>
> and the command used to restore it on the new HE is:
> hosted-engine --deploy --restore-from-file=backup/file_name
>
> The problem comes out during the recovering of the backup.
>
> With versions prior to 4.3.11 and also with 4.4.0 I got the error:
> 2020-06-25 15:17:34,950+0200 ERROR ansible failed {
> "ansible_host": "localhost",
> "ansible_playbook":
> "/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml",
> "ansible_result": {
> "_ansible_no_log": false,
> "changed": false,
> "invocation": {
> "module_args": {
> "ca_file": null,
> "compress": true,
> "headers": null,
> "hostname": null,
> "insecure": null,
> "kerberos": false,
> "ovirt_auth": {
> "ansible_facts": {
> "ovirt_auth": {
> "ca_file": null,
> "compress": true,
> "headers": null,
> "insecure": true,
> "kerberos": false,
> "timeout": 0,
> "token":
> "1f5Zoys35sQmLb2MiEg6bhWm2rDJULFan3eBK0juJJR3S-nXtN_b31jac1sZ0KRz3d1KSDmr8tyf7ExNe_pqJg",
> "url":
> "https://ovirt-sgsi.cnaf.infn.it/ovirt-engine/api;
> 
> }
> },
> "attempts": 1,
> "changed": false,
> "failed": false
> },
> "password": null,
> "state": "absent",
> "timeout": 0,
> "token": null,
> 

[ovirt-users] Re: Failing to restore a backup

2020-07-14 Thread Anton Louw via Users
Hi Andrea,

I have had sleepless nights with the same issue, but eventually figured it out. 
The two commands I used are below:


  1.  Backup the configs
engine-backup --scope=all --mode=backup --file=Full --log=Log_Full


  1.  Restore the configs
engine-backup --mode=restore --file=Full --log=Log_Full --provision-db 
--provision-dwh-db --restore-permissions

I ran the second command after I built a new HE from scratch. You just need to 
download the backup files from the original and copy them to the new HE before 
you restore.

Give it a bash and see if it works for you.

Cheers


Anton Louw
Cloud Engineer: Storage and Virtualization
__
D: 087 805 1572 | M: N/A
A: Rutherford Estate, 1 Scott Street, Waverley, Johannesburg
anton.l...@voxtelecom.co.za

www.vox.co.za



From: Andrea Chierici 
Sent: 14 July 2020 14:54
To: users@ovirt.org
Subject: [ovirt-users] Failing to restore a backup

Dear all,
I'm rather new to the list, not to ovirt, that I use since 2014 profitably.
I've a problem with an ovirt instance and I am desperately seeking for help.

I run a 4.3 self hosted engine installation, with 8 hypervisors and an iscsi 
storage.
Since the storage is not  very reliable, I bought a dell powervault where to 
move everything. No problem to move the VMs, the problem came out with the 
hosted engine.
I've read many documentation and the procedure I think I must follow involves 
backing up the current HE, powering it off, installing a new host where to 
create a new HE recovering the backup.
The command I used to generate the backup is:
engine-backup --mode=backup --file=file_name --log=log_file_name

and the command used to restore it on the new HE is:
hosted-engine --deploy --restore-from-file=backup/file_name

The problem comes out during the recovering of the backup.

With versions prior to 4.3.11 and also with 4.4.0 I got the error:
2020-06-25 15:17:34,950+0200 ERROR ansible failed {
"ansible_host": "localhost",
"ansible_playbook": 
"/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml",
"ansible_result": {
"_ansible_no_log": false,
"changed": false,
"invocation": {
"module_args": {
"ca_file": null,
"compress": true,
"headers": null,
"hostname": null,
"insecure": null,
"kerberos": false,
"ovirt_auth": {
"ansible_facts": {
"ovirt_auth": {
"ca_file": null,
"compress": true,
"headers": null,
"insecure": true,
"kerberos": false,
"timeout": 0,
"token": 
"1f5Zoys35sQmLb2MiEg6bhWm2rDJULFan3eBK0juJJR3S-nXtN_b31jac1sZ0KRz3d1KSDmr8tyf7ExNe_pqJg",
"url": 
"https://ovirt-sgsi.cnaf.infn.it/ovirt-engine/api;
}
},
"attempts": 1,
"changed": false,
"failed": false
},
"password": null,
"state": "absent",
"timeout": 0,
"token": null,
"url": null,
"username": null
}
},
"msg": "You must specify either 'url' or 'hostname'."
},
"ansible_task": "Always revoke the SSO token",
"ansible_type": "task",
"status": "FAILED",
"task_duration": 3
}


Recently I tried the 4.3.11 beta and 4.4.1 and the error now is different:

[ INFO  ] Upgrading CA\n[ ERROR ] Failed to execute stage 'Misc configuration': 
(2, 'No such file or directory')\n[ INFO  ] DNF Performing DNF transaction 
rollback\n

I simply can't figure out what file is missing.
If, as a test, I try to install the HE without restoring the backup, the 
installation goes smoothly to the end, but at that point I can't restore the 
backup, as far as I can understand.

Any hint on what I may be missing?
Thanks,
Andrea



--

Andrea Chierici - INFN-CNAF

Viale Berti Pichat 6/2, 40127 BOLOGNA

Office Tel: +39 051 2095463

SkypeID ataruz

--


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Q4GGDSLBACN4M6X7Q3L4ATW6YQE5P2RJ/