[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"
I think that this can be resolved by a remote (r)syslog system and proper documentation. I would be happy to write a short crash-course , but I will definately need assiatance from more experienced person. So far (7 months later) , I still struggle to find my errors and where to control log level. Best Regards, Strahil NikolovOn Jun 18, 2019 12:18, m...@brendanh.com wrote: > > "trade-off between time [developing] and time spent debugging such cases when > they do happen" > > Your call. All I know is, it took me over a month to install oVirt, > including three weeks of one-to-one time with Simone Tiraboschi from Red Hat. > He sent me eleven emails but eventually gave up, baffled as me. It > shouldn't be this hard. Others who are having this or similar problems will > just abandon oVirt: ("I'm just wondering if I should cut my losses with > oVirt"): > https://lists.ovirt.org/archives/list/users@ovirt.org/thread/PZJYNAKPYNQUTTNAXG57HTMQHATWYQGZ/ > > > The biggest challenge is to find the relevant error. What would be useful is > a log aggregator. If oVirt had a journalctl type app running on the host > that tails ALL the logs including the engine logs from hosted-engine (via > ssh), everything would be in one place and easy to spot. Currently, you need > fairly detailed knowledge of the architecture and install process to (i) find > the log files (ii) whittle down to the one displaying the problem. Yes, I > know you guys have a log-packaging app that compresses them up, so they can > be sent to Red Hat for inspection (does this even include hosted-engine > logs?). But with a journalling app, users would be able to spot the error > themselves and most of the time (if it's not a new bug), fix it on their own > like I did. And yes, I know once set up, users will have their own log > aggregator in the form of Kibana, Splunk, etc but these don't help during the > initial install. > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/4732OJ7DJFKI474B5TZOYGP4GXBRPTZS/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/NLS4XOAB4WH73O4IFYH75KJH23FVFRJH/
[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"
"trade-off between time [developing] and time spent debugging such cases when they do happen" Your call. All I know is, it took me over a month to install oVirt, including three weeks of one-to-one time with Simone Tiraboschi from Red Hat. He sent me eleven emails but eventually gave up, baffled as me. It shouldn't be this hard. Others who are having this or similar problems will just abandon oVirt: ("I'm just wondering if I should cut my losses with oVirt"): https://lists.ovirt.org/archives/list/users@ovirt.org/thread/PZJYNAKPYNQUTTNAXG57HTMQHATWYQGZ/ The biggest challenge is to find the relevant error. What would be useful is a log aggregator. If oVirt had a journalctl type app running on the host that tails ALL the logs including the engine logs from hosted-engine (via ssh), everything would be in one place and easy to spot. Currently, you need fairly detailed knowledge of the architecture and install process to (i) find the log files (ii) whittle down to the one displaying the problem. Yes, I know you guys have a log-packaging app that compresses them up, so they can be sent to Red Hat for inspection (does this even include hosted-engine logs?). But with a journalling app, users would be able to spot the error themselves and most of the time (if it's not a new bug), fix it on their own like I did. And yes, I know once set up, users will have their own log aggregator in the form of Kibana, Splunk, etc but these don't help during the initial install. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/4732OJ7DJFKI474B5TZOYGP4GXBRPTZS/
[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"
On Mon, Jun 17, 2019 at 1:38 PM wrote: > > Googling the pertinent text from the above long error: > duplicate key value violates unique constraint "name_server_pkey" > led me to this bug report: > https://bugzilla.redhat.com/show_bug.cgi?id=1530944 > and the discovery I had a duplicate DNS IP address in /etc/resolv.conf > Removing this and adding the host again worked :-) > But it shouldn't have been this hard to install oVirt. May I suggest > tolarance of duplicate DNS IPs be added? > In above bug report, Yaniv Kaul says won't fix because it's user error. > Perhaps, but the oVirt installer should do a modicum of hand-holding IMO. Perhaps. I am not saying this does not make sense. It's a question of trade-off - between the time spent preparing a patch, testing it, potentially having to maintain it in the future, etc., and the time spent debugging such cases when they do happen, and the other damage caused (not much, in code related only to new setups - it has very low chances to e.g. corrupt data etc.). While in many cases we do try to hand-hold, admittedly especially in cases of more potential for damage, there is a limit to how much we can do. Or, in other words (also being a sysadmin for 15 years in the past, and also seeing many support cases in oVirt/RHV), there is practically _no_ limit to how much a user can do weird things to the system... That said, you know the "standard" answer in such cases... (Verified - that's the hard part - ) Patches are welcome! :-) Best regards, -- Didi ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MP5JLB2W2PBQBDWUHG6EH2X2SE47CFUA/
[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"
Googling the pertinent text from the above long error: duplicate key value violates unique constraint "name_server_pkey" led me to this bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1530944 and the discovery I had a duplicate DNS IP address in /etc/resolv.conf Removing this and adding the host again worked :-) But it shouldn't have been this hard to install oVirt. May I suggest tolarance of duplicate DNS IPs be added? In above bug report, Yaniv Kaul says won't fix because it's user error. Perhaps, but the oVirt installer should do a modicum of hand-holding IMO. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FAIHQ5CTMBNE2MHVOC6IAQBXCNJ2UINZ/
[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"
Hi Edward, you're hitting [1] - it will be included in the next appliance [1] https://bugzilla.redhat.com/show_bug.cgi?id=1718399 On Monday, June 17, 2019, Edward Berger wrote: > The hosted engine is created in two steps, first as a 192.168.x.x address > as a local VM on the host, then it gets copied over to shared storage and > gets the real ip address you assigned in the setup wizard.. So that part > is normal behavior. > > I had a recent hosted engine installation failure with oVirt node 4.3.4, > where the local VM was stuck trying to yum install yum-utils, but couldn't > because it is behind a firewall, so I ssh'd into the local VM, added a > proxy line to /etc/yum.conf, kill -HUP'd the bad process and manually > re-ran the yum install command and it was able to complete the hosted > engine installation. > > If that's not the issue, maybe your node's network config is not something > the installer expects like preconfigured bridge when it wants to do the > bridge configuration for itself, or a bond type not supported... > > > On Sun, Jun 16, 2019 at 12:12 PM wrote: > > Hi, > > I've been failing to install hosted-engine on oVirt Node for a long time. > I'm now trying on a Coffee Lake Xeon-based system, having previously tried > on Broadwell-E. > > Trying using the webui or hosted-engine --deploy has similar result. > Error in the title occurs when using the webui. Using hosted-engine > --deploy gets shows: > [ INFO ] TASK [ovirt.hosted_engine_setup : Wait for the host to be up] > [ INFO ] ok: [localhost] > [ INFO ] TASK [ovirt.hosted_engine_setup : Check host status] > [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The > host has been set in non_operational status, please check engine logs, fix > accordingly and re-deploy.\n"} > > Despite the failure, the oVirt webui is can be browsed on https://:6900, > but the host has status "unassigned". The Node webui (https://:9090) > has the engine VM running, but when I login to its console, I see its IP is > 192.168.122.123, not the DHCP-reserved IP address (on our 10.0.8.x > network), which doesn't seem right. I suspect some problem with DHCP, but > I don't know how to fix. Any ideas? > > vdsm.log shows: > 2019-06-16 15:06:39,117+ INFO (vmrecovery) [vds] recovery: waiting > for storage pool to go up (clientIF:709) > 2019-06-16 15:06:44,122+ INFO (vmrecovery) [vdsm.api] START > getConnectedStoragePoolsList(options=None) from=internal, > task_id=7f984b0d-9765-457e-ac8e-c5cd0bdf73d2 (api:48) > 2019-06-16 15:06:44,122+ INFO (vmrecovery) [vdsm.api] FINISH > getConnectedStoragePoolsList return={'poollist': []} from=internal, > task_id=7f984b0d-9765-457e-ac8e-c5cd0bdf73d2 (api:54) > 2019-06-16 15:06:44,122+ INFO (vmrecovery) [vds] recovery: waiting > for storage pool to go up (clientIF:709) > 2019-06-16 15:06:48,258+ INFO (periodic/1) [vdsm.api] START > repoStats(domains=()) from=internal, > task_id=0526307b-bb37-4eff-94d6-910ac0d64933 > (api:48) > 2019-06-16 15:06:48,258+ INFO (periodic/1) [vdsm.api] FINISH > repoStats return={} from=internal, > task_id=0526307b-bb37-4eff-94d6-910ac0d64933 > (api:54) > 2019-06-16 15:06:49,126+ INFO (vmrecovery) [vdsm.api] START > getConnectedStoragePoolsList(options=None) from=internal, > task_id=0d5b359e-1a4c-4cc0-87a1-4a41e91ba356 (api:48) > 2019-06-16 15:06:49,126+ INFO (vmrecovery) [vdsm.api] FINISH > getConnectedStoragePoolsList return={'poollist': []} from=internal, > task_id=0d5b359e-1a4c-4cc0-87a1-4a41e91ba356 (api:54) > 2019-06-16 15:06:49,126+ INFO (vmrecovery) [vds] recovery: waiting > for storage pool to go up (clientIF:709) > 2019-06-16 15:06:53,040+ INFO (jsonrpc/5) [api.host] START > getAllVmStats() from=::1,50104 (api:48) > 2019-06-16 15:06:53,041+ INFO (jsonrpc/5) [api.host] FINISH > getAllVmStats return={'status': {'message': 'Done', 'code': 0}, > 'statsList': (suppressed)} from=::1,50104 (api:54) > 2019-06-16 15:06:53,041+ INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC > call Host.getAllVmStats succeeded in 0.01 seconds (__init__:312) > 2019-06-16 15:06:54,132+ INFO (vmrecovery) [vdsm.api] START > getConnectedStoragePoolsList(options=None) from=internal, > task_id=99c33317-7753-4d24-a10b-b716adcdaf76 (api:48) > 2019-06-16 15:06:54,132+ INFO (vmrecovery) [vdsm.api] FINISH > getConnectedStoragePoolsList return={'poollist': []} from=internal, > task_id=99c33317-7753-4d24-a10b-b716adcdaf76 (api:54) > 2019-06-16 15:06:54,132+ INFO (vmrecovery) [vds] recovery: waiting > for storage pool to go up (clientIF:709) > 2019-06-16 15:06:59,134+ INFO (vmrecovery) [vdsm.api] START > getConnectedStoragePoolsList(options=None) from=internal, > task_id=8f5679a1-8734-491d-b925-7387effe4726 (api:48) > 2019-06-16 15:06:59,134+ INFO (vmrecovery) [vdsm.api] FINISH > getConnectedStoragePoolsList return={'poollist': []} from=internal, > task_id=8f5679a1-8734-491d-b925-7387effe4726 (api:54) > 2019-06-16 15:06:59,134+ I
[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"
Actually, I think I've found the most detailed error in /var/log/ovirt-engine/engine.log. It occurs at exactly the time the webui errors ("Failed to configure management network on host"): 2019-06-16 23:16:47,888+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [dd6ed2d] Failed in 'CollectVdsNetworkDataAfterInstallationVDS' method, for vds: 'host'; host: 'host.example.com': CallableStatementCallback; SQL [{call insertnameserver(?, ?, ?)}ERROR: duplicate key value violates unique constraint "name_server_pkey" Detail: Key (dns_resolver_configuration_id, address)=(5e4bbac0-a0b4-458f-bb9e-f959efaf810f, 10.0.8.1) already exists. Where: SQL statement "INSERT INTO name_server( address, position, dns_resolver_configuration_id) VALUES ( v_address, v_position, v_dns_resolver_configuration_id)" PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "name_server_pkey" Detail: Key (dns_resolver_configuration_id, address)=(5e4bbac0-a0b4-458f-bb9e-f959efaf810f, 10.0.8.1) already exists. Where: SQL statement "INSERT INTO name_server( address, position, dns_resolver_configuration_id) VALUES ( v_address, v_position, v_dns_resolver_configuration_id)" PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at SQL statement 2019-06-16 23:16:47,888+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [dd6ed2d] Command 'CollectVdsNetworkDataAfterInstallationVDSCommand(HostName = host, CollectHostNetworkDataVdsCommandParameters:{hostId='5e4bbac0-a0b4-458f-bb9e-f959efaf810f', vds='Host[host,5e4bbac0-a0b4-458f-bb9e-f959efaf810f]'})' execution failed: CallableStatementCallback; SQL [{call insertnameserver(?, ?, ?)}ERROR: duplicate key value violates unique constraint "name_server_pkey" Detail: Key (dns_resolver_configuration_id, address)=(5e4bbac0-a0b4-458f-bb9e-f959efaf810f, 10.0.8.1) already exists. Where: SQL statement "INSERT INTO name_server( address, position, dns_resolver_configuration_id) VALUES ( v_address, v_position, v_dns_resolver_configuration_id)" PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "name_server_pkey" Detail: Key (dns_resolver_configuration_id, address)=(5e4bbac0-a0b4-458f-bb9e-f959efaf810f, 10.0.8.1) already exists. Where: SQL statement "INSERT INTO name_server( address, position, dns_resolver_configuration_id) VALUES ( v_address, v_position, v_dns_resolver_configuration_id)" PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at SQL statement 2019-06-16 23:16:47,888+01 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [dd6ed2d] Exception: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.springframework.dao.DuplicateKeyException: CallableStatementCallback; SQL [{call insertnameserver(?, ?, ?)}ERROR: duplicate key value violates unique constraint "name_server_pkey" Detail: Key (dns_resolver_configuration_id, address)=(5e4bbac0-a0b4-458f-bb9e-f959efaf810f, 10.0.8.1) already exists. Where: SQL statement "INSERT INTO name_server( address, position, dns_resolver_configuration_id) VALUES ( v_address, v_position, v_dns_resolver_configuration_id)" PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "name_server_pkey" Detail: Key (dns_resolver_configuration_id, address)=(5e4bbac0-a0b4-458f-bb9e-f959efaf810f, 10.0.8.1) already exists. Where: SQL statement "INSERT INTO name_server( address, position, dns_resolver_configuration_id) VALUES ( v_address, v_position, v_dns_resolver_configuration_id)" PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at SQL statement (Failed with error ENGINE and code 5001) at org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:118) [bll.jar:] at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.runVdsCommand(VDSBrokerFrontendImpl.java:33) [bll.jar:] at org.ovirt.engine.core.bll.network.NetworkConfigurator.refreshNetworkConfiguration(NetworkConfigurator.java:129) [bll.jar:] at org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand.co
[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"
Not sure if it's related, but while adding the host, about a minute before the timeout\error occurs, /var/log/vdsm/mom.log shows: 2019-06-16 22:58:46,799 - mom - ERROR - Failed to initialize MOM threads Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 29, in run hypervisor_iface = self.get_hypervisor_interface() File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 217, in get_hypervisor_interface return module.instance(self.config) File "/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcclientInterface.py", line 96, in instance return JsonRpcVdsmClientInterface() File "/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcclientInterface.py", line 31, in __init__ self._vdsm_api = client.connect(host="localhost") File "/usr/lib/python2.7/site-packages/vdsm/client.py", line 157, in connect raise ConnectionError(host, port, use_tls, timeout, e) ConnectionError: Connection to localhost:54321 with use_tls=True, timeout=60 failed: [Errno 111] Connection refused Is this the cause or is this benign? As for your suggestion for the engine-VM, for now I've switched to using plain CentOS instead of Node, and this shows the problem occurs prior to adding the engine-VM. The failure occurs in the stage before: adding a new host. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/HVOY6LN4HUBGJNA33W5MVHCMZ7BSG76U/