[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"

2019-06-18 Thread Strahil
I think that this can be resolved by a remote (r)syslog system and proper 
documentation.

I would be happy to write a short crash-course , but I will definately need 
assiatance from more experienced person.
So far (7 months later) , I still struggle to find my errors and where to 
control log level.

Best Regards,
Strahil NikolovOn Jun 18, 2019 12:18, m...@brendanh.com wrote:
>
> "trade-off between time [developing] and time spent debugging such cases when 
> they do happen" 
>
> Your call.  All I know is, it took me over a month to install oVirt, 
> including three weeks of one-to-one time with Simone Tiraboschi from Red Hat. 
>  He sent me eleven emails but eventually gave up, baffled as me.  It 
> shouldn't be this hard.  Others who are having this or similar problems will 
> just abandon oVirt: ("I'm just wondering if I should cut my losses with 
> oVirt"): 
> https://lists.ovirt.org/archives/list/users@ovirt.org/thread/PZJYNAKPYNQUTTNAXG57HTMQHATWYQGZ/
>  
>
> The biggest challenge is to find the relevant error. What would be useful is 
> a log aggregator.  If oVirt had a journalctl type app running on the host 
> that tails ALL the logs including the engine logs from hosted-engine (via 
> ssh), everything would be in one place and easy to spot.  Currently, you need 
> fairly detailed knowledge of the architecture and install process to (i) find 
> the log files (ii) whittle down to the one displaying the problem.  Yes, I 
> know you guys have a log-packaging app that compresses them up, so they can 
> be sent to Red Hat for inspection (does this even include hosted-engine 
> logs?).  But with a journalling app, users would be able to spot the error 
> themselves and most of the time (if it's not a new bug), fix it on their own 
> like I did.  And yes, I know once set up, users will have their own log 
> aggregator in the form of Kibana, Splunk, etc but these don't help during the 
> initial install.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/4732OJ7DJFKI474B5TZOYGP4GXBRPTZS/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NLS4XOAB4WH73O4IFYH75KJH23FVFRJH/


[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"

2019-06-18 Thread me
"trade-off between time [developing] and time spent debugging such cases when 
they do happen"

Your call.  All I know is, it took me over a month to install oVirt, including 
three weeks of one-to-one time with Simone Tiraboschi from Red Hat.  He sent me 
eleven emails but eventually gave up, baffled as me.  It shouldn't be this 
hard.  Others who are having this or similar problems will just abandon oVirt: 
("I'm just wondering if I should cut my losses with oVirt"):
https://lists.ovirt.org/archives/list/users@ovirt.org/thread/PZJYNAKPYNQUTTNAXG57HTMQHATWYQGZ/

The biggest challenge is to find the relevant error. What would be useful is a 
log aggregator.  If oVirt had a journalctl type app running on the host that 
tails ALL the logs including the engine logs from hosted-engine (via ssh), 
everything would be in one place and easy to spot.  Currently, you need fairly 
detailed knowledge of the architecture and install process to (i) find the log 
files (ii) whittle down to the one displaying the problem.  Yes, I know you 
guys have a log-packaging app that compresses them up, so they can be sent to 
Red Hat for inspection (does this even include hosted-engine logs?).  But with 
a journalling app, users would be able to spot the error themselves and most of 
the time (if it's not a new bug), fix it on their own like I did.  And yes, I 
know once set up, users will have their own log aggregator in the form of 
Kibana, Splunk, etc but these don't help during the initial install.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4732OJ7DJFKI474B5TZOYGP4GXBRPTZS/


[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"

2019-06-18 Thread Yedidyah Bar David
On Mon, Jun 17, 2019 at 1:38 PM  wrote:
>
> Googling the pertinent text from the above long error:
> duplicate key value violates unique constraint "name_server_pkey"
> led me to this bug report:
> https://bugzilla.redhat.com/show_bug.cgi?id=1530944
> and the discovery I had a duplicate DNS IP address in /etc/resolv.conf
> Removing this and adding the host again worked :-)
> But it shouldn't have been this hard to install oVirt.  May I suggest 
> tolarance of duplicate DNS IPs be added?
> In above bug report, Yaniv Kaul says won't fix because it's user error.  
> Perhaps, but the oVirt installer should do a modicum of hand-holding IMO.

Perhaps. I am not saying this does not make sense. It's a question of
trade-off - between the time spent preparing a patch, testing it,
potentially having to maintain it in the future, etc., and the time
spent debugging such cases when they do happen, and the other damage
caused (not much, in code related only to new setups - it has very low
chances to e.g. corrupt data etc.).

While in many cases we do try to hand-hold, admittedly especially in
cases of more potential for damage, there is a limit to how much we
can do. Or, in other words (also being a sysadmin for 15 years in the
past, and also seeing many support cases in oVirt/RHV), there is
practically _no_ limit to how much a user can do weird things to the
system...

That said, you know the "standard" answer in such cases... (Verified -
that's the hard part - ) Patches are welcome! :-)

Best regards,
-- 
Didi
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MP5JLB2W2PBQBDWUHG6EH2X2SE47CFUA/


[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"

2019-06-17 Thread me
Googling the pertinent text from the above long error:
duplicate key value violates unique constraint "name_server_pkey"
led me to this bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1530944
and the discovery I had a duplicate DNS IP address in /etc/resolv.conf
Removing this and adding the host again worked :-)
But it shouldn't have been this hard to install oVirt.  May I suggest tolarance 
of duplicate DNS IPs be added?
In above bug report, Yaniv Kaul says won't fix because it's user error.  
Perhaps, but the oVirt installer should do a modicum of hand-holding IMO.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FAIHQ5CTMBNE2MHVOC6IAQBXCNJ2UINZ/


[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"

2019-06-16 Thread Yuval Turgeman
Hi Edward, you're hitting [1] - it will be included in the next appliance

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1718399


On Monday, June 17, 2019, Edward Berger  wrote:

> The hosted engine is created in two steps, first as a 192.168.x.x address
> as a local VM on the host, then it gets copied over to shared storage and
> gets the real ip address you assigned in the setup wizard..  So that part
> is normal behavior.
>
> I had a recent hosted engine installation failure with oVirt node 4.3.4,
> where the local VM was stuck trying to yum install yum-utils, but couldn't
> because it is behind a firewall, so I ssh'd into the local VM, added a
> proxy line to /etc/yum.conf, kill -HUP'd the bad process and manually
> re-ran the yum install command and it was able to complete the hosted
> engine installation.
>
> If that's not the issue, maybe your node's network config is not something
> the installer expects like preconfigured bridge when it wants to do the
> bridge configuration for itself, or a bond type not supported...
>
>
> On Sun, Jun 16, 2019 at 12:12 PM  wrote:
>
> Hi,
>
> I've been failing to install hosted-engine on oVirt Node for a long time.
> I'm now trying on a Coffee Lake Xeon-based system, having previously tried
> on Broadwell-E.
>
> Trying using the webui or hosted-engine --deploy has similar result.
> Error in the title occurs when using the webui.  Using hosted-engine
> --deploy gets shows:
> [ INFO  ] TASK [ovirt.hosted_engine_setup : Wait for the host to be up]
> [ INFO  ] ok: [localhost]
> [ INFO  ] TASK [ovirt.hosted_engine_setup : Check host status]
> [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The
> host has been set in non_operational status, please check engine logs, fix
> accordingly and re-deploy.\n"}
>
> Despite the failure, the oVirt webui is can be browsed on https://:6900,
> but the host has status "unassigned".  The Node webui (https://:9090)
> has the engine VM running, but when I login to its console, I see its IP is
> 192.168.122.123, not the DHCP-reserved IP address (on our 10.0.8.x
> network), which doesn't seem right.  I suspect some problem with DHCP, but
> I don't know how to fix.  Any ideas?
>
> vdsm.log shows:
> 2019-06-16 15:06:39,117+ INFO  (vmrecovery) [vds] recovery: waiting
> for storage pool to go up (clientIF:709)
> 2019-06-16 15:06:44,122+ INFO  (vmrecovery) [vdsm.api] START
> getConnectedStoragePoolsList(options=None) from=internal,
> task_id=7f984b0d-9765-457e-ac8e-c5cd0bdf73d2 (api:48)
> 2019-06-16 15:06:44,122+ INFO  (vmrecovery) [vdsm.api] FINISH
> getConnectedStoragePoolsList return={'poollist': []} from=internal,
> task_id=7f984b0d-9765-457e-ac8e-c5cd0bdf73d2 (api:54)
> 2019-06-16 15:06:44,122+ INFO  (vmrecovery) [vds] recovery: waiting
> for storage pool to go up (clientIF:709)
> 2019-06-16 15:06:48,258+ INFO  (periodic/1) [vdsm.api] START
> repoStats(domains=()) from=internal, 
> task_id=0526307b-bb37-4eff-94d6-910ac0d64933
> (api:48)
> 2019-06-16 15:06:48,258+ INFO  (periodic/1) [vdsm.api] FINISH
> repoStats return={} from=internal, 
> task_id=0526307b-bb37-4eff-94d6-910ac0d64933
> (api:54)
> 2019-06-16 15:06:49,126+ INFO  (vmrecovery) [vdsm.api] START
> getConnectedStoragePoolsList(options=None) from=internal,
> task_id=0d5b359e-1a4c-4cc0-87a1-4a41e91ba356 (api:48)
> 2019-06-16 15:06:49,126+ INFO  (vmrecovery) [vdsm.api] FINISH
> getConnectedStoragePoolsList return={'poollist': []} from=internal,
> task_id=0d5b359e-1a4c-4cc0-87a1-4a41e91ba356 (api:54)
> 2019-06-16 15:06:49,126+ INFO  (vmrecovery) [vds] recovery: waiting
> for storage pool to go up (clientIF:709)
> 2019-06-16 15:06:53,040+ INFO  (jsonrpc/5) [api.host] START
> getAllVmStats() from=::1,50104 (api:48)
> 2019-06-16 15:06:53,041+ INFO  (jsonrpc/5) [api.host] FINISH
> getAllVmStats return={'status': {'message': 'Done', 'code': 0},
> 'statsList': (suppressed)} from=::1,50104 (api:54)
> 2019-06-16 15:06:53,041+ INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC
> call Host.getAllVmStats succeeded in 0.01 seconds (__init__:312)
> 2019-06-16 15:06:54,132+ INFO  (vmrecovery) [vdsm.api] START
> getConnectedStoragePoolsList(options=None) from=internal,
> task_id=99c33317-7753-4d24-a10b-b716adcdaf76 (api:48)
> 2019-06-16 15:06:54,132+ INFO  (vmrecovery) [vdsm.api] FINISH
> getConnectedStoragePoolsList return={'poollist': []} from=internal,
> task_id=99c33317-7753-4d24-a10b-b716adcdaf76 (api:54)
> 2019-06-16 15:06:54,132+ INFO  (vmrecovery) [vds] recovery: waiting
> for storage pool to go up (clientIF:709)
> 2019-06-16 15:06:59,134+ INFO  (vmrecovery) [vdsm.api] START
> getConnectedStoragePoolsList(options=None) from=internal,
> task_id=8f5679a1-8734-491d-b925-7387effe4726 (api:48)
> 2019-06-16 15:06:59,134+ INFO  (vmrecovery) [vdsm.api] FINISH
> getConnectedStoragePoolsList return={'poollist': []} from=internal,
> task_id=8f5679a1-8734-491d-b925-7387effe4726 (api:54)
> 2019-06-16 15:06:59,134+ I

[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"

2019-06-16 Thread me
Actually, I think I've found the most detailed error in 
/var/log/ovirt-engine/engine.log.  It occurs at exactly the time the webui 
errors ("Failed to configure management network on host"):

2019-06-16 23:16:47,888+01 ERROR 
[org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand]
 (EE-ManagedThreadFactory-engine-Thread-1) [dd6ed2d] Failed in 
'CollectVdsNetworkDataAfterInstallationVDS' method, for vds: 'host'; host: 
'host.example.com': CallableStatementCallback; SQL [{call insertnameserver(?, 
?, ?)}ERROR: duplicate key value violates unique constraint "name_server_pkey"
  Detail: Key (dns_resolver_configuration_id, 
address)=(5e4bbac0-a0b4-458f-bb9e-f959efaf810f, 10.0.8.1) already exists.
  Where: SQL statement "INSERT INTO
name_server(
  address,
  position,
  dns_resolver_configuration_id)
VALUES (
  v_address,
  v_position,
  v_dns_resolver_configuration_id)"
PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at 
SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: 
duplicate key value violates unique constraint "name_server_pkey"
  Detail: Key (dns_resolver_configuration_id, 
address)=(5e4bbac0-a0b4-458f-bb9e-f959efaf810f, 10.0.8.1) already exists.
  Where: SQL statement "INSERT INTO
name_server(
  address,
  position,
  dns_resolver_configuration_id)
VALUES (
  v_address,
  v_position,
  v_dns_resolver_configuration_id)"
PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at 
SQL statement
2019-06-16 23:16:47,888+01 ERROR 
[org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand]
 (EE-ManagedThreadFactory-engine-Thread-1) [dd6ed2d] Command 
'CollectVdsNetworkDataAfterInstallationVDSCommand(HostName = host, 
CollectHostNetworkDataVdsCommandParameters:{hostId='5e4bbac0-a0b4-458f-bb9e-f959efaf810f',
 vds='Host[host,5e4bbac0-a0b4-458f-bb9e-f959efaf810f]'})' execution failed: 
CallableStatementCallback; SQL [{call insertnameserver(?, ?, ?)}ERROR: 
duplicate key value violates unique constraint "name_server_pkey"
  Detail: Key (dns_resolver_configuration_id, 
address)=(5e4bbac0-a0b4-458f-bb9e-f959efaf810f, 10.0.8.1) already exists.
  Where: SQL statement "INSERT INTO
name_server(
  address,
  position,
  dns_resolver_configuration_id)
VALUES (
  v_address,
  v_position,
  v_dns_resolver_configuration_id)"
PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at 
SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: 
duplicate key value violates unique constraint "name_server_pkey"
  Detail: Key (dns_resolver_configuration_id, 
address)=(5e4bbac0-a0b4-458f-bb9e-f959efaf810f, 10.0.8.1) already exists.
  Where: SQL statement "INSERT INTO
name_server(
  address,
  position,
  dns_resolver_configuration_id)
VALUES (
  v_address,
  v_position,
  v_dns_resolver_configuration_id)"
PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at 
SQL statement
2019-06-16 23:16:47,888+01 ERROR 
[org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] 
(EE-ManagedThreadFactory-engine-Thread-1) [dd6ed2d] Exception: 
org.ovirt.engine.core.common.errors.EngineException: EngineException: 
org.springframework.dao.DuplicateKeyException: CallableStatementCallback; SQL 
[{call insertnameserver(?, ?, ?)}ERROR: duplicate key value violates unique 
constraint "name_server_pkey"
  Detail: Key (dns_resolver_configuration_id, 
address)=(5e4bbac0-a0b4-458f-bb9e-f959efaf810f, 10.0.8.1) already exists.
  Where: SQL statement "INSERT INTO
name_server(
  address,
  position,
  dns_resolver_configuration_id)
VALUES (
  v_address,
  v_position,
  v_dns_resolver_configuration_id)"
PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at 
SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: 
duplicate key value violates unique constraint "name_server_pkey"
  Detail: Key (dns_resolver_configuration_id, 
address)=(5e4bbac0-a0b4-458f-bb9e-f959efaf810f, 10.0.8.1) already exists.
  Where: SQL statement "INSERT INTO
name_server(
  address,
  position,
  dns_resolver_configuration_id)
VALUES (
  v_address,
  v_position,
  v_dns_resolver_configuration_id)"
PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at 
SQL statement (Failed with error ENGINE and code 5001)
at 
org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:118) 
[bll.jar:]
at 
org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.runVdsCommand(VDSBrokerFrontendImpl.java:33)
 [bll.jar:]
at 
org.ovirt.engine.core.bll.network.NetworkConfigurator.refreshNetworkConfiguration(NetworkConfigurator.java:129)
 [bll.jar:]
at 
org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand.co

[ovirt-users] Re: Hosted engine setup: "Failed to configure management network on host Local due to setup networks failure"

2019-06-16 Thread me
Not sure if it's related, but while adding the host, about a minute before the 
timeout\error occurs, /var/log/vdsm/mom.log shows:

2019-06-16 22:58:46,799 - mom - ERROR - Failed to initialize MOM threads
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 29, in run
hypervisor_iface = self.get_hypervisor_interface()
  File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 217, in 
get_hypervisor_interface
return module.instance(self.config)
  File 
"/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcclientInterface.py",
 line 96, in instance
return JsonRpcVdsmClientInterface()
  File 
"/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcclientInterface.py",
 line 31, in __init__
self._vdsm_api = client.connect(host="localhost")
  File "/usr/lib/python2.7/site-packages/vdsm/client.py", line 157, in connect
raise ConnectionError(host, port, use_tls, timeout, e)
ConnectionError: Connection to localhost:54321 with use_tls=True, timeout=60 
failed: [Errno 111] Connection refused

Is this the cause or is this benign?

As for your suggestion for the engine-VM, for now I've switched to using plain 
CentOS instead of Node, and this shows the problem occurs prior to adding the 
engine-VM.  The failure occurs in the stage before: adding a new host.  
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/HVOY6LN4HUBGJNA33W5MVHCMZ7BSG76U/