Re: [ovirt-users] hosted engine install fails on useless DHCP lookup

2018-02-15 Thread Simone Tiraboschi
On Thu, Feb 15, 2018 at 1:08 AM, Jamie Lawrence 
wrote:

> > On Feb 14, 2018, at 1:27 AM, Simone Tiraboschi 
> wrote:
> > On Wed, Feb 14, 2018 at 2:11 AM, Jamie Lawrence <
> jlawre...@squaretrade.com> wrote:
> > Hello,
> >
> > I'm seeing the hosted engine install fail on an Ansible playbook step.
> Log below. I tried looking at the file specified for retry, below
> (/usr/share/ovirt-hosted-engine-setup/ansible/bootstrap_local_vm.retry);
> it contains the word, 'localhost'.
> >
> > The log below didn't contain anything I could see that was actionable;
> given that it was an ansible error, I hunted down the config and enabled
> logging. On this run the error was different - the installer log was the
> same, but the reported error (from the installer changed).
> >
> > The first time, the installer said:
> >
> > [ INFO  ] TASK [Wait for the host to become non operational]
> > [ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts":
> {"ovirt_hosts": []}, "attempts": 150, "changed": false}
> > [ ERROR ] Failed to execute stage 'Closing up': Failed executing
> ansible-playbook
> > [ INFO  ] Stage: Clean up
> >
> > 'localhost' here is not an issue by itself: the playbook is executed on
> the host against the same host over a local connection so localhost is
> absolutely fine there.
> >
> > Maybe you hit this one:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1540451
>
> That seems likely.
>

At the point the engine VM is up but you can reach it only from that host
since it's on a natted network.
I'd suggest to connect to the engine VM from there and check host-deploy
logs.


>
>
> > It seams NetworkManager related but still not that clear.
> > Stopping NetworkManager and starting network before the deployment seams
> to help.
>
> Tried this, got the same results.
>
> [snip]
> > Anyone see what is wrong here?
> >
> > This is absolutely fine.
> > The new ansible based flow (also called node zero) uses an engine
> running on a local virtual machine to bootstrap the system.
> > The bootstrap local VM runs over libvirt default natted network with its
> own dhcp instance, that's why we are consuming it.
> > The locally running engine will create a target virtual machine on the
> shared storage and that one will be instead configured as you specified.
>
> Thanks for the context - that's useful, and presumably explains why
> 192.168 addresses  (which we don't use) are appearing in the logs.
>
> Not being entirely sure where to go from here, I guess I'll spend the
> evening figuring out ansible-ese in order to try to figure out why it is
> blowing chunks.
>
> Thanks for the note.
>
> -j
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted engine install fails on useless DHCP lookup

2018-02-14 Thread Jamie Lawrence
> On Feb 14, 2018, at 1:27 AM, Simone Tiraboschi  wrote:
> On Wed, Feb 14, 2018 at 2:11 AM, Jamie Lawrence  
> wrote:
> Hello,
> 
> I'm seeing the hosted engine install fail on an Ansible playbook step. Log 
> below. I tried looking at the file specified for retry, below 
> (/usr/share/ovirt-hosted-engine-setup/ansible/bootstrap_local_vm.retry); it 
> contains the word, 'localhost'.
> 
> The log below didn't contain anything I could see that was actionable; given 
> that it was an ansible error, I hunted down the config and enabled logging. 
> On this run the error was different - the installer log was the same, but the 
> reported error (from the installer changed).
> 
> The first time, the installer said:
> 
> [ INFO  ] TASK [Wait for the host to become non operational]
> [ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": 
> []}, "attempts": 150, "changed": false}
> [ ERROR ] Failed to execute stage 'Closing up': Failed executing 
> ansible-playbook
> [ INFO  ] Stage: Clean up
> 
> 'localhost' here is not an issue by itself: the playbook is executed on the 
> host against the same host over a local connection so localhost is absolutely 
> fine there.
> 
> Maybe you hit this one:
> https://bugzilla.redhat.com/show_bug.cgi?id=1540451

That seems likely. 


> It seams NetworkManager related but still not that clear.
> Stopping NetworkManager and starting network before the deployment seams to 
> help.

Tried this, got the same results.

[snip]
> Anyone see what is wrong here?
> 
> This is absolutely fine.
> The new ansible based flow (also called node zero) uses an engine running on 
> a local virtual machine to bootstrap the system.
> The bootstrap local VM runs over libvirt default natted network with its own 
> dhcp instance, that's why we are consuming it.
> The locally running engine will create a target virtual machine on the shared 
> storage and that one will be instead configured as you specified.

Thanks for the context - that's useful, and presumably explains why 192.168 
addresses (which we don't use) are appearing in the logs.

Not being entirely sure where to go from here, I guess I'll spend the evening 
figuring out ansible-ese in order to try to figure out why it is blowing chunks.

Thanks for the note. 

-j
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted engine install fails on useless DHCP lookup

2018-02-14 Thread Simone Tiraboschi
On Wed, Feb 14, 2018 at 2:11 AM, Jamie Lawrence 
wrote:

> Hello,
>
> I'm seeing the hosted engine install fail on an Ansible playbook step. Log
> below. I tried looking at the file specified for retry, below
> (/usr/share/ovirt-hosted-engine-setup/ansible/bootstrap_local_vm.retry);
> it contains the word, 'localhost'.
>
> The log below didn't contain anything I could see that was actionable;
> given that it was an ansible error, I hunted down the config and enabled
> logging. On this run the error was different - the installer log was the
> same, but the reported error (from the installer changed).
>
> The first time, the installer said:
>
> [ INFO  ] TASK [Wait for the host to become non operational]
> [ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts":
> []}, "attempts": 150, "changed": false}
> [ ERROR ] Failed to execute stage 'Closing up': Failed executing
> ansible-playbook
> [ INFO  ] Stage: Clean up
>

'localhost' here is not an issue by itself: the playbook is executed on the
host against the same host over a local connection so localhost is
absolutely fine there.

Maybe you hit this one:
https://bugzilla.redhat.com/show_bug.cgi?id=1540451

It seams NetworkManager related but still not that clear.
Stopping NetworkManager and starting network before the deployment seams to
help.


>
>
> Second:
>
> [ INFO  ] TASK [Get local vm ip]
> [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 50, "changed": true,
> "cmd": "virsh -r net-dhcp-leases default | grep -i 00:16:3e:11:e7:bd | awk
> '{ print $5 }' | cut -f1 -d'/'", "delta": "0:00:00.093840", "end":
> "2018-02-13 16:53:08.658556", "rc": 0, "start": "2018-02-13
> 16:53:08.564716", "stderr": "", "stderr_lines": [], "stdout": "",
> "stdout_lines": []}
> [ ERROR ] Failed to execute stage 'Closing up': Failed executing
> ansible-playbook
> [ INFO  ] Stage: Clean up
>
>
>
>  Ansible log below; as with that second snippet, it appears that it was
> trying to parse out a host name from virsh's list of DHCP leases, couldn't,
> and died.
>
> Which makes sense: I gave it a static IP, and unless I'm missing
> something, setup should not have been doing that. I verified that the
> answer file has the IP:
>
> OVEHOSTED_VM/cloudinitVMStaticCIDR=str:10.181.26.150/24
>
> Anyone see what is wrong here?
>

This is absolutely fine.
The new ansible based flow (also called node zero) uses an engine running
on a local virtual machine to bootstrap the system.
The bootstrap local VM runs over libvirt default natted network with its
own dhcp instance, that's why we are consuming it.
The locally running engine will create a target virtual machine on the
shared storage and that one will be instead configured as you specified.



>
> -j
>
>
> hosted-engine --deploy log:
>
> 2018-02-13 16:20:32,138-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 TASK [Force host-deploy in offline mode]
> 2018-02-13 16:20:33,041-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 changed: [localhost]
> 2018-02-13 16:20:33,342-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 TASK [include_tasks]
> 2018-02-13 16:20:33,443-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 ok: [localhost]
> 2018-02-13 16:20:33,744-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 TASK [Obtain SSO token using
> username/password credentials]
> 2018-02-13 16:20:35,248-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 ok: [localhost]
> 2018-02-13 16:20:35,550-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 TASK [Add host]
> 2018-02-13 16:20:37,053-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 changed: [localhost]
> 2018-02-13 16:20:37,355-0800 INFO 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:100 TASK [Wait for the host to become non
> operational]
> 2018-02-13 16:27:48,895-0800 DEBUG 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:94 {u'_ansible_parsed': True,
> u'_ansible_no_log': False, u'changed': False, u'attempts': 150,
> u'invocation': {u'module_args': {u'pattern': u'name=
> ovirt-1.squaretrade.com', u'fetch_nested': False, u'nested_attributes':
> []}}, u'ansible_facts': {u'ovirt_hosts': []}}
> 2018-02-13 16:27:48,995-0800 ERROR 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:98 fatal: [localhost]: FAILED! =>
> {"ansible_facts": {"ovirt_hosts": []}, "attempts": 150, "changed": false}
> 2018-02-13 16:27:49,297-0800 DEBUG 
> otopi.ovirt_hosted_engine_setup.ansible_utils
> ansible_utils._process_output:94 PLAY RECAP [localhost] : ok: 42 changed:
> 17 unreachable: 0 skipped: 2 failed: 1
> 2018-02-13 

[ovirt-users] hosted engine install fails on useless DHCP lookup

2018-02-13 Thread Jamie Lawrence
Hello,

I'm seeing the hosted engine install fail on an Ansible playbook step. Log 
below. I tried looking at the file specified for retry, below 
(/usr/share/ovirt-hosted-engine-setup/ansible/bootstrap_local_vm.retry); it 
contains the word, 'localhost'. 

The log below didn't contain anything I could see that was actionable; given 
that it was an ansible error, I hunted down the config and enabled logging. On 
this run the error was different - the installer log was the same, but the 
reported error (from the installer changed). 

The first time, the installer said:

[ INFO  ] TASK [Wait for the host to become non operational]
[ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": []}, 
"attempts": 150, "changed": false}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing 
ansible-playbook
[ INFO  ] Stage: Clean up


Second:

[ INFO  ] TASK [Get local vm ip]
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 50, "changed": true, 
"cmd": "virsh -r net-dhcp-leases default | grep -i 00:16:3e:11:e7:bd | awk '{ 
print $5 }' | cut -f1 -d'/'", "delta": "0:00:00.093840", "end": "2018-02-13 
16:53:08.658556", "rc": 0, "start": "2018-02-13 16:53:08.564716", "stderr": "", 
"stderr_lines": [], "stdout": "", "stdout_lines": []}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing 
ansible-playbook
[ INFO  ] Stage: Clean up



 Ansible log below; as with that second snippet, it appears that it was trying 
to parse out a host name from virsh's list of DHCP leases, couldn't, and died. 

Which makes sense: I gave it a static IP, and unless I'm missing something, 
setup should not have been doing that. I verified that the answer file has the 
IP:

OVEHOSTED_VM/cloudinitVMStaticCIDR=str:10.181.26.150/24

Anyone see what is wrong here?

-j


hosted-engine --deploy log:

2018-02-13 16:20:32,138-0800 INFO otopi.ovirt_hosted_engine_setup.ansible_utils 
ansible_utils._process_output:100 TASK [Force host-deploy in offline mode]
2018-02-13 16:20:33,041-0800 INFO otopi.ovirt_hosted_engine_setup.ansible_utils 
ansible_utils._process_output:100 changed: [localhost]
2018-02-13 16:20:33,342-0800 INFO otopi.ovirt_hosted_engine_setup.ansible_utils 
ansible_utils._process_output:100 TASK [include_tasks]
2018-02-13 16:20:33,443-0800 INFO otopi.ovirt_hosted_engine_setup.ansible_utils 
ansible_utils._process_output:100 ok: [localhost]
2018-02-13 16:20:33,744-0800 INFO otopi.ovirt_hosted_engine_setup.ansible_utils 
ansible_utils._process_output:100 TASK [Obtain SSO token using 
username/password credentials]
2018-02-13 16:20:35,248-0800 INFO otopi.ovirt_hosted_engine_setup.ansible_utils 
ansible_utils._process_output:100 ok: [localhost]
2018-02-13 16:20:35,550-0800 INFO otopi.ovirt_hosted_engine_setup.ansible_utils 
ansible_utils._process_output:100 TASK [Add host]
2018-02-13 16:20:37,053-0800 INFO otopi.ovirt_hosted_engine_setup.ansible_utils 
ansible_utils._process_output:100 changed: [localhost]
2018-02-13 16:20:37,355-0800 INFO otopi.ovirt_hosted_engine_setup.ansible_utils 
ansible_utils._process_output:100 TASK [Wait for the host to become non 
operational]
2018-02-13 16:27:48,895-0800 DEBUG 
otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:94 
{u'_ansible_parsed': True, u'_ansible_no_log': False, u'changed': False, 
u'attempts': 150, u'invocation': {u'module_args': {u'pattern': 
u'name=ovirt-1.squaretrade.com', u'fetch_nested': False, u'nested_attributes': 
[]}}, u'ansible_facts': {u'ovirt_hosts': []}}
2018-02-13 16:27:48,995-0800 ERROR 
otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:98 
fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": []}, 
"attempts": 150, "changed": false}
2018-02-13 16:27:49,297-0800 DEBUG 
otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:94 
PLAY RECAP [localhost] : ok: 42 changed: 17 unreachable: 0 skipped: 2 failed: 1
2018-02-13 16:27:49,397-0800 DEBUG 
otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:94 
PLAY RECAP [ovirt-engine-1.squaretrade.com] : ok: 15 changed: 8 unreachable: 0 
skipped: 4 failed: 0
2018-02-13 16:27:49,498-0800 DEBUG 
otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:180 
ansible-playbook rc: 2
2018-02-13 16:27:49,498-0800 DEBUG 
otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:187 
ansible-playbook stdout:
2018-02-13 16:27:49,499-0800 DEBUG 
otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:189  to retry, 
use: --limit 
@/usr/share/ovirt-hosted-engine-setup/ansible/bootstrap_local_vm.retry

2018-02-13 16:27:49,499-0800 DEBUG 
otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:190 
ansible-playbook stderr:
2018-02-13 16:27:49,500-0800 DEBUG otopi.context context._executeMethod:143 
method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 133, in 
_executeMethod
method['method']()
  File