Re: [ovirt-users] 3 strikes....

2017-12-28 Thread Michal Skrivanek


> On 28 Dec 2017, at 00:02, Blaster  wrote:
> 
> Well, I've spent the last 2.5 days trying to get oVirt 4.2 up and running.
> 
> I sneeze on it, vdsm has a conniption and there appears to be no way to 
> recover from it.
> 
> 1) Install 4.2.  Everything looks good.  Start copying over some 
> data..accidently wipe out the master storage domain...It's gone.  The only 
> method google could suggest was to re-initialize the data center.  Great.  
> I'd love to!  It's greyed out.  Can't get it back...Try several hosted-engine 
> uninstall methods, including 
> /usr/sbin/ovirt-hosted-engine-cleanup and wiping out the storage.  
> 
> re-run hosted-engine --deploy
>   All I get over and over in the vdsm log file while waiting for vdsm to 
> become operational is..
> 2017-12-27 16:36:22,150-0600 ERROR (periodic/3) [virt.periodic.Operation] 
>  operation failed 
> (periodic:215)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 213, in 
> __call__
> self._func()
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 522, in 
> __call__
> self._send_metrics()
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 538, in 
> _send_metrics
> vm_sample.interval)
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line 45, in 
> produce
> networks(vm, stats, first_sample, last_sample, interval)
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line 322, in 
> networks
> if nic.name.startswith('hostdev'):
> AttributeError: name
not relevant to your issue, just fyi
this error is not significant and fixed by 
9a2f73a4384e1d72c3285ef88876e404ec8228ff now

> 2017-12-27 16:36:22,620-0600 INFO  (periodic/1) [vdsm.api] START 
> repoStats(domains=()) from=internal, 
> task_id=94688cf1-a991-433e-9e22-7065ed5dc1bf (api:46)
> 2017-12-27 16:36:22,620-0600 INFO  (periodic/1) [vdsm.api] FINISH repoStats 
> return={} from=internal, task_id=94688cf1-a991-433e-9e22-7065ed5dc1bf (api:52)
> 2017-12-27 16:36:22,621-0600 INFO  (periodic/1) [vdsm.api] START 
> multipath_health() from=internal, 
> task_id=9c680369-8f2a-439e-8fe5-b2a1e33c0706 (api:46)
> 2017-12-27 16:36:22,622-0600 INFO  (periodic/1) [vdsm.api] FINISH 
> multipath_health return={} from=internal, 
> task_id=9c680369-8f2a-439e-8fe5-b2a1e33c0706 (api:52)
> 2017-12-27 16:36:22,633-0600 ERROR (periodic/1) [root] failed to retrieve 
> Hosted Engine HA score '[Errno 2] No such file or directory'Is the Hosted 
> Engine setup finished? (api:196)
> 2017-12-27 16:36:23,178-0600 INFO  (vmrecovery) [vdsm.api] START 
> getConnectedStoragePoolsList(options=None) from=internal, 
> task_id=a7e48a2f-8cb7-4ec5-acd7-452c8f0c522b (api:46)
> 2017-12-27 16:36:23,179-0600 INFO  (vmrecovery) [vdsm.api] FINISH 
> getConnectedStoragePoolsList return={'poollist': []} from=internal, 
> task_id=a7e48a2f-8cb7-4ec5-acd7-452c8f0c522b (api:52)
> 2017-12-27 16:36:23,179-0600 INFO  (vmrecovery) [vds] recovery: waiting for 
> storage pool to go up (clientIF:643)
> 
> sigh...reinstall 7.4 and do it all over again.
> 
> 2) copying data to master storage pool.  Didn't wipe it out this time, but 
> filled the volume instead.  Environment freezes.
>  vdsm can't start...infinite loop waiting for storage pool again.  Try clean 
> up and redeploy.  Same problem as above.
> 7.4 reinstall #2 here we go...
> 
> 3)Up and running again.  Forgot to add my NIC card. Shut it down.  Boot back 
> up.  vdsm sees new network interfaces.
> for some reason, it switches ovirtmgmt over to one of the new interfaces 
> which doesn't have a cable 
> attached to it.  Clean up ifcfg- files and reboot.  ifcfg-ovirtmgmt is now 
> gone.  recreate and reboot.  Interface
> comes alive, but vdsm is not starting.
> supervdsm log shows:
>  Multiple southbound ports per network detected, ignoring this network for 
> the QoS report (network: ovirtmgmt, ports: ['enp3s0', 'enp4s0'])
> restore-net::DEBUG::2017-12-27 13:10:39,815::cmdutils::150::root::(exec_cmd) 
> /usr/share/openvswitch/scripts/ovs-ctl status (cwd None)
> restore-net::DEBUG::2017-12-27 13:10:39,856::cmdutils::158::root::(exec_cmd) 
> SUCCESS:  = '';  = 0
> restore-net::DEBUG::2017-12-27 13:10:39,863::vsctl::58::root::(commit) 
> Executing commands: /usr/bin/ovs-vsctl --oneline --format=json -- list Bridge 
> -- list Port -- list Interface
> restore-net::DEBUG::2017-12-27 13:10:39,864::cmdutils::150::root::(exec_cmd) 
> /usr/bin/ovs-vsctl --oneline --format=json -- list Bridge -- list Port -- 
> list Interface (cwd None)
> restore-net::DEBUG::2017-12-27 13:10:39,944::cmdutils::158::root::(exec_cmd) 
> SUCCESS:  = '';  = 0
> restore-net::ERROR::2017-12-27 
> 13:10:39,954::restore_net_config::454::root::(restore) unified restoration 
> failed.
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/network/restore_net_config.py", 
> line 448, in restore
> unified_restoration()
>   File "/usr/lib/python2.7/site-packag

[ovirt-users] 3 strikes....

2017-12-27 Thread Blaster

Well, I've spent the last 2.5 days trying to get oVirt 4.2 up and running.

I sneeze on it, vdsm has a conniption and there appears to be no way to 
recover from it.


1) Install 4.2.  Everything looks good.  Start copying over some 
data..accidently wipe out the master storage domain...It's gone. The 
only method google could suggest was to re-initialize the data center.  
Great.  I'd love to!  It's greyed out.  Can't get it back...Try several 
hosted-engine uninstall methods, including


/usr/sbin/ovirt-hosted-engine-cleanup and wiping out the storage.

re-run hosted-engine --deploy
  All I get over and over in the vdsm log file while waiting for vdsm to become 
operational is..
2017-12-27 16:36:22,150-0600 ERROR (periodic/3) [virt.periodic.Operation] 
 operation failed 
(periodic:215)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 213, in 
__call__
self._func()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 522, in 
__call__
self._send_metrics()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 538, in 
_send_metrics
vm_sample.interval)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line 45, in 
produce
networks(vm, stats, first_sample, last_sample, interval)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line 322, in 
networks
if nic.name.startswith('hostdev'):
AttributeError: name
2017-12-27 16:36:22,620-0600 INFO  (periodic/1) [vdsm.api] START 
repoStats(domains=()) from=internal, 
task_id=94688cf1-a991-433e-9e22-7065ed5dc1bf (api:46)
2017-12-27 16:36:22,620-0600 INFO  (periodic/1) [vdsm.api] FINISH repoStats 
return={} from=internal, task_id=94688cf1-a991-433e-9e22-7065ed5dc1bf (api:52)
2017-12-27 16:36:22,621-0600 INFO  (periodic/1) [vdsm.api] START 
multipath_health() from=internal, task_id=9c680369-8f2a-439e-8fe5-b2a1e33c0706 
(api:46)
2017-12-27 16:36:22,622-0600 INFO  (periodic/1) [vdsm.api] FINISH 
multipath_health return={} from=internal, 
task_id=9c680369-8f2a-439e-8fe5-b2a1e33c0706 (api:52)
2017-12-27 16:36:22,633-0600 ERROR (periodic/1) [root] failed to retrieve 
Hosted Engine HA score '[Errno 2] No such file or directory'Is the Hosted 
Engine setup finished? (api:196)
2017-12-27 16:36:23,178-0600 INFO  (vmrecovery) [vdsm.api] START 
getConnectedStoragePoolsList(options=None) from=internal, 
task_id=a7e48a2f-8cb7-4ec5-acd7-452c8f0c522b (api:46)
2017-12-27 16:36:23,179-0600 INFO  (vmrecovery) [vdsm.api] FINISH 
getConnectedStoragePoolsList return={'poollist': []} from=internal, 
task_id=a7e48a2f-8cb7-4ec5-acd7-452c8f0c522b (api:52)
2017-12-27 16:36:23,179-0600 INFO  (vmrecovery) [vds] recovery: waiting for 
storage pool to go up (clientIF:643)

sigh...reinstall 7.4 and do it all over again.

2) copying data to master storage pool.  Didn't wipe it out this time, but 
filled the volume instead.  Environment freezes.
 vdsm can't start...infinite loop waiting for storage pool again.  Try clean up 
and redeploy.  Same problem as above.
7.4 reinstall #2 here we go...

3)Up and running again.  Forgot to add my NIC card. Shut it down.  Boot back 
up.  vdsm sees new network interfaces.
for some reason, it switches ovirtmgmt over to one of the new interfaces which 
doesn't have a cable
attached to it.  Clean up ifcfg- files and reboot.  ifcfg-ovirtmgmt is now 
gone.  recreate and reboot.  Interface
comes alive, but vdsm is not starting.
supervdsm log shows:
 Multiple southbound ports per network detected, ignoring this network for the 
QoS report (network: ovirtmgmt, ports: ['enp3s0', 'enp4s0'])
restore-net::DEBUG::2017-12-27 13:10:39,815::cmdutils::150::root::(exec_cmd) 
/usr/share/openvswitch/scripts/ovs-ctl status (cwd None)
restore-net::DEBUG::2017-12-27 13:10:39,856::cmdutils::158::root::(exec_cmd) SUCCESS: 
 = '';  = 0
restore-net::DEBUG::2017-12-27 13:10:39,863::vsctl::58::root::(commit) 
Executing commands: /usr/bin/ovs-vsctl --oneline --format=json -- list Bridge 
-- list Port -- list Interface
restore-net::DEBUG::2017-12-27 13:10:39,864::cmdutils::150::root::(exec_cmd) 
/usr/bin/ovs-vsctl --oneline --format=json -- list Bridge -- list Port -- list 
Interface (cwd None)
restore-net::DEBUG::2017-12-27 13:10:39,944::cmdutils::158::root::(exec_cmd) SUCCESS: 
 = '';  = 0
restore-net::ERROR::2017-12-27 
13:10:39,954::restore_net_config::454::root::(restore) unified restoration 
failed.
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/network/restore_net_config.py", 
line 448, in restore
unified_restoration()
  File "/usr/lib/python2.7/site-packages/vdsm/network/restore_net_config.py", 
line 131, in unified_restoration
classified_conf = _classify_nets_bonds_config(available_config)
  File "/usr/lib/python2.7/site-packages/vdsm/network/restore_net_config.py", 
line 260, in _classify_nets_bonds_config
current_config = kernelconfig.KernelConfig(net_info)
  File "/usr/lib/python2.7/site-packages/vdsm/netw