Re: [ovirt-users] Host cannot connect to hosted storage domain

2017-09-29 Thread Alexander Witte
Hi Sandro,

Thanks so much for the reply.

Yeah I ended up reinstalling the Engine referencing the FQDN in the storage 
path.  Fixed all the issues.

Thanks a lot for the Hyperconverged link- at first glance it almost looks like 
the exact setup I am creating and might suit our needs exactly.

Thanks!

Alex Witte
Technology Specialist
Office:  647-349-8484
Mobile: 647-880-7048
www.baicanada.com

[cid:image001.gif@01D0D506.FAB96B00]

On Sep 29, 2017, at 2:21 AM, Sandro Bonazzola 
> wrote:



2017-09-28 2:32 GMT+02:00 Alexander Witte 
>:
Hi!

Question hopefully someone can help me out with:

In my Self Hosted Engine environment, the local storage domain DATA (NFS) that 
was created with the self engine installation has been configured as 
localhost:/shares


So it seems you're trying to do an Hyperconverged setup on local NFS. Please 
note this is not a supported configuration.
If you need to have the storage on the hosts you use for running Self Hosted 
Engine and other VMs, I would suggest to follow 
https://ovirt.org/documentation/gluster-hyperconverged/Gluster_Hyperconverged_Guide/



I suspect this is preventing me from adding any additional hosts to the oVirt 
Datacenter as I am receiving a VDSM error that I cannot mount that domain.  I 
think since the domain is set as localhost it cannot be resolved correct by any 
additional hosts...?

You are correct, being the NFS export referenced as localhost:// it is not 
accessible by other hosts.


 The ISO, Data(Master) and EXPORT domains are set as FQDN:/shares/iso and I am 
not seeing problems specific to them.

I am curious what the correct procedure is to change this hosted engine storage 
domain path from localhost:/shares to FQDN:/shares ?  I have attempted this:

1) Put hosted engine in Global Maintenance Mode
2) Shutdown hosted engine
3) edit the /etc/ovirt-hosted-engine/hosted-engine.conf file file and change:
storage=10.0.0.223:/shares   to
storage=menmaster.traindemo.local:/shares
4) Restart hosted engine


If this is a fresh install, I would suggest to restart from scratch following 
the Hyperconverged guide.


Although I’m not having any luck restarting the hosted engine after and running 
a journalctl -u on the overt-ha-agent service is giving me this:

Sep 27 20:17:19 menmaster.traindemo.local ovirt-ha-agent[2052]: ovirt-ha-agent 
ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call 
last):
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 
191, in _run_agent
return 
action(he)
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 
64, in action_proper
return 
he.start_monitoring()
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
 line 409, in start_monitoring

self._initialize_storage_images(force=True)
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
 line 651, in _initialize_storage_images

img.teardown_images()
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/image.py", line 
218, in teardown_images

volumeID=volUUID,
  File 
"/usr/lib/python2.7/site-packages/vdsm/jsonrpcvdscli.py", line 155, in 
_callMethod

(methodName, args, e))
Exception: 
Attempt to call function: teardownImage with arguments: () error: 
'teardownImage'
Sep 27 20:17:19 menmaster.traindemo.local ovirt-ha-agent[2052]: ovirt-ha-agent 
ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
Sep 27 20:17:24 menmaster.traindemo.local ovirt-ha-agent[2052]: ovirt-ha-agent 
ovirt_hosted_engine_ha.agent.agent.Agent ERROR Too many errors occurred, giving 
up. Please review the log and consider filing
Sep 27 20:17:24 menmaster.traindemo.local systemd[1]: ovirt-ha-agent.service: 
main process exited, code=exited, status=157/n/a
Sep 27 20:17:24 menmaster.traindemo.local systemd[1]: Unit 
ovirt-ha-agent.service entered failed state.
Sep 27 20:17:24 menmaster.traindemo.local systemd[1]: 

Re: [ovirt-users] Host cannot connect to hosted storage domain

2017-09-29 Thread Sandro Bonazzola
2017-09-28 2:32 GMT+02:00 Alexander Witte :

> Hi!
>
> Question hopefully someone can help me out with:
>
> In my Self Hosted Engine environment, the local storage domain DATA (NFS)
> that was created with the self engine installation has been configured as
> localhost:/shares
>


So it seems you're trying to do an Hyperconverged setup on local NFS.
Please note this is not a supported configuration.
If you need to have the storage on the hosts you use for running Self
Hosted Engine and other VMs, I would suggest to follow
https://ovirt.org/documentation/gluster-hyperconverged/Gluster_Hyperconverged_Guide/



>
> I *suspect* this is preventing me from adding any additional hosts to the
> oVirt Datacenter as I am receiving a VDSM error that I cannot mount that
> domain.  I think since the domain is set as localhost it cannot be resolved
> correct by any additional hosts...?
>

You are correct, being the NFS export referenced as localhost:// it is not
accessible by other hosts.



>  The ISO, Data(Master) and EXPORT domains are set as FQDN:/shares/iso and
> I am not seeing problems specific to them.
>
> I am curious what the correct procedure is to change this hosted engine
> storage domain path from *localhost:/shares* to *FQDN:/shares* ?  I have
> attempted this:
>
> 1) Put hosted engine in Global Maintenance Mode
> 2) Shutdown hosted engine
> 3) edit the /etc/ovirt-hosted-engine/hosted-engine.conf file file and
> change:
> storage=10.0.0.223:/shares   to
> storage=menmaster.traindemo.local:/shares
> 4) Restart hosted engine
>
>
If this is a fresh install, I would suggest to restart from scratch
following the Hyperconverged guide.



> Although I’m not having any luck restarting the hosted engine after and
> running a journalctl -u on the overt-ha-agent service is giving me this:
>
> Sep 27 20:17:19 menmaster.traindemo.local ovirt-ha-agent[2052]:
> ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback
> (most recent call last):
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 191, in _run_agent
> return
> action(he)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 64, in action_proper
> return
> he.start_monitoring()
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 409, in start_monitoring
>
> self._initialize_storage_images(force=True)
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 651, in _initialize_storage_images
>
> img.teardown_images()
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/image.py",
> line 218, in teardown_images
>
> volumeID=volUUID,
>   File
> "/usr/lib/python2.7/site-packages/vdsm/jsonrpcvdscli.py", line 155, in
> _callMethod
>
> (methodName, args, e))
> Exception:
> Attempt to call function: teardownImage with arguments: () error:
> 'teardownImage'
> Sep 27 20:17:19 menmaster.traindemo.local ovirt-ha-agent[2052]:
> ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to
> restart agent
> Sep 27 20:17:24 menmaster.traindemo.local ovirt-ha-agent[2052]:
> ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Too many
> errors occurred, giving up. Please review the log and consider filing
> Sep 27 20:17:24 menmaster.traindemo.local systemd[1]:
> ovirt-ha-agent.service: main process exited, code=exited, status=157/n/a
> Sep 27 20:17:24 menmaster.traindemo.local systemd[1]: Unit
> ovirt-ha-agent.service entered failed state.
> Sep 27 20:17:24 menmaster.traindemo.local systemd[1]:
> ovirt-ha-agent.service failed.
> Sep 27 20:17:25 menmaster.traindemo.local systemd[1]:
> ovirt-ha-agent.service holdoff time over, scheduling restart.
> Sep 27 20:17:25 menmaster.traindemo.local systemd[1]: Started oVirt Hosted
> Engine High Availability Monitoring Agent.
> Sep 27 20:17:25 menmaster.traindemo.local systemd[1]: Starting oVirt
> Hosted Engine High Availability Monitoring Agent...
> Sep 27 20:17:35 menmaster.traindemo.local ovirt-ha-agent[2626]:
> ovirt-ha-agent ovirt_hosted_engine_ha.lib.storage_server.StorageServer
> ERROR The hosted-engine storage domain is already mounted on '/rhev/d
> Sep 27 20:17:42 menmaster.traindemo.local ovirt-ha-agent[2626]:
> ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback
> (most recent call 

Re: [ovirt-users] Host cannot connect to hosted storage domain

2017-09-28 Thread Alexander Witte
What is the correct procedure to change the hosted_storage NFS path?

Right now:
localhost:/shares
Change to:
menmaster.traindemo.local:/shares

1) Put VM in global maintenance
2) Shutdown VM
3) Edit /etc/ovirt-hosted-engine/hosted-engine.conf
4) Restart VM
5) Exit global maintenance

Is this correct?

I think having the localhost in the storage domain path is preventing hosts 
being added to the oVirt datacenter object.

Thanks,

Alex Witte


[cid:image001.gif@01D0D506.FAB96B00]

On Sep 27, 2017, at 11:23 PM, Alexander Witte 
> wrote:

OK after a host reboot I was able to get the Engine VM up again and into the 
Web interface.  Although continually when I try a second host to the datacenter 
within oVirt I run into the error:

"Host mennode2 cannot access the Storage Domain(s) hosted_storage attached to 
the Data Center Train1.  Setting Host state to Non Operational."

Note:  I can successfully read (and mount) the NFS exports oVirt is complaining 
about:

[root@mennode2 ~]# showmount -e menmaster.traindemo.local
Export list for menmaster.traindemo.local:
/shares *
/shares/exports *
/shares/data*
/shares/isos*
[root@mennode2 ~]#

[root@mennode2 tmp]# mount -t nfs menmaster.traindemo.local:/shares test
[root@mennode2 tmp]# cd test
[root@mennode2 test]# ls
7d18ff24-57a3-4b4a-9934-0263191fe2e4  data  __DIRECT_IO_TEST__  exports  isos
[root@mennode2 test]#

One thing I DO notice is the path of my exports is the difference in the path 
for the hosted_storage domain.  I wonder if the second host would have issues 
resolving this?

Data. ==> menmaster.traindemo.local:/shares/data
Export  ==> menmaster.traindemo.local:/shares/exports
Hosted_storage. ==> localhost:/shares
ISO. ==> menmaster.traindemo.local:/shares/exports


Below I have copied exports from the VDSM and OVIRT-ENGINE log and have copied 
the output of the hosted-engine.conf file.  Any help in pinpointing the source 
of the problem is greatly appreciated!!


VSDM logs:

2017-09-27 23:10:01,700-0400 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call 
Host.getStats succeeded in 0.06 seconds (__init__:539)
2017-09-27 23:10:03,104-0400 INFO  (jsonrpc/5) [vdsm.api] START 
getSpmStatus(spUUID=u'59c7f8f3-0063-00a8-02c7-00f3', options=None) 
from=:::10.0.0.227,39748, flow_id=19e2dbb3, 
task_id=7a05a5bd-9b15-43be-890d-c6f5d7650e5c (api:46)
2017-09-27 23:10:03,109-0400 INFO  (jsonrpc/5) [vdsm.api] FINISH getSpmStatus 
return={'spm_st': {'spmId': 1, 'spmStatus': 'SPM', 'spmLver': 8L}} 
from=:::10.0.0.227,39748, flow_id=19e2dbb3, 
task_id=7a05a5bd-9b15-43be-890d-c6f5d7650e5c (api:52)
2017-09-27 23:10:03,110-0400 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call 
StoragePool.getSpmStatus succeeded in 0.01 seconds (__init__:539)
2017-09-27 23:10:03,189-0400 INFO  (jsonrpc/7) [vdsm.api] START 
getStoragePoolInfo(spUUID=u'59c7f8f3-0063-00a8-02c7-00f3', 
options=None) from=:::10.0.0.227,39866, flow_id=19e2dbb3, 
task_id=0f8b49e9-9d82-457e-a2a2-39dc7ed9f022 (api:46)
2017-09-27 23:10:03,196-0400 INFO  (jsonrpc/7) [vdsm.api] FINISH 
getStoragePoolInfo return={'info': {'name': 'No Description', 'isoprefix': 
u'/rhev/data-center/mnt/menmaster.traindemo.local:_shares_isos/da001a29-eca5-44d6-a097-129dd9be623f/images/----',
 'pool_status': 'connected', 'lver': 8L, 'domains': 
u'da001a29-eca5-44d6-a097-129dd9be623f:Active,f36157cc-b25a-400a-ab0f-a071e8a8eea7:Active,7d18ff24-57a3-4b4a-9934-0263191fe2e4:Active,795d4a1d-3ceb-4773-99de-8e7cf05112f3:Active',
 'master_uuid': u'f36157cc-b25a-400a-ab0f-a071e8a8eea7', 'version': '4', 
'spm_id': 1, 'type': 'NFS', 'master_ver': 1}, 'dominfo': 
{u'da001a29-eca5-44d6-a097-129dd9be623f': {'status': u'Active', 'diskfree': 
'1044166737920', 'isoprefix': 
u'/rhev/data-center/mnt/menmaster.traindemo.local:_shares_isos/da001a29-eca5-44d6-a097-129dd9be623f/images/----',
 'alerts': [], 'disktotal': '1049702170624', 'version': 0}, 
u'f36157cc-b25a-400a-ab0f-a071e8a8eea7': {'status': u'Active', 'diskfree': 
'1044166737920', 'isoprefix': '', 'alerts': [], 'disktotal': '1049702170624', 
'version': 4}, u'7d18ff24-57a3-4b4a-9934-0263191fe2e4': {'status': u'Active', 
'diskfree': '1044166737920', 'isoprefix': '', 'alerts': [], 'disktotal': 
'1049702170624', 'version': 4}, u'795d4a1d-3ceb-4773-99de-8e7cf05112f3': 
{'status': u'Active', 'diskfree': '1044166737920', 'isoprefix': '', 'alerts': 
[], 'disktotal': '1049702170624', 'version': 0}}} from=:::10.0.0.227,39866, 
flow_id=19e2dbb3, task_id=0f8b49e9-9d82-457e-a2a2-39dc7ed9f022 (api:52)
2017-09-27 23:10:03,198-0400 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call 
StoragePool.getInfo succeeded in 0.01 seconds (__init__:539)
2017-09-27 23:10:04,212-0400 INFO  (Reactor thread) 
[ProtocolDetector.AcceptorImpl] Accepted connection from ::1:35980 
(protocoldetector:72)
2017-09-27 23:10:04,223-0400 INFO  (Reactor thread) 

Re: [ovirt-users] Host cannot connect to hosted storage domain

2017-09-27 Thread Alexander Witte
OK after a host reboot I was able to get the Engine VM up again and into the 
Web interface.  Although continually when I try a second host to the datacenter 
within oVirt I run into the error:

"Host mennode2 cannot access the Storage Domain(s) hosted_storage attached to 
the Data Center Train1.  Setting Host state to Non Operational."

Note:  I can successfully read (and mount) the NFS exports oVirt is complaining 
about:

[root@mennode2 ~]# showmount -e menmaster.traindemo.local
Export list for menmaster.traindemo.local:
/shares *
/shares/exports *
/shares/data*
/shares/isos*
[root@mennode2 ~]#

[root@mennode2 tmp]# mount -t nfs menmaster.traindemo.local:/shares test
[root@mennode2 tmp]# cd test
[root@mennode2 test]# ls
7d18ff24-57a3-4b4a-9934-0263191fe2e4  data  __DIRECT_IO_TEST__  exports  isos
[root@mennode2 test]#

One thing I DO notice is the path of my exports is the difference in the path 
for the hosted_storage domain.  I wonder if the second host would have issues 
resolving this?

Data. ==> menmaster.traindemo.local:/shares/data
Export  ==> menmaster.traindemo.local:/shares/exports
Hosted_storage. ==> localhost:/shares
ISO. ==> menmaster.traindemo.local:/shares/exports


Below I have copied exports from the VDSM and OVIRT-ENGINE log and have copied 
the output of the hosted-engine.conf file.  Any help in pinpointing the source 
of the problem is greatly appreciated!!


VSDM logs:

2017-09-27 23:10:01,700-0400 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call 
Host.getStats succeeded in 0.06 seconds (__init__:539)
2017-09-27 23:10:03,104-0400 INFO  (jsonrpc/5) [vdsm.api] START 
getSpmStatus(spUUID=u'59c7f8f3-0063-00a8-02c7-00f3', options=None) 
from=:::10.0.0.227,39748, flow_id=19e2dbb3, 
task_id=7a05a5bd-9b15-43be-890d-c6f5d7650e5c (api:46)
2017-09-27 23:10:03,109-0400 INFO  (jsonrpc/5) [vdsm.api] FINISH getSpmStatus 
return={'spm_st': {'spmId': 1, 'spmStatus': 'SPM', 'spmLver': 8L}} 
from=:::10.0.0.227,39748, flow_id=19e2dbb3, 
task_id=7a05a5bd-9b15-43be-890d-c6f5d7650e5c (api:52)
2017-09-27 23:10:03,110-0400 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call 
StoragePool.getSpmStatus succeeded in 0.01 seconds (__init__:539)
2017-09-27 23:10:03,189-0400 INFO  (jsonrpc/7) [vdsm.api] START 
getStoragePoolInfo(spUUID=u'59c7f8f3-0063-00a8-02c7-00f3', 
options=None) from=:::10.0.0.227,39866, flow_id=19e2dbb3, 
task_id=0f8b49e9-9d82-457e-a2a2-39dc7ed9f022 (api:46)
2017-09-27 23:10:03,196-0400 INFO  (jsonrpc/7) [vdsm.api] FINISH 
getStoragePoolInfo return={'info': {'name': 'No Description', 'isoprefix': 
u'/rhev/data-center/mnt/menmaster.traindemo.local:_shares_isos/da001a29-eca5-44d6-a097-129dd9be623f/images/----',
 'pool_status': 'connected', 'lver': 8L, 'domains': 
u'da001a29-eca5-44d6-a097-129dd9be623f:Active,f36157cc-b25a-400a-ab0f-a071e8a8eea7:Active,7d18ff24-57a3-4b4a-9934-0263191fe2e4:Active,795d4a1d-3ceb-4773-99de-8e7cf05112f3:Active',
 'master_uuid': u'f36157cc-b25a-400a-ab0f-a071e8a8eea7', 'version': '4', 
'spm_id': 1, 'type': 'NFS', 'master_ver': 1}, 'dominfo': 
{u'da001a29-eca5-44d6-a097-129dd9be623f': {'status': u'Active', 'diskfree': 
'1044166737920', 'isoprefix': 
u'/rhev/data-center/mnt/menmaster.traindemo.local:_shares_isos/da001a29-eca5-44d6-a097-129dd9be623f/images/----',
 'alerts': [], 'disktotal': '1049702170624', 'version': 0}, 
u'f36157cc-b25a-400a-ab0f-a071e8a8eea7': {'status': u'Active', 'diskfree': 
'1044166737920', 'isoprefix': '', 'alerts': [], 'disktotal': '1049702170624', 
'version': 4}, u'7d18ff24-57a3-4b4a-9934-0263191fe2e4': {'status': u'Active', 
'diskfree': '1044166737920', 'isoprefix': '', 'alerts': [], 'disktotal': 
'1049702170624', 'version': 4}, u'795d4a1d-3ceb-4773-99de-8e7cf05112f3': 
{'status': u'Active', 'diskfree': '1044166737920', 'isoprefix': '', 'alerts': 
[], 'disktotal': '1049702170624', 'version': 0}}} from=:::10.0.0.227,39866, 
flow_id=19e2dbb3, task_id=0f8b49e9-9d82-457e-a2a2-39dc7ed9f022 (api:52)
2017-09-27 23:10:03,198-0400 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call 
StoragePool.getInfo succeeded in 0.01 seconds (__init__:539)
2017-09-27 23:10:04,212-0400 INFO  (Reactor thread) 
[ProtocolDetector.AcceptorImpl] Accepted connection from ::1:35980 
(protocoldetector:72)
2017-09-27 23:10:04,223-0400 INFO  (Reactor thread) [ProtocolDetector.Detector] 
Detected protocol stomp from ::1:35980 (protocoldetector:127)
2017-09-27 23:10:04,224-0400 INFO  (Reactor thread) [Broker.StompAdapter] 
Processing CONNECT request (stompreactor:103)
2017-09-27 23:10:04,225-0400 INFO  (JsonRpc (StompReactor)) 
[Broker.StompAdapter] Subscribe command received (stompreactor:132)
2017-09-27 23:10:04,622-0400 INFO  (monitor/7d18ff2) [storage.SANLock] 
Acquiring host id for domain 7d18ff24-57a3-4b4a-9934-0263191fe2e4 (id=1, 
async=True) (clusterlock:288)
2017-09-27 23:10:04,623-0400 ERROR (monitor/7d18ff2) [storage.Monitor] Error 
acquiring host id 1 for