[ovirt-users] Re: HE deploy fails at "Initialize lockspace volume" step

2023-10-23 Thread Giuliano David
Just a note about QNAP iSCSI: I hope to have the chance to throw it in 
the wastebin the that device as soon as possible :) , but for complete 
information I have to say that I was able to add a new storage domain 
using QNAP iSCSI LUNs after the HE was setup on different target.

So the problem was only within the HE deployment.


On 23/10/23 18:27, Giuliano David wrote:

Thanks for your suggestion.
I checked the /etc/iscsi/initiatorname.iscsi content of all my nodes, 
and they are uniques.
The iSCSI target was a QNAP system with 10Gb/s NIC. I set up a new 
target on a Debian server and the deploy ended successfully (many 
other errors on iSCSI deploying the HE, but i managed to solve them)

It is now clear to me that:
- QNAP iSCSI target introduces something nasty that oVirt HE deploy 
cannot manage
- ovirt-hosted-engine-cleanup is not enough to clean the HE setup 
environment on the node: manual clean of some directories and a reboot 
are necessary too.
- The ansible script used to deploy oVirt HE is s fragile ... 
One thing should be implemented on it: the chance to suspend the 
script on failures, let the administrator fix the issue and then 
resume the script from the failing step (instead of aborting the 
deployment and performing the cleanup messing up the logfile too)


Thanks

giuliano


On 21/10/23 21:08, Strahil Nikolov wrote:
Simplest thing to check is if your host can discover and write to the 
LUN. Is it possible that more than 1 node has the same client IQN ?


Best Regards,
Strahil Nikolov




On Friday, October 20, 2023, 12:38 PM, Giuliano David 
 wrote:


Hi everyone.
I need help understanding a failure deploying the hosted engine on a
fresh-installed oVirt 4.5.4 el8 node.
After the setup via official ISO, I login via ssh in the node and I
issue the command:

# hosted-engine --deploy --4
--ansible-extra-vars=he_offline_deployment=true
-- Note --
The extra ansible variable is the only way I found to inhibit the
deployed hosted engine downloading last OS updates that will break
Python compatibility between the ansible playbook in the node
deploying
and the ansible host in the engine deployed.
Without that extra variable the deployment fails with fancy reasons.
-- End note --

The deployment goes, until i specify an iSCSI target and a (free)
LUN.
The playbook adds the storage domain, creates the HE disk and
transfert
the HE vm to the domain. Then an error occurs:

[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Initialize
lockspace
volume]
[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Workaround for
ovirt-ha-broker start failures]
[ INFO  ] changed: [localhost]
[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Initialize
lockspace
volume]
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 5, "changed":
true, "cmd": ["hosted-engine", "--reinitialize-lockspace",
"--force"],
"delta": "0:00:00.170053", "end": "2023-10-20 11:21:18.111299",
"msg":
"non-zero return code", "rc": 1, "start": "2023-10-20
11:21:17.941246",
"stderr": "Traceback (most recent call last):\n  File
\"/usr/lib64/python3.6/runpy.py\", line 193, in
_run_module_as_main\n
\"__main__\", mod_spec)\n File \"/usr/lib64/python3.6/runpy.py\",
line
85, in _run_code\n exec(code, run_globals)\n  File

\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/reinitialize_lockspace.py\",

line 30, in \n ha_cli.reset_lockspace(force)\n File

\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/client/client.py\",

line 286, in reset_lockspace\n    stats =
broker.get_stats_from_storage()\n  File

\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py\",

line 148, in get_stats_from_storage\n result =
self._proxy.get_stats()\n  File
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1112, in __call__\n
return self.__send(self.__name, args)\n File
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1452, in __request\n
verbose=self.__verbose\n  File
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1154, in request\n
return self.single_request(host, handler, request_body,
verbose)\n  File
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1166, in
single_request\n    http_conn = self.send_request(host, handler,
request_body, verbose)\n  File
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1279, in
send_request\n self.send_content(connection, request_body)\n File
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1309, in
send_content\n connection.endheaders(request_body)\n  File
\"/usr/lib64/python3.6/http/client.py\", line 1268, in endheaders\n
self._send_output(message_body, encode_chunked=encode_chunked)\n 
File
\"/usr/lib64/python3.6/http/client.py\", line 1044, in
_send_output\n
self.send(msg)\n  File 

[ovirt-users] Re: HE deploy fails at "Initialize lockspace volume" step

2023-10-23 Thread Giuliano David

Thanks for your suggestion.
I checked the /etc/iscsi/initiatorname.iscsi content of all my nodes, 
and they are uniques.
The iSCSI target was a QNAP system with 10Gb/s NIC. I set up a new 
target on a Debian server and the deploy ended successfully (many other 
errors on iSCSI deploying the HE, but i managed to solve them)

It is now clear to me that:
- QNAP iSCSI target introduces something nasty that oVirt HE deploy 
cannot manage
- ovirt-hosted-engine-cleanup is not enough to clean the HE setup 
environment on the node: manual clean of some directories and a reboot 
are necessary too.
- The ansible script used to deploy oVirt HE is s fragile ... 
One thing should be implemented on it: the chance to suspend the script 
on failures, let the administrator fix the issue and then resume the 
script from the failing step (instead of aborting the deployment and 
performing the cleanup messing up the logfile too)


Thanks

giuliano


On 21/10/23 21:08, Strahil Nikolov wrote:
Simplest thing to check is if your host can discover and write to the 
LUN. Is it possible that more than 1 node has the same client IQN ?


Best Regards,
Strahil Nikolov




On Friday, October 20, 2023, 12:38 PM, Giuliano David 
 wrote:


Hi everyone.
I need help understanding a failure deploying the hosted engine on a
fresh-installed oVirt 4.5.4 el8 node.
After the setup via official ISO, I login via ssh in the node and I
issue the command:

# hosted-engine --deploy --4
--ansible-extra-vars=he_offline_deployment=true
-- Note --
The extra ansible variable is the only way I found to inhibit the
deployed hosted engine downloading last OS updates that will break
Python compatibility between the ansible playbook in the node
deploying
and the ansible host in the engine deployed.
Without that extra variable the deployment fails with fancy reasons.
-- End note --

The deployment goes, until i specify an iSCSI target and a (free) LUN.
The playbook adds the storage domain, creates the HE disk and
transfert
the HE vm to the domain. Then an error occurs:

[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Initialize
lockspace
volume]
[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Workaround for
ovirt-ha-broker start failures]
[ INFO  ] changed: [localhost]
[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Initialize
lockspace
volume]
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 5, "changed":
true, "cmd": ["hosted-engine", "--reinitialize-lockspace",
"--force"],
"delta": "0:00:00.170053", "end": "2023-10-20 11:21:18.111299",
"msg":
"non-zero return code", "rc": 1, "start": "2023-10-20
11:21:17.941246",
"stderr": "Traceback (most recent call last):\n  File
\"/usr/lib64/python3.6/runpy.py\", line 193, in _run_module_as_main\n
\"__main__\", mod_spec)\n File \"/usr/lib64/python3.6/runpy.py\",
line
85, in _run_code\n exec(code, run_globals)\n File

\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/reinitialize_lockspace.py\",

line 30, in \n ha_cli.reset_lockspace(force)\n File

\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/client/client.py\",

line 286, in reset_lockspace\n    stats =
broker.get_stats_from_storage()\n  File

\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py\",

line 148, in get_stats_from_storage\n result =
self._proxy.get_stats()\n  File
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1112, in __call__\n
return self.__send(self.__name, args)\n  File
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1452, in __request\n
verbose=self.__verbose\n  File
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1154, in request\n
return self.single_request(host, handler, request_body,
verbose)\n  File
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1166, in
single_request\n    http_conn = self.send_request(host, handler,
request_body, verbose)\n  File
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1279, in
send_request\n self.send_content(connection, request_body)\n File
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1309, in
send_content\n connection.endheaders(request_body)\n  File
\"/usr/lib64/python3.6/http/client.py\", line 1268, in endheaders\n
self._send_output(message_body, encode_chunked=encode_chunked)\n 
File
\"/usr/lib64/python3.6/http/client.py\", line 1044, in _send_output\n
self.send(msg)\n  File \"/usr/lib64/python3.6/http/client.py\", line
982, in send\n self.connect()\n  File
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py\",

line 76, in connect\n
self.sock.connect(base64.b16decode(self.host))\nFileNotFoundError:
[Errno 2] No such file or directory", "stderr_lines": ["Traceback
(most
recent call last):", "  File \"/usr/lib64/python3.6/runpy.py\", 

[ovirt-users] Re: HE deploy fails at "Initialize lockspace volume" step

2023-10-21 Thread Strahil Nikolov via Users
Simplest thing to check is if your host can discover and write to the LUN. Is 
it possible that more than 1 node has the same client IQN ?
Best Regards,Strahil Nikolov 





On Friday, October 20, 2023, 12:38 PM, Giuliano David 
 wrote:

Hi everyone.
I need help understanding a failure deploying the hosted engine on a 
fresh-installed oVirt 4.5.4 el8 node.
After the setup via official ISO, I login via ssh in the node and I 
issue the command:

# hosted-engine --deploy --4 --ansible-extra-vars=he_offline_deployment=true
-- Note --
The extra ansible variable is the only way I found to inhibit the 
deployed hosted engine downloading last OS updates that will break 
Python compatibility between the ansible playbook in the node deploying 
and the ansible host in the engine deployed.
Without that extra variable the deployment fails with fancy reasons.
-- End note --

The deployment goes, until i specify an iSCSI target and a (free) LUN.
The playbook adds the storage domain, creates the HE disk and transfert 
the HE vm to the domain. Then an error occurs:

[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Initialize lockspace 
volume]
[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Workaround for 
ovirt-ha-broker start failures]
[ INFO  ] changed: [localhost]
[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Initialize lockspace 
volume]
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 5, "changed": 
true, "cmd": ["hosted-engine", "--reinitialize-lockspace", "--force"], 
"delta": "0:00:00.170053", "end": "2023-10-20 11:21:18.111299", "msg": 
"non-zero return code", "rc": 1, "start": "2023-10-20 11:21:17.941246", 
"stderr": "Traceback (most recent call last):\n  File 
\"/usr/lib64/python3.6/runpy.py\", line 193, in _run_module_as_main\n    
\"__main__\", mod_spec)\n File \"/usr/lib64/python3.6/runpy.py\", line 
85, in _run_code\n exec(code, run_globals)\n  File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/reinitialize_lockspace.py\",
 
line 30, in \n    ha_cli.reset_lockspace(force)\n File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/client/client.py\", 
line 286, in reset_lockspace\n    stats = 
broker.get_stats_from_storage()\n  File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py\", 
line 148, in get_stats_from_storage\n    result = 
self._proxy.get_stats()\n  File 
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1112, in __call__\n    
return self.__send(self.__name, args)\n  File 
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1452, in __request\n    
verbose=self.__verbose\n  File 
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1154, in request\n    
return self.single_request(host, handler, request_body, verbose)\n  File 
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1166, in 
single_request\n    http_conn = self.send_request(host, handler, 
request_body, verbose)\n  File 
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1279, in 
send_request\n    self.send_content(connection, request_body)\n File 
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1309, in 
send_content\n    connection.endheaders(request_body)\n  File 
\"/usr/lib64/python3.6/http/client.py\", line 1268, in endheaders\n    
self._send_output(message_body, encode_chunked=encode_chunked)\n  File 
\"/usr/lib64/python3.6/http/client.py\", line 1044, in _send_output\n    
self.send(msg)\n  File \"/usr/lib64/python3.6/http/client.py\", line 
982, in send\n self.connect()\n  File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py\", 
line 76, in connect\n 
self.sock.connect(base64.b16decode(self.host))\nFileNotFoundError: 
[Errno 2] No such file or directory", "stderr_lines": ["Traceback (most 
recent call last):", "  File \"/usr/lib64/python3.6/runpy.py\", line 
193, in _run_module_as_main", "    \"__main__\", mod_spec)", "  File 
\"/usr/lib64/python3.6/runpy.py\", line 85, in _run_code", " exec(code, 
run_globals)", "  File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/reinitialize_lockspace.py\",
 
line 30, in ", "    ha_cli.reset_lockspace(force)", " File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/client/client.py\", 
line 286, in reset_lockspace", "    stats = 
broker.get_stats_from_storage()", "  File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py\", 
line 148, in get_stats_from_storage", "    result = 
self._proxy.get_stats()", "  File 
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1112, in __call__", "    
return self.__send(self.__name, args)", "  File 
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1452, in __request", 
"    verbose=self.__verbose", "  File 
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1154, in request", "    
return self.single_request(host, handler, request_body, verbose)", "  
File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1166, in 
single_request", "    http_conn = self.send_request(host, handler, 
request_body, verbose)", "  File