Le 19/12/2016 à 08:28, Sahina Bose a écrit :
On Fri, Dec 16, 2016 at 11:00 PM, Nathanaël Blanchet <[email protected]
<mailto:[email protected]>> wrote:
Le 16/12/2016 à 16:34, Sahina Bose a écrit :
Failed to find host
'Host[guadalupe1,7a30c899-a317-479a-b07b-244bc2374485]' in
gluster peer list from
'Host[guadalupe1,7a30c899-a317-479a-b07b-244bc2374485]' on attempt 2
It looks the gluster uuid saved in the ovirt engine db does not
match the one returned from CLI
Was this host reinstalled?
You may need to remove host from engine and add it again. If that
doesn't work you may need to manually change the uuid value in
the database (gluster_server table)
Removing host did nothing, indeed I had to go to the
gluster_server table to remove any disconnected host uuid, but it
was not enough. Then I had then to remove the host and reinstall
it as a new host.
Thank you, I've been spending a lot of time to solve this issue.
Sorry to hear that you had trouble with this. Could you explain a bit
on how you got into this state?
Was it because you re-provisioned one of the gluster nodes and the
gluster UUID was reset (without oVirt being aware of it?). Would like
to either fix/enhance the engine to handle this if it's a common
enough use-case
When going to the gluster_server table, I realized that there were some
(disconnected) hosts probed with the gluster network IP. A the begining,
I didn't use the gluster network so my hosts were probed on the
management network and all was fine. When I decided to change the
gluster traffic to a dedicated network (you answered to me about it
there : https://www.mail-archive.com/[email protected]/msg37742.html), I
believed that hosts would be probed with the new network IP, but they
didn't. So I manually probed them with the gluster IP, and I think all
my troubles come from there. I reinstalled vdsm, and then nothing was ok
since since this moment.
On Fri, Dec 16, 2016 at 7:00 PM, Nathanaël Blanchet
<[email protected] <mailto:[email protected]>> wrote:
extract of the last engine logs, thank you
Le 16/12/2016 à 14:02, Sahina Bose a écrit :
Could you attach the engine log with this error?
On Fri, Dec 16, 2016 at 4:29 PM, Nathanaël Blanchet
<[email protected] <mailto:[email protected]>> wrote:
Hi,
I used to successfully run a replica 3 gluster volume,
but since the last 4.0.5 update, they can't connect each
other with the message : gluster [gluster peer status
guadalupe1.v100.abes.fr
<http://guadalupe1.v100.abes.fr>] command failed on
server guadalupe2.v100.abes.fr
<http://guadalupe2.v100.abes.fr>.
So host guadalupe1 can't never be up.
When doing gluster peer probe, they are connected as
expected. I reinstalled vdsm and gluster, but it is
still the same.
I found this on guadalupe2 supervdsm.log
MainProcess|jsonrpc.Executor/6::DEBUG::2016-12-16
11:53:21,429::supervdsmServer::99::SuperVdsm.ServerCallback::(wrapper)
return peerStatus with [{'status': 'CONNECTED',
'hostname': '10.34.101.56/24 <http://10.34.101.56/24>',
'uuid': 'c259c09b-8d7c-4b12-8745-677199877583'},
{'status': 'CONNECTED', 'hostname':
'guadalupe3.v100.abes.fr
<http://guadalupe3.v100.abes.fr>', 'uuid':
'6af67cd3-7931-446d-aaa2-ffea51325adc'}, {'status':
'CONNECTED', 'hostname': 'guadalupe1.v100.abes.fr
<http://guadalupe1.v100.abes.fr>', 'uuid':
'8eb485cd-31c4-4c3a-a315-3dc6d3ddc0c9'}]
MainProcess|jsonrpc.Executor/7::DEBUG::2016-12-16
11:53:21,490::supervdsmServer::92::SuperVdsm.ServerCallback::(wrapper)
call peerProbe with () {}
MainProcess|jsonrpc.Executor/7::DEBUG::2016-12-16
11:53:21,491::commands::68::root::(execCmd)
/usr/bin/taskset --cpu-list 0-63 /usr/sbin/gluster
--mode=script peer probe guadalupe1.v100.abes.fr
<http://guadalupe1.v100.abes.fr> --xml (cwd None)
MainProcess|jsonrpc.Executor/7::DEBUG::2016-12-16
11:53:21,570::commands::86::root::(execCmd) SUCCESS:
<err> = ''; <rc> = 0
MainProcess|jsonrpc.Executor/7::DEBUG::2016-12-16
11:53:21,570::supervdsmServer::99::SuperVdsm.ServerCallback::(wrapper)
return peerProbe with True
We can see guadalupe2 can see guadalupe1 but taskset
still executes peer probe to guadalupe1 with message
"Host guadalupe1.v100.abes.fr
<http://guadalupe1.v100.abes.fr> port 24007 already in
peer list"
How can I say to guadalupe2 stop trying to probe guadalupe1?
--
Nathanaël Blanchet
Supervision réseau
Pôle Infrastrutures Informatiques
227 avenue Professeur-Jean-Louis-Viala
34193 MONTPELLIER CEDEX 5
Tél. 33 (0)4 67 54 84 55
Fax 33 (0)4 67 54 84 14
[email protected] <mailto:[email protected]>
_______________________________________________
Users mailing list
[email protected] <mailto:[email protected]>
http://lists.ovirt.org/mailman/listinfo/users
<http://lists.ovirt.org/mailman/listinfo/users>
--
Nathanaël Blanchet
Supervision réseau
Pôle Infrastrutures Informatiques
227 avenue Professeur-Jean-Louis-Viala
34193 MONTPELLIER CEDEX 5
Tél. 33 (0)4 67 54 84 55
Fax 33 (0)4 67 54 84 14
[email protected] <mailto:[email protected]>
--
Nathanaël Blanchet
Supervision réseau
Pôle Infrastrutures Informatiques
227 avenue Professeur-Jean-Louis-Viala
34193 MONTPELLIER CEDEX 5
Tél. 33 (0)4 67 54 84 55
Fax 33 (0)4 67 54 84 14
[email protected] <mailto:[email protected]>
--
Nathanaël Blanchet
Supervision réseau
Pôle Infrastrutures Informatiques
227 avenue Professeur-Jean-Louis-Viala
34193 MONTPELLIER CEDEX 5
Tél. 33 (0)4 67 54 84 55
Fax 33 (0)4 67 54 84 14
[email protected]
_______________________________________________
Users mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/users