I always used IP addresses instead of names when I added a peer. In the gluster peer status, I do see IP:
[root@DC-MTL-NAS-01 ~]# gluster peer status Number of Peers: 2 Hostname: XXX.XXX.XXX.12 Uuid: ec1e10c1-0e38-4d2a-ab51-50fb0c67b6ee State: Peer in Cluster (Connected) Hostname: XXX.XXX.XXX.13 Uuid: eef75e55-170a-4621-9d6e-3b5c3a6e5561 State: Accepted peer request (Disconnected) I can ping those IPs from any server. >From the Server 3 Gluster logs, I can see this: [2017-10-24 12:31:33.012446] I [MSGID: 100030] [glusterfsd.c:2503:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.10.6 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO) [2017-10-24 12:31:33.020739] I [MSGID: 106478] [glusterd.c:1449:init] 0-management: Maximum allowed open file descriptors set to 65536 [2017-10-24 12:31:33.020796] I [MSGID: 106479] [glusterd.c:1496:init] 0-management: Using /var/lib/glusterd as working directory [2017-10-24 12:31:33.029673] E [rpc-transport.c:283:rpc_transport_load] 0-rpc-transport: /usr/lib64/glusterfs/3.10.6/rpc-transport/rdma.so: cannot open shared object file: No such file or directory [2017-10-24 12:31:33.029702] W [rpc-transport.c:287:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine [2017-10-24 12:31:33.029715] W [rpcsvc.c:1661:rpcsvc_create_listener] 0-rpc-service: cannot create listener, initing the transport failed [2017-10-24 12:31:33.029731] E [MSGID: 106243] [glusterd.c:1720:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2017-10-24 12:31:33.032226] I [MSGID: 106228] [glusterd.c:500:glusterd_check_gsync_present] 0-glusterd: geo-replication module not installed in the system [No such file or directory] [2017-10-24 12:31:33.032816] I [MSGID: 106513] [glusterd-store.c:2201:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 31000 [2017-10-24 12:31:33.042393] I [MSGID: 106498] [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 [2017-10-24 12:31:33.042474] W [MSGID: 106062] [glusterd-handler.c:3466:glusterd_transport_inet_options_build] 0-glusterd: Failed to get tcp-user-timeout [2017-10-24 12:31:33.042501] I [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2017-10-24 12:31:33.082295] E [MSGID: 101075] [common-utils.c:307:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Name or service not known) [2017-10-24 12:31:33.082331] E [name.c:262:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution failed on host dc-mtl-nas-01.elemenai.lan [2017-10-24 12:31:33.082563] I [MSGID: 106544] [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: eef75e55-170a-4621-9d6e-3b5c3a6e5561 [2017-10-24 12:31:33.082589] I [MSGID: 106004] [glusterd-handler.c:5888:__glusterd_peer_rpc_notify] 0-management: Peer <server1.domain.lan> (<3e190322-78f1-4ef6-80f7-8f48d51c2263>), in state <Accepted peer request>, has disconnected from glusterd. [2017-10-24 12:31:33.117581] E [MSGID: 106187] [glusterd-store.c:4566:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore [2017-10-24 12:31:33.117658] E [MSGID: 101019] [xlator.c:503:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2017-10-24 12:31:33.117678] E [MSGID: 101066] [graph.c:325:glusterfs_graph_init] 0-management: initializing translator failed [2017-10-24 12:31:33.117696] E [MSGID: 101176] [graph.c:681:glusterfs_graph_activate] 0-graph: init failed [2017-10-24 12:31:33.118208] W [glusterfsd.c:1360:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7f1a34ba1bcd] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x1b1) [0x7f1a34ba1a71] -->/usr/sbin/glusterd(cleanup_and_exit+0x6b) [0x7f1a34ba0f5b] ) 0-: received signum (1), shutting down server1.domain.lan: Is the server 1 FQDN (not the ip address). Ludwig On Tue, Oct 24, 2017 at 2:16 AM, Bartosz Zięba <[email protected]> wrote: > Are you shure about possibility to resolve all node names on all other > nodes? > You need to use names used previously in Gluster - check their by ‚gluster > peer status’ or ‚gluster pool list’. > > Regards, > Bartosz > > > Wiadomość napisana przez Ludwig Gamache <[email protected]> w dniu > 24.10.2017, o godz. 03:13: > > All, > > I am trying to add a third peer to my gluster install. The first 2 nodes > are running since many months and have gluster 3.10.3-1. > > I recently installed the 3rd node and gluster 3.10.6-1. I was able to > start the gluster daemon on it. After, I tried to add the peer from one of > the 2 previous server (gluster peer probe IPADDRESS). > > That first peer started the communication with the 3rd peer. At that > point, peer status were messed up. Server 1 saw both other servers as > connected. Server 2 only saw server 1 as connected and did not have server > 3 as a peer. Server 3 only had server 1 as a peer and saw it as > disconnected. > > I also found errors in the gluster logs of server 3 that could not be done: > > [2017-10-24 00:15:20.090462] E [name.c:262:af_inet_client_get_remote_sockaddr] > 0-management: DNS resolution failed on host HOST3.DOMAIN.lan > > I rebooted node 3 and now gluster does not even restart on that node. It > keeps giving Name resolution problems. The 2 other nodes are active. > > However, I can ping the 3 servers (one from each others) using their DNS > names. > > Any idea about what to look at? > > _______________________________________________ > Gluster-users mailing list > [email protected] > http://lists.gluster.org/mailman/listinfo/gluster-users > > -- Ludwig Gamache IT Director - Element AI 4200 St-Laurent, suite 1200 514-704-0564
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
