Re: [Gluster-users] Upgrade from 3.8.15 to 3.12.5

2018-02-19 Thread rwecker
Thanks That Fixed both issuses 

Russell Wecker 
IT Director 
Southern Asia Pacific Division 


From: "Atin Mukherjee" <amukh...@redhat.com> 
To: "rwecker" <rwec...@ssd.org> 
Cc: "gluster-users" <gluster-users@gluster.org> 
Sent: Monday, February 19, 2018 4:51:56 PM 
Subject: Re: [Gluster-users] Upgrade from 3.8.15 to 3.12.5 

I believe the peer rejected issue is something we recently identified and has 
been fixed through [ https://bugzilla.redhat.com/show_bug.cgi?id=1544637 | 
https://bugzilla.redhat.com/show_bug.cgi?id=1544637 ] and is available in 
3.12.6. I'd request you to upgrade to the latest version in 3.12 series. 

On Mon, Feb 19, 2018 at 12:27 PM, < [ mailto:rwec...@ssd.org | rwec...@ssd.org 
] > wrote: 



Hi, 

I have a 3 node cluster (Found1, Found2, Found2) which i wanted to upgrade I 
upgraded one node from 3.8.15 to 3.12.5 and now i am having multiple problems 
with the install. The 2 nodes not upgraded are still working fine(Found1,2) but 
the one upgraded has Peer Rejected (Connected) when peer status is run but it 
also has multiple brick that have "Transport endpoint is not connected" some 
brick seem to work some do not. 

any help would be appreciated. 

Thanks 


here are the log files 


glusterd.log 
[2018-02-19 05:32:38.589150] I [MSGID: 106478] [glusterd.c:1423:init] 
0-management: Maximum allowed open file descriptors set to 65536 
[2018-02-19 05:32:38.589237] I [MSGID: 106479] [glusterd.c:1481:init] 
0-management: Using /var/lib/glusterd as working directory 
[2018-02-19 05:32:38.589264] I [MSGID: 106479] [glusterd.c:1486:init] 
0-management: Using /var/run/gluster as pid file working directory 
[2018-02-19 05:32:38.609833] W [MSGID: 103071] 
[rdma.c:4630:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel 
creation failed [No such device] 
[2018-02-19 05:32:38.609892] W [MSGID: 103055] [rdma.c:4939:init] 
0-rdma.management: Failed to initialize IB Device 
[2018-02-19 05:32:38.609919] W [rpc-transport.c:350:rpc_transport_load] 
0-rpc-transport: 'rdma' initialization failed 
[2018-02-19 05:32:38.610149] W [rpcsvc.c:1682:rpcsvc_create_listener] 
0-rpc-service: cannot create listener, initing the transport failed 
[2018-02-19 05:32:38.610178] E [MSGID: 106243] [glusterd.c:1769:init] 
0-management: creation of 1 listeners failed, continuing with succeeded 
transport 
[2018-02-19 05:32:49.737152] I [MSGID: 106513] 
[glusterd-store.c:2241:glusterd_restore_op_version] 0-glusterd: retrieved 
op-version: 30712 
[2018-02-19 05:32:50.248992] I [MSGID: 106498] 
[glusterd-handler.c:3603:glusterd_friend_add_from_peerinfo] 0-management: 
connect returned 0 
[2018-02-19 05:32:50.249097] I [MSGID: 106498] 
[glusterd-handler.c:3603:glusterd_friend_add_from_peerinfo] 0-management: 
connect returned 0 
[2018-02-19 05:32:50.249161] W [MSGID: 106062] 
[glusterd-handler.c:3400:glusterd_transport_inet_options_build] 0-glusterd: 
Failed to get tcp-user-timeout 
[2018-02-19 05:32:50.249206] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 
0-management: setting frame-timeout to 600 
[2018-02-19 05:32:50.249327] W [MSGID: 101002] [options.c:995:xl_opt_validate] 
0-management: option 'address-family' is deprecated, preferred is 
'transport.address-family', continuing with correction 
[2018-02-19 05:32:50.254789] W [MSGID: 106062] 
[glusterd-handler.c:3400:glusterd_transport_inet_options_build] 0-glusterd: 
Failed to get tcp-user-timeout 
[2018-02-19 05:32:50.254831] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 
0-management: setting frame-timeout to 600 
[2018-02-19 05:32:50.254908] W [MSGID: 101002] [options.c:995:xl_opt_validate] 
0-management: option 'address-family' is deprecated, preferred is 
'transport.address-family', continuing with correction 
[2018-02-19 05:32:50.258683] I [MSGID: 106544] 
[glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: 
de955a28-c230-4ada-98ba-a8f404ee8827 
Final graph: 
+--+
 
1: volume management 
2: type mgmt/glusterd 
3: option rpc-auth.auth-glusterfs on 
4: option rpc-auth.auth-unix on 
5: option rpc-auth.auth-null on 
6: option rpc-auth-allow-insecure on 
7: option transport.listen-backlog 10 
8: option event-threads 1 
9: option ping-timeout 0 
10: option transport.socket.read-fail-log off 
11: option transport.socket.keepalive-interval 2 
12: option transport.socket.keepalive-time 10 
13: option transport-type rdma 
14: option working-directory /var/lib/glusterd 
15: end-volume 
16: 
+--+
 
[2018-02-19 05:32:50.259384] I [MSGID: 101190] 
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with 
index 1 
[2018-02-19 05:32:50.284115] I [MSGID: 106163] 
[glusterd-handshake.c:1316:__glusterd_mgmt_hndsk_versions_ack] 0-management: 
using the op-version 30712 
[2018-02-19 05:32:50.285320] I [MSGID: 106493

Re: [Gluster-users] Upgrade from 3.8.15 to 3.12.5

2018-02-19 Thread Atin Mukherjee
I believe the peer rejected issue is something we recently identified and
has been fixed through https://bugzilla.redhat.com/show_bug.cgi?id=1544637
and is available in 3.12.6. I'd request you to upgrade to the latest
version in 3.12 series.

On Mon, Feb 19, 2018 at 12:27 PM,  wrote:

> Hi,
>
> I have a 3 node cluster (Found1, Found2, Found2) which i wanted to upgrade
> I upgraded one node from 3.8.15 to 3.12.5 and now i am having multiple
> problems with the install. The 2 nodes not upgraded are still working
> fine(Found1,2) but the one upgraded has Peer Rejected (Connected) when peer
> status is run but it also has multiple brick that have "Transport endpoint
> is not connected"  some brick seem to work some do not.
>
> any help would be appreciated.
>
> Thanks
>
>
> here are the log files
>
>
> glusterd.log
> [2018-02-19 05:32:38.589150] I [MSGID: 106478] [glusterd.c:1423:init]
> 0-management: Maximum allowed open file descriptors set to 65536
> [2018-02-19 05:32:38.589237] I [MSGID: 106479] [glusterd.c:1481:init]
> 0-management: Using /var/lib/glusterd as working directory
> [2018-02-19 05:32:38.589264] I [MSGID: 106479] [glusterd.c:1486:init]
> 0-management: Using /var/run/gluster as pid file working directory
> [2018-02-19 05:32:38.609833] W [MSGID: 103071] 
> [rdma.c:4630:__gf_rdma_ctx_create]
> 0-rpc-transport/rdma: rdma_cm event channel creation failed [No such device]
> [2018-02-19 05:32:38.609892] W [MSGID: 103055] [rdma.c:4939:init]
> 0-rdma.management: Failed to initialize IB Device
> [2018-02-19 05:32:38.609919] W [rpc-transport.c:350:rpc_transport_load]
> 0-rpc-transport: 'rdma' initialization failed
> [2018-02-19 05:32:38.610149] W [rpcsvc.c:1682:rpcsvc_create_listener]
> 0-rpc-service: cannot create listener, initing the transport failed
> [2018-02-19 05:32:38.610178] E [MSGID: 106243] [glusterd.c:1769:init]
> 0-management: creation of 1 listeners failed, continuing with succeeded
> transport
> [2018-02-19 05:32:49.737152] I [MSGID: 106513] 
> [glusterd-store.c:2241:glusterd_restore_op_version]
> 0-glusterd: retrieved op-version: 30712
> [2018-02-19 05:32:50.248992] I [MSGID: 106498] [glusterd-handler.c:3603:
> glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
> [2018-02-19 05:32:50.249097] I [MSGID: 106498] [glusterd-handler.c:3603:
> glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
> [2018-02-19 05:32:50.249161] W [MSGID: 106062] [glusterd-handler.c:3400:
> glusterd_transport_inet_options_build] 0-glusterd: Failed to get
> tcp-user-timeout
> [2018-02-19 05:32:50.249206] I [rpc-clnt.c:1044:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2018-02-19 05:32:50.249327] W [MSGID: 101002] [options.c:995:xl_opt_validate]
> 0-management: option 'address-family' is deprecated, preferred is
> 'transport.address-family', continuing with correction
> [2018-02-19 05:32:50.254789] W [MSGID: 106062] [glusterd-handler.c:3400:
> glusterd_transport_inet_options_build] 0-glusterd: Failed to get
> tcp-user-timeout
> [2018-02-19 05:32:50.254831] I [rpc-clnt.c:1044:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2018-02-19 05:32:50.254908] W [MSGID: 101002] [options.c:995:xl_opt_validate]
> 0-management: option 'address-family' is deprecated, preferred is
> 'transport.address-family', continuing with correction
> [2018-02-19 05:32:50.258683] I [MSGID: 106544]
> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
> de955a28-c230-4ada-98ba-a8f404ee8827
> Final graph:
> +---
> ---+
>   1: volume management
>   2: type mgmt/glusterd
>   3: option rpc-auth.auth-glusterfs on
>   4: option rpc-auth.auth-unix on
>   5: option rpc-auth.auth-null on
>   6: option rpc-auth-allow-insecure on
>   7: option transport.listen-backlog 10
>   8: option event-threads 1
>   9: option ping-timeout 0
> 10: option transport.socket.read-fail-log off
>  11: option transport.socket.keepalive-interval 2
>  12: option transport.socket.keepalive-time 10
>  13: option transport-type rdma
>  14: option working-directory /var/lib/glusterd
>  15: end-volume
>  16:
> +---
> ---+
> [2018-02-19 05:32:50.259384] I [MSGID: 101190] 
> [event-epoll.c:613:event_dispatch_epoll_worker]
> 0-epoll: Started thread with index 1
> [2018-02-19 05:32:50.284115] I [MSGID: 106163]
> [glusterd-handshake.c:1316:__glusterd_mgmt_hndsk_versions_ack]
> 0-management: using the op-version 30712
> [2018-02-19 05:32:50.285320] I [MSGID: 106493] 
> [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk]
> 0-glusterd: Received RJT from uuid: a23fa00c-4c7c-436d-9d04-0c16941c,
> host: found2.ssd.org, port: 0
> [2018-02-19 05:32:50.286561] I [MSGID: 106493] 
> [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk]
> 0-glusterd: Received RJT from uuid: 

[Gluster-users] Upgrade from 3.8.15 to 3.12.5

2018-02-18 Thread rwecker
Hi, 

I have a 3 node cluster (Found1, Found2, Found2) which i wanted to upgrade I 
upgraded one node from 3.8.15 to 3.12.5 and now i am having multiple problems 
with the install. The 2 nodes not upgraded are still working fine(Found1,2) but 
the one upgraded has Peer Rejected (Connected) when peer status is run but it 
also has multiple brick that have "Transport endpoint is not connected" some 
brick seem to work some do not. 

any help would be appreciated. 

Thanks 


here are the log files 


glusterd.log 
[2018-02-19 05:32:38.589150] I [MSGID: 106478] [glusterd.c:1423:init] 
0-management: Maximum allowed open file descriptors set to 65536 
[2018-02-19 05:32:38.589237] I [MSGID: 106479] [glusterd.c:1481:init] 
0-management: Using /var/lib/glusterd as working directory 
[2018-02-19 05:32:38.589264] I [MSGID: 106479] [glusterd.c:1486:init] 
0-management: Using /var/run/gluster as pid file working directory 
[2018-02-19 05:32:38.609833] W [MSGID: 103071] 
[rdma.c:4630:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel 
creation failed [No such device] 
[2018-02-19 05:32:38.609892] W [MSGID: 103055] [rdma.c:4939:init] 
0-rdma.management: Failed to initialize IB Device 
[2018-02-19 05:32:38.609919] W [rpc-transport.c:350:rpc_transport_load] 
0-rpc-transport: 'rdma' initialization failed 
[2018-02-19 05:32:38.610149] W [rpcsvc.c:1682:rpcsvc_create_listener] 
0-rpc-service: cannot create listener, initing the transport failed 
[2018-02-19 05:32:38.610178] E [MSGID: 106243] [glusterd.c:1769:init] 
0-management: creation of 1 listeners failed, continuing with succeeded 
transport 
[2018-02-19 05:32:49.737152] I [MSGID: 106513] 
[glusterd-store.c:2241:glusterd_restore_op_version] 0-glusterd: retrieved 
op-version: 30712 
[2018-02-19 05:32:50.248992] I [MSGID: 106498] 
[glusterd-handler.c:3603:glusterd_friend_add_from_peerinfo] 0-management: 
connect returned 0 
[2018-02-19 05:32:50.249097] I [MSGID: 106498] 
[glusterd-handler.c:3603:glusterd_friend_add_from_peerinfo] 0-management: 
connect returned 0 
[2018-02-19 05:32:50.249161] W [MSGID: 106062] 
[glusterd-handler.c:3400:glusterd_transport_inet_options_build] 0-glusterd: 
Failed to get tcp-user-timeout 
[2018-02-19 05:32:50.249206] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 
0-management: setting frame-timeout to 600 
[2018-02-19 05:32:50.249327] W [MSGID: 101002] [options.c:995:xl_opt_validate] 
0-management: option 'address-family' is deprecated, preferred is 
'transport.address-family', continuing with correction 
[2018-02-19 05:32:50.254789] W [MSGID: 106062] 
[glusterd-handler.c:3400:glusterd_transport_inet_options_build] 0-glusterd: 
Failed to get tcp-user-timeout 
[2018-02-19 05:32:50.254831] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 
0-management: setting frame-timeout to 600 
[2018-02-19 05:32:50.254908] W [MSGID: 101002] [options.c:995:xl_opt_validate] 
0-management: option 'address-family' is deprecated, preferred is 
'transport.address-family', continuing with correction 
[2018-02-19 05:32:50.258683] I [MSGID: 106544] 
[glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: 
de955a28-c230-4ada-98ba-a8f404ee8827 
Final graph: 
+--+
 
1: volume management 
2: type mgmt/glusterd 
3: option rpc-auth.auth-glusterfs on 
4: option rpc-auth.auth-unix on 
5: option rpc-auth.auth-null on 
6: option rpc-auth-allow-insecure on 
7: option transport.listen-backlog 10 
8: option event-threads 1 
9: option ping-timeout 0 
10: option transport.socket.read-fail-log off 
11: option transport.socket.keepalive-interval 2 
12: option transport.socket.keepalive-time 10 
13: option transport-type rdma 
14: option working-directory /var/lib/glusterd 
15: end-volume 
16: 
+--+
 
[2018-02-19 05:32:50.259384] I [MSGID: 101190] 
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with 
index 1 
[2018-02-19 05:32:50.284115] I [MSGID: 106163] 
[glusterd-handshake.c:1316:__glusterd_mgmt_hndsk_versions_ack] 0-management: 
using the op-version 30712 
[2018-02-19 05:32:50.285320] I [MSGID: 106493] 
[glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk] 0-glusterd: Received RJT 
from uuid: a23fa00c-4c7c-436d-9d04-0c16941c, host: found2.ssd.org, port: 0 
[2018-02-19 05:32:50.286561] I [MSGID: 106493] 
[glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk] 0-glusterd: Received RJT 
from uuid: b9fb5e3b-b638-4495-afee-36b465aea4e7, host: found1.ssd.org, port: 0 
[2018-02-19 05:32:50.296816] I [MSGID: 106490] 
[glusterd-handler.c:2540:__glusterd_handle_incoming_friend_req] 0-glusterd: 
Received probe from uuid: a23fa00c-4c7c-436d-9d04-0c16941c 
[2018-02-19 05:32:50.298392] E [MSGID: 106010] 
[glusterd-utils.c:3374:glusterd_compare_friend_volume] 0-management: Version of 
Cksums VMData differ. local cksum = 1127272657, remote cksum = 3816303263 on 
peer found2.ssd.org 
[2018-02-19