Any time estimation on to when this fix would be released? - In next 3.10 update (rastar to confirm the date) Any recommended workaround? - probably you need to wipe off the ip reserved local ports file.
On Tue, Jun 20, 2017 at 2:36 PM, Guy Cukierman <[email protected]> wrote: > Thanks Gaurav! > > > > 1. Any time estimation on to when this fix would be released? > 2. Any recommended workaround? > > > > Best, > > Guy. > > > > *From:* Gaurav Yadav [mailto:[email protected]] > *Sent:* Tuesday, June 20, 2017 9:46 AM > *To:* Guy Cukierman <[email protected]> > *Cc:* Atin Mukherjee <[email protected]>; [email protected] > > *Subject:* Re: [Gluster-users] gluster peer probe failing > > > > Hi, > > I am able to recreate the issue and here is my RCA. > > Maximum value i.e 32767 is being overflowed while doing manipulation on it > and it was previously not taken care properly. > Hence glusterd was crashing with SIGSEGV. > > Issue is being fixed with "https://bugzilla.redhat.com/ > show_bug.cgi?id=1454418" and being backported as well. > > > > > > Thanks > > Gaurav > > > > > > On Tue, Jun 20, 2017 at 6:43 AM, Gaurav Yadav <[email protected]> wrote: > > Hi, > > I have tried on my host by setting corresponding ports, but I didn't see > the issue on my machine locally. > > However with the logs you have sent it is prety much clear issue is > related to ports only. > > I will trying to reproduce on some other machine. Will update you as s0on > as possible. > > > > > > Thanks > > Gaurav > > > > On Sun, Jun 18, 2017 at 12:37 PM, Guy Cukierman <[email protected]> wrote: > > Hi, > > Below please find the reserved ports and log, thanks. > > > > sysctl net.ipv4.ip_local_reserved_ports: > > net.ipv4.ip_local_reserved_ports = 30000-32767 > > > > > > glusterd.log: > > [2017-06-18 07:04:17.853162] I [MSGID: 106487] > [glusterd-handler.c:1242:__glusterd_handle_cli_probe] > 0-glusterd: Received CLI probe req 192.168.1.17 24007 > > [2017-06-18 07:04:17.853237] D [MSGID: 0] > [common-utils.c:3361:gf_is_local_addr] > 0-management: 192.168.1.17 > > [2017-06-18 07:04:17.854093] D [logging.c:1952:_gf_msg_internal] > 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About > to flush least recently used log message to disk > > The message "D [MSGID: 0] [common-utils.c:3361:gf_is_local_addr] > 0-management: 192.168.1.17 " repeated 2 times between [2017-06-18 > 07:04:17.853237] and [2017-06-18 07:04:17.853869] > > [2017-06-18 07:04:17.854093] D [MSGID: 0] > [common-utils.c:3377:gf_is_local_addr] > 0-management: 192.168.1.17 is not local > > [2017-06-18 07:04:17.854221] D [MSGID: 0] [glusterd-peer-utils.c:132: > glusterd_peerinfo_find_by_hostname] 0-management: Unable to find friend: > 192.168.1.17 > > [2017-06-18 07:04:17.854271] D [logging.c:1952:_gf_msg_internal] > 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About > to flush least recently used log message to disk > > [2017-06-18 07:04:17.854269] D [MSGID: 0] [glusterd-peer-utils.c:132: > glusterd_peerinfo_find_by_hostname] 0-management: Unable to find friend: > 192.168.1.17 > > [2017-06-18 07:04:17.854271] D [MSGID: 0] > [glusterd-peer-utils.c:246:glusterd_peerinfo_find] > 0-management: Unable to find hostname: 192.168.1.17 > > [2017-06-18 07:04:17.854306] I [MSGID: 106129] > [glusterd-handler.c:3690:glusterd_probe_begin] > 0-glusterd: Unable to find peerinfo for host: 192.168.1.17 (24007) > > [2017-06-18 07:04:17.854343] D [MSGID: 0] > [glusterd-peer-utils.c:486:glusterd_peer_hostname_new] > 0-glusterd: Returning 0 > > [2017-06-18 07:04:17.854367] D [MSGID: 0] > [glusterd-utils.c:7060:glusterd_sm_tr_log_init] > 0-glusterd: returning 0 > > [2017-06-18 07:04:17.854387] D [MSGID: 0] [glusterd-store.c:4092: > glusterd_store_create_peer_dir] 0-glusterd: Returning with 0 > > [2017-06-18 07:04:17.854918] D [MSGID: 0] [store.c:420:gf_store_handle_new] > 0-: Returning 0 > > [2017-06-18 07:04:17.855083] D [MSGID: 0] [store.c:374:gf_store_save_value] > 0-management: returning: 0 > > [2017-06-18 07:04:17.855130] D [logging.c:1952:_gf_msg_internal] > 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About > to flush least recently used log message to disk > > The message "D [MSGID: 0] [store.c:374:gf_store_save_value] 0-management: > returning: 0" repeated 2 times between [2017-06-18 07:04:17.855083] and > [2017-06-18 07:04:17.855128] > > [2017-06-18 07:04:17.855129] D [MSGID: 0] > [glusterd-store.c:4221:glusterd_store_peer_write] > 0-glusterd: Returning with 0 > > [2017-06-18 07:04:17.856294] D [MSGID: 0] [glusterd-store.c:4247: > glusterd_store_perform_peer_store] 0-glusterd: Returning 0 > > [2017-06-18 07:04:17.856332] D [MSGID: 0] > [glusterd-store.c:4268:glusterd_store_peerinfo] > 0-glusterd: Returning with 0 > > [2017-06-18 07:04:17.856365] W [MSGID: 106062] [glusterd-handler.c:3466: > glusterd_transport_inet_options_build] 0-glusterd: Failed to get > tcp-user-timeout > > [2017-06-18 07:04:17.856387] D [MSGID: 0] [glusterd-handler.c:3474: > glusterd_transport_inet_options_build] 0-glusterd: Returning 0 > > [2017-06-18 07:04:17.856409] I [rpc-clnt.c:1059:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > > [2017-06-18 07:04:17.856421] D [rpc-clnt.c:1071:rpc_clnt_connection_init] > 0-management: setting ping-timeout to 30 > > [2017-06-18 07:04:17.856434] D [rpc-transport.c:279:rpc_transport_load] > 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/3.10.3/ > rpc-transport/socket.so > > [2017-06-18 07:04:17.856580] D [socket.c:4082:socket_init] 0-management: > Configued transport.tcp-user-timeout=-1 > > [2017-06-18 07:04:17.856594] D [socket.c:4165:socket_init] 0-management: > SSL support on the I/O path is NOT enabled > > [2017-06-18 07:04:17.856625] D [socket.c:4168:socket_init] 0-management: > SSL support for glusterd is NOT enabled > > [2017-06-18 07:04:17.856634] D [socket.c:4185:socket_init] 0-management: > using system polling thread > > [2017-06-18 07:04:17.856664] D [name.c:168:client_fill_address_family] > 0-management: address-family not specified, marking it as unspec for > getaddrinfo to resolve from (remote-host: 192.168.1.17) > > [2017-06-18 07:04:17.861800] D [MSGID: 0] [common-utils.c:334:gf_resolve_ip6] > 0-resolver: returning ip-192.168.1.17 (port-24007) for hostname: > 192.168.1.17 and port: 24007 > > [2017-06-18 07:04:17.861830] D [socket.c:2982:socket_fix_ssl_opts] > 0-management: disabling SSL for portmapper connection > > [2017-06-18 07:04:17.861885] D [MSGID: 0] > [common-utils.c:3106:gf_ports_reserved] > 0-glusterfs: lower: 30000, higher: 32767 > > [2017-06-18 07:04:17.861920] D [logging.c:1764:gf_log_flush_extra_msgs] > 0-logging-infra: Log buffer size reduced. About to flush 5 extra log > messages > > [2017-06-18 07:04:17.861933] D [logging.c:1767:gf_log_flush_extra_msgs] > 0-logging-infra: Just flushed 5 extra log messages > > pending frames: > > frame : type(0) op(0) > > patchset: git://git.gluster.org/glusterfs.git > > signal received: 11 > > time of crash: > > 2017-06-18 07:04:17 > > configuration details: > > argp 1 > > backtrace 1 > > dlfcn 1 > > libpthread 1 > > llistxattr 1 > > setfsid 1 > > spinlock 1 > > epoll.h 1 > > xattr.h 1 > > st_atim.tv_nsec 1 > > package-string: glusterfs 3.10.3 > > /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7fbdf7c964d0] > > /lib64/libglusterfs.so.0(gf_print_trace+0x324)[0x7fbdf7c9fdd4] > > /lib64/libc.so.6(+0x35250)[0x7fbdf637a250] > > /lib64/libglusterfs.so.0(gf_ports_reserved+0x15c)[0x7fbdf7ca044c] > > /lib64/libglusterfs.so.0(gf_process_reserved_ports+0xbe)[0x7fbdf7ca070e] > > /usr/lib64/glusterfs/3.10.3/rpc-transport/socket.so(+ > 0xd158)[0x7fbde9c24158] > > /usr/lib64/glusterfs/3.10.3/rpc-transport/socket.so(client_bind+0x93)[ > 0x7fbde9c245a3] > > /usr/lib64/glusterfs/3.10.3/rpc-transport/socket.so(+ > 0xa875)[0x7fbde9c21875] > > /lib64/libgfrpc.so.0(rpc_clnt_reconnect+0xc9)[0x7fbdf7a5ff89] > > /lib64/libgfrpc.so.0(rpc_clnt_start+0x39)[0x7fbdf7a60049] > > /usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+ > 0x24218)[0x7fbdec7b5218] > > /usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+ > 0x24843)[0x7fbdec7b5843] > > /usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+ > 0x24ae0)[0x7fbdec7b5ae0] > > /usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+ > 0x27890)[0x7fbdec7b8890] > > /usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+ > 0x27e20)[0x7fbdec7b8e20] > > /usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+ > 0x20f5e)[0x7fbdec7b1f5e] > > /lib64/libglusterfs.so.0(synctask_wrap+0x10)[0x7fbdf7ccd750] > > /lib64/libc.so.6(+0x46cf0)[0x7fbdf638bcf0] > > --------- > > > > *From:* Gaurav Yadav [mailto:[email protected]] > *Sent:* Friday, June 16, 2017 5:47 AM > *To:* Atin Mukherjee <[email protected]> > *Cc:* Guy Cukierman <[email protected]>; [email protected] > > > *Subject:* Re: [Gluster-users] gluster peer probe failing > > > > > > Could you please send me the output of command "sysctl > net.ipv4.ip_local_reserved_ports". > > Apart from output of command please send the logs to look into the issue. > > Thanks > > Gaurav > > > > > > On Thu, Jun 15, 2017 at 4:28 PM, Atin Mukherjee <[email protected]> > wrote: > > +Gaurav, he is the author of the patch, can you please comment here? > > > > On Thu, Jun 15, 2017 at 3:28 PM, Guy Cukierman <[email protected]> wrote: > > Thanks, but my current settings are: > > net.ipv4.ip_local_reserved_ports = 30000-32767 > > net.ipv4.ip_local_port_range = 32768 60999 > > meaning the reserved ports are already in the short int range, so maybe I > misunderstood something? or is it a different issue? > > > > *From:* Atin Mukherjee [mailto:[email protected]] > *Sent:* Thursday, June 15, 2017 10:56 AM > *To:* Guy Cukierman <[email protected]> > *Cc:* [email protected] > *Subject:* Re: [Gluster-users] gluster peer probe failing > > > > https://review.gluster.org/#/c/17494/ will it and the next update of 3.10 > should have this fix. > > If sysctl net.ipv4.ip_local_reserved_ports has any value > short int range > then this would be a problem with the current version. > Would you be able to reset the reserved ports temporarily to get this going? > > > > > On Wed, Jun 14, 2017 at 8:32 PM, Guy Cukierman <[email protected]> wrote: > > Hi, > > I have a gluster (version 3.10.2) server running on a 3 node (centos7) > cluster. > > Firewalld and SELinux are disabled, and I see I can telnet from each node > to the other on port 24007. > > > > When I try to create the first peering by running on node1 the command: > > gluster peer probe <node2 ip address> > > > > I get the error: > > “Connection failed. Please check if gluster daemon is operational.” > > > > And Glusterd.log shows: > > > > [2017-06-14 14:46:09.927510] I [MSGID: 106487] > [glusterd-handler.c:1242:__glusterd_handle_cli_probe] > 0-glusterd: Received CLI probe req 192.168.1.17 24007 > > [2017-06-14 14:46:09.928560] I [MSGID: 106129] > [glusterd-handler.c:3690:glusterd_probe_begin] > 0-glusterd: Unable to find peerinfo for host: 192.168.1.17 (24007) > > [2017-06-14 14:46:09.930783] W [MSGID: 106062] [glusterd-handler.c:3466: > glusterd_transport_inet_options_build] 0-glusterd: Failed to get > tcp-user-timeout > > [2017-06-14 14:46:09.930837] I [rpc-clnt.c:1059:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > > pending frames: > > frame : type(0) op(0) > > patchset: git://git.gluster.org/glusterfs.git > > signal received: 11 > > time of crash: > > 2017-06-14 14:46:09 > > configuration details: > > argp 1 > > backtrace 1 > > dlfcn 1 > > libpthread 1 > > llistxattr 1 > > setfsid 1 > > spinlock 1 > > epoll.h 1 > > xattr.h 1 > > st_atim.tv_nsec 1 > > package-string: glusterfs 3.10.3 > > /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7f69625da4d0] > > /lib64/libglusterfs.so.0(gf_print_trace+0x324)[0x7f69625e3dd4] > > /lib64/libc.so.6(+0x35250)[0x7f6960cbe250] > > /lib64/libglusterfs.so.0(gf_ports_reserved+0x15c)[0x7f69625e444c] > > /lib64/libglusterfs.so.0(gf_process_reserved_ports+0xbe)[0x7f69625e470e] > > /usr/lib64/glusterfs/3.10.3/rpc-transport/socket.so(+ > 0xd158)[0x7f6954568158] > > /usr/lib64/glusterfs/3.10.3/rpc-transport/socket.so(client_bind+0x93)[ > 0x7f69545685a3] > > /usr/lib64/glusterfs/3.10.3/rpc-transport/socket.so(+ > 0xa875)[0x7f6954565875] > > /lib64/libgfrpc.so.0(rpc_clnt_reconnect+0xc9)[0x7f69623a3f89] > > /lib64/libgfrpc.so.0(rpc_clnt_start+0x39)[0x7f69623a4049] > > /usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+ > 0x24218)[0x7f69570f9218] > > /usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+ > 0x24843)[0x7f69570f9843] > > /usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+ > 0x24ae0)[0x7f69570f9ae0] > > /usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+ > 0x27890)[0x7f69570fc890] > > /usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+ > 0x27e20)[0x7f69570fce20] > > /usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+ > 0x20f5e)[0x7f69570f5f5e] > > /lib64/libglusterfs.so.0(synctask_wrap+0x10)[0x7f6962611750] > > /lib64/libc.so.6(+0x46cf0)[0x7f6960ccfcf0] > > > > And a file is create under /var/lib/glusterd/peers/<node2 ip address> > which contains: > > uuid=00000000-0000-0000-0000-000000000000 > > state=0 > > hostname1=192.168.1.17 > > > > and the glusterd daemon exits and I cannot restart it until I delete this > file from the peers folder. > > > > Any idea what is wrong? > > thanks! > > > _______________________________________________ > Gluster-users mailing list > [email protected] > http://lists.gluster.org/mailman/listinfo/gluster-users > > > > > > > > > > >
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
