On Tue, Jan 15, 2019 at 2:13 PM Atin Mukherjee <atin.mukherje...@gmail.com> wrote:
> Interesting. I’ll do a deep dive at it sometime this week. > > On Tue, 15 Jan 2019 at 14:05, Xavi Hernandez <jaher...@redhat.com> wrote: > >> On Mon, Jan 14, 2019 at 11:08 AM Ashish Pandey <aspan...@redhat.com> >> wrote: >> >>> >>> I downloaded logs of regression runs 1077 and 1073 and tried to >>> investigate it. >>> In both regression ec/bug-1236065.t is hanging on TEST 70 which is >>> trying to get the online brick count >>> >>> I can see that in mount/bricks and glusterd logs it has not move forward >>> after this test. >>> glusterd.log - >>> >>> [2019-01-06 16:27:51.346408]:++++++++++ >>> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 70 5 online_brick_count >>> ++++++++++ >>> [2019-01-06 16:27:51.645014] I [MSGID: 106499] >>> [glusterd-handler.c:4404:__glusterd_handle_status_volume] 0-management: >>> Received status volume req for volume patchy >>> [2019-01-06 16:27:51.646664] I [dict.c:2745:dict_get_str_boolean] >>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x4a6c3) >>> [0x7f4c37fe06c3] >>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x43b3a) >>> [0x7f4c37fd9b3a] >>> -->/build/install/lib/libglusterfs.so.0(dict_get_str_boolean+0x170) >>> [0x7f4c433d83fb] ) 0-dict: key nfs.disable, integer type asked, has string >>> type [Invalid argument] >>> [2019-01-06 16:27:51.647177] I [dict.c:2361:dict_get_strn] >>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32) >>> [0x7f4c38095a32] >>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac) >>> [0x7f4c37fdd4ac] >>> -->/build/install/lib/libglusterfs.so.0(dict_get_strn+0x179) >>> [0x7f4c433d7673] ) 0-dict: key brick0.rdma_port, string type asked, has >>> integer type [Invalid argument] >>> [2019-01-06 16:27:51.647227] I [dict.c:2361:dict_get_strn] >>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32) >>> [0x7f4c38095a32] >>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac) >>> [0x7f4c37fdd4ac] >>> -->/build/install/lib/libglusterfs.so.0(dict_get_strn+0x179) >>> [0x7f4c433d7673] ) 0-dict: key brick1.rdma_port, string type asked, has >>> integer type [Invalid argument] >>> [2019-01-06 16:27:51.647292] I [dict.c:2361:dict_get_strn] >>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32) >>> [0x7f4c38095a32] >>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac) >>> [0x7f4c37fdd4ac] >>> -->/build/install/lib/libglusterfs.so.0(dict_get_strn+0x179) >>> [0x7f4c433d7673] ) 0-dict: key brick2.rdma_port, string type asked, has >>> integer type [Invalid argument] >>> [2019-01-06 16:27:51.647333] I [dict.c:2361:dict_get_strn] >>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32) >>> [0x7f4c38095a32] >>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac) >>> [0x7f4c37fdd4ac] >>> -->/build/install/lib/libglusterfs.so.0(dict_get_strn+0x179) >>> [0x7f4c433d7673] ) 0-dict: key brick3.rdma_port, string type asked, has >>> integer type [Invalid argument] >>> [2019-01-06 16:27:51.647371] I [dict.c:2361:dict_get_strn] >>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32) >>> [0x7f4c38095a32] >>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac) >>> [0x7f4c37fdd4ac] >>> -->/build/install/lib/libglusterfs.so.0(dict_get_strn+0x179) >>> [0x7f4c433d7673] ) 0-dict: key brick4.rdma_port, string type asked, has >>> integer type [Invalid argument] >>> [2019-01-06 16:27:51.647409] I [dict.c:2361:dict_get_strn] >>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32) >>> [0x7f4c38095a32] >>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac) >>> [0x7f4c37fdd4ac] >>> -->/build/install/lib/libglusterfs.so.0(dict_get_strn+0x179) >>> [0x7f4c433d7673] ) 0-dict: key brick5.rdma_port, string type asked, has >>> integer type [Invalid argument] >>> [2019-01-06 16:27:51.647447] I [dict.c:2361:dict_get_strn] >>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32) >>> [0x7f4c38095a32] >>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac) >>> [0x7f4c37fdd4ac] >>> -->/build/install/lib/libglusterfs.so.0(dict_get_strn+0x179) >>> [0x7f4c433d7673] ) 0-dict: key brick6.rdma_port, string type asked, has >>> integer type [Invalid argument] >>> [2019-01-06 16:27:51.649335] E [MSGID: 101191] >>> [event-epoll.c:759:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch >>> handler >>> [2019-01-06 16:27:51.932871] I [MSGID: 106499] >>> [glusterd-handler.c:4404:__glusterd_handle_status_volume] 0-management: >>> Received status volume req for volume patchy >>> >>> It is just taking lot of time to get the status at this point. >>> It looks like there could be some issue with connection or the handing >>> of volume status when some bricks are down. >>> >> >> The 'online_brick_count' check uses 'gluster volume status' to get some >> information, and it does that several times (currently 7). Looking at >> cmd_history.log, I see that after the 'online_brick_count' at line 70, only >> one 'gluster volume status' has completed. Apparently the second 'gluster >> volume status' is hung. >> >> In cli.log I see that the second 'gluster volume status' seems to have >> started, but not finished: >> >> Normal run: >> >> [2019-01-08 16:36:43.628821] I [cli.c:834:main] 0-cli: Started running >> gluster with version 6dev >> [2019-01-08 16:36:43.808182] I [MSGID: 101190] >> [event-epoll.c:675:event_dispatch_epoll_worker] 0-epoll: Started thread >> with index 0 >> [2019-01-08 16:36:43.808287] I [MSGID: 101190] >> [event-epoll.c:675:event_dispatch_epoll_worker] 0-epoll: Started thread >> with index 1 >> [2019-01-08 16:36:43.808432] E [MSGID: 101191] >> [event-epoll.c:759:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch >> handler >> [2019-01-08 16:36:43.816534] I [dict.c:1947:dict_get_uint32] >> (-->gluster(cli_cmd_process+0x1e4) [0x40db50] >> -->gluster(cli_cmd_volume_status_cbk+0x90) [0x415bec] >> -->/build/install/lib/libglusterfs.so.0(dict_get_uint32+0x176) >> [0x7fefe569456 >> 9] ) 0-dict: key cmd, unsigned integer type asked, has integer type >> [Invalid argument] >> [2019-01-08 16:36:43.816716] I [dict.c:1947:dict_get_uint32] >> (-->gluster(cli_cmd_volume_status_cbk+0x1cb) [0x415d27] >> -->gluster(gf_cli_status_volume_all+0xc8) [0x42fa94] >> -->/build/install/lib/libglusterfs.so.0(dict_get_uint32+0x176) [0x7f >> efe5694569] ) 0-dict: key cmd, unsigned integer type asked, has integer >> type [Invalid argument] >> [2019-01-08 16:36:43.824437] I [input.c:31:cli_batch] 0-: Exiting with: 0 >> >> >> Bad run: >> >> [2019-01-08 16:36:43.940361] I [cli.c:834:main] 0-cli: Started running >> gluster with version 6dev >> [2019-01-08 16:36:44.147364] I [MSGID: 101190] >> [event-epoll.c:675:event_dispatch_epoll_worker] 0-epoll: Started thread >> with index 0 >> [2019-01-08 16:36:44.147477] I [MSGID: 101190] >> [event-epoll.c:675:event_dispatch_epoll_worker] 0-epoll: Started thread >> with index 1 >> [2019-01-08 16:36:44.147583] E [MSGID: 101191] >> [event-epoll.c:759:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch >> handler >> >> >> In glusterd.log it seems as if it hasn't received any status request. It >> looks like the cli has not even connected to glusterd. >> > Downloaded the logs for the recent failure from https://build.gluster.org/job/regression-test-with-multiplex/1092/ and based on the log scanning this is what I see: 1. The test executes with out any issues till line no 74 i.e. "TEST $CLI volume start $V0 force" and cli.log along with cmd_history.log confirm the same: cli.log ==== [2019-01-16 16:28:46.871877]:++++++++++ G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 73 gluster --mode=script --wignore volume start patchy force ++++++++++ [2019-01-16 16:28:46.980780] I [cli.c:834:main] 0-cli: Started running gluster with version 6dev [2019-01-16 16:28:47.185996] I [MSGID: 101190] [event-epoll.c:675:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2019-01-16 16:28:47.186113] I [MSGID: 101190] [event-epoll.c:675:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-01-16 16:28:47.186234] E [MSGID: 101191] [event-epoll.c:759:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler [2019-01-16 16:28:49.223376] I [cli-rpc-ops.c:1448:gf_cli_start_volume_cbk] 0-cli: Received resp to start volume <=== successfully processed the callback [2019-01-16 16:28:49.223668] I [input.c:31:cli_batch] 0-: Exiting with: 0 cmd_history.log ============ [2019-01-16 16:28:49.220491] : volume start patchy force : SUCCESS However, in both cli and cmd_history log files these are the last set of logs I see which indicates either the test script is completely paused. There's no possibility I see that cli receiving this command and dropping it completely as otherwise we should have atleast seen the "Started running gluster with version 6dev" and "Exiting with" log entries. I could manage to reproduce this once locally in my system and then when I ran command from another prompt, volume status and all other gluster basic commands go through. I also inspected the processes and I don't see any suspect of processes being hung. So the mystery continues and we need to see why the test script is not all moving forward. >> Xavi >> >> >>> --- >>> Ashish >>> >>> >>> >>> ------------------------------ >>> *From: *"Mohit Agrawal" <moagr...@redhat.com> >>> *To: *"Shyam Ranganathan" <srang...@redhat.com> >>> *Cc: *"Gluster Devel" <gluster-devel@gluster.org> >>> *Sent: *Saturday, January 12, 2019 6:46:20 PM >>> *Subject: *Re: [Gluster-devel] Regression health for release-5.next >>> and release-6 >>> >>> Previous logs related to client not bricks, below are the brick logs >>> >>> [2019-01-12 12:25:25.893485]:++++++++++ >>> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 68 rm -f 0.o 10.o 11.o 12.o 13.o >>> 14.o 15.o 16.o 17.o 18.o 19.o 1.o 2.o 3.o 4.o 5.o 6.o 7.o 8.o 9.o ++++++++++ >>> The message "I [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: >>> key 'trusted.ec.size' would not be sent on wire in the future [Invalid >>> argument]" repeated 199 times between [2019-01-12 12:25:25.283989] and >>> [2019-01-12 12:25:25.899532] >>> [2019-01-12 12:25:25.903375] E [MSGID: 113001] >>> [posix-inode-fd-ops.c:4617:_posix_handle_xattr_keyvalue_pair] >>> 8-patchy-posix: fgetxattr failed on >>> gfid=d91f6331-d394-479d-ab51-6bcf674ac3e0 while doing xattrop: >>> Key:trusted.ec.dirty (Bad file descriptor) [Bad file descriptor] >>> [2019-01-12 12:25:25.903468] E [MSGID: 115073] >>> [server-rpc-fops_v2.c:1805:server4_fxattrop_cbk] 0-patchy-server: 1486: >>> FXATTROP 2 (d91f6331-d394-479d-ab51-6bcf674ac3e0), client: >>> CTX_ID:b785c2b0-3453-4a03-b129-19e6ceeb5346-GRAPH_ID:0-PID:24147-HOST:softserve-moagrawa-test.1-PC_NAME:patchy-client-1-RECON_NO:-1, >>> error-xlator: patchy-posix [Bad file descriptor] >>> >>> >>> Thanks, >>> Mohit Agrawal >>> >>> On Sat, Jan 12, 2019 at 6:29 PM Mohit Agrawal <moagr...@redhat.com> >>> wrote: >>> >>>> >>>> For specific to "add-brick-and-validate-replicated-volume-options.t" i >>>> have posted a patch https://review.gluster.org/22015. >>>> For test case "ec/bug-1236065.t" I think the issue needs to be check by >>>> ec team >>>> >>>> On the brick side, it is showing below logs >>>> >>>> >>>>>>>>>>>>>>>>> >>>> >>>> on wire in the future [Invalid argument] >>>> The message "I [MSGID: 101016] [glusterfs3.h:746:dict_to_xdr] 0-dict: >>>> key 'trusted.ec.dirty' would not be sent on wire in the future [Invalid >>>> argument]" repeated 3 times between [2019-01-12 12:25:25.902828] and >>>> [2019-01-12 12:25:25.902992] >>>> [2019-01-12 12:25:25.903553] W [MSGID: 114031] >>>> [client-rpc-fops_v2.c:1614:client4_0_fxattrop_cbk] 0-patchy-client-1: >>>> remote operation failed [Bad file descriptor] >>>> [2019-01-12 12:25:25.903998] W [MSGID: 122040] >>>> [ec-common.c:1181:ec_prepare_update_cbk] 0-patchy-disperse-0: Failed to get >>>> size and version : FOP : 'FXATTROP' failed on gfid >>>> d91f6331-d394-479d-ab51-6bcf674ac3e0 [Input/output error] >>>> [2019-01-12 12:25:25.904059] W [fuse-bridge.c:1907:fuse_unlink_cbk] >>>> 0-glusterfs-fuse: 3259: UNLINK() /test/0.o => -1 (Input/output error) >>>> >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>> Test case is getting timed out because "volume heal $V0 full" command >>>> is stuck, look's like shd is getting stuck at getxattr >>>> >>>> >>>>>>>>>>>>>>. >>>> >>>> Thread 8 (Thread 0x7f83777fe700 (LWP 25552)): >>>> #0 0x00007f83bb70d945 in pthread_cond_wait@@GLIBC_2.3.2 () from >>>> /usr/lib64/libpthread.so.0 >>>> #1 0x00007f83bc910e5b in syncop_getxattr (subvol=<optimized out>, >>>> loc=loc@entry=0x7f83777fdbb0, dict=dict@entry=0x0, >>>> key=key@entry=0x7f83add06a28 >>>> "trusted.ec.heal", xdata_in=xdata_in@entry=0x0, >>>> xdata_out=xdata_out@entry=0x0) at syncop.c:1680 >>>> #2 0x00007f83add02f27 in ec_shd_selfheal (healer=0x7f83a8030880, >>>> child=<optimized out>, loc=0x7f83777fdbb0, full=<optimized out>) at >>>> ec-heald.c:161 >>>> #3 0x00007f83add0325b in ec_shd_full_heal (subvol=0x7f83a80094b0, >>>> entry=<optimized out>, parent=0x7f83777fdde0, data=0x7f83a8030880) at >>>> ec-heald.c:294 >>>> #4 0x00007f83bc930ac2 in syncop_ftw (subvol=0x7f83a80094b0, >>>> loc=loc@entry=0x7f83777fdde0, pid=pid@entry=-6, >>>> data=data@entry=0x7f83a8030880, >>>> fn=fn@entry=0x7f83add03140 <ec_shd_full_heal>) at syncop-utils.c:125 >>>> #5 0x00007f83add03534 in ec_shd_full_sweep >>>> (healer=healer@entry=0x7f83a8030880, >>>> inode=<optimized out>) at ec-heald.c:311 >>>> #6 0x00007f83add0367b in ec_shd_full_healer (data=0x7f83a8030880) at >>>> ec-heald.c:372 >>>> #7 0x00007f83bb709e25 in start_thread () from >>>> /usr/lib64/libpthread.so.0 >>>> #8 0x00007f83bafd634d in clone () from /usr/lib64/libc.so.6 >>>> Thread 7 (Thread 0x7f8376ffd700 (LWP 25553)): >>>> #0 0x00007f83bb70d945 in pthread_cond_wait@@GLIBC_2.3.2 () from >>>> /usr/lib64/libpthread.so.0 >>>> #1 0x00007f83bc910e5b in syncop_getxattr (subvol=<optimized out>, >>>> loc=loc@entry=0x7f8376ffcbb0, dict=dict@entry=0x0, >>>> key=key@entry=0x7f83add06a28 >>>> "trusted.ec.heal", xdata_in=xdata_in@entry=0x0, >>>> xdata_out=xdata_out@entry=0x0) at syncop.c:1680 >>>> #2 0x00007f83add02f27 in ec_shd_selfheal (healer=0x7f83a80308f0, >>>> child=<optimized out>, loc=0x7f8376ffcbb0, full=<optimized out>) at >>>> ec-heald.c:161 >>>> #3 0x00007f83add0325b in ec_shd_full_heal (subvol=0x7f83a800d110, >>>> entry=<optimized out>, parent=0x7f8376ffcde0, data=0x7f83a80308f0) at >>>> ec-heald.c:294 >>>> #4 0x00007f83bc930ac2 in syncop_ftw (subvol=0x7f83a800d110, >>>> loc=loc@entry=0x7f8376ffcde0, pid=pid@entry=-6, >>>> data=data@entry=0x7f83a80308f0, >>>> fn=fn@entry=0x7f83add03140 <ec_shd_full_heal>) at syncop-utils.c:125 >>>> #5 0x00007f83add03534 in ec_shd_full_sweep >>>> (healer=healer@entry=0x7f83a80308f0, >>>> inode=<optimized out>) at ec-heald.c:311 >>>> #6 0x00007f83add0367b in ec_shd_full_healer (data=0x7f83a80308f0) at >>>> ec-heald.c:372 >>>> #7 0x00007f83bb709e25 in start_thread () from >>>> /usr/lib64/libpthread.so.0 >>>> #8 0x00007f83bafd634d in clone () from /usr/lib64/libc.so.6 >>>> Thread 6 (Thread 0x7f83767fc700 (LWP 25554)): >>>> #0 0x00007f83bb70d945 in pthread_cond_wait@@GLIBC_2.3.2 () from >>>> /usr/lib64/libpthread.so.0 >>>> #1 0x00007f83bc910e5b in syncop_getxattr (subvol=<optimized out>, >>>> loc=loc@entry=0x7f83767fbbb0, dict=dict@entry=0x0, >>>> key=key@entry=0x7f83add06a28 >>>> "trusted.ec.heal", xdata_in=xdata_in@entry=0x0, >>>> xdata_out=xdata_out@entry=0x0) at syncop.c:1680 >>>> #2 0x00007f83add02f27 in ec_shd_selfheal (healer=0x7f83a8030960, >>>> child=<optimized out>, loc=0x7f83767fbbb0, full=<optimized out>) at >>>> ec-heald.c:161 >>>> #3 0x00007f83add0325b in ec_shd_full_heal (subvol=0x7f83a8010af0, >>>> entry=<optimized out>, parent=0x7f83767fbde0, data=0x7f83a8030960) at >>>> ec-heald.c:294 >>>> #4 0x00007f83bc930ac2 in syncop_ftw (subvol=0x7f83a8010af0, >>>> loc=loc@entry=0x7f83767fbde0, pid=pid@entry=-6, >>>> data=data@entry=0x7f83a8030960, >>>> fn=fn@entry=0x7f83add03140 <ec_shd_full_heal>) at syncop-utils.c:125 >>>> #5 0x00007f83add03534 in ec_shd_full_sweep >>>> (healer=healer@entry=0x7f83a8030960, >>>> inode=<optimized out>) at ec-heald.c:311 >>>> #6 0x00007f83add0367b in ec_shd_full_healer (data=0x7f83a8030960) at >>>> ec-heald.c:372 >>>> #7 0x00007f83bb709e25 in start_thread () from >>>> /usr/lib64/libpthread.so.0 >>>> #8 0x00007f83bafd634d in clone () from /usr/lib64/libc.so.6 >>>> Thread 5 (Thread 0x7f8375ffb700 (LWP 25555)): >>>> #0 0x00007f83bb70d945 in pthread_cond_wait@@GLIBC_2.3.2 () from >>>> /usr/lib64/libpthread.so.0 >>>> #1 0x00007f83bc910e5b in syncop_getxattr (subvol=<optimized out>, >>>> loc=loc@entry=0x7f8375ffabb0, dict=dict@entry=0x0, >>>> key=key@entry=0x7f83add06a28 >>>> "trusted.ec.heal", xdata_in=xdata_in@entry=0x0, >>>> xdata_out=xdata_out@entry=0x0) at syncop.c:1680 >>>> #2 0x00007f83add02f27 in ec_shd_selfheal (healer=0x7f83a80309d0, >>>> child=<optimized out>, loc=0x7f8375ffabb0, full=<optimized out>) at >>>> ec-heald.c:161 >>>> #3 0x00007f83add0325b in ec_shd_full_heal (subvol=0x7f83a80144d0, >>>> entry=<optimized out>, parent=0x7f8375ffade0, data=0x7f83a80309d0) at >>>> ec-heald.c:294 >>>> #4 0x00007f83bc930ac2 in syncop_ftw (subvol=0x7f83a80144d0, >>>> loc=loc@entry=0x7f8375ffade0, pid=pid@entry=-6, >>>> data=data@entry=0x7f83a80309d0, >>>> fn=fn@entry=0x7f83add03140 <ec_shd_full_heal>) at syncop-utils.c:125 >>>> #5 0x00007f83add03534 in ec_shd_full_sweep >>>> (healer=healer@entry=0x7f83a80309d0, >>>> inode=<optimized out>) at ec-heald.c:311 >>>> #6 0x00007f83add0367b in ec_shd_full_healer (data=0x7f83a80309d0) at >>>> ec-heald.c:372 >>>> #7 0x00007f83bb709e25 in start_thread () from >>>> /usr/lib64/libpthread.so.0 >>>> #8 0x00007f83bafd634d in clone () from /usr/lib64/libc.so.6 >>>> Thread 4 (Thread 0x7f83757fa700 (LWP 25556)): >>>> #0 0x00007f83bb70d945 in pthread_cond_wait@@GLIBC_2.3.2 () from >>>> /usr/lib64/libpthread.so.0 >>>> #1 0x00007f83bc910e5b in syncop_getxattr (subvol=<optimized out>, >>>> loc=loc@entry=0x7f83757f9bb0, dict=dict@entry=0x0, >>>> key=key@entry=0x7f83add06a28 >>>> "trusted.ec.heal", xdata_in=xdata_in@entry=0x0, >>>> xdata_out=xdata_out@entry=0x0) at syncop.c:1680 >>>> #2 0x00007f83add02f27 in ec_shd_selfheal (healer=0x7f83a8030a40, >>>> child=<optimized out>, loc=0x7f83757f9bb0, full=<optimized out>) at >>>> ec-heald.c:161 >>>> #3 0x00007f83add0325b in ec_shd_full_heal (subvol=0x7f83a8017eb0, >>>> entry=<optimized out>, parent=0x7f83757f9de0, data=0x7f83a8030a40) at >>>> ec-heald.c:294 >>>> #4 0x00007f83bc930ac2 in syncop_ftw (subvol=0x7f83a8017eb0, >>>> loc=loc@entry=0x7f83757f9de0, pid=pid@entry=-6, >>>> data=data@entry=0x7f83a8030a40, >>>> fn=fn@entry=0x7f83add03140 <ec_shd_full_heal>) at syncop-utils.c:125 >>>> #5 0x00007f83add03534 in ec_shd_full_sweep >>>> (healer=healer@entry=0x7f83a8030a40, >>>> inode=<optimized out>) at ec-heald.c:311 >>>> #6 0x00007f83add0367b in ec_shd_full_healer (data=0x7f83a8030a40) at >>>> ec-heald.c:372 >>>> #7 0x00007f83bb709e25 in start_thread () from >>>> /usr/lib64/libpthread.so.0 >>>> #8 0x00007f83bafd634d in clone () from /usr/lib64/libc.so.6 >>>> Thread 3 (Thread 0x7f8374ff9700 (LWP 25557)): >>>> #0 0x00007f83bb70d945 in pthread_cond_wait@@GLIBC_2.3.2 () from >>>> /usr/lib64/libpthread.so.0 >>>> #1 0x00007f83bc910e5b in syncop_getxattr (subvol=<optimized out>, >>>> loc=loc@entry=0x7f8374ff8bb0, dict=dict@entry=0x0, >>>> key=key@entry=0x7f83add06a28 >>>> "trusted.ec.heal", xdata_in=xdata_in@entry=0x0, >>>> xdata_out=xdata_out@entry=0x0) at syncop.c:1680 >>>> #2 0x00007f83add02f27 in ec_shd_selfheal (healer=0x7f83a8030ab0, >>>> child=<optimized out>, loc=0x7f8374ff8bb0, full=<optimized out>) at >>>> ec-heald.c:161 >>>> #3 0x00007f83add0325b in ec_shd_full_heal (subvol=0x7f83a801b890, >>>> entry=<optimized out>, parent=0x7f8374ff8de0, data=0x7f83a8030ab0) at >>>> ec-heald.c:294 >>>> #4 0x00007f83bc930ac2 in syncop_ftw (subvol=0x7f83a801b890, >>>> loc=loc@entry=0x7f8374ff8de0, pid=pid@entry=-6, >>>> data=data@entry=0x7f83a8030ab0, >>>> fn=fn@entry=0x7f83add03140 <ec_shd_full_heal>) at syncop-utils.c:125 >>>> #5 0x00007f83add03534 in ec_shd_full_sweep >>>> (healer=healer@entry=0x7f83a8030ab0, >>>> inode=<optimized out>) at ec-heald.c:311 >>>> #6 0x00007f83add0367b in ec_shd_full_healer (data=0x7f83a8030ab0) at >>>> ec-heald.c:372 >>>> #7 0x00007f83bb709e25 in start_thread () from >>>> /usr/lib64/libpthread.so.0 >>>> #8 0x00007f83bafd634d in clone () from /usr/lib64/libc.so.6 >>>> Thread 2 (Thread 0x7f8367fff700 (LWP 25558)): >>>> #0 0x00007f83bb70d945 in pthread_cond_wait@@GLIBC_2.3.2 () from >>>> /usr/lib64/libpthread.so.0 >>>> #1 0x00007f83bc910e5b in syncop_getxattr (subvol=<optimized out>, >>>> loc=loc@entry=0x7f8367ffebb0, dict=dict@entry=0x0, >>>> key=key@entry=0x7f83add06a28 >>>> "trusted.ec.heal", xdata_in=xdata_in@entry=0x0, >>>> xdata_out=xdata_out@entry=0x0) at syncop.c:1680 >>>> #2 0x00007f83add02f27 in ec_shd_selfheal (healer=0x7f83a8030b20, >>>> child=<optimized out>, loc=0x7f8367ffebb0, full=<optimized out>) at >>>> ec-heald.c:161 >>>> #3 0x00007f83add0325b in ec_shd_full_heal (subvol=0x7f83a801f270, >>>> entry=<optimized out>, parent=0x7f8367ffede0, data=0x7f83a8030b20) at >>>> ec-heald.c:294 >>>> #4 0x00007f83bc930ac2 in syncop_ftw (subvol=0x7f83a801f270, >>>> loc=loc@entry=0x7f8367ffede0, pid=pid@entry=-6, >>>> data=data@entry=0x7f83a8030b20, >>>> fn=fn@entry=0x7f83add03140 <ec_shd_full_heal>) at syncop-utils.c:125 >>>> #5 0x00007f83add03534 in ec_shd_full_sweep >>>> (healer=healer@entry=0x7f83a8030b20, >>>> inode=<optimized out>) at ec-heald.c:311 >>>> #6 0x00007f83add0367b in ec_shd_full_healer (data=0x7f83a8030b20) at >>>> ec-heald.c:372 >>>> #7 0x00007f83bb709e25 in start_thread () from >>>> /usr/lib64/libpthread.so.0 >>>> #8 0x00007f83bafd634d in clone () from /usr/lib64/libc.so.6 >>>> Thread 1 (Thread 0x7f83bcdd1780 (LWP 25383)): >>>> #0 0x00007f83bb70af57 in pthread_join () from >>>> /usr/lib64/libpthread.so.0 >>>> #1 0x00007f83bc92eff8 in event_dispatch_epoll >>>> (event_pool=0x55af0a6dd560) at event-epoll.c:846 >>>> #2 0x000055af0a4116b8 in main (argc=15, argv=0x7fff75610898) at >>>> glusterfsd.c:2848 >>>> >>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>. >>>> >>>> Thanks, >>>> Mohit Agrawal >>>> >>>> On Fri 11 Jan, 2019, 21:20 Shyam Ranganathan <srang...@redhat.com >>>> wrote: >>>> >>>>> We can check health on master post the patch as stated by Mohit below. >>>>> >>>>> Release-5 is causing some concerns as we need to tag the release >>>>> yesterday, but we have the following 2 tests failing or coredumping >>>>> pretty regularly, need attention on these. >>>>> >>>>> ec/bug-1236065.t >>>>> glusterd/add-brick-and-validate-replicated-volume-options.t >>>>> >>>>> Shyam >>>>> On 1/10/19 6:20 AM, Mohit Agrawal wrote: >>>>> > I think we should consider regression-builds after merged the patch >>>>> > (https://review.gluster.org/#/c/glusterfs/+/21990/) >>>>> > as we know this patch introduced some delay. >>>>> > >>>>> > Thanks, >>>>> > Mohit Agrawal >>>>> > >>>>> > On Thu, Jan 10, 2019 at 3:55 PM Atin Mukherjee <amukh...@redhat.com >>>>> > <mailto:amukh...@redhat.com>> wrote: >>>>> > >>>>> > Mohit, Sanju - request you to investigate the failures related to >>>>> > glusterd and brick-mux and report back to the list. >>>>> > >>>>> > On Thu, Jan 10, 2019 at 12:25 AM Shyam Ranganathan >>>>> > <srang...@redhat.com <mailto:srang...@redhat.com>> wrote: >>>>> > >>>>> > Hi, >>>>> > >>>>> > As part of branching preparation next week for release-6, >>>>> please >>>>> > find >>>>> > test failures and respective test links here [1]. >>>>> > >>>>> > The top tests that are failing/dumping-core are as below and >>>>> > need attention, >>>>> > - ec/bug-1236065.t >>>>> > - glusterd/add-brick-and-validate-replicated-volume-options.t >>>>> > - readdir-ahead/bug-1390050.t >>>>> > - glusterd/brick-mux-validation.t >>>>> > - bug-1432542-mpx-restart-crash.t >>>>> > >>>>> > Others of interest, >>>>> > - replicate/bug-1341650.t >>>>> > >>>>> > Please file a bug if needed against the test case and report >>>>> the >>>>> > same >>>>> > here, in case a problem is already addressed, then do send >>>>> back the >>>>> > patch details that addresses this issue as a response to >>>>> this mail. >>>>> > >>>>> > Thanks, >>>>> > Shyam >>>>> > >>>>> > [1] Regression failures: >>>>> > https://hackmd.io/wsPgKjfJRWCP8ixHnYGqcA?view >>>>> > _______________________________________________ >>>>> > Gluster-devel mailing list >>>>> > Gluster-devel@gluster.org <mailto:Gluster-devel@gluster.org> >>>>> > https://lists.gluster.org/mailman/listinfo/gluster-devel >>>>> > >>>>> > >>>>> >>>> >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel@gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>> >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel@gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-devel >> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel@gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-devel > > -- > --Atin > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel