28.08.2018 10:43, Amar Tumballi пишет:


On Tue, Aug 28, 2018 at 11:24 AM, Dmitry Melekhov <[email protected] <mailto:[email protected]>> wrote:

    Hello!


    Yesterday we hit something like this on 4.1.2

    Centos 7.5.


    Volume is replicated - two bricks and one arbiter.


    We rebooted arbiter, waited for heal end,  and tried to live
    migrate VM to another node ( we run VMs on gluster nodes ):


    [2018-08-27 09:56:22.085411] I [MSGID: 115029]
    [server-handshake.c:763:server_setvolume] 0-pool-server: accepted
    client from
    
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
    client-6-RECON_NO:-0 (version: 4.1.2)
    [2018-08-27 09:56:22.107609] I [MSGID: 115036]
    [server.c:483:server_rpc_notify] 0-pool-server: disconnecting
    connection from
    
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
    client-6-RECON_NO:-0
    [2018-08-27 09:56:22.107747] I [MSGID: 101055]
    [client_t.c:444:gf_client_unref] 0-pool-server: Shutting down
    connection
    
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien
    t-6-RECON_NO:-0
    [2018-08-27 09:58:37.905829] I [MSGID: 115036]
    [server.c:483:server_rpc_notify] 0-pool-server: disconnecting
    connection from
    
CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p
    ool-client-6-RECON_NO:-0
    [2018-08-27 09:58:37.905926] W
    [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
    lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
    {client=0x7ffb58035bc0, pid=30292 lk-owner=28c831d8bc550000}
    [2018-08-27 09:58:37.905959] W
    [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
    lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
    {client=0x7ffb58035bc0, pid=30292 lk-owner=2870a7d6bc550000}
    [2018-08-27 09:58:37.905979] W
    [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
    lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
    {client=0x7ffb58035bc0, pid=30292 lk-owner=2880a7d6bc550000}
    [2018-08-27 09:58:37.905997] W
    [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
    lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
    {client=0x7ffb58035bc0, pid=30292 lk-owner=28f031d8bc550000}
    [2018-08-27 09:58:37.906016] W
    [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
    lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
    {client=0x7ffb58035bc0, pid=30292 lk-owner=28b07dd5bc550000}
    [2018-08-27 09:58:37.906034] W
    [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
    lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
    {client=0x7ffb58035bc0, pid=30292 lk-owner=28e0a7d6bc550000}
    [2018-08-27 09:58:37.906056] W
    [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
    lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
    {client=0x7ffb58035bc0, pid=30292 lk-owner=28b845d8bc550000}
    [2018-08-27 09:58:37.906079] W
    [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
    lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
    {client=0x7ffb58035bc0, pid=30292 lk-owner=2858a7d8bc550000}
    [2018-08-27 09:58:37.906098] W
    [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
    lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
    {client=0x7ffb58035bc0, pid=30292 lk-owner=2868a8d7bc550000}
    [2018-08-27 09:58:37.906121] W
    [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
    lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
    {client=0x7ffb58035bc0, pid=30292 lk-owner=28f80bd7bc550000}
    ...

    [2018-08-27 09:58:37.907375] W
    [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
    lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
    {client=0x7ffb58035bc0, pid=30292 lk-owner=28a8cdd6bc550000}
    [2018-08-27 09:58:37.907393] W
    [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: releasing
    lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held by
    {client=0x7ffb58035bc0, pid=30292 lk-owner=2880cdd6bc550000}
    [2018-08-27 09:58:37.907476] I [socket.c:3837:socket_submit_reply]
    0-tcp.pool-server: not connected (priv->connected = -1)
    [2018-08-27 09:58:37.907520] E
    [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to
    submit message (XID: 0xcb88cb, Program: GlusterFS 4.x v1,
    ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
    [2018-08-27 09:58:37.910727] E [server.c:137:server_submit_reply]
    (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
    [0x7ffb64379084]
    -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
    ba) [0x7ffb5fddf5ba]
    -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
    [0x7ffb5fd89fce] ) 0-: Reply submission failed
    [2018-08-27 09:58:37.910814] E
    [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to
    submit message (XID: 0xcb88ce, Program: GlusterFS 4.x v1,
    ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
    [2018-08-27 09:58:37.910861] E [server.c:137:server_submit_reply]
    (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
    [0x7ffb64379084]
    -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
    ba) [0x7ffb5fddf5ba]
    -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
    [0x7ffb5fd89fce] ) 0-: Reply submission failed
    [2018-08-27 09:58:37.910904] E
    [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to
    submit message (XID: 0xcb88cf, Program: GlusterFS 4.x v1,
    ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
    [2018-08-27 09:58:37.910940] E [server.c:137:server_submit_reply]
    (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
    [0x7ffb64379084]
    -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
    ba) [0x7ffb5fddf5ba]
    -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
    [0x7ffb5fd89fce] ) 0-: Reply submission failed
    [2018-08-27 09:58:37.910979] E
    [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to
    submit message (XID: 0xcb88d1, Program: GlusterFS 4.x v1,
    ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
    [2018-08-27 09:58:37.911012] E [server.c:137:server_submit_reply]
    (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
    [0x7ffb64379084]
    -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
    ba) [0x7ffb5fddf5ba]
    -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
    [0x7ffb5fd89fce] ) 0-: Reply submission failed
    [2018-08-27 09:58:37.911050] E
    [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to
    submit message (XID: 0xcb88d8, Program: GlusterFS 4.x v1,
    ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server)
    [2018-08-27 09:58:37.911083] E [server.c:137:server_submit_reply]
    (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
    [0x7ffb64379084]
    -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
    ba) [0x7ffb5fddf5ba]
    -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
    [0x7ffb5fd89fce] ) 0-: Reply submission failed
    [2018-08-27 09:58:37.916217] E [server.c:137:server_submit_reply]
    (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
    [0x7ffb64379084]
    -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
    ba) [0x7ffb5fddf5ba]
    -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
    [0x7ffb5fd89fce] ) 0-: Reply submission failed
    [2018-08-27 09:58:37.916520] I [MSGID: 115013]
    [server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd cleanup on
    /balamak.img


    after this I/O on  /balamak.img was blocked.


    Only solution we found was to reboot all 3 nodes.


    Is there any bug report in bugzilla we can add logs?


Not aware of such bugs!

    Is it possible to turn of these locks?


Not sure, will get back on this one!


btw, found this link
https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-filelocks/

tried on another (test) cluster:

 [root@marduk ~]# gluster volume statedump pool
Segmentation fault (core dumped)


4.1.2 too...

something is wrong here.


    Thank you!




    _______________________________________________
    Gluster-users mailing list
    [email protected] <mailto:[email protected]>
    https://lists.gluster.org/mailman/listinfo/gluster-users
    <https://lists.gluster.org/mailman/listinfo/gluster-users>




--
Amar Tumballi (amarts)


_______________________________________________
Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to