On Tue, Sep 11, 2018 at 2:13 AM, <g.vasilopou...@uoc.gr> wrote: > It seems that a vm with 3 disks boot in domain engine another disk in > domain vol1 and a third in domain v3 became non responsive when one gluster > host went down. > To explain a bit the situation I have 3 glusterfs hosts with 3 volumes > hosts are g1,g2,g3 each have 3 bricks > g1 has vol1,vol2 and vol3 arbiter > g2 has vol1, vol2arbiter and vol3 > g3 has vol1arb vol2 and vol3 > libgfapi is enabled . I put a host in maintenance to update the bios and > the vm who had disks in two domain became unresponsive.. > is this normal? qemu logs showing that it tries Domain configuration shows > host1 as primary for vol1 and host2 as primary for vol3 with the other two > as backup-volfile servers.. > it seems it always try to connect to the server that is down and not to > one of the alternative hosts... > is this libgapi/libvirt problem ? >
Yes, this is gfapi + libvirt issue. see https://bugzilla.redhat.com/show_bug.cgi?id=1484660 for details > Here are some libvirt logs showing what it tries to do.. > [2018-09-10 19:43:42.876114] T [socket.c:3133:socket_connect] > 0-vol1-client-2: connecting 0x55ed673525c0, state=2 gen=0 sock=-1 > [2018-09-10 19:43:42.876124] T [name.c:243:af_inet_client_get_remote_sockaddr] > 0-vol1-client-2: option remote-port missing in volume vol1-client-2. > Defaulting to 24007 > [2018-09-10 19:43:42.878566] D [socket.c:3051:socket_fix_ssl_opts] > 0-vol1-client-2: disabling SSL for portmapper connection > [2018-09-10 19:43:42.878770] T [socket.c:834:__socket_nodelay] > 0-vol1-client-2: NODELAY enabled for socket 30 > [2018-09-10 19:43:42.878780] T [socket.c:920:__socket_keepalive] > 0-vol1-client-2: Keep-alive enabled for socket: 30, (idle: 20, interval: 2, > max-probes: 9, timeout: 0) > [2018-09-10 19:43:42.878830] T [rpc-clnt.c:406:rpc_clnt_reconnect] > 0-vol3-client-1: attempting reconnect > [2018-09-10 19:43:42.878846] T [socket.c:3133:socket_connect] > 0-vol3-client-1: connecting 0x55ed673546c0, state=2 gen=0 sock=-1 > [2018-09-10 19:43:42.878856] T [name.c:243:af_inet_client_get_remote_sockaddr] > 0-vol3-client-1: option remote-port missing in volume vol3-client-1. > Defaulting to 24007 > [2018-09-10 19:43:42.881229] D [socket.c:3051:socket_fix_ssl_opts] > 0-vol3-client-1: disabling SSL for portmapper connection > [2018-09-10 19:43:42.881255] T [socket.c:834:__socket_nodelay] > 0-vol3-client-1: NODELAY enabled for socket 38 > [2018-09-10 19:43:42.881264] T [socket.c:920:__socket_keepalive] > 0-vol3-client-1: Keep-alive enabled for socket: 38, (idle: 20, interval: 2, > max-probes: 9, timeout: 0) > [2018-09-10 19:43:45.569298] T [socket.c:724:__socket_disconnect] > 0-vol3-client-1: disconnecting 0x55ed673546c0, state=2 gen=0 sock=38 > [2018-09-10 19:43:45.569308] T [socket.c:724:__socket_disconnect] > 0-vol1-client-2: disconnecting 0x55ed673525c0, state=2 gen=0 sock=30 > [2018-09-10 19:43:45.570000] T [socket.c:728:__socket_disconnect] (--> > /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fddbadade9b] (--> > /usr/lib64/glusterfs/3.12.13/rpc-t > ransport/socket.so(+0x4ea0)[0x7fdda7bbfea0] (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x530a)[0x7fdda7bc030a] > (--> /usr/lib64/glusterfs/3.12.13/rpc-transport/s > ocket.so(+0x9a08)[0x7fdda7bc4a08] (--> > /lib64/libglusterfs.so.0(+0x883c4)[0x7fddbae093c4] > ))))) 0-vol3-client-1: tearing down socket connection > [2018-09-10 19:43:45.570020] D [socket.c:686:__socket_shutdown] > 0-vol3-client-1: shutdown() returned -1. Transport endpoint is not connected > [2018-09-10 19:43:45.570038] D [socket.c:733:__socket_disconnect] > 0-vol3-client-1: __socket_teardown_connection () failed: Transport endpoint > is not connected > [2018-09-10 19:43:45.570043] D [socket.c:2474:socket_event_handler] > 0-transport: EPOLLERR - disconnecting now > [2018-09-10 19:43:45.570907] T [socket.c:728:__socket_disconnect] (--> > /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fddbadade9b] (--> > /usr/lib64/glusterfs/3.12.13/rpc-t > ransport/socket.so(+0x4ea0)[0x7fdda7bbfea0] (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x530a)[0x7fdda7bc030a] > (--> /usr/lib64/glusterfs/3.12.13/rpc-transport/s > ocket.so(+0x9a08)[0x7fdda7bc4a08] (--> > /lib64/libglusterfs.so.0(+0x883c4)[0x7fddbae093c4] > ))))) 0-vol1-client-2: tearing down socket connection > [2018-09-10 19:43:45.570928] D [socket.c:686:__socket_shutdown] > 0-vol1-client-2: shutdown() returned -1. Transport endpoint is not connected > [2018-09-10 19:43:45.570936] D [socket.c:733:__socket_disconnect] > 0-vol1-client-2: __socket_teardown_connection () failed: Transport endpoint > is not connected > [2018-09-10 19:43:45.570940] D [socket.c:2474:socket_event_handler] > 0-transport: EPOLLERR - disconnecting now > [2018-09-10 19:43:45.570960] D > [rpc-clnt-ping.c:99:rpc_clnt_remove_ping_timer_locked] > (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fddbadade9b] > (--> /lib64/libgfrp > c.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7fddbab7828b] (--> > /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x5f)[0x7fddbab7460f] > (--> /lib64/libgfrpc.so.0(rpc_clnt_no > tify+0x2a0)[0x7fddbab75130] (--> /lib64/libgfrpc.so.0(rpc_trans > port_notify+0x23)[0x7fddbab70ea3] ))))) 0-: 10.xxx.xxx.130:24007: ping > timer event already removed > [2018-09-10 19:43:45.571098] D > [rpc-clnt-ping.c:99:rpc_clnt_remove_ping_timer_locked] > (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fddbadade9b] > (--> /lib64/libgfrp > c.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7fddbab7828b] (--> > /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x5f)[0x7fddbab7460f] > (--> /lib64/libgfrpc.so.0(rpc_clnt_no > tify+0x2a0)[0x7fddbab75130] (--> /lib64/libgfrpc.so.0(rpc_trans > port_notify+0x23)[0x7fddbab70ea3] ))))) 0-: 10.xxx.xxx.130:24007: ping > timer event already removed > [2018-09-10 19:43:45.878885] T [rpc-clnt.c:406:rpc_clnt_reconnect] > 0-vol1-client-2: attempting reconnect > [2018-09-10 19:43:45.881546] T [socket.c:834:__socket_nodelay] > 0-vol1-client-2: NODELAY enabled for socket 38 > [2018-09-10 19:43:45.881555] T [socket.c:920:__socket_keepalive] > 0-vol1-client-2: Keep-alive enabled for socket: 38, (idle: 20, interval: 2, > max-probes: 9, timeout: 0) > [2018-09-10 19:43:45.883839] D [socket.c:3051:socket_fix_ssl_opts] > 0-vol3-client-1: disabling SSL for portmapper connection > [2018-09-10 19:43:45.883878] T [socket.c:834:__socket_nodelay] > 0-vol3-client-1: NODELAY enabled for socket 30 > [2018-09-10 19:43:45.883886] T [socket.c:920:__socket_keepalive] > 0-vol3-client-1: Keep-alive enabled for socket: 30, (idle: 20, interval: 2, > max-probes: 9, timeout: 0) > [2018-09-10 19:43:48.575316] T [socket.c:724:__socket_disconnect] > 0-vol3-client-1: disconnecting 0x55ed673546c0, state=2 gen=0 sock=30 > [2018-09-10 19:43:48.575329] T [socket.c:724:__socket_disconnect] > 0-vol1-client-2: disconnecting 0x55ed673525c0, state=2 gen=0 sock=38 > [2018-09-10 19:43:48.576022] T [socket.c:728:__socket_disconnect] (--> > /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fddbadade9b] (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x4ea0)[0x7fdda7bbfea0] > (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x530a)[0x7fdda7bc030a] > (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x9a08)[0x7fdda7bc4a08] > (--> /lib64/libglusterfs.so.0(+0x883c4)[0x7fddbae093c4] ))))) > 0-vol3-client-1: tearing down socket connection > [2018-09-10 19:43:48.576045] D [socket.c:686:__socket_shutdown] > 0-vol3-client-1: shutdown() returned -1. Transport endpoint is not connected > [2018-09-10 19:43:48.576054] D [socket.c:733:__socket_disconnect] > 0-vol3-client-1: __socket_teardown_connection () failed: Transport endpoint > is not connected > [2018-09-10 19:43:48.576059] D [socket.c:2474:socket_event_handler] > 0-transport: EPOLLERR - disconnecting now > [2018-09-10 19:43:48.576079] T [socket.c:728:__socket_disconnect] (--> > /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fddbadade9b] (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x4ea0)[0x7fdda7bbfea0] > (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x530a)[0x7fdda7bc030a] > (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x9a08)[0x7fdda7bc4a08] > (--> /lib64/libglusterfs.so.0(+0x883c4)[0x7fddbae093c4] ))))) > 0-vol1-client-2: tearing down socket connection > [2018-09-10 19:43:48.576099] D [socket.c:686:__socket_shutdown] > 0-vol1-client-2: shutdown() returned -1. Transport endpoint is not connected > [2018-09-10 19:43:48.576106] D [socket.c:733:__socket_disconnect] > 0-vol1-client-2: __socket_teardown_connection () failed: Transport endpoint > is not connected > [2018-09-10 19:43:48.576111] D [socket.c:2474:socket_event_handler] > 0-transport: EPOLLERR - disconnecting now > [2018-09-10 19:43:48.576879] D > [rpc-clnt-ping.c:99:rpc_clnt_remove_ping_timer_locked] > (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fddbadade9b] > (--> > /lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7fddbab7828b] > (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x5f)[0x7fddbab7460f] > (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7fddbab75130] (--> > /lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fddbab70ea3] ))))) 0-: > 10.xxx.xxx.130:24007: ping timer event already removed > [2018-09-10 19:43:48.576958] D > [rpc-clnt-ping.c:99:rpc_clnt_remove_ping_timer_locked] > (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fddbadade9b] > (--> > /lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7fddbab7828b] > (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x5f)[0x7fddbab7460f] > (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7fddbab75130] (--> > /lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fddbab70ea3] ))))) 0-: > 10.xxx.xxx.130:24007: ping timer event already removed > [2018-09-10 19:43:48.881651] T [rpc-clnt.c:406:rpc_clnt_reconnect] > 0-vol1-client-2: attempting reconnect > [2018-09-10 19:43:48.881667] T [socket.c:3133:socket_connect] > 0-vol1-client-2: connecting 0x55ed673525c0, state=2 gen=0 sock=-1 > [2018-09-10 19:43:48.881689] T [name.c:243:af_inet_client_get_remote_sockaddr] > 0-vol1-client-2: option remote-port missing in volume vol1-client-2. > Defaulting to 24007 > [2018-09-10 19:43:48.884056] T [rpc-clnt.c:406:rpc_clnt_reconnect] > 0-vol3-client-1: attempting reconnect > [2018-09-10 19:43:48.884072] T [socket.c:3133:socket_connect] > 0-vol3-client-1: connecting 0x55ed673546c0, state=2 gen=0 sock=-1 > [2018-09-10 19:43:48.884084] T [name.c:243:af_inet_client_get_remote_sockaddr] > 0-vol3-client-1: option remote-port missing in volume vol3-client-1. > Defaulting to 24007 > [2018-09-10 19:43:48.884190] D [socket.c:3051:socket_fix_ssl_opts] > 0-vol1-client-2: disabling SSL for portmapper connection > [2018-09-10 19:43:48.886524] T [socket.c:834:__socket_nodelay] > 0-vol3-client-1: NODELAY enabled for socket 30 > [2018-09-10 19:43:48.886532] T [socket.c:920:__socket_keepalive] > 0-vol3-client-1: Keep-alive enabled for socket: 30, (idle: 20, interval: 2, > max-probes: 9, timeout: 0) > [2018-09-10 19:43:51.581293] T [socket.c:724:__socket_disconnect] > 0-vol3-client-1: disconnecting 0x55ed673546c0, state=2 gen=0 sock=30 > [2018-09-10 19:43:51.581293] T [socket.c:724:__socket_disconnect] > 0-vol1-client-2: disconnecting 0x55ed673525c0, state=2 gen=0 sock=38 > [2018-09-10 19:43:51.582009] T [socket.c:728:__socket_disconnect] (--> > /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fddbadade9b] (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x4ea0)[0x7fdda7bbfea0] > (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x530a)[0x7fdda7bc030a] > (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x9a08)[0x7fdda7bc4a08] > (--> /lib64/libglusterfs.so.0(+0x883c4)[0x7fddbae093c4] ))))) > 0-vol1-client-2: tearing down socket connection > [2018-09-10 19:43:51.582030] D [socket.c:686:__socket_shutdown] > 0-vol1-client-2: shutdown() returned -1. Transport endpoint is not connected > [2018-09-10 19:43:51.582036] D [socket.c:733:__socket_disconnect] > 0-vol1-client-2: __socket_teardown_connection () failed: Transport endpoint > is not connected > [2018-09-10 19:43:51.582040] D [socket.c:2474:socket_event_handler] > 0-transport: EPOLLERR - disconnecting now > [2018-09-10 19:43:51.582084] T [socket.c:728:__socket_disconnect] (--> > /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fddbadade9b] (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x4ea0)[0x7fdda7bbfea0] > (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x530a)[0x7fdda7bc030a] > (--> > /usr/lib64/glusterfs/3.12.13/rpc-transport/socket.so(+0x9a08)[0x7fdda7bc4a08] > (--> /lib64/libglusterfs.so.0(+0x883c4)[0x7fddbae093c4] ))))) > 0-vol3-client-1: tearing down socket connection > [2018-09-10 19:43:51.582105] D [socket.c:686:__socket_shutdown] > 0-vol3-client-1: shutdown() returned -1. Transport endpoint is not connected > [2018-09-10 19:43:51.582111] D [socket.c:733:__socket_disconnect] > 0-vol3-client-1: __socket_teardown_connection () failed: Transport endpoint > is not connected > [2018-09-10 19:43:51.582116] D [socket.c:2474:socket_event_handler] > 0-transport: EPOLLERR - disconnecting now > [2018-09-10 19:43:51.582812] D > [rpc-clnt-ping.c:99:rpc_clnt_remove_ping_timer_locked] > (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fddbadade9b] > (--> > /lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7fddbab7828b] > (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x5f)[0x7fddbab7460f] > (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7fddbab75130] (--> > /lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fddbab70ea3] ))))) 0-: > 10.xxx.xxx.130:24007: ping timer event already removed > [2018-09-10 19:43:51.582865] D > [rpc-clnt-ping.c:99:rpc_clnt_remove_ping_timer_locked] > (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fddbadade9b] > (--> > /lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7fddbab7828b] > (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x5f)[0x7fddbab7460f] > (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7fddbab75130] (--> > /lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fddbab70ea3] ))))) 0-: > 10.xxx.xxx.130:24007: ping timer event already removed > [2018-09-10 19:43:51.884349] T [rpc-clnt.c:406:rpc_clnt_reconnect] > 0-vol1-client-2: attempting reconnect > [2018-09-10 19:43:51.884367] T [socket.c:3133:socket_connect] > 0-vol1-client-2: connecting 0x55ed673525c0, state=2 gen=0 sock=-1 > [2018-09-10 19:43:51.884376] T [name.c:243:af_inet_client_get_remote_sockaddr] > 0-vol1-client-2: option remote-port missing in volume vol1-client-2. > Defaulting to 24007 > [2018-09-10 19:43:51.886644] T [rpc-clnt.c:406:rpc_clnt_reconnect] > 0-vol3-client-1: attempting reconnect > [2018-09-10 19:43:51.886659] T [socket.c:3133:socket_connect] > 0-vol3-client-1: connecting 0x55ed673546c0, state=2 gen=0 sock=-1 > [2018-09-10 19:43:51.886669] T [name.c:243:af_inet_client_get_remote_sockaddr] > 0-vol3-client-1: option remote-port missing in volume vol3-client-1. > Defaulting to 24007 > [2018-09-10 19:43:51.887251] D [socket.c:3051:socket_fix_ssl_opts] > 0-vol1-client-2: disabling SSL for portmapper connection > [2018-09-10 19:43:51.887281] T [socket.c:834:__socket_nodelay] > 0-vol1-client-2: NODELAY enabled for socket 38 > [2018-09-10 19:43:51.887290] T [socket.c:920:__socket_keepalive] > 0-vol1-client-2: Keep-alive enabled for socket: 38, (idle: 20, interval: 2, > max-probes: 9, timeout: 0) > [2018-09-10 19:43:51.889141] D [socket.c:3051:socket_fix_ssl_opts] > 0-vol3-client-1: disabling SSL for portmapper connection > : > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: https://www.ovirt.org/communit > y/about/community-guidelines/ > List Archives: https://lists.ovirt.org/archiv > es/list/users@ovirt.org/message/LZ6HYGEQOPARSLOE64MJUZBML4XOLB5L/ >
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GEGXFKOR5S62A27ZAKRPY53ORDXKVZJP/