I updated my Origin 3.9 install on CentOS.  After rebooting Heketi won't
start anymore.  When trying to start the heketi-storage container i get the
following message in the pod events:

(combined from similar events): MountVolume.SetUp failed for volume "db" :
mount failed: mount failed: exit status 1 Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for
/var/lib/origin/openshift.local.volumes/pods/82562195-8ddd-11e8-bcc6-525400887c40/volumes/
kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o
backup-volfile-servers=192.168.2.139:192
.168.2.140:192.168.2.141,log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/
kubernetes.io/glusterfs/db/heketi-storage-2-tmv2w-glusterfs.log
192.168.2.139:heketidbstorage
/var/lib/origin/openshift.local.volumes/pods/82562195-8ddd-11e8-bcc6-525400887c40/volumes/
kubernetes.io~glusterfs/db Output: Running scope as unit run-19776.scope.
Mount failed. Please check the log file for more details. the following
error information was pulled from the glusterfs log to help diagnose this
issue: [2018-07-22 18:52:24.527924] I [fuse-bridge.c:5839:fini] 0-fuse:
Closing fuse connection to
'/var/lib/origin/openshift.local.volumes/pods/82562195-8ddd-11e8-bcc6-525400887c40/volumes/
kubernetes.io~glusterfs/db'. [2018-07-22 18:52:24.528231] W
[glusterfsd.c:1300:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7e25)
[0x7f10eb904e25] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
[0x55675e9840d5] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
[0x55675e983efb] ) 0-: received signum (15), shutting down

All 3 of my gluster containers are running.  I turned up logging to see if
I could find the issue, here's what the mount log has:

[2018-07-22 18:34:10.681591] I [rpc-clnt.c:2001:rpc_clnt_reconfig]
0-heketidbstorage-client-0: changing port to 49152 (from 0)
[2018-07-22 18:34:10.684468] E [MSGID: 114058]
[client-handshake.c:1564:client_query_portmap_cbk]
0-heketidbstorage-client-1: failed to get the port number for remote
subvolume. Please run 'gluster volume status' on server to see if brick
process is running.
[2018-07-22 18:34:10.684567] I [MSGID: 114018]
[client.c:2280:client_rpc_notify] 0-heketidbstorage-client-1: disconnected
from heketidbstorage-client-1. Client process will keep trying to connect
to glusterd until brick's port is available
[2018-07-22 18:34:10.686084] I [rpc-clnt.c:2001:rpc_clnt_reconfig]
0-heketidbstorage-client-2: changing port to 49152 (from 0)
[2018-07-22 18:34:10.688890] I [MSGID: 114057]
[client-handshake.c:1477:select_server_supported_programs]
0-heketidbstorage-client-2: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
[2018-07-22 18:34:10.688989] I [MSGID: 114057]
[client-handshake.c:1477:select_server_supported_programs]
0-heketidbstorage-client-0: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
[2018-07-22 18:34:10.689312] W [MSGID: 114043]
[client-handshake.c:1108:client_setvolume_cbk] 0-heketidbstorage-client-2:
failed to set the volume [Permission denied]
[2018-07-22 18:34:10.689334] W [MSGID: 114007]
[client-handshake.c:1137:client_setvolume_cbk] 0-heketidbstorage-client-2:
failed to get 'process-uuid' from reply dict [Invalid argument]
[2018-07-22 18:34:10.689346] E [MSGID: 114044]
[client-handshake.c:1143:client_setvolume_cbk] 0-heketidbstorage-client-2:
SETVOLUME on remote-host failed: Authentication failed [Permission denied]
[2018-07-22 18:34:10.689366] I [MSGID: 114049]
[client-handshake.c:1257:client_setvolume_cbk] 0-heketidbstorage-client-2:
sending AUTH_FAILED event
[2018-07-22 18:34:10.689389] E [fuse-bridge.c:5328:notify] 0-fuse: Server
authenication failed. Shutting down.
[2018-07-22 18:34:10.689407] I [fuse-bridge.c:5834:fini] 0-fuse: Unmounting
'/tmp/mnt'.
[2018-07-22 18:34:10.689456] I [fuse-bridge.c:5839:fini] 0-fuse: Closing
fuse connection to '/tmp/mnt'.
[2018-07-22 18:34:10.689511] I [MSGID: 114046]
[client-handshake.c:1230:client_setvolume_cbk] 0-heketidbstorage-client-0:
Connected to heketidbstorage-client-0, attached to remote volume
'/var/lib/heketi/mounts/vg_113359a58d40f5a772995c5db1cfa55f/brick_3e0b346013c97202e45fa662c28675b7/brick'.
[2018-07-22 18:34:10.689530] I [MSGID: 114047]
[client-handshake.c:1241:client_setvolume_cbk] 0-heketidbstorage-client-0:
Server and Client lk-version numbers are not same, reopening the fds
[2018-07-22 18:34:10.689587] I [MSGID: 108005]
[afr-common.c:4706:afr_notify] 0-heketidbstorage-replicate-0: Subvolume
'heketidbstorage-client-0' came back up; going online.
[2018-07-22 18:34:10.689893] W [glusterfsd.c:1300:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7e25) [0x7f68d0e51e25]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x556f6c5e00d5]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x556f6c5dfefb] ) 0-:
received signum (15), shutting down

Any thoughts?

Thanks
_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to