Hi, yesterday I’ve got a strange crash on almost all bricks, same type of crash, repeated:
[2015-06-09 18:23:56.407520] I [login.c:81:gf_auth] 0-auth/login: allowed user
names: c3deedb5-893f-41fb-8c33-9ae23a0e9d27
[2015-06-09 18:23:56.407580] I [server-handshake.c:585:server_setvolume]
0-atlas-data-01-server: accepted client from
atlas-storage-10.roma1.infn.it-7546-2015/06/09-18:23:55:618600-atlas-data-01-client-0-0-0
(version: 3.7.1)
[2015-06-09 18:23:56.407707] I [login.c:81:gf_auth] 0-auth/login: allowed user
names: c3deedb5-893f-41fb-8c33-9ae23a0e9d27
[2015-06-09 18:23:56.407772] I [server-handshake.c:585:server_setvolume]
0-atlas-data-01-server: accepted client from
atlas-storage-09.roma1.infn.it-25429-2015/06/09-18:18:57:328935-atlas-data-01-client-0-0-0
(version: 3.7.1)
[2015-06-09 18:23:56.415905] I [login.c:81:gf_auth] 0-auth/login: allowed user
names: c3deedb5-893f-41fb-8c33-9ae23a0e9d27
[2015-06-09 18:23:56.415947] I [server-handshake.c:585:server_setvolume]
0-atlas-data-01-server: accepted client from
atlas-storage-10.roma1.infn.it-7530-2015/06/09-18:23:54:608880-atlas-data-01-client-0-0-0
(version: 3.7.1)
[2015-06-09 18:23:56.433956] E [posix-handle.c:157:posix_make_ancestryfromgfid]
0-atlas-data-01-posix: could not read the link from the gfid handle
/bricks/atlas/data01/data/.glusterfs/74/4b/744b7cf0-258f-4dea-b4d9-7001bb21ca56
(No such file or directory)
[2015-06-09 18:23:56.433954] E [posix-handle.c:157:posix_make_ancestryfromgfid]
0-atlas-data-01-posix: could not read the link from the gfid handle
/bricks/atlas/data01/data/.glusterfs/74/4b/744b7cf0-258f-4dea-b4d9-7001bb21ca56
(No such file or directory)
pending frames:
frame : type(0) op(11)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2015-06-09 18:23:56
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.1
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7f0f6446ed92]
/lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7f0f644899ed]
/lib64/libc.so.6(+0x35650)[0x7f0f62e60650]
/usr/lib64/glusterfs/3.7.1/xlator/features/upcall.so(upcall_cache_invalidate+0xb5)[0x7f0f5537cab5]
/usr/lib64/glusterfs/3.7.1/xlator/features/upcall.so(up_readdir_cbk+0x1a2)[0x7f0f55376292]
/usr/lib64/glusterfs/3.7.1/xlator/features/locks.so(pl_readdirp_cbk+0x164)[0x7f0f5558dc94]
/usr/lib64/glusterfs/3.7.1/xlator/features/access-control.so(posix_acl_readdirp_cbk+0x299)[0x7f0f557a6829]
/usr/lib64/glusterfs/3.7.1/xlator/features/bitrot-stub.so(br_stub_readdirp_cbk+0x181)[0x7f0f559b5fb1]
/usr/lib64/glusterfs/3.7.1/xlator/storage/posix.so(posix_readdirp+0x143)[0x7f0f56f0cfc3]
/lib64/libglusterfs.so.0(default_readdirp+0x75)[0x7f0f644736a5]
/lib64/libglusterfs.so.0(default_readdirp+0x75)[0x7f0f644736a5]
/lib64/libglusterfs.so.0(default_readdirp+0x75)[0x7f0f644736a5]
/usr/lib64/glusterfs/3.7.1/xlator/features/bitrot-stub.so(br_stub_readdirp+0x246)[0x7f0f559b0d46]
/usr/lib64/glusterfs/3.7.1/xlator/features/access-control.so(posix_acl_readdirp+0x18d)[0x7f0f557a45cd]
/usr/lib64/glusterfs/3.7.1/xlator/features/locks.so(pl_readdirp+0x14e)[0x7f0f5558c7ee]
/usr/lib64/glusterfs/3.7.1/xlator/features/upcall.so(up_readdirp+0x17a)[0x7f0f5537abfa]
/lib64/libglusterfs.so.0(default_readdirp_resume+0x134)[0x7f0f644809e4]
/lib64/libglusterfs.so.0(call_resume+0x7d)[0x7f0f64498c7d]
/usr/lib64/glusterfs/3.7.1/xlator/performance/io-threads.so(iot_worker+0x123)[0x7f0f5516b353]
/lib64/libpthread.so.0(+0x7df5)[0x7f0f635dadf5]
/lib64/libc.so.6(clone+0x6d)[0x7f0f62f211ad]
---------
I’m not sure if the missing file is the culprit, but if it is the cause, how
can I solve it? For the moment I’ve recreated the bricks from a backup, so I’m
fine, but it would be nice to understand what to do in case it happens again. I
still have the contents of the old crashed brick.
The crash was happening every time I restarted glusterd, in the same way.
I’m using gluster 3.7.1 on CentOS 7.1, with the following kind of configuration:
# gluster volume info atlas-data-01
Volume Name: atlas-data-01
Type: Replicate
Volume ID: 854620a1-3e88-4e76-91ce-486996bf6a12
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: node1:/bricks/atlas/data01/data
Brick2: node2:/bricks/atlas/data01/data
Brick3: node3:/bricks/atlas/data02/data
Options Reconfigured:
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
nfs.disable: true
server.allow-insecure: on
ganesha.enable: off
nfs-ganesha: disable
I was playing with ganesha and tried to enable in on the volumes (but failed,
as you can see from my other messages), and I’m not sure it is related, but all
the crashed bricks were the ones belonging to the volumes where I tried to
enable ganesha.
Thanks,
Alessandro
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
