This is the core file from the crash just now
[root@psanaoss213 /]# ls -al core*
-rw------- 1 root root 4073594880 Jun 8 15:05 core.22682
From yesterday:
[root@psanaoss214 /]# ls -al core*
-rw------- 1 root root 4362727424 Jun 8 00:58 core.13483
-rw------- 1 root root 4624773120 Jun 8 03:21 core.8792
On 06/08/2012 04:34 PM, Anand Avati wrote:
Is it possible the system was running low on memory? I see you have
48GB, but memory registration failure typically would be because the
system limit on the number of pinnable pages in RAM was hit. Can you
tell us the size of your core dump files after the crash?
Avati
On Fri, Jun 8, 2012 at 4:22 PM, Ling Ho <[email protected]
<mailto:[email protected]>> wrote:
Hello,
I have a brick that crashed twice today, and another different
brick that crashed just a while a go.
This is what I see in one of the brick logs:
patchset: git://git.gluster.com/glusterfs.git
<http://git.gluster.com/glusterfs.git>
patchset: git://git.gluster.com/glusterfs.git
<http://git.gluster.com/glusterfs.git>
signal received: 6
signal received: 6
time of crash: 2012-06-08 15:05:11
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.2.6
/lib64/libc.so.6[0x34bc032900]
/lib64/libc.so.6(gsignal+0x35)[0x34bc032885]
/lib64/libc.so.6(abort+0x175)[0x34bc034065]
/lib64/libc.so.6[0x34bc06f977]
/lib64/libc.so.6[0x34bc075296]
/opt/glusterfs/3.2.6/lib64/libglusterfs.so.0(__gf_free+0x44)[0x7f1740ba25e4]
/opt/glusterfs/3.2.6/lib64/libgfrpc.so.0(rpc_transport_destroy+0x47)[0x7f1740956967]
/opt/glusterfs/3.2.6/lib64/libgfrpc.so.0(rpc_transport_unref+0x62)[0x7f1740956a32]
/opt/glusterfs/3.2.6/lib64/glusterfs/3.2.6/rpc-transport/rdma.so(+0xc135)[0x7f173ca27135]
/lib64/libpthread.so.0[0x34bc8077f1]
/lib64/libc.so.6(clone+0x6d)[0x34bc0e5ccd]
---------
And somewhere before these, there is also
[2012-06-08 15:05:07.512604] E [rdma.c:198:rdma_new_post]
0-rpc-transport/rdma: memory registration failed
I have 48GB of memory on the system:
# free
total used free shared buffers
cached
Mem: 49416716 34496648 14920068 0 31692
28209612
-/+ buffers/cache: 6255344 43161372
Swap: 4194296 1740 4192556 <tel:1740%20%20%20%204192556>
# uname -a
Linux psanaoss213 2.6.32-220.7.1.el6.x86_64 #1 SMP Fri Feb 10
15:22:22 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
The server gluster versions is 3.2.6-1. I am using have both rdma
clients and tcp clients over 10Gb/s network.
Any suggestion what I should look for?
Is there a way to just restart the brick, and not glusterd on the
server? I have 8 bricks on the server.
Thanks,
...
ling
Here's the volume info:
# gluster volume info
Volume Name: ana12
Type: Distribute
Status: Started
Number of Bricks: 40
Transport-type: tcp,rdma
Bricks:
Brick1: psanaoss214:/brick1
Brick2: psanaoss214:/brick2
Brick3: psanaoss214:/brick3
Brick4: psanaoss214:/brick4
Brick5: psanaoss214:/brick5
Brick6: psanaoss214:/brick6
Brick7: psanaoss214:/brick7
Brick8: psanaoss214:/brick8
Brick9: psanaoss211:/brick1
Brick10: psanaoss211:/brick2
Brick11: psanaoss211:/brick3
Brick12: psanaoss211:/brick4
Brick13: psanaoss211:/brick5
Brick14: psanaoss211:/brick6
Brick15: psanaoss211:/brick7
Brick16: psanaoss211:/brick8
Brick17: psanaoss212:/brick1
Brick18: psanaoss212:/brick2
Brick19: psanaoss212:/brick3
Brick20: psanaoss212:/brick4
Brick21: psanaoss212:/brick5
Brick22: psanaoss212:/brick6
Brick23: psanaoss212:/brick7
Brick24: psanaoss212:/brick8
Brick25: psanaoss213:/brick1
Brick26: psanaoss213:/brick2
Brick27: psanaoss213:/brick3
Brick28: psanaoss213:/brick4
Brick29: psanaoss213:/brick5
Brick30: psanaoss213:/brick6
Brick31: psanaoss213:/brick7
Brick32: psanaoss213:/brick8
Brick33: psanaoss215:/brick1
Brick34: psanaoss215:/brick2
Brick35: psanaoss215:/brick4
Brick36: psanaoss215:/brick5
Brick37: psanaoss215:/brick7
Brick38: psanaoss215:/brick8
Brick39: psanaoss215:/brick3
Brick40: psanaoss215:/brick6
Options Reconfigured:
performance.io-thread-count: 16
performance.write-behind-window-size: 16MB
performance.cache-size: 1GB
nfs.disable: on
performance.cache-refresh-timeout: 1
network.ping-timeout: 42
performance.cache-max-file-size: 1PB
_______________________________________________
Gluster-users mailing list
[email protected] <mailto:[email protected]>
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users