Hi, GlusterFS expert,
We have encounter a coredump of client process “glusterfs” issue recently, and
it could be reproduced more easy when the IO load and CPU/memory load is highly
during stability testing.
Our glusterfs release is 3.12.2
I have copy the call trace of core dump as the below, and have some question,
wish can get help from you.
1) Do you have encounter related issue? From the call trace, we could see
the fd variable seems abnoram with the field “refcount” and “inode”,
For wb_inode->this, it become invalid value with 0xff00, did the
value “0xff00”is some meaningful value? Because every coredump
occurred, the value of inode->this is same with “0xff00”
2) When I check the source code, in function wb_enqueue_common, could find
function __wb_request_unref used instead of wb_request_unref, and we could see
though function wb_request_unref is defined, but never used! Firstly it seems
some strange, secondly, in wb_request_unref, there are lock mechanism to avoid
race condition, but __wb_request_unref without those mechanism, and we could
see there are more occurrence called from __wb_request_unref, will it lead to
race issue?
[Current thread is 1 (Thread 0x7f54e82a3700 (LWP 6078))]
(gdb) bt
#0 0x7f54e623197c in wb_fulfill (wb_inode=0x7f54d4066bd0,
liabilities=0x7f54d0824440) at write-behind.c:1155
#1 0x7f54e6233662 in wb_process_queue (wb_inode=0x7f54d4066bd0) at
write-behind.c:1728
#2 0x7f54e6234039 in wb_writev (frame=0x7f54d406d6c0, this=0x7f54e0014b10,
fd=0x7f54d8019d70, vector=0x7f54d0018000, count=1, offset=33554431,
flags=32770, iobref=0x7f54d021ec20, xdata=0x0)
at write-behind.c:1842
#3 0x7f54e6026fcb in du_writev_resume (ret=0, frame=0x7f54d0002260,
opaque=0x7f54d0002260) at disk-usage.c:490
#4 0x7f54ece07160 in synctask_wrap () at syncop.c:377
#5 0x7f54eb3a2660 in ?? () from /lib64/libc.so.6
#6 0x in ?? ()
(gdb) p wb_inode
$6 = (wb_inode_t *) 0x7f54d4066bd0
(gdb) p wb_inode->this
$1 = (xlator_t *) 0xff00
(gdb) frame 1
#1 0x7f54e6233662 in wb_process_queue (wb_inode=0x7f54d4066bd0) at
write-behind.c:1728
1728 in write-behind.c
(gdb) p wind_failure
$2 = 0
(gdb) p *wb_inode
$3 = {window_conf = 35840637416824320, window_current = 35840643167805440,
transit = 35839681019027968, all = {next = 0xb000, prev = 0x7f54d4066bd000},
todo = {next = 0x7f54deadc0de00,
prev = 0x7f54e00489e000}, liability = {next = 0x7f5400a200, prev =
0xb000}, temptation = {next = 0x7f54d4066bd000, prev = 0x7f54deadc0de00}, wip =
{next = 0x7f54e00489e000, prev = 0x7f5400a200},
gen = 45056, size = 35840591659782144, lock = {spinlock = 0, mutex = {__data
= {__lock = 0, __count = 8344798, __owner = 0, __nusers = 8344799, __kind =
41472, __spins = 21504, __elision = 127, __list = {
__prev = 0xb000, __next = 0x7f54d4066bd000}},
__size =
"\000\000\000\000\336T\177\000\000\000\000\000\337T\177\000\000\242\000\000\000T\177\000\000\260\000\000\000\000\000\000\000\320k\006\324T\177",
__align = 35840634501726208}},
this = 0xff00, dontsync = -1}
(gdb) frame 2
#2 0x7f54e6234039 in wb_writev (frame=0x7f54d406d6c0, this=0x7f54e0014b10,
fd=0x7f54d8019d70, vector=0x7f54d0018000, count=1, offset=33554431,
flags=32770, iobref=0x7f54d021ec20, xdata=0x0)
at write-behind.c:1842
1842 in write-behind.c
(gdb) p fd
$4 = (fd_t *) 0x7f54d8019d70
(gdb) p *fd
$5 = {pid = 140002378149040, flags = -670836240, refcount = 32596, inode_list =
{next = 0x7f54d8019d80, prev = 0x7f54d8019d80}, inode = 0x0, lock = {spinlock =
-536740032, mutex = {__data = {
__lock = -536740032, __count = 32596, __owner = -453505333, __nusers =
32596, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next =
0x0}},
__size = "@\377\001\340T\177\000\000\313\016\370\344T\177", '\000'
, __align = 140002512207680}}, _ctx = 0x, xl_count =
0, lk_ctx = 0x0, anonymous = (unknown: 3623984496)}
(gdb)
Thanks & Best Regards,
George
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel