For about a week we've been seeing a decent number of buffer overflows
detected across all our RGW nodes in one of our clusters. This started
happening a day after we started weighing in some new OSD nodes, so
we're thinking it's probably related to that. Could someone help us
determine the root cause of this?
Cluster details:
Distro: CentOS 7.2
Release: 0.94.10-0.el7.x86_64
OSDs: 1120
RGW nodes: 10
See log messages below. If you know how to improve the call trace
below I would like to hear that too. I tried installing the
ceph-debuginfo-0.94.10-0.el7.x86_64 package, but that didn't seem to
help.
Thanks,
Bryan
# From /var/log/messages:
Sep 7 20:06:11 p3cephrgw003 radosgw: *** buffer overflow detected ***:
/bin/radosgw terminated
Sep 7 21:01:55 p3cephrgw003 radosgw: *** buffer overflow detected ***:
/bin/radosgw terminated
Sep 7 21:37:00 p3cephrgw003 radosgw: *** buffer overflow detected ***:
/bin/radosgw terminated
Sep 7 23:14:54 p3cephrgw003 radosgw: *** buffer overflow detected ***:
/bin/radosgw terminated
Sep 7 23:17:08 p3cephrgw003 radosgw: *** buffer overflow detected ***:
/bin/radosgw terminated
Sep 8 00:12:39 p3cephrgw003 radosgw: *** buffer overflow detected ***:
/bin/radosgw terminated
Sep 8 07:04:07 p3cephrgw003 radosgw: *** buffer overflow detected ***:
/bin/radosgw terminated
Sep 8 07:17:49 p3cephrgw003 radosgw: *** buffer overflow detected ***:
/bin/radosgw terminated
Sep 8 07:41:39 p3cephrgw003 radosgw: *** buffer overflow detected ***:
/bin/radosgw terminated
Sep 8 07:59:29 p3cephrgw003 radosgw: *** buffer overflow detected ***:
/bin/radosgw terminated
# From /var/log/ceph/client.radosgw.p3cephrgw003.log:
0> 2017-09-08 07:59:29.696615 7f7b296a2700 -1 *** Caught signal (Aborted)
**
in thread 7f7b296a2700
ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af)
1: /bin/radosgw() [0x6d3d92]
2: (()+0xf100) [0x7f7f425e9100]
3: (gsignal()+0x37) [0x7f7f4141d5f7]
4: (abort()+0x148) [0x7f7f4141ece8]
5: (()+0x75317) [0x7f7f4145d317]
6: (__fortify_fail()+0x37) [0x7f7f414f5ac7]
7: (()+0x10bc80) [0x7f7f414f3c80]
8: (()+0x10da37) [0x7f7f414f5a37]
9: (OS_Accept()+0xc1) [0x7f7f435bd8b1]
10: (FCGX_Accept_r()+0x9c) [0x7f7f435bb91c]
11: (RGWFCGXProcess::run()+0x7bf) [0x58136f]
12: (RGWProcessControlThread::entry()+0xe) [0x5821fe]
13: (()+0x7dc5) [0x7f7f425e1dc5]
14: (clone()+0x6d) [0x7f7f414de21d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com