Re: [ceph-users] radosgw pegging down 5 CPU cores when no data is being transferred

Vladimir Brik Wed, 21 Aug 2019 08:00:53 -0700

Correction: the number of threads stuck using 100% of a CPU core variesfrom 1 to 5 (it's not always 5)


Vlad


On 8/21/19 8:54 AM, Vladimir Brik wrote:

Hello
I am running a Ceph 14.2.1 cluster with 3 rados gateways. Periodically,radosgw process on those machines starts consuming 100% of 5 CPU coresfor days at a time, even though the machine is not being used for datatransfers (nothing in radosgw logs, couple of KB/s of network).
This situation can affect any number of our rados gateways, lasts fromfew hours to few days and stops if radosgw process is restarted or onits own.
Does anybody have an idea what might be going on or how to debug it? Idon't see anything obvious in the logs. Perf top is saying that CPU isconsumed by radosgw shared object in symbol get_obj_data::flush, which,if I interpret things correctly, is called from a symbol with a longname that contains the substring "boost9intrusive9list_impl"
This is our configuration:
rgw_frontends = civetweb num_threads=5000 port=443sssl_certificate=/etc/ceph/rgw.crterror_log_file=/var/log/ceph/civetweb.error.log
(error log file doesn't exist)


Thanks,

Vlad
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] radosgw pegging down 5 CPU cores when no data is being transferred

Reply via email to