I've also encountered this issue on a cluster yesterday; one CPU got
stuck in an infinite loop in get_obj_data::flush and it stopped
serving requests. I've updated the tracker issue accordingly.
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croi
n its own.
I’m going to check with others who’re more familiar with this code path.
Begin forwarded message:
*From:*Vladimir Brik <mailto:vladimir.b...@icecube.wisc.edu>>
*Subject:**Re: [ceph-users] radosgw pegging down 5 CPU cores when no
data is being transferred*
*Date:*August 21,
ven though the machine is not being used for data transfers
>> (nothing in radosgw logs, couple of KB/s of network).
>>
>> This situation can affect any number of our rados gateways, lasts from few
>> hours to few days and stops if radosgw process is restarted or on i
> Are you running multisite?
No
> Do you have dynamic bucket resharding turned on?
Yes. "radosgw-admin reshard list" prints "[]"
> Are you using lifecycle?
I am not sure. How can I check? "radosgw-admin lc list" says "[]"
> And just to be clear -- sometimes all 3 of your rados gateways are
> si
On 8/21/19 10:22 AM, Mark Nelson wrote:
> Hi Vladimir,
>
>
> On 8/21/19 8:54 AM, Vladimir Brik wrote:
>> Hello
>>
[much elided]
> You might want to try grabbing a a callgraph from perf instead of just
> running perf top or using my wallclock profiler to see if you can drill
> down and find out
Correction: the number of threads stuck using 100% of a CPU core varies
from 1 to 5 (it's not always 5)
Vlad
On 8/21/19 8:54 AM, Vladimir Brik wrote:
Hello
I am running a Ceph 14.2.1 cluster with 3 rados gateways. Periodically,
radosgw process on those machines starts consuming 100% of 5 CPU
On Wed, Aug 21, 2019 at 3:55 PM Vladimir Brik
wrote:
>
> Hello
>
> I am running a Ceph 14.2.1 cluster with 3 rados gateways. Periodically,
> radosgw process on those machines starts consuming 100% of 5 CPU cores
> for days at a time, even though the machine is not being used for data
> transfers (
Hi Vladimir,
On 8/21/19 8:54 AM, Vladimir Brik wrote:
Hello
I am running a Ceph 14.2.1 cluster with 3 rados gateways.
Periodically, radosgw process on those machines starts consuming 100%
of 5 CPU cores for days at a time, even though the machine is not
being used for data transfers (nothin
Hello
I am running a Ceph 14.2.1 cluster with 3 rados gateways. Periodically,
radosgw process on those machines starts consuming 100% of 5 CPU cores
for days at a time, even though the machine is not being used for data
transfers (nothing in radosgw logs, couple of KB/s of network).
This sit