Hi everyone
We have faced some RGW outages recently, with the RGW returning HTTP 503. First
for a few, then for most, then all requests - in the course of 1-2 hours. This
seems to have started since we have updated from 15.2.4 to 15.2.5.
The line that accompanies these outages in the log is the following:
s3:list_bucket Scheduling request failed with -2218
It first pops up a few times here and there, until it eventually applies to all
requests. It seems to indicate that the throttler has reached the limit of open
connections.
As we run a pair of HAProxy instances in front of RGW, which limit the number
of connections to the two RGW instances to 400, this limit should never be
reached. We do use RGW metadata sync between the instances, which could account
for some extra connections, but if I look at open TCP connections between the
instances I can count no more than 20 at any given time.
I also noticed that some connections in the RGW log seem to never complete.
That is, I can find a ‘starting new request’ line, but no associated ‘req done’
or ‘beast’ line.
I don’t think there are any hung connections around, as they are killed by
HAProxy after a short timeout.
Looking at the code, it seems as if the throttler in use (SimpleThrottler),
eventually reaches the maximum count of 1024 connections
(outstanding_requests), and never recovers. I believe that the request_complete
function is not called in all cases, but I am not familiar with the Ceph
codebase, so I am not sure.
See
https://github.com/ceph/ceph/blob/cc17681b478594aa39dd80437256a54e388432f0/src/rgw/rgw_dmclock_async_scheduler.h#L166-L214
<https://github.com/ceph/ceph/blob/cc17681b478594aa39dd80437256a54e388432f0/src/rgw/rgw_dmclock_async_scheduler.h#L166-L214>
Does anyone see the same phenomenon? Could this be a bug in the request
handling of RGW, or am I wrong in my assumptions?
For now we’re just restarting our RGWs regularly, which seems to keep the
problem at bay.
Thanks for any hints.
Denis
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]