Re: [ceph-users] Restarting OSD leads to lower CPU usage

Jan Schermer Thu, 11 Jun 2015 05:57:08 -0700

I have no experience with perf and the package is not installed.
I will take a look at it, thanks.


Jan


> On 11 Jun 2015, at 13:48, Dan van der Ster <[email protected]> wrote:
> 
> Hi Jan,
> 
> Can you get perf top running? It should show you where the OSDs are 
> spinning...
> 
> Cheers, Dan
> 
> On Thu, Jun 11, 2015 at 11:21 AM, Jan Schermer <[email protected]> wrote:
>> Hi,
>> hoping someone can point me in the right direction.
>> 
>> Some of my OSDs have a larger CPU usage (and ops latencies) than others. If 
>> I restart the OSD everything runs nicely for some time, then it creeps up.
>> 
>> 1) most of my OSDs have ~40% CPU (core) usage (user+sys), some are closer to 
>> 80%. Restarting means the offending OSDs only use 40% again.
>> 2) average latencies and CPU usage on the host are the same - so it’s not 
>> caused by the host that the OSD is running on
>> 3) I can’t say exactly when or how the issue happens. I can’t even say if 
>> it’s the same OSDs. It seems it either happens when something heavy happens 
>> in a cluster (like dropping very old snapshots, rebalancing) and then 
>> doesn’t come back, or maybe it happens slowly over time and I can’t find it 
>> in the graphs. Looking at the graphs it seems to be the former.
>> 
>> I have just one suspicion and that is the “fd cache size” - we have it set 
>> to 16384 but the open fds suggest there are more open files for the osd 
>> process (over 17K fds) - it varies by some hundreds between the osds. Maybe 
>> some are just slightly over the limit and the misses cause this? Restarting 
>> the OSD clears them (~2K) and they increase over time. I increased it to 
>> 32768 yesterday and it consistently nice now, but it might take another few 
>> days to manifest…
>> Could this explain it? Any other tips?
>> 
>> Thanks
>> 
>> Jan
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Restarting OSD leads to lower CPU usage

Reply via email to