Re: [ceph-users] Restarting OSD leads to lower CPU usage

Jan Schermer Thu, 11 Jun 2015 12:24:19 -0700

TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES works, or at least seems to, just nothing 
positive. This is on a Centos 6-ish distro.
I can’t really upgrade anything easily because of support, and we still run 
0.67.12 in production, so that’s a no-go.
I know upgrading to Giant is the best way to achieve more performance, but 
we’re not ready for that yet either (but working on it :))
I’d expect the tcmalloc issue to manifest almost immediately? There are 
thousands of threads, hundreds of connections - surely it would manifest 
sooner? People were seeing regressions with just two clients in benchmarks so I 
thought we are operating with b0rked thread cache constantly…


for the record, preloading jemalloc ends with sigsegv within a few minutes, if 
anybody wanted to know… :)

Jan


> On 11 Jun 2015, at 21:14, Somnath Roy <[email protected]> wrote:
> 
> Yeah ! Then it is the tcmalloc issue..
> If you are using the version coming with OS , the 
> TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES won't do anything.
> Try building the latest tcmalloc and set the env variable and see if it 
> improves or not.
> Also, you can try with latest ceph build with jemalloc enabled if you have a 
> test cluster.
> 
> Thanks & Regards
> Somnath
> 
> -----Original Message-----
> From: Jan Schermer [mailto:[email protected]] 
> Sent: Thursday, June 11, 2015 12:10 PM
> To: Somnath Roy
> Cc: Dan van der Ster; [email protected]
> Subject: Re: [ceph-users] Restarting OSD leads to lower CPU usage
> 
> Hi,
> I looked at it briefly before leaving, tcmalloc was at the top. I can provide 
> a full listing tomorrow if it helps.
> 
> 12.80%  libtcmalloc.so.4.1.0  [.] tcmalloc::CentralFreeList::FetchFromSpans()
>  8.40%  libtcmalloc.so.4.1.0  [.] 
> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*,
>  unsigned long, int)
>  7.40%  [kernel]              [k] futex_wake
>  6.36%  libtcmalloc.so.4.1.0  [.] 
> tcmalloc::CentralFreeList::ReleaseToSpans(void*)
>  6.09%  [kernel]              [k] futex_requeue
> 
> Not much else to see. We tried setting the venerable 
> TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES, but it only got much much worse 
> (default 16MB, tried 8MB and up to 512MB, it was unusably slow immediately 
> after start). We haven’t tried upgrading tcmalloc, though...
> 
> We only use Ceph for RBD with OpenStack, block size is the default (4MB).
> I tested different block sizes previously, and I got the best results from 
> 8MB blocks (and I was benchmarking 4K random direct/sync writes) - strange, I 
> think…
> 
> I increased fdcache to 120000 (which should be enough for all objects on the 
> OSD), and I will compare how it behaves tomorrow.
> 
> Thanks a lot
> 
> Jan
> 
>> On 11 Jun 2015, at 20:59, Somnath Roy <[email protected]> wrote:
>> 
>> Yeah, perf top will help you a lot..
>> 
>> Some guess:
>> 
>> 1. If your block size is small 4-16K range, most probably you are hitting 
>> the tcmalloc issue. 'perf top' will show up with lot of tcmalloc traces in 
>> that case.
>> 
>> 2. fdcache should save you some cpu but I don't see it will be that 
>> significant.
>> 
>> Thanks & Regards
>> Somnath
>> 
>> -----Original Message-----
>> From: ceph-users [mailto:[email protected]] On Behalf 
>> Of Jan Schermer
>> Sent: Thursday, June 11, 2015 5:57 AM
>> To: Dan van der Ster
>> Cc: [email protected]
>> Subject: Re: [ceph-users] Restarting OSD leads to lower CPU usage
>> 
>> I have no experience with perf and the package is not installed.
>> I will take a look at it, thanks.
>> 
>> Jan
>> 
>> 
>>> On 11 Jun 2015, at 13:48, Dan van der Ster <[email protected]> wrote:
>>> 
>>> Hi Jan,
>>> 
>>> Can you get perf top running? It should show you where the OSDs are 
>>> spinning...
>>> 
>>> Cheers, Dan
>>> 
>>> On Thu, Jun 11, 2015 at 11:21 AM, Jan Schermer <[email protected]> wrote:
>>>> Hi,
>>>> hoping someone can point me in the right direction.
>>>> 
>>>> Some of my OSDs have a larger CPU usage (and ops latencies) than others. 
>>>> If I restart the OSD everything runs nicely for some time, then it creeps 
>>>> up.
>>>> 
>>>> 1) most of my OSDs have ~40% CPU (core) usage (user+sys), some are closer 
>>>> to 80%. Restarting means the offending OSDs only use 40% again.
>>>> 2) average latencies and CPU usage on the host are the same - so 
>>>> it’s not caused by the host that the OSD is running on
>>>> 3) I can’t say exactly when or how the issue happens. I can’t even say if 
>>>> it’s the same OSDs. It seems it either happens when something heavy 
>>>> happens in a cluster (like dropping very old snapshots, rebalancing) and 
>>>> then doesn’t come back, or maybe it happens slowly over time and I can’t 
>>>> find it in the graphs. Looking at the graphs it seems to be the former.
>>>> 
>>>> I have just one suspicion and that is the “fd cache size” - we have 
>>>> it set to 16384 but the open fds suggest there are more open files for the 
>>>> osd process (over 17K fds) - it varies by some hundreds between the osds. 
>>>> Maybe some are just slightly over the limit and the misses cause this? 
>>>> Restarting the OSD clears them (~2K) and they increase over time. I 
>>>> increased it to 32768 yesterday and it consistently nice now, but it might 
>>>> take another few days to manifest… Could this explain it? Any other tips?
>>>> 
>>>> Thanks
>>>> 
>>>> Jan
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> [email protected]
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> ________________________________
>> 
>> PLEASE NOTE: The information contained in this electronic mail message is 
>> intended only for the use of the designated recipient(s) named above. If the 
>> reader of this message is not the intended recipient, you are hereby 
>> notified that you have received this message in error and that any review, 
>> dissemination, distribution, or copying of this message is strictly 
>> prohibited. If you have received this communication in error, please notify 
>> the sender by telephone or e-mail (as shown above) immediately and destroy 
>> any and all copies of this message in your possession (whether hard copies 
>> or electronically stored copies).
>> 
> 

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Restarting OSD leads to lower CPU usage

Reply via email to