TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES works, or at least seems to, just nothing positive. This is on a Centos 6-ish distro. I can’t really upgrade anything easily because of support, and we still run 0.67.12 in production, so that’s a no-go. I know upgrading to Giant is the best way to achieve more performance, but we’re not ready for that yet either (but working on it :)) I’d expect the tcmalloc issue to manifest almost immediately? There are thousands of threads, hundreds of connections - surely it would manifest sooner? People were seeing regressions with just two clients in benchmarks so I thought we are operating with b0rked thread cache constantly…
for the record, preloading jemalloc ends with sigsegv within a few minutes, if anybody wanted to know… :) Jan > On 11 Jun 2015, at 21:14, Somnath Roy <[email protected]> wrote: > > Yeah ! Then it is the tcmalloc issue.. > If you are using the version coming with OS , the > TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES won't do anything. > Try building the latest tcmalloc and set the env variable and see if it > improves or not. > Also, you can try with latest ceph build with jemalloc enabled if you have a > test cluster. > > Thanks & Regards > Somnath > > -----Original Message----- > From: Jan Schermer [mailto:[email protected]] > Sent: Thursday, June 11, 2015 12:10 PM > To: Somnath Roy > Cc: Dan van der Ster; [email protected] > Subject: Re: [ceph-users] Restarting OSD leads to lower CPU usage > > Hi, > I looked at it briefly before leaving, tcmalloc was at the top. I can provide > a full listing tomorrow if it helps. > > 12.80% libtcmalloc.so.4.1.0 [.] tcmalloc::CentralFreeList::FetchFromSpans() > 8.40% libtcmalloc.so.4.1.0 [.] > tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, > unsigned long, int) > 7.40% [kernel] [k] futex_wake > 6.36% libtcmalloc.so.4.1.0 [.] > tcmalloc::CentralFreeList::ReleaseToSpans(void*) > 6.09% [kernel] [k] futex_requeue > > Not much else to see. We tried setting the venerable > TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES, but it only got much much worse > (default 16MB, tried 8MB and up to 512MB, it was unusably slow immediately > after start). We haven’t tried upgrading tcmalloc, though... > > We only use Ceph for RBD with OpenStack, block size is the default (4MB). > I tested different block sizes previously, and I got the best results from > 8MB blocks (and I was benchmarking 4K random direct/sync writes) - strange, I > think… > > I increased fdcache to 120000 (which should be enough for all objects on the > OSD), and I will compare how it behaves tomorrow. > > Thanks a lot > > Jan > >> On 11 Jun 2015, at 20:59, Somnath Roy <[email protected]> wrote: >> >> Yeah, perf top will help you a lot.. >> >> Some guess: >> >> 1. If your block size is small 4-16K range, most probably you are hitting >> the tcmalloc issue. 'perf top' will show up with lot of tcmalloc traces in >> that case. >> >> 2. fdcache should save you some cpu but I don't see it will be that >> significant. >> >> Thanks & Regards >> Somnath >> >> -----Original Message----- >> From: ceph-users [mailto:[email protected]] On Behalf >> Of Jan Schermer >> Sent: Thursday, June 11, 2015 5:57 AM >> To: Dan van der Ster >> Cc: [email protected] >> Subject: Re: [ceph-users] Restarting OSD leads to lower CPU usage >> >> I have no experience with perf and the package is not installed. >> I will take a look at it, thanks. >> >> Jan >> >> >>> On 11 Jun 2015, at 13:48, Dan van der Ster <[email protected]> wrote: >>> >>> Hi Jan, >>> >>> Can you get perf top running? It should show you where the OSDs are >>> spinning... >>> >>> Cheers, Dan >>> >>> On Thu, Jun 11, 2015 at 11:21 AM, Jan Schermer <[email protected]> wrote: >>>> Hi, >>>> hoping someone can point me in the right direction. >>>> >>>> Some of my OSDs have a larger CPU usage (and ops latencies) than others. >>>> If I restart the OSD everything runs nicely for some time, then it creeps >>>> up. >>>> >>>> 1) most of my OSDs have ~40% CPU (core) usage (user+sys), some are closer >>>> to 80%. Restarting means the offending OSDs only use 40% again. >>>> 2) average latencies and CPU usage on the host are the same - so >>>> it’s not caused by the host that the OSD is running on >>>> 3) I can’t say exactly when or how the issue happens. I can’t even say if >>>> it’s the same OSDs. It seems it either happens when something heavy >>>> happens in a cluster (like dropping very old snapshots, rebalancing) and >>>> then doesn’t come back, or maybe it happens slowly over time and I can’t >>>> find it in the graphs. Looking at the graphs it seems to be the former. >>>> >>>> I have just one suspicion and that is the “fd cache size” - we have >>>> it set to 16384 but the open fds suggest there are more open files for the >>>> osd process (over 17K fds) - it varies by some hundreds between the osds. >>>> Maybe some are just slightly over the limit and the misses cause this? >>>> Restarting the OSD clears them (~2K) and they increase over time. I >>>> increased it to 32768 yesterday and it consistently nice now, but it might >>>> take another few days to manifest… Could this explain it? Any other tips? >>>> >>>> Thanks >>>> >>>> Jan >>>> _______________________________________________ >>>> ceph-users mailing list >>>> [email protected] >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> _______________________________________________ >> ceph-users mailing list >> [email protected] >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> ________________________________ >> >> PLEASE NOTE: The information contained in this electronic mail message is >> intended only for the use of the designated recipient(s) named above. If the >> reader of this message is not the intended recipient, you are hereby >> notified that you have received this message in error and that any review, >> dissemination, distribution, or copying of this message is strictly >> prohibited. If you have received this communication in error, please notify >> the sender by telephone or e-mail (as shown above) immediately and destroy >> any and all copies of this message in your possession (whether hard copies >> or electronically stored copies). >> > _______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
