Hi Mark, >>It would be really interesting if you could give jemalloc a try.
I have done a lot of benchs with osd and jemalloc (--with-jemalloc), libjemalloc1 (= 3.5.1-2) on ubuntu trusty. mainly with 4k randread and randwrite, and I don't have seen any problem/hang/bug. The speed was around same than tcmalloc, maybe a little bit slower, but it's marginal. for read, I was around 250k iops by osd with jemalloc vs 260k iops with tcmalloc. ----- Mail original ----- De: "Mark Nelson" <mnel...@redhat.com> À: "ceph-users" <ceph-users@lists.ceph.com> Envoyé: Jeudi 25 Juin 2015 18:25:26 Objet: Re: [ceph-users] Switching from tcmalloc It would be really interesting if you could give jemalloc a try. Originally tcmalloc was used to get around some serious memory fragmentation issues in the OSD. You can read the original bug tracker entry from 5 years ago here: http://tracker.ceph.com/issues/138 It's definitely possible that glibc malloc has improved since back then. I think jemalloc is definitely worth considering. It appears to be a little slower than tcmalloc when tcmalloc is working well, but far more consistent and likely faster than glibc. Mark On 06/24/2015 12:59 PM, Jan Schermer wrote: > We already had the migratepages in place before we disabled tcmalloc. It > didn’t do much. > > Disabling tcmalloc made immediate difference but there were still spikes > and the latency wasn’t that great. (CPU usage was) > Migrating memory helped a lot after that - it didn’t help (at least not > the visibly on graphs) when tcmalloc was used - it’s overhead was so > large NUMA didn’t matter at all. > But we are running Dumpling, so it is possible other bottlenecks that > were resolved in later (Giant) releases would once again overshadow the > gain we got from disabling tcmalloc or there would be regression from > disabling it. > … or our setup/workload is somehow completely different from what > somebody else has? > > Jan > >> On 24 Jun 2015, at 19:41, Robert LeBlanc <rob...@leblancnet.us >> <mailto:rob...@leblancnet.us>> wrote: >> >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA256 >> >> From what I understand, you probably got most of your reduction from >> co-locating your memory to the right NUMA nodes. tcmalloc/jemalloc should be >> much higher in performance because of how they hold memory in thread pools >> (less locking to allocate memory) and they try much harder to reuse dirty >> free pages so memory stays within the thread again reducing locking for >> memory allocations. >> >> I would do some more testing along with what Ben Hines mentioned about >> overall client performance. >> >> - ---------------- >> Robert LeBlanc >> GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >> On Wed, Jun 24, 2015 at 11:25 AM, Jan Schermer wrote: >> There were essentialy three things we had to do for such a drastic drop >> >> 1) recompile CEPH —without-tcmalloc >> 2) pin the OSDs to a set of a specific NUMA zone - we had this for a long >> time and it really helped >> 3) migrate the OSD memory to the correct CPU with migratepages >> - we will use cgroups in the future for this, should make life easier and is >> the only correct solution >> >> It is similiar to the effect of just restarting the OSD, but much better - >> since we immediately see hundreds of connections on a freshly restarted OSD >> (and in the benchmark the tcmalloc issue manifested with just two clients in >> parallel) I’d say we never saw the raw performance with tcmalloc >> (undegraded), but it was never this good - consistently low latencies, much >> smaller spikes when something happens and much lower CPU usage (about 50% >> savings but we’re also backfilling a lot on the background). Workloads are >> faster as well - like reweighting OSDs on that same node was much (hundreds >> of percent) faster. >> >> So far the effect has been drastic. I wonder why tcmalloc was even used when >> people are having problems with it? The glibc malloc seems to work just fine >> for us. >> >> The only concerning thing is the virtual memory usage - we are over 400GB >> VSS with a few OSDs. That doesn’t hurt anything, though. >> >> Jan >> >> >> On 24 Jun 2015, at 18:46, Robert LeBlanc wrote: >> >> - -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA256 >> >> Did you see what the effect of just restarting the OSDs before using >> tcmalloc? I've noticed that there is usually a good drop for us just by >> restarting them. I don't think it is usually this drastic. >> >> - - ---------------- >> Robert LeBlanc >> GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >> On Wed, Jun 24, 2015 at 2:08 AM, Jan Schermer wrote: >> Can you guess when we did that? >> Still on dumpling, btw... >> >> http://www.zviratko.net/link/notcmalloc.png >> >> Jan >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> - -----BEGIN PGP SIGNATURE----- >> Version: Mailvelope v0.13.1 >> Comment:https://www.mailvelope.com <https://www.mailvelope.com/> >> >> wsFcBAEBCAAQBQJVit75CRDmVDuy+mK58QAAmjcP/jU+wyohdwKDP+FHDAgJ >> DcqdB5aPG2AM79iLcYUub5bQjdNJpcWN/hyZcNdF3aSzEV3aY6jIqu9OpOIB >> c2fIzfGOoczzW/FEf7qKRVGpxaQL21Sw1LpwMEscNe0ETz9HMHoaAnBO9IFn >> nUEOCdEpRBO5W1rWwNAx9EVnOUPklb7vVEpY23sgtHhQSprb9oeO8D99AMRz >> /RhdHKlRDgHBjun/stCiR6lFuvBUx0GBmyaMuO5rfsLGRIkySLv++3CLQI6X >> NCt/MjYwTTNNfO/y/MjkiV/j+Cm1G1lcjlgbDjilf7bgf8/7W2vJa1sMtaA4 >> xJL+PpZxiKcGSdC96B+EBYxLhLcwsNpbfq7uxQOkIspa66mkIMAVzJgt4DFL >> Ca+UY3ODA26VtWF5U/hkdupgld+YSxXTyJakeShrBSFAX0a4cygV9Ll7SIhO >> IDS+0Mbur0IGzIWRgtCQhRXsc7wn3IoIovqe8Nfk4xupeoK2P5UHO1rW9pWy >> Jwj5PXieDqxgx8RKlulN1bCbSgTaEdveTiqqVxlnM9L0MhgesuB8vkpHbsqn >> mYJHNzU7ghU89xLnRuia9rBlpjw4OzagfowAJTH3UnaO67kxES+IWO8onQbN >> RhY0QR5cB5rVSjYkzzlsuLM17fQPcT8++yMarKdsrr6WIGppXUFFdATAqIaY >> DHD1 >> =goL4 >> - -----END PGP SIGNATURE----- >> >> >> -----BEGIN PGP SIGNATURE----- >> Version: Mailvelope v0.13.1 >> Comment:https://www.mailvelope.com <https://www.mailvelope.com/> >> >> wsFcBAEBCAAQBQJViuu0CRDmVDuy+mK58QAASzoQAIf4Lj/jA2yl2XMS7RAW >> FmgK8rsf2iyzg6UQMmobFw0oWTb/0T4AscXlZIE7dhGUi6m6UHWBPB7P6YBZ >> UQ2eJqzcaK1Jf/flfTZajWB2z2CSYpuwbPYaQ8SqKoyauEjKgD092/LUfKL3 >> TP5z7SdhZ8/HmzT2qFUdYuAQ+WvD3rgdJtkblFgItM+bqKmhibiZr3KHzXoU >> j5Ob61AsR6/s3hgWJ09uAghqB8SNsxJ0u7R5RnaiS2VWkHSHTrdiTwd/ONlL >> anBnKljTgkCSqS3RoPVB74qlqhDxlDnwRYvKrxurcikaI3tZ17xt4UvCc9yP >> RRH6M8aU1+7itOxu8DyOeZ+9Ev5/H6i5LwtrnN2pHaN9s0tWRKwzt5HQEYhE >> ceoyui+EtpN8zzqs9ryIGvHL3KB1bmL+0WWO4RlT8NwodsSge3Yga8KUMa07 >> 8+dh0VGUywGEmxMg2VWPyvKf/keOiWHHi4UDJRgXJdnBjH/+4Yebva7TJ2b9 >> Ch0r8JL00nbJCBb78dvw59XiFUJBFT5WfgItmbfjX2SI+srFaDXFKtGSjnFi >> MK4gE7DA70tKgP+xwpw3Eou7rDzxogqxnV54BlNzvbokbfiDAZ/ARL7CtC/1 >> SnBxzEaliaJnBHKSgwOyP9sxz+QKMxty2ZTSmCnBUxKRK9O2hNSzFf6+1heT >> yQ3L >> =DreJ >> -----END PGP SIGNATURE----- > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com