Thanks for the suggestion. NTP is fine in my case. Turns out it was a networking problem that wasn't triggering error counters on the NICs so it took a bit to track it down.
QH On Thu, Jul 30, 2015 at 4:16 PM, Spillmann, Dieter < dieter.spillm...@arris.com> wrote: > I saw this behavior when the servers are not in time sync. > Check your ntp settings > > Dieter > > From: ceph-users <ceph-users-boun...@lists.ceph.com> on behalf of Quentin > Hartman <qhart...@direwolfdigital.com> > Date: Wednesday, July 29, 2015 at 5:47 PM > To: Luis Periquito <periqu...@gmail.com> > Cc: Ceph Users <ceph-users@lists.ceph.com> > Subject: Re: [ceph-users] ceph-mon cpu usage > > I just had my ceph cluster exhibit this behavior (two of three mons eat > all CPU, cluster becomes unusably slow) which is running 0.87.1 > > It seems to be tied to deep scrubbing, as the behavior almost immediately > surfaces if that is turned on, but if it is off the behavior eventually > seems to return to normal and stays that way while scrubbing is off. I have > not yet found anything in the cluster to indicate a hardware problem. > > Any thoughts or further insights on this subject would be appreciated. > > QH > > On Sat, Jul 25, 2015 at 12:31 AM, Luis Periquito <periqu...@gmail.com> > wrote: > >> I think I figured out! All 4 of the OSDs on one host (OSD 107-110) were >> sending massive amounts of auth requests to the monitors, seeming to >> overwhelm them. >> >> Weird bit is that I removed them (osd crush remove, auth del, osd rm), dd >> the box and all of the disks, reinstalled and guess what? They are still >> doing a lot of requests to the MONs... this will require some further >> investigations. >> >> As this is happening during my holidays, I just disabled them, and will >> investigate further when I get back. >> >> >> On Fri, Jul 24, 2015 at 11:11 PM, Kjetil Jørgensen <kje...@medallia.com> >> wrote: >> >>> It sounds slightly similar to what I just experienced. >>> >>> I had one monitor out of three, which seemed to essentially run one core >>> at full tilt continuously, and had it's virtual address space allocated at >>> the point where top started calling it Tb. Requests hitting this monitor >>> did not get very timely responses (although; I don't know if this were >>> happening consistently or arbitrarily). >>> >>> I ended up re-building the monitor from the two healthy ones I had, >>> which made the problem go away for me. >>> >>> After the fact inspection of the monitor I ripped out, clocked it in at >>> 1.3Gb compared to the 250Mb of the other two, after rebuild they're all >>> comparable in size. >>> >>> In my case; this started out for me on firefly, and persisted after >>> upgrading to hammer. Which prompted the rebuild, suspecting that in my case >>> it were related to "something" persistent for this monitor. >>> >>> I do not have that much more useful to contribute to this discussion, >>> since I've more-or-less destroyed any evidence by re-building the monitor. >>> >>> Cheers, >>> KJ >>> >>> On Fri, Jul 24, 2015 at 1:55 PM, Luis Periquito <periqu...@gmail.com> >>> wrote: >>> >>>> The leveldb is smallish: around 70mb. >>>> >>>> I ran debug mon = 10 for a while, but couldn't find any interesting >>>> information. I would run out of space quite quickly though as the log >>>> partition only has 10g. >>>> On 24 Jul 2015 21:13, "Mark Nelson" <mnel...@redhat.com> wrote: >>>> >>>>> On 07/24/2015 02:31 PM, Luis Periquito wrote: >>>>> >>>>>> Now it's official, I have a weird one! >>>>>> >>>>>> Restarted one of the ceph-mons with jemalloc and it didn't make any >>>>>> difference. It's still using a lot of cpu and still not freeing up >>>>>> memory... >>>>>> >>>>>> The issue is that the cluster almost stops responding to requests, and >>>>>> if I restart the primary mon (that had almost no memory usage nor cpu) >>>>>> the cluster goes back to its merry way responding to requests. >>>>>> >>>>>> Does anyone have any idea what may be going on? The worst bit is that >>>>>> I >>>>>> have several clusters just like this (well they are smaller), and as >>>>>> we >>>>>> do everything with puppet, they should be very similar... and all the >>>>>> other clusters are just working fine, without any issues whatsoever... >>>>>> >>>>> >>>>> We've seen cases where leveldb can't compact fast enough and memory >>>>> balloons, but it's usually associated with extreme CPU usage as well. It >>>>> would be showing up in perf though if that were the case... >>>>> >>>>> >>>>>> On 24 Jul 2015 10:11, "Jan Schermer" <j...@schermer.cz >>>>>> <mailto:j...@schermer.cz>> wrote: >>>>>> >>>>>> You don’t (shouldn’t) need to rebuild the binary to use jemalloc. >>>>>> It >>>>>> should be possible to do something like >>>>>> >>>>>> LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd … >>>>>> >>>>>> The last time we tried it segfaulted after a few minutes, so YMMV >>>>>> and be careful. >>>>>> >>>>>> Jan >>>>>> >>>>>> On 23 Jul 2015, at 18:18, Luis Periquito <periqu...@gmail.com >>>>>>> <mailto:periqu...@gmail.com>> wrote: >>>>>>> >>>>>>> Hi Greg, >>>>>>> >>>>>>> I've been looking at the tcmalloc issues, but did seem to affect >>>>>>> osd's, and I do notice it in heavy read workloads (even after the >>>>>>> patch and >>>>>>> increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This >>>>>>> is affecting the mon process though. >>>>>>> >>>>>>> looking at perf top I'm getting most of the CPU usage in mutex >>>>>>> lock/unlock >>>>>>> 5.02% libpthread-2.19.so <http://libpthread-2.19.so/> [.] >>>>>>> pthread_mutex_unlock >>>>>>> 3.82% libsoftokn3.so [.] 0x000000000001e7cb >>>>>>> 3.46% libpthread-2.19.so <http://libpthread-2.19.so/> [.] >>>>>>> pthread_mutex_lock >>>>>>> >>>>>>> I could try to use jemalloc, are you aware of any built binaries? >>>>>>> Can I mix a cluster with different malloc binaries? >>>>>>> >>>>>>> >>>>>>> On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum < >>>>>>> g...@gregs42.com >>>>>>> <mailto:g...@gregs42.com>> wrote: >>>>>>> >>>>>>> On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito >>>>>>> <periqu...@gmail.com <mailto:periqu...@gmail.com>> wrote: >>>>>>> > The ceph-mon is already taking a lot of memory, and I ran a >>>>>>> heap stats >>>>>>> > ------------------------------------------------ >>>>>>> > MALLOC: 32391696 ( 30.9 MiB) Bytes in use by >>>>>>> application >>>>>>> > MALLOC: + 27597135872 (26318.7 MiB) Bytes in page heap >>>>>>> freelist >>>>>>> > MALLOC: + 16598552 ( 15.8 MiB) Bytes in central cache >>>>>>> freelist >>>>>>> > MALLOC: + 14693536 ( 14.0 MiB) Bytes in transfer >>>>>>> cache >>>>>>> freelist >>>>>>> > MALLOC: + 17441592 ( 16.6 MiB) Bytes in thread cache >>>>>>> freelists >>>>>>> > MALLOC: + 116387992 ( 111.0 MiB) Bytes in malloc >>>>>>> metadata >>>>>>> > MALLOC: ------------ >>>>>>> > MALLOC: = 27794649240 (26507.0 MiB) Actual memory used >>>>>>> (physical + swap) >>>>>>> > MALLOC: + 26116096 ( 24.9 MiB) Bytes released to OS >>>>>>> (aka unmapped) >>>>>>> > MALLOC: ------------ >>>>>>> > MALLOC: = 27820765336 (26531.9 MiB) Virtual address space >>>>>>> used >>>>>>> > MALLOC: >>>>>>> > MALLOC: 5683 Spans in use >>>>>>> > MALLOC: 21 Thread heaps in use >>>>>>> > MALLOC: 8192 Tcmalloc page size >>>>>>> > ------------------------------------------------ >>>>>>> > >>>>>>> > after that I ran the heap release and it went back to >>>>>>> normal. >>>>>>> > ------------------------------------------------ >>>>>>> > MALLOC: 22919616 ( 21.9 MiB) Bytes in use by >>>>>>> application >>>>>>> > MALLOC: + 4792320 ( 4.6 MiB) Bytes in page heap >>>>>>> freelist >>>>>>> > MALLOC: + 18743448 ( 17.9 MiB) Bytes in central cache >>>>>>> freelist >>>>>>> > MALLOC: + 20645776 ( 19.7 MiB) Bytes in transfer >>>>>>> cache >>>>>>> freelist >>>>>>> > MALLOC: + 18456088 ( 17.6 MiB) Bytes in thread cache >>>>>>> freelists >>>>>>> > MALLOC: + 116387992 ( 111.0 MiB) Bytes in malloc >>>>>>> metadata >>>>>>> > MALLOC: ------------ >>>>>>> > MALLOC: = 201945240 ( 192.6 MiB) Actual memory used >>>>>>> (physical + swap) >>>>>>> > MALLOC: + 27618820096 <tel:%2B%20%2027618820096> (26339.4 >>>>>>> MiB) Bytes released to OS (aka unmapped) >>>>>>> > MALLOC: ------------ >>>>>>> > MALLOC: = 27820765336 (26531.9 MiB) Virtual address space >>>>>>> used >>>>>>> > MALLOC: >>>>>>> > MALLOC: 5639 Spans in use >>>>>>> > MALLOC: 29 Thread heaps in use >>>>>>> > MALLOC: 8192 Tcmalloc page size >>>>>>> > ------------------------------------------------ >>>>>>> > >>>>>>> > So it just seems the monitor is not returning unused >>>>>>> memory into the OS or >>>>>>> > reusing already allocated memory it deems as free... >>>>>>> >>>>>>> Yep. This is a bug (best we can tell) in some versions of >>>>>>> tcmalloc >>>>>>> combined with certain distribution stacks, although I don't >>>>>>> think >>>>>>> we've seen it reported on Trusty (nor on a tcmalloc >>>>>>> distribution that >>>>>>> new) before. Alternatively some folks are seeing tcmalloc use >>>>>>> up lots >>>>>>> of CPU in other scenarios involving memory return and it may >>>>>>> manifest >>>>>>> like this, but I'm not sure. You could look through the >>>>>>> mailing list >>>>>>> for information on it. >>>>>>> -Greg >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> ceph-users mailing list >>>>>>> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> ceph-users mailing list >>>>>> ceph-users@lists.ceph.com >>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>> >>>>>> _______________________________________________ >>>>> ceph-users mailing list >>>>> ceph-users@lists.ceph.com >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>>> >>> >>> >>> -- >>> -- >>> Kjetil Joergensen <kje...@medallia.com> >>> Operations Engineer, Medallia Inc >>> Phone: +1 (650) 739-6580 >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com