Re: [ceph-users] ceph-mon cpu usage

Quentin Hartman Thu, 30 Jul 2015 15:23:52 -0700

Thanks for the suggestion. NTP is fine in my case. Turns out it was a
networking problem that wasn't triggering error counters on the NICs so it
took a bit to track it down.


QH

On Thu, Jul 30, 2015 at 4:16 PM, Spillmann, Dieter <
dieter.spillm...@arris.com> wrote:

> I saw this behavior when the servers are not in time sync.
> Check your ntp settings
>
> Dieter
>
> From: ceph-users <ceph-users-boun...@lists.ceph.com> on behalf of Quentin
> Hartman <qhart...@direwolfdigital.com>
> Date: Wednesday, July 29, 2015 at 5:47 PM
> To: Luis Periquito <periqu...@gmail.com>
> Cc: Ceph Users <ceph-users@lists.ceph.com>
> Subject: Re: [ceph-users] ceph-mon cpu usage
>
> I just had my ceph cluster exhibit this behavior (two of three mons eat
> all CPU, cluster becomes unusably slow) which is running 0.87.1
>
> It seems to be tied to deep scrubbing, as the behavior almost immediately
> surfaces if that is turned on, but if it is off the behavior eventually
> seems to return to normal and stays that way while scrubbing is off. I have
> not yet found anything in the cluster to indicate a hardware problem.
>
> Any thoughts or further insights on this subject would be appreciated.
>
> QH
>
> On Sat, Jul 25, 2015 at 12:31 AM, Luis Periquito <periqu...@gmail.com>
> wrote:
>
>> I think I figured out! All 4 of the OSDs on one host (OSD 107-110) were
>> sending massive amounts of auth requests to the monitors, seeming to
>> overwhelm them.
>>
>> Weird bit is that I removed them (osd crush remove, auth del, osd rm), dd
>> the box and all of the disks, reinstalled and guess what? They are still
>> doing a lot of requests to the MONs... this will require some further
>> investigations.
>>
>> As this is happening during my holidays, I just disabled them, and will
>> investigate further when I get back.
>>
>>
>> On Fri, Jul 24, 2015 at 11:11 PM, Kjetil Jørgensen <kje...@medallia.com>
>> wrote:
>>
>>> It sounds slightly similar to what I just experienced.
>>>
>>> I had one monitor out of three, which seemed to essentially run one core
>>> at full tilt continuously, and had it's virtual address space allocated at
>>> the point where top started calling it Tb. Requests hitting this monitor
>>> did not get very timely responses (although; I don't know if this were
>>> happening consistently or arbitrarily).
>>>
>>> I ended up re-building the monitor from the two healthy ones I had,
>>> which made the problem go away for me.
>>>
>>> After the fact inspection of the monitor I ripped out, clocked it in at
>>> 1.3Gb compared to the 250Mb of the other two, after rebuild they're all
>>> comparable in size.
>>>
>>> In my case; this started out for me on firefly, and persisted after
>>> upgrading to hammer. Which prompted the rebuild, suspecting that in my case
>>> it were related to "something" persistent for this monitor.
>>>
>>> I do not have that much more useful to contribute to this discussion,
>>> since I've more-or-less destroyed any evidence by re-building the monitor.
>>>
>>> Cheers,
>>> KJ
>>>
>>> On Fri, Jul 24, 2015 at 1:55 PM, Luis Periquito <periqu...@gmail.com>
>>> wrote:
>>>
>>>> The leveldb is smallish: around 70mb.
>>>>
>>>> I ran debug mon = 10 for a while,  but couldn't find any interesting
>>>> information. I would run out of space quite quickly though as the log
>>>> partition only has 10g.
>>>> On 24 Jul 2015 21:13, "Mark Nelson" <mnel...@redhat.com> wrote:
>>>>
>>>>> On 07/24/2015 02:31 PM, Luis Periquito wrote:
>>>>>
>>>>>> Now it's official,  I have a weird one!
>>>>>>
>>>>>> Restarted one of the ceph-mons with jemalloc and it didn't make any
>>>>>> difference. It's still using a lot of cpu and still not freeing up
>>>>>> memory...
>>>>>>
>>>>>> The issue is that the cluster almost stops responding to requests, and
>>>>>> if I restart the primary mon (that had almost no memory usage nor cpu)
>>>>>> the cluster goes back to its merry way responding to requests.
>>>>>>
>>>>>> Does anyone have any idea what may be going on? The worst bit is that
>>>>>> I
>>>>>> have several clusters just like this (well they are smaller), and as
>>>>>> we
>>>>>> do everything with puppet, they should be very similar... and all the
>>>>>> other clusters are just working fine, without any issues whatsoever...
>>>>>>
>>>>>
>>>>> We've seen cases where leveldb can't compact fast enough and memory
>>>>> balloons, but it's usually associated with extreme CPU usage as well. It
>>>>> would be showing up in perf though if that were the case...
>>>>>
>>>>>
>>>>>> On 24 Jul 2015 10:11, "Jan Schermer" <j...@schermer.cz
>>>>>> <mailto:j...@schermer.cz>> wrote:
>>>>>>
>>>>>>     You don’t (shouldn’t) need to rebuild the binary to use jemalloc.
>>>>>> It
>>>>>>     should be possible to do something like
>>>>>>
>>>>>>     LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd …
>>>>>>
>>>>>>     The last time we tried it segfaulted after a few minutes, so YMMV
>>>>>>     and be careful.
>>>>>>
>>>>>>     Jan
>>>>>>
>>>>>>     On 23 Jul 2015, at 18:18, Luis Periquito <periqu...@gmail.com
>>>>>>>     <mailto:periqu...@gmail.com>> wrote:
>>>>>>>
>>>>>>>     Hi Greg,
>>>>>>>
>>>>>>>     I've been looking at the tcmalloc issues, but did seem to affect
>>>>>>>     osd's, and I do notice it in heavy read workloads (even after the
>>>>>>>     patch and
>>>>>>>     increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This
>>>>>>>     is affecting the mon process though.
>>>>>>>
>>>>>>>     looking at perf top I'm getting most of the CPU usage in mutex
>>>>>>>     lock/unlock
>>>>>>>       5.02% libpthread-2.19.so <http://libpthread-2.19.so/>    [.]
>>>>>>>     pthread_mutex_unlock
>>>>>>>       3.82%  libsoftokn3.so        [.] 0x000000000001e7cb
>>>>>>>       3.46% libpthread-2.19.so <http://libpthread-2.19.so/>    [.]
>>>>>>>     pthread_mutex_lock
>>>>>>>
>>>>>>>     I could try to use jemalloc, are you aware of any built binaries?
>>>>>>>     Can I mix a cluster with different malloc binaries?
>>>>>>>
>>>>>>>
>>>>>>>     On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum <
>>>>>>> g...@gregs42.com
>>>>>>>     <mailto:g...@gregs42.com>> wrote:
>>>>>>>
>>>>>>>         On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito
>>>>>>>         <periqu...@gmail.com <mailto:periqu...@gmail.com>> wrote:
>>>>>>>         > The ceph-mon is already taking a lot of memory, and I ran a
>>>>>>>         heap stats
>>>>>>>         > ------------------------------------------------
>>>>>>>         > MALLOC:       32391696 (   30.9 MiB) Bytes in use by
>>>>>>> application
>>>>>>>         > MALLOC: +  27597135872 (26318.7 MiB) Bytes in page heap
>>>>>>> freelist
>>>>>>>         > MALLOC: +     16598552 (   15.8 MiB) Bytes in central cache
>>>>>>>         freelist
>>>>>>>         > MALLOC: +     14693536 (   14.0 MiB) Bytes in transfer
>>>>>>> cache
>>>>>>>         freelist
>>>>>>>         > MALLOC: +     17441592 (   16.6 MiB) Bytes in thread cache
>>>>>>>         freelists
>>>>>>>         > MALLOC: +    116387992 (  111.0 MiB) Bytes in malloc
>>>>>>> metadata
>>>>>>>         > MALLOC:   ------------
>>>>>>>         > MALLOC: =  27794649240 (26507.0 MiB) Actual memory used
>>>>>>>         (physical + swap)
>>>>>>>         > MALLOC: +     26116096 (   24.9 MiB) Bytes released to OS
>>>>>>>         (aka unmapped)
>>>>>>>         > MALLOC:   ------------
>>>>>>>         > MALLOC: =  27820765336 (26531.9 MiB) Virtual address space
>>>>>>> used
>>>>>>>         > MALLOC:
>>>>>>>         > MALLOC:           5683              Spans in use
>>>>>>>         > MALLOC:             21              Thread heaps in use
>>>>>>>         > MALLOC:           8192              Tcmalloc page size
>>>>>>>         > ------------------------------------------------
>>>>>>>         >
>>>>>>>         > after that I ran the heap release and it went back to
>>>>>>> normal.
>>>>>>>         > ------------------------------------------------
>>>>>>>         > MALLOC:       22919616 (   21.9 MiB) Bytes in use by
>>>>>>> application
>>>>>>>         > MALLOC: +      4792320 (    4.6 MiB) Bytes in page heap
>>>>>>> freelist
>>>>>>>         > MALLOC: +     18743448 (   17.9 MiB) Bytes in central cache
>>>>>>>         freelist
>>>>>>>         > MALLOC: +     20645776 (   19.7 MiB) Bytes in transfer
>>>>>>> cache
>>>>>>>         freelist
>>>>>>>         > MALLOC: +     18456088 (   17.6 MiB) Bytes in thread cache
>>>>>>>         freelists
>>>>>>>         > MALLOC: +    116387992 (  111.0 MiB) Bytes in malloc
>>>>>>> metadata
>>>>>>>         > MALLOC:   ------------
>>>>>>>         > MALLOC: =    201945240 (  192.6 MiB) Actual memory used
>>>>>>>         (physical + swap)
>>>>>>>         > MALLOC: + 27618820096 <tel:%2B%20%2027618820096> (26339.4
>>>>>>>         MiB) Bytes released to OS (aka unmapped)
>>>>>>>         > MALLOC:   ------------
>>>>>>>         > MALLOC: =  27820765336 (26531.9 MiB) Virtual address space
>>>>>>> used
>>>>>>>         > MALLOC:
>>>>>>>         > MALLOC:           5639              Spans in use
>>>>>>>         > MALLOC:             29              Thread heaps in use
>>>>>>>         > MALLOC:           8192              Tcmalloc page size
>>>>>>>         > ------------------------------------------------
>>>>>>>         >
>>>>>>>         > So it just seems the monitor is not returning unused
>>>>>>> memory into the OS or
>>>>>>>         > reusing already allocated memory it deems as free...
>>>>>>>
>>>>>>>         Yep. This is a bug (best we can tell) in some versions of
>>>>>>> tcmalloc
>>>>>>>         combined with certain distribution stacks, although I don't
>>>>>>> think
>>>>>>>         we've seen it reported on Trusty (nor on a tcmalloc
>>>>>>>         distribution that
>>>>>>>         new) before. Alternatively some folks are seeing tcmalloc use
>>>>>>>         up lots
>>>>>>>         of CPU in other scenarios involving memory return and it may
>>>>>>>         manifest
>>>>>>>         like this, but I'm not sure. You could look through the
>>>>>>>         mailing list
>>>>>>>         for information on it.
>>>>>>>         -Greg
>>>>>>>
>>>>>>>
>>>>>>>     _______________________________________________
>>>>>>>     ceph-users mailing list
>>>>>>>     ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>>>>>>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@lists.ceph.com
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@lists.ceph.com
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>
>>>
>>> --
>>> --
>>> Kjetil Joergensen <kje...@medallia.com>
>>> Operations Engineer, Medallia Inc
>>> Phone: +1 (650) 739-6580
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph-mon cpu usage

Reply via email to