Re: [ceph-users] ceph-mon cpu usage

2015-07-30 Thread Spillmann, Dieter
I saw this behavior when the servers are not in time sync.
Check your ntp settings

Dieter

From: ceph-users 
ceph-users-boun...@lists.ceph.commailto:ceph-users-boun...@lists.ceph.com 
on behalf of Quentin Hartman 
qhart...@direwolfdigital.commailto:qhart...@direwolfdigital.com
Date: Wednesday, July 29, 2015 at 5:47 PM
To: Luis Periquito periqu...@gmail.commailto:periqu...@gmail.com
Cc: Ceph Users ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Subject: Re: [ceph-users] ceph-mon cpu usage

I just had my ceph cluster exhibit this behavior (two of three mons eat all 
CPU, cluster becomes unusably slow) which is running 0.87.1

It seems to be tied to deep scrubbing, as the behavior almost immediately 
surfaces if that is turned on, but if it is off the behavior eventually seems 
to return to normal and stays that way while scrubbing is off. I have not yet 
found anything in the cluster to indicate a hardware problem.

Any thoughts or further insights on this subject would be appreciated.

QH

On Sat, Jul 25, 2015 at 12:31 AM, Luis Periquito 
periqu...@gmail.commailto:periqu...@gmail.com wrote:
I think I figured out! All 4 of the OSDs on one host (OSD 107-110) were sending 
massive amounts of auth requests to the monitors, seeming to overwhelm them.

Weird bit is that I removed them (osd crush remove, auth del, osd rm), dd the 
box and all of the disks, reinstalled and guess what? They are still doing a 
lot of requests to the MONs... this will require some further investigations.

As this is happening during my holidays, I just disabled them, and will 
investigate further when I get back.


On Fri, Jul 24, 2015 at 11:11 PM, Kjetil Jørgensen 
kje...@medallia.commailto:kje...@medallia.com wrote:
It sounds slightly similar to what I just experienced.

I had one monitor out of three, which seemed to essentially run one core at 
full tilt continuously, and had it's virtual address space allocated at the 
point where top started calling it Tb. Requests hitting this monitor did not 
get very timely responses (although; I don't know if this were happening 
consistently or arbitrarily).

I ended up re-building the monitor from the two healthy ones I had, which made 
the problem go away for me.

After the fact inspection of the monitor I ripped out, clocked it in at 1.3Gb 
compared to the 250Mb of the other two, after rebuild they're all comparable in 
size.

In my case; this started out for me on firefly, and persisted after upgrading 
to hammer. Which prompted the rebuild, suspecting that in my case it were 
related to something persistent for this monitor.

I do not have that much more useful to contribute to this discussion, since 
I've more-or-less destroyed any evidence by re-building the monitor.

Cheers,
KJ

On Fri, Jul 24, 2015 at 1:55 PM, Luis Periquito 
periqu...@gmail.commailto:periqu...@gmail.com wrote:

The leveldb is smallish: around 70mb.

I ran debug mon = 10 for a while,  but couldn't find any interesting 
information. I would run out of space quite quickly though as the log partition 
only has 10g.

On 24 Jul 2015 21:13, Mark Nelson 
mnel...@redhat.commailto:mnel...@redhat.com wrote:
On 07/24/2015 02:31 PM, Luis Periquito wrote:
Now it's official,  I have a weird one!

Restarted one of the ceph-mons with jemalloc and it didn't make any
difference. It's still using a lot of cpu and still not freeing up memory...

The issue is that the cluster almost stops responding to requests, and
if I restart the primary mon (that had almost no memory usage nor cpu)
the cluster goes back to its merry way responding to requests.

Does anyone have any idea what may be going on? The worst bit is that I
have several clusters just like this (well they are smaller), and as we
do everything with puppet, they should be very similar... and all the
other clusters are just working fine, without any issues whatsoever...

We've seen cases where leveldb can't compact fast enough and memory balloons, 
but it's usually associated with extreme CPU usage as well. It would be showing 
up in perf though if that were the case...


On 24 Jul 2015 10:11, Jan Schermer j...@schermer.czmailto:j...@schermer.cz
mailto:j...@schermer.czmailto:j...@schermer.cz wrote:

You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It
should be possible to do something like

LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd …

The last time we tried it segfaulted after a few minutes, so YMMV
and be careful.

Jan

On 23 Jul 2015, at 18:18, Luis Periquito 
periqu...@gmail.commailto:periqu...@gmail.com
mailto:periqu...@gmail.commailto:periqu...@gmail.com wrote:

Hi Greg,

I've been looking at the tcmalloc issues, but did seem to affect
osd's, and I do notice it in heavy read workloads (even after the
patch and
increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This
is affecting the mon process though.

looking at perf top I'm getting most of the CPU usage

Re: [ceph-users] ceph-mon cpu usage

2015-07-30 Thread Quentin Hartman
Thanks for the suggestion. NTP is fine in my case. Turns out it was a
networking problem that wasn't triggering error counters on the NICs so it
took a bit to track it down.

QH

On Thu, Jul 30, 2015 at 4:16 PM, Spillmann, Dieter 
dieter.spillm...@arris.com wrote:

 I saw this behavior when the servers are not in time sync.
 Check your ntp settings

 Dieter

 From: ceph-users ceph-users-boun...@lists.ceph.com on behalf of Quentin
 Hartman qhart...@direwolfdigital.com
 Date: Wednesday, July 29, 2015 at 5:47 PM
 To: Luis Periquito periqu...@gmail.com
 Cc: Ceph Users ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] ceph-mon cpu usage

 I just had my ceph cluster exhibit this behavior (two of three mons eat
 all CPU, cluster becomes unusably slow) which is running 0.87.1

 It seems to be tied to deep scrubbing, as the behavior almost immediately
 surfaces if that is turned on, but if it is off the behavior eventually
 seems to return to normal and stays that way while scrubbing is off. I have
 not yet found anything in the cluster to indicate a hardware problem.

 Any thoughts or further insights on this subject would be appreciated.

 QH

 On Sat, Jul 25, 2015 at 12:31 AM, Luis Periquito periqu...@gmail.com
 wrote:

 I think I figured out! All 4 of the OSDs on one host (OSD 107-110) were
 sending massive amounts of auth requests to the monitors, seeming to
 overwhelm them.

 Weird bit is that I removed them (osd crush remove, auth del, osd rm), dd
 the box and all of the disks, reinstalled and guess what? They are still
 doing a lot of requests to the MONs... this will require some further
 investigations.

 As this is happening during my holidays, I just disabled them, and will
 investigate further when I get back.


 On Fri, Jul 24, 2015 at 11:11 PM, Kjetil Jørgensen kje...@medallia.com
 wrote:

 It sounds slightly similar to what I just experienced.

 I had one monitor out of three, which seemed to essentially run one core
 at full tilt continuously, and had it's virtual address space allocated at
 the point where top started calling it Tb. Requests hitting this monitor
 did not get very timely responses (although; I don't know if this were
 happening consistently or arbitrarily).

 I ended up re-building the monitor from the two healthy ones I had,
 which made the problem go away for me.

 After the fact inspection of the monitor I ripped out, clocked it in at
 1.3Gb compared to the 250Mb of the other two, after rebuild they're all
 comparable in size.

 In my case; this started out for me on firefly, and persisted after
 upgrading to hammer. Which prompted the rebuild, suspecting that in my case
 it were related to something persistent for this monitor.

 I do not have that much more useful to contribute to this discussion,
 since I've more-or-less destroyed any evidence by re-building the monitor.

 Cheers,
 KJ

 On Fri, Jul 24, 2015 at 1:55 PM, Luis Periquito periqu...@gmail.com
 wrote:

 The leveldb is smallish: around 70mb.

 I ran debug mon = 10 for a while,  but couldn't find any interesting
 information. I would run out of space quite quickly though as the log
 partition only has 10g.
 On 24 Jul 2015 21:13, Mark Nelson mnel...@redhat.com wrote:

 On 07/24/2015 02:31 PM, Luis Periquito wrote:

 Now it's official,  I have a weird one!

 Restarted one of the ceph-mons with jemalloc and it didn't make any
 difference. It's still using a lot of cpu and still not freeing up
 memory...

 The issue is that the cluster almost stops responding to requests, and
 if I restart the primary mon (that had almost no memory usage nor cpu)
 the cluster goes back to its merry way responding to requests.

 Does anyone have any idea what may be going on? The worst bit is that
 I
 have several clusters just like this (well they are smaller), and as
 we
 do everything with puppet, they should be very similar... and all the
 other clusters are just working fine, without any issues whatsoever...


 We've seen cases where leveldb can't compact fast enough and memory
 balloons, but it's usually associated with extreme CPU usage as well. It
 would be showing up in perf though if that were the case...


 On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz
 mailto:j...@schermer.cz wrote:

 You don’t (shouldn’t) need to rebuild the binary to use jemalloc.
 It
 should be possible to do something like

 LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd …

 The last time we tried it segfaulted after a few minutes, so YMMV
 and be careful.

 Jan

 On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com
 mailto:periqu...@gmail.com wrote:

 Hi Greg,

 I've been looking at the tcmalloc issues, but did seem to affect
 osd's, and I do notice it in heavy read workloads (even after the
 patch and
 increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This
 is affecting the mon process though.

 looking at perf top I'm getting most of the CPU usage

Re: [ceph-users] ceph-mon cpu usage

2015-07-29 Thread Quentin Hartman
I just had my ceph cluster exhibit this behavior (two of three mons eat all
CPU, cluster becomes unusably slow) which is running 0.87.1

It seems to be tied to deep scrubbing, as the behavior almost immediately
surfaces if that is turned on, but if it is off the behavior eventually
seems to return to normal and stays that way while scrubbing is off. I have
not yet found anything in the cluster to indicate a hardware problem.

Any thoughts or further insights on this subject would be appreciated.

QH

On Sat, Jul 25, 2015 at 12:31 AM, Luis Periquito periqu...@gmail.com
wrote:

 I think I figured out! All 4 of the OSDs on one host (OSD 107-110) were
 sending massive amounts of auth requests to the monitors, seeming to
 overwhelm them.

 Weird bit is that I removed them (osd crush remove, auth del, osd rm), dd
 the box and all of the disks, reinstalled and guess what? They are still
 doing a lot of requests to the MONs... this will require some further
 investigations.

 As this is happening during my holidays, I just disabled them, and will
 investigate further when I get back.


 On Fri, Jul 24, 2015 at 11:11 PM, Kjetil Jørgensen kje...@medallia.com
 wrote:

 It sounds slightly similar to what I just experienced.

 I had one monitor out of three, which seemed to essentially run one core
 at full tilt continuously, and had it's virtual address space allocated at
 the point where top started calling it Tb. Requests hitting this monitor
 did not get very timely responses (although; I don't know if this were
 happening consistently or arbitrarily).

 I ended up re-building the monitor from the two healthy ones I had, which
 made the problem go away for me.

 After the fact inspection of the monitor I ripped out, clocked it in at
 1.3Gb compared to the 250Mb of the other two, after rebuild they're all
 comparable in size.

 In my case; this started out for me on firefly, and persisted after
 upgrading to hammer. Which prompted the rebuild, suspecting that in my case
 it were related to something persistent for this monitor.

 I do not have that much more useful to contribute to this discussion,
 since I've more-or-less destroyed any evidence by re-building the monitor.

 Cheers,
 KJ

 On Fri, Jul 24, 2015 at 1:55 PM, Luis Periquito periqu...@gmail.com
 wrote:

 The leveldb is smallish: around 70mb.

 I ran debug mon = 10 for a while,  but couldn't find any interesting
 information. I would run out of space quite quickly though as the log
 partition only has 10g.
 On 24 Jul 2015 21:13, Mark Nelson mnel...@redhat.com wrote:

 On 07/24/2015 02:31 PM, Luis Periquito wrote:

 Now it's official,  I have a weird one!

 Restarted one of the ceph-mons with jemalloc and it didn't make any
 difference. It's still using a lot of cpu and still not freeing up
 memory...

 The issue is that the cluster almost stops responding to requests, and
 if I restart the primary mon (that had almost no memory usage nor cpu)
 the cluster goes back to its merry way responding to requests.

 Does anyone have any idea what may be going on? The worst bit is that I
 have several clusters just like this (well they are smaller), and as we
 do everything with puppet, they should be very similar... and all the
 other clusters are just working fine, without any issues whatsoever...


 We've seen cases where leveldb can't compact fast enough and memory
 balloons, but it's usually associated with extreme CPU usage as well. It
 would be showing up in perf though if that were the case...


 On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz
 mailto:j...@schermer.cz wrote:

 You don’t (shouldn’t) need to rebuild the binary to use jemalloc.
 It
 should be possible to do something like

 LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd …

 The last time we tried it segfaulted after a few minutes, so YMMV
 and be careful.

 Jan

  On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com
 mailto:periqu...@gmail.com wrote:

 Hi Greg,

 I've been looking at the tcmalloc issues, but did seem to affect
 osd's, and I do notice it in heavy read workloads (even after the
 patch and
 increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This
 is affecting the mon process though.

 looking at perf top I'm getting most of the CPU usage in mutex
 lock/unlock
   5.02% libpthread-2.19.so http://libpthread-2.19.so/[.]
 pthread_mutex_unlock
   3.82%  libsoftokn3.so[.] 0x0001e7cb
   3.46% libpthread-2.19.so http://libpthread-2.19.so/[.]
 pthread_mutex_lock

 I could try to use jemalloc, are you aware of any built binaries?
 Can I mix a cluster with different malloc binaries?


 On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum 
 g...@gregs42.com
 mailto:g...@gregs42.com wrote:

 On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito
 periqu...@gmail.com mailto:periqu...@gmail.com wrote:
  The ceph-mon is 

Re: [ceph-users] ceph-mon cpu usage

2015-07-25 Thread Luis Periquito
I think I figured out! All 4 of the OSDs on one host (OSD 107-110) were
sending massive amounts of auth requests to the monitors, seeming to
overwhelm them.

Weird bit is that I removed them (osd crush remove, auth del, osd rm), dd
the box and all of the disks, reinstalled and guess what? They are still
doing a lot of requests to the MONs... this will require some further
investigations.

As this is happening during my holidays, I just disabled them, and will
investigate further when I get back.


On Fri, Jul 24, 2015 at 11:11 PM, Kjetil Jørgensen kje...@medallia.com
wrote:

 It sounds slightly similar to what I just experienced.

 I had one monitor out of three, which seemed to essentially run one core
 at full tilt continuously, and had it's virtual address space allocated at
 the point where top started calling it Tb. Requests hitting this monitor
 did not get very timely responses (although; I don't know if this were
 happening consistently or arbitrarily).

 I ended up re-building the monitor from the two healthy ones I had, which
 made the problem go away for me.

 After the fact inspection of the monitor I ripped out, clocked it in at
 1.3Gb compared to the 250Mb of the other two, after rebuild they're all
 comparable in size.

 In my case; this started out for me on firefly, and persisted after
 upgrading to hammer. Which prompted the rebuild, suspecting that in my case
 it were related to something persistent for this monitor.

 I do not have that much more useful to contribute to this discussion,
 since I've more-or-less destroyed any evidence by re-building the monitor.

 Cheers,
 KJ

 On Fri, Jul 24, 2015 at 1:55 PM, Luis Periquito periqu...@gmail.com
 wrote:

 The leveldb is smallish: around 70mb.

 I ran debug mon = 10 for a while,  but couldn't find any interesting
 information. I would run out of space quite quickly though as the log
 partition only has 10g.
 On 24 Jul 2015 21:13, Mark Nelson mnel...@redhat.com wrote:

 On 07/24/2015 02:31 PM, Luis Periquito wrote:

 Now it's official,  I have a weird one!

 Restarted one of the ceph-mons with jemalloc and it didn't make any
 difference. It's still using a lot of cpu and still not freeing up
 memory...

 The issue is that the cluster almost stops responding to requests, and
 if I restart the primary mon (that had almost no memory usage nor cpu)
 the cluster goes back to its merry way responding to requests.

 Does anyone have any idea what may be going on? The worst bit is that I
 have several clusters just like this (well they are smaller), and as we
 do everything with puppet, they should be very similar... and all the
 other clusters are just working fine, without any issues whatsoever...


 We've seen cases where leveldb can't compact fast enough and memory
 balloons, but it's usually associated with extreme CPU usage as well. It
 would be showing up in perf though if that were the case...


 On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz
 mailto:j...@schermer.cz wrote:

 You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It
 should be possible to do something like

 LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd …

 The last time we tried it segfaulted after a few minutes, so YMMV
 and be careful.

 Jan

  On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com
 mailto:periqu...@gmail.com wrote:

 Hi Greg,

 I've been looking at the tcmalloc issues, but did seem to affect
 osd's, and I do notice it in heavy read workloads (even after the
 patch and
 increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This
 is affecting the mon process though.

 looking at perf top I'm getting most of the CPU usage in mutex
 lock/unlock
   5.02% libpthread-2.19.so http://libpthread-2.19.so/[.]
 pthread_mutex_unlock
   3.82%  libsoftokn3.so[.] 0x0001e7cb
   3.46% libpthread-2.19.so http://libpthread-2.19.so/[.]
 pthread_mutex_lock

 I could try to use jemalloc, are you aware of any built binaries?
 Can I mix a cluster with different malloc binaries?


 On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com
 mailto:g...@gregs42.com wrote:

 On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito
 periqu...@gmail.com mailto:periqu...@gmail.com wrote:
  The ceph-mon is already taking a lot of memory, and I ran a
 heap stats
  
  MALLOC:   32391696 (   30.9 MiB) Bytes in use by
 application
  MALLOC: +  27597135872 (26318.7 MiB) Bytes in page heap
 freelist
  MALLOC: + 16598552 (   15.8 MiB) Bytes in central cache
 freelist
  MALLOC: + 14693536 (   14.0 MiB) Bytes in transfer cache
 freelist
  MALLOC: + 17441592 (   16.6 MiB) Bytes in thread cache
 freelists
  MALLOC: +116387992 (  111.0 MiB) Bytes in 

Re: [ceph-users] ceph-mon cpu usage

2015-07-24 Thread Jan Schermer
You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It should be 
possible to do something like

LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd …

The last time we tried it segfaulted after a few minutes, so YMMV and be 
careful.

Jan

 On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com wrote:
 
 Hi Greg,
 
 I've been looking at the tcmalloc issues, but did seem to affect osd's, and I 
 do notice it in heavy read workloads (even after the patch and increasing 
 TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is affecting the mon 
 process though.
 
 looking at perf top I'm getting most of the CPU usage in mutex lock/unlock
   5.02%  libpthread-2.19.so http://libpthread-2.19.so/[.] 
 pthread_mutex_unlock
   3.82%  libsoftokn3.so[.] 0x0001e7cb
   3.46%  libpthread-2.19.so http://libpthread-2.19.so/[.] 
 pthread_mutex_lock
 
 I could try to use jemalloc, are you aware of any built binaries? Can I mix a 
 cluster with different malloc binaries?
 
 
 On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com 
 mailto:g...@gregs42.com wrote:
 On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com 
 mailto:periqu...@gmail.com wrote:
  The ceph-mon is already taking a lot of memory, and I ran a heap stats
  
  MALLOC:   32391696 (   30.9 MiB) Bytes in use by application
  MALLOC: +  27597135872 (26318.7 MiB) Bytes in page heap freelist
  MALLOC: + 16598552 (   15.8 MiB) Bytes in central cache freelist
  MALLOC: + 14693536 (   14.0 MiB) Bytes in transfer cache freelist
  MALLOC: + 17441592 (   16.6 MiB) Bytes in thread cache freelists
  MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
  MALLOC:   
  MALLOC: =  27794649240 (26507.0 MiB) Actual memory used (physical + swap)
  MALLOC: + 26116096 (   24.9 MiB) Bytes released to OS (aka unmapped)
  MALLOC:   
  MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
  MALLOC:
  MALLOC:   5683  Spans in use
  MALLOC: 21  Thread heaps in use
  MALLOC:   8192  Tcmalloc page size
  
 
  after that I ran the heap release and it went back to normal.
  
  MALLOC:   22919616 (   21.9 MiB) Bytes in use by application
  MALLOC: +  4792320 (4.6 MiB) Bytes in page heap freelist
  MALLOC: + 18743448 (   17.9 MiB) Bytes in central cache freelist
  MALLOC: + 20645776 (   19.7 MiB) Bytes in transfer cache freelist
  MALLOC: + 18456088 (   17.6 MiB) Bytes in thread cache freelists
  MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
  MALLOC:   
  MALLOC: =201945240 (  192.6 MiB) Actual memory used (physical + swap)
  MALLOC: + 27618820096 tel:%2B%20%2027618820096 (26339.4 MiB) Bytes 
  released to OS (aka unmapped)
  MALLOC:   
  MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
  MALLOC:
  MALLOC:   5639  Spans in use
  MALLOC: 29  Thread heaps in use
  MALLOC:   8192  Tcmalloc page size
  
 
  So it just seems the monitor is not returning unused memory into the OS or
  reusing already allocated memory it deems as free...
 
 Yep. This is a bug (best we can tell) in some versions of tcmalloc
 combined with certain distribution stacks, although I don't think
 we've seen it reported on Trusty (nor on a tcmalloc distribution that
 new) before. Alternatively some folks are seeing tcmalloc use up lots
 of CPU in other scenarios involving memory return and it may manifest
 like this, but I'm not sure. You could look through the mailing list
 for information on it.
 -Greg
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-mon cpu usage

2015-07-24 Thread Mark Nelson

On 07/24/2015 02:31 PM, Luis Periquito wrote:

Now it's official,  I have a weird one!

Restarted one of the ceph-mons with jemalloc and it didn't make any
difference. It's still using a lot of cpu and still not freeing up memory...

The issue is that the cluster almost stops responding to requests, and
if I restart the primary mon (that had almost no memory usage nor cpu)
the cluster goes back to its merry way responding to requests.

Does anyone have any idea what may be going on? The worst bit is that I
have several clusters just like this (well they are smaller), and as we
do everything with puppet, they should be very similar... and all the
other clusters are just working fine, without any issues whatsoever...


We've seen cases where leveldb can't compact fast enough and memory 
balloons, but it's usually associated with extreme CPU usage as well. 
It would be showing up in perf though if that were the case...




On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz
mailto:j...@schermer.cz wrote:

You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It
should be possible to do something like

LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd …

The last time we tried it segfaulted after a few minutes, so YMMV
and be careful.

Jan


On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com
mailto:periqu...@gmail.com wrote:

Hi Greg,

I've been looking at the tcmalloc issues, but did seem to affect
osd's, and I do notice it in heavy read workloads (even after the
patch and
increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This
is affecting the mon process though.

looking at perf top I'm getting most of the CPU usage in mutex
lock/unlock
  5.02% libpthread-2.19.so http://libpthread-2.19.so/[.]
pthread_mutex_unlock
  3.82%  libsoftokn3.so[.] 0x0001e7cb
  3.46% libpthread-2.19.so http://libpthread-2.19.so/[.]
pthread_mutex_lock

I could try to use jemalloc, are you aware of any built binaries?
Can I mix a cluster with different malloc binaries?


On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com
mailto:g...@gregs42.com wrote:

On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito
periqu...@gmail.com mailto:periqu...@gmail.com wrote:
 The ceph-mon is already taking a lot of memory, and I ran a
heap stats
 
 MALLOC:   32391696 (   30.9 MiB) Bytes in use by application
 MALLOC: +  27597135872 (26318.7 MiB) Bytes in page heap freelist
 MALLOC: + 16598552 (   15.8 MiB) Bytes in central cache
freelist
 MALLOC: + 14693536 (   14.0 MiB) Bytes in transfer cache
freelist
 MALLOC: + 17441592 (   16.6 MiB) Bytes in thread cache
freelists
 MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
 MALLOC:   
 MALLOC: =  27794649240 (26507.0 MiB) Actual memory used
(physical + swap)
 MALLOC: + 26116096 (   24.9 MiB) Bytes released to OS
(aka unmapped)
 MALLOC:   
 MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
 MALLOC:
 MALLOC:   5683  Spans in use
 MALLOC: 21  Thread heaps in use
 MALLOC:   8192  Tcmalloc page size
 

 after that I ran the heap release and it went back to normal.
 
 MALLOC:   22919616 (   21.9 MiB) Bytes in use by application
 MALLOC: +  4792320 (4.6 MiB) Bytes in page heap freelist
 MALLOC: + 18743448 (   17.9 MiB) Bytes in central cache
freelist
 MALLOC: + 20645776 (   19.7 MiB) Bytes in transfer cache
freelist
 MALLOC: + 18456088 (   17.6 MiB) Bytes in thread cache
freelists
 MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
 MALLOC:   
 MALLOC: =201945240 (  192.6 MiB) Actual memory used
(physical + swap)
 MALLOC: + 27618820096 tel:%2B%20%2027618820096 (26339.4
MiB) Bytes released to OS (aka unmapped)
 MALLOC:   
 MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
 MALLOC:
 MALLOC:   5639  Spans in use
 MALLOC: 29  Thread heaps in use
 MALLOC:   8192  Tcmalloc page size
 

 So it just seems the monitor is not returning unused memory into the 
OS or
 reusing already allocated memory it deems as free...

Yep. This is a bug (best we can tell) in some 

Re: [ceph-users] ceph-mon cpu usage

2015-07-24 Thread Luis Periquito
The leveldb is smallish: around 70mb.

I ran debug mon = 10 for a while,  but couldn't find any interesting
information. I would run out of space quite quickly though as the log
partition only has 10g.
On 24 Jul 2015 21:13, Mark Nelson mnel...@redhat.com wrote:

 On 07/24/2015 02:31 PM, Luis Periquito wrote:

 Now it's official,  I have a weird one!

 Restarted one of the ceph-mons with jemalloc and it didn't make any
 difference. It's still using a lot of cpu and still not freeing up
 memory...

 The issue is that the cluster almost stops responding to requests, and
 if I restart the primary mon (that had almost no memory usage nor cpu)
 the cluster goes back to its merry way responding to requests.

 Does anyone have any idea what may be going on? The worst bit is that I
 have several clusters just like this (well they are smaller), and as we
 do everything with puppet, they should be very similar... and all the
 other clusters are just working fine, without any issues whatsoever...


 We've seen cases where leveldb can't compact fast enough and memory
 balloons, but it's usually associated with extreme CPU usage as well. It
 would be showing up in perf though if that were the case...


 On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz
 mailto:j...@schermer.cz wrote:

 You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It
 should be possible to do something like

 LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd …

 The last time we tried it segfaulted after a few minutes, so YMMV
 and be careful.

 Jan

  On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com
 mailto:periqu...@gmail.com wrote:

 Hi Greg,

 I've been looking at the tcmalloc issues, but did seem to affect
 osd's, and I do notice it in heavy read workloads (even after the
 patch and
 increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This
 is affecting the mon process though.

 looking at perf top I'm getting most of the CPU usage in mutex
 lock/unlock
   5.02% libpthread-2.19.so http://libpthread-2.19.so/[.]
 pthread_mutex_unlock
   3.82%  libsoftokn3.so[.] 0x0001e7cb
   3.46% libpthread-2.19.so http://libpthread-2.19.so/[.]
 pthread_mutex_lock

 I could try to use jemalloc, are you aware of any built binaries?
 Can I mix a cluster with different malloc binaries?


 On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com
 mailto:g...@gregs42.com wrote:

 On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito
 periqu...@gmail.com mailto:periqu...@gmail.com wrote:
  The ceph-mon is already taking a lot of memory, and I ran a
 heap stats
  
  MALLOC:   32391696 (   30.9 MiB) Bytes in use by
 application
  MALLOC: +  27597135872 (26318.7 MiB) Bytes in page heap
 freelist
  MALLOC: + 16598552 (   15.8 MiB) Bytes in central cache
 freelist
  MALLOC: + 14693536 (   14.0 MiB) Bytes in transfer cache
 freelist
  MALLOC: + 17441592 (   16.6 MiB) Bytes in thread cache
 freelists
  MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
  MALLOC:   
  MALLOC: =  27794649240 (26507.0 MiB) Actual memory used
 (physical + swap)
  MALLOC: + 26116096 (   24.9 MiB) Bytes released to OS
 (aka unmapped)
  MALLOC:   
  MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
  MALLOC:
  MALLOC:   5683  Spans in use
  MALLOC: 21  Thread heaps in use
  MALLOC:   8192  Tcmalloc page size
  
 
  after that I ran the heap release and it went back to normal.
  
  MALLOC:   22919616 (   21.9 MiB) Bytes in use by
 application
  MALLOC: +  4792320 (4.6 MiB) Bytes in page heap
 freelist
  MALLOC: + 18743448 (   17.9 MiB) Bytes in central cache
 freelist
  MALLOC: + 20645776 (   19.7 MiB) Bytes in transfer cache
 freelist
  MALLOC: + 18456088 (   17.6 MiB) Bytes in thread cache
 freelists
  MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
  MALLOC:   
  MALLOC: =201945240 (  192.6 MiB) Actual memory used
 (physical + swap)
  MALLOC: + 27618820096 tel:%2B%20%2027618820096 (26339.4
 MiB) Bytes released to OS (aka unmapped)
  MALLOC:   
  MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
  MALLOC:
  MALLOC:   5639  Spans in use
  MALLOC: 29  

Re: [ceph-users] ceph-mon cpu usage

2015-07-24 Thread Luis Periquito
Now it's official,  I have a weird one!

Restarted one of the ceph-mons with jemalloc and it didn't make any
difference. It's still using a lot of cpu and still not freeing up memory...

The issue is that the cluster almost stops responding to requests, and if I
restart the primary mon (that had almost no memory usage nor cpu) the
cluster goes back to its merry way responding to requests.

Does anyone have any idea what may be going on? The worst bit is that I
have several clusters just like this (well they are smaller), and as we do
everything with puppet, they should be very similar... and all the other
clusters are just working fine, without any issues whatsoever...
 On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz wrote:

 You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It
 should be possible to do something like

 LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd …

 The last time we tried it segfaulted after a few minutes, so YMMV and be
 careful.

 Jan

 On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com wrote:

 Hi Greg,

 I've been looking at the tcmalloc issues, but did seem to affect osd's,
 and I do notice it in heavy read workloads (even after the patch and
 increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is
 affecting the mon process though.

 looking at perf top I'm getting most of the CPU usage in mutex lock/unlock
   5.02%  libpthread-2.19.so[.] pthread_mutex_unlock
   3.82%  libsoftokn3.so[.] 0x0001e7cb
   3.46%  libpthread-2.19.so[.] pthread_mutex_lock

 I could try to use jemalloc, are you aware of any built binaries? Can I
 mix a cluster with different malloc binaries?


 On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com wrote:

 On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com
 wrote:
  The ceph-mon is already taking a lot of memory, and I ran a heap stats
  
  MALLOC:   32391696 (   30.9 MiB) Bytes in use by application
  MALLOC: +  27597135872 (26318.7 MiB) Bytes in page heap freelist
  MALLOC: + 16598552 (   15.8 MiB) Bytes in central cache freelist
  MALLOC: + 14693536 (   14.0 MiB) Bytes in transfer cache freelist
  MALLOC: + 17441592 (   16.6 MiB) Bytes in thread cache freelists
  MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
  MALLOC:   
  MALLOC: =  27794649240 (26507.0 MiB) Actual memory used (physical +
 swap)
  MALLOC: + 26116096 (   24.9 MiB) Bytes released to OS (aka unmapped)
  MALLOC:   
  MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
  MALLOC:
  MALLOC:   5683  Spans in use
  MALLOC: 21  Thread heaps in use
  MALLOC:   8192  Tcmalloc page size
  
 
  after that I ran the heap release and it went back to normal.
  
  MALLOC:   22919616 (   21.9 MiB) Bytes in use by application
  MALLOC: +  4792320 (4.6 MiB) Bytes in page heap freelist
  MALLOC: + 18743448 (   17.9 MiB) Bytes in central cache freelist
  MALLOC: + 20645776 (   19.7 MiB) Bytes in transfer cache freelist
  MALLOC: + 18456088 (   17.6 MiB) Bytes in thread cache freelists
  MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
  MALLOC:   
  MALLOC: =201945240 (  192.6 MiB) Actual memory used (physical +
 swap)
  MALLOC: + 27618820096 (26339.4 MiB) Bytes released to OS (aka unmapped)
  MALLOC:   
  MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
  MALLOC:
  MALLOC:   5639  Spans in use
  MALLOC: 29  Thread heaps in use
  MALLOC:   8192  Tcmalloc page size
  
 
  So it just seems the monitor is not returning unused memory into the OS
 or
  reusing already allocated memory it deems as free...

 Yep. This is a bug (best we can tell) in some versions of tcmalloc
 combined with certain distribution stacks, although I don't think
 we've seen it reported on Trusty (nor on a tcmalloc distribution that
 new) before. Alternatively some folks are seeing tcmalloc use up lots
 of CPU in other scenarios involving memory return and it may manifest
 like this, but I'm not sure. You could look through the mailing list
 for information on it.
 -Greg


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-mon cpu usage

2015-07-24 Thread Kjetil Jørgensen
It sounds slightly similar to what I just experienced.

I had one monitor out of three, which seemed to essentially run one core at
full tilt continuously, and had it's virtual address space allocated at the
point where top started calling it Tb. Requests hitting this monitor did
not get very timely responses (although; I don't know if this were
happening consistently or arbitrarily).

I ended up re-building the monitor from the two healthy ones I had, which
made the problem go away for me.

After the fact inspection of the monitor I ripped out, clocked it in at
1.3Gb compared to the 250Mb of the other two, after rebuild they're all
comparable in size.

In my case; this started out for me on firefly, and persisted after
upgrading to hammer. Which prompted the rebuild, suspecting that in my case
it were related to something persistent for this monitor.

I do not have that much more useful to contribute to this discussion, since
I've more-or-less destroyed any evidence by re-building the monitor.

Cheers,
KJ

On Fri, Jul 24, 2015 at 1:55 PM, Luis Periquito periqu...@gmail.com wrote:

 The leveldb is smallish: around 70mb.

 I ran debug mon = 10 for a while,  but couldn't find any interesting
 information. I would run out of space quite quickly though as the log
 partition only has 10g.
 On 24 Jul 2015 21:13, Mark Nelson mnel...@redhat.com wrote:

 On 07/24/2015 02:31 PM, Luis Periquito wrote:

 Now it's official,  I have a weird one!

 Restarted one of the ceph-mons with jemalloc and it didn't make any
 difference. It's still using a lot of cpu and still not freeing up
 memory...

 The issue is that the cluster almost stops responding to requests, and
 if I restart the primary mon (that had almost no memory usage nor cpu)
 the cluster goes back to its merry way responding to requests.

 Does anyone have any idea what may be going on? The worst bit is that I
 have several clusters just like this (well they are smaller), and as we
 do everything with puppet, they should be very similar... and all the
 other clusters are just working fine, without any issues whatsoever...


 We've seen cases where leveldb can't compact fast enough and memory
 balloons, but it's usually associated with extreme CPU usage as well. It
 would be showing up in perf though if that were the case...


 On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz
 mailto:j...@schermer.cz wrote:

 You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It
 should be possible to do something like

 LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd …

 The last time we tried it segfaulted after a few minutes, so YMMV
 and be careful.

 Jan

  On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com
 mailto:periqu...@gmail.com wrote:

 Hi Greg,

 I've been looking at the tcmalloc issues, but did seem to affect
 osd's, and I do notice it in heavy read workloads (even after the
 patch and
 increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This
 is affecting the mon process though.

 looking at perf top I'm getting most of the CPU usage in mutex
 lock/unlock
   5.02% libpthread-2.19.so http://libpthread-2.19.so/[.]
 pthread_mutex_unlock
   3.82%  libsoftokn3.so[.] 0x0001e7cb
   3.46% libpthread-2.19.so http://libpthread-2.19.so/[.]
 pthread_mutex_lock

 I could try to use jemalloc, are you aware of any built binaries?
 Can I mix a cluster with different malloc binaries?


 On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com
 mailto:g...@gregs42.com wrote:

 On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito
 periqu...@gmail.com mailto:periqu...@gmail.com wrote:
  The ceph-mon is already taking a lot of memory, and I ran a
 heap stats
  
  MALLOC:   32391696 (   30.9 MiB) Bytes in use by
 application
  MALLOC: +  27597135872 (26318.7 MiB) Bytes in page heap
 freelist
  MALLOC: + 16598552 (   15.8 MiB) Bytes in central cache
 freelist
  MALLOC: + 14693536 (   14.0 MiB) Bytes in transfer cache
 freelist
  MALLOC: + 17441592 (   16.6 MiB) Bytes in thread cache
 freelists
  MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
  MALLOC:   
  MALLOC: =  27794649240 (26507.0 MiB) Actual memory used
 (physical + swap)
  MALLOC: + 26116096 (   24.9 MiB) Bytes released to OS
 (aka unmapped)
  MALLOC:   
  MALLOC: =  27820765336 (26531.9 MiB) Virtual address space
 used
  MALLOC:
  MALLOC:   5683  Spans in use
  MALLOC: 21  Thread heaps in use
  MALLOC:   8192  Tcmalloc page size
  

Re: [ceph-users] ceph-mon cpu usage

2015-07-23 Thread Luis Periquito
Hi Greg,

I've been looking at the tcmalloc issues, but did seem to affect osd's, and
I do notice it in heavy read workloads (even after the patch and
increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is
affecting the mon process though.

looking at perf top I'm getting most of the CPU usage in mutex lock/unlock
  5.02%  libpthread-2.19.so[.] pthread_mutex_unlock
  3.82%  libsoftokn3.so[.] 0x0001e7cb
  3.46%  libpthread-2.19.so[.] pthread_mutex_lock

I could try to use jemalloc, are you aware of any built binaries? Can I mix
a cluster with different malloc binaries?


On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com wrote:

 On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com
 wrote:
  The ceph-mon is already taking a lot of memory, and I ran a heap stats
  
  MALLOC:   32391696 (   30.9 MiB) Bytes in use by application
  MALLOC: +  27597135872 (26318.7 MiB) Bytes in page heap freelist
  MALLOC: + 16598552 (   15.8 MiB) Bytes in central cache freelist
  MALLOC: + 14693536 (   14.0 MiB) Bytes in transfer cache freelist
  MALLOC: + 17441592 (   16.6 MiB) Bytes in thread cache freelists
  MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
  MALLOC:   
  MALLOC: =  27794649240 (26507.0 MiB) Actual memory used (physical + swap)
  MALLOC: + 26116096 (   24.9 MiB) Bytes released to OS (aka unmapped)
  MALLOC:   
  MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
  MALLOC:
  MALLOC:   5683  Spans in use
  MALLOC: 21  Thread heaps in use
  MALLOC:   8192  Tcmalloc page size
  
 
  after that I ran the heap release and it went back to normal.
  
  MALLOC:   22919616 (   21.9 MiB) Bytes in use by application
  MALLOC: +  4792320 (4.6 MiB) Bytes in page heap freelist
  MALLOC: + 18743448 (   17.9 MiB) Bytes in central cache freelist
  MALLOC: + 20645776 (   19.7 MiB) Bytes in transfer cache freelist
  MALLOC: + 18456088 (   17.6 MiB) Bytes in thread cache freelists
  MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
  MALLOC:   
  MALLOC: =201945240 (  192.6 MiB) Actual memory used (physical + swap)
  MALLOC: + 27618820096 (26339.4 MiB) Bytes released to OS (aka unmapped)
  MALLOC:   
  MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
  MALLOC:
  MALLOC:   5639  Spans in use
  MALLOC: 29  Thread heaps in use
  MALLOC:   8192  Tcmalloc page size
  
 
  So it just seems the monitor is not returning unused memory into the OS
 or
  reusing already allocated memory it deems as free...

 Yep. This is a bug (best we can tell) in some versions of tcmalloc
 combined with certain distribution stacks, although I don't think
 we've seen it reported on Trusty (nor on a tcmalloc distribution that
 new) before. Alternatively some folks are seeing tcmalloc use up lots
 of CPU in other scenarios involving memory return and it may manifest
 like this, but I'm not sure. You could look through the mailing list
 for information on it.
 -Greg

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-mon cpu usage

2015-07-23 Thread Luis Periquito
The ceph-mon is already taking a lot of memory, and I ran a heap stats

MALLOC:   32391696 (   30.9 MiB) Bytes in use by application
MALLOC: +  27597135872 (26318.7 MiB) Bytes in page heap freelist
MALLOC: + 16598552 (   15.8 MiB) Bytes in central cache freelist
MALLOC: + 14693536 (   14.0 MiB) Bytes in transfer cache freelist
MALLOC: + 17441592 (   16.6 MiB) Bytes in thread cache freelists
MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
MALLOC:   
MALLOC: =  27794649240 (26507.0 MiB) Actual memory used (physical + swap)
MALLOC: + 26116096 (   24.9 MiB) Bytes released to OS (aka unmapped)
MALLOC:   
MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
MALLOC:
MALLOC:   5683  Spans in use
MALLOC: 21  Thread heaps in use
MALLOC:   8192  Tcmalloc page size


after that I ran the heap release and it went back to normal.

MALLOC:   22919616 (   21.9 MiB) Bytes in use by application
MALLOC: +  4792320 (4.6 MiB) Bytes in page heap freelist
MALLOC: + 18743448 (   17.9 MiB) Bytes in central cache freelist
MALLOC: + 20645776 (   19.7 MiB) Bytes in transfer cache freelist
MALLOC: + 18456088 (   17.6 MiB) Bytes in thread cache freelists
MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
MALLOC:   
MALLOC: =201945240 (  192.6 MiB) Actual memory used (physical + swap)
MALLOC: +  27618820096 (26339.4 MiB) Bytes released to OS (aka unmapped)
MALLOC:   
MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
MALLOC:
MALLOC:   5639  Spans in use
MALLOC: 29  Thread heaps in use
MALLOC:   8192  Tcmalloc page size


So it just seems the monitor is not returning unused memory into the OS or
reusing already allocated memory it deems as free...


On Wed, Jul 22, 2015 at 4:29 PM, Luis Periquito periqu...@gmail.com wrote:

 This cluster is server RBD storage for openstack, and today all the I/O
 was just stopped.
 After looking in the boxes ceph-mon was using 17G ram - and this was on
 *all* the mons. Restarting the main one just made it work again (I
 restarted the other ones because they were using a lot of ram).
 This has happened twice now (first was last Monday).

 As this is considered a prod cluster there is no logging enabled, and I
 can't reproduce it - our test/dev clusters have been working fine, and have
 neither symptoms, but they were upgraded from firefly.
 What can we do to help debug the issue? Any ideas on how to identify the
 underlying issue?

 thanks,

 On Mon, Jul 20, 2015 at 1:59 PM, Luis Periquito periqu...@gmail.com
 wrote:

 Hi all,

 I have a cluster with 28 nodes (all physical, 4Cores, 32GB Ram), each
 node has 4 OSDs for a total of 112 OSDs. Each OSD has 106 PGs (counted
 including replication). There are 3 MONs on this cluster.
 I'm running on Ubuntu trusty with kernel 3.13.0-52-generic, with Hammer
 (0.94.2).

 This cluster was installed with Hammer (0.94.1) and has only been
 upgraded to the latest available version.

 On the three mons one is mostly idle, one is using ~170% CPU, and one is
 using ~270% CPU. They will change as I restart the process (usually the
 idle one is the one with the lowest uptime).

 Running a perf top againt the ceph-mon PID on the non-idle boxes it
 wields something like this:

   4.62%  libpthread-2.19.so[.] pthread_mutex_unlock
   3.95%  libpthread-2.19.so[.] pthread_mutex_lock
   3.91%  libsoftokn3.so[.] 0x0001db26
   2.38%  [kernel]  [k] _raw_spin_lock
   2.09%  libtcmalloc.so.4.1.2  [.] operator new(unsigned long)
   1.79%  ceph-mon  [.] DispatchQueue::enqueue(Message*, int,
 unsigned long)
   1.62%  ceph-mon  [.] RefCountedObject::get()
   1.58%  libpthread-2.19.so[.] pthread_mutex_trylock
   1.32%  libtcmalloc.so.4.1.2  [.] operator delete(void*)
   1.24%  libc-2.19.so  [.] 0x00097fd0
   1.20%  ceph-mon  [.] ceph::buffer::ptr::release()
   1.18%  ceph-mon  [.] RefCountedObject::put()
   1.15%  libfreebl3.so [.] 0x000542a8
   1.05%  [kernel]  [k] update_cfs_shares
   1.00%  [kernel]  [k] tcp_sendmsg

 The cluster is mostly idle, and it's healthy. The store is 69MB big, and
 the MONs are consuming around 700MB of RAM.

 Any ideas on this situation? Is it safe to ignore?



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-mon cpu usage

2015-07-23 Thread Gregory Farnum
On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com wrote:
 The ceph-mon is already taking a lot of memory, and I ran a heap stats
 
 MALLOC:   32391696 (   30.9 MiB) Bytes in use by application
 MALLOC: +  27597135872 (26318.7 MiB) Bytes in page heap freelist
 MALLOC: + 16598552 (   15.8 MiB) Bytes in central cache freelist
 MALLOC: + 14693536 (   14.0 MiB) Bytes in transfer cache freelist
 MALLOC: + 17441592 (   16.6 MiB) Bytes in thread cache freelists
 MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
 MALLOC:   
 MALLOC: =  27794649240 (26507.0 MiB) Actual memory used (physical + swap)
 MALLOC: + 26116096 (   24.9 MiB) Bytes released to OS (aka unmapped)
 MALLOC:   
 MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
 MALLOC:
 MALLOC:   5683  Spans in use
 MALLOC: 21  Thread heaps in use
 MALLOC:   8192  Tcmalloc page size
 

 after that I ran the heap release and it went back to normal.
 
 MALLOC:   22919616 (   21.9 MiB) Bytes in use by application
 MALLOC: +  4792320 (4.6 MiB) Bytes in page heap freelist
 MALLOC: + 18743448 (   17.9 MiB) Bytes in central cache freelist
 MALLOC: + 20645776 (   19.7 MiB) Bytes in transfer cache freelist
 MALLOC: + 18456088 (   17.6 MiB) Bytes in thread cache freelists
 MALLOC: +116387992 (  111.0 MiB) Bytes in malloc metadata
 MALLOC:   
 MALLOC: =201945240 (  192.6 MiB) Actual memory used (physical + swap)
 MALLOC: +  27618820096 (26339.4 MiB) Bytes released to OS (aka unmapped)
 MALLOC:   
 MALLOC: =  27820765336 (26531.9 MiB) Virtual address space used
 MALLOC:
 MALLOC:   5639  Spans in use
 MALLOC: 29  Thread heaps in use
 MALLOC:   8192  Tcmalloc page size
 

 So it just seems the monitor is not returning unused memory into the OS or
 reusing already allocated memory it deems as free...

Yep. This is a bug (best we can tell) in some versions of tcmalloc
combined with certain distribution stacks, although I don't think
we've seen it reported on Trusty (nor on a tcmalloc distribution that
new) before. Alternatively some folks are seeing tcmalloc use up lots
of CPU in other scenarios involving memory return and it may manifest
like this, but I'm not sure. You could look through the mailing list
for information on it.
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-mon cpu usage

2015-07-22 Thread Luis Periquito
This cluster is server RBD storage for openstack, and today all the I/O was
just stopped.
After looking in the boxes ceph-mon was using 17G ram - and this was on
*all* the mons. Restarting the main one just made it work again (I
restarted the other ones because they were using a lot of ram).
This has happened twice now (first was last Monday).

As this is considered a prod cluster there is no logging enabled, and I
can't reproduce it - our test/dev clusters have been working fine, and have
neither symptoms, but they were upgraded from firefly.
What can we do to help debug the issue? Any ideas on how to identify the
underlying issue?

thanks,

On Mon, Jul 20, 2015 at 1:59 PM, Luis Periquito periqu...@gmail.com wrote:

 Hi all,

 I have a cluster with 28 nodes (all physical, 4Cores, 32GB Ram), each node
 has 4 OSDs for a total of 112 OSDs. Each OSD has 106 PGs (counted including
 replication). There are 3 MONs on this cluster.
 I'm running on Ubuntu trusty with kernel 3.13.0-52-generic, with Hammer
 (0.94.2).

 This cluster was installed with Hammer (0.94.1) and has only been upgraded
 to the latest available version.

 On the three mons one is mostly idle, one is using ~170% CPU, and one is
 using ~270% CPU. They will change as I restart the process (usually the
 idle one is the one with the lowest uptime).

 Running a perf top againt the ceph-mon PID on the non-idle boxes it wields
 something like this:

   4.62%  libpthread-2.19.so[.] pthread_mutex_unlock
   3.95%  libpthread-2.19.so[.] pthread_mutex_lock
   3.91%  libsoftokn3.so[.] 0x0001db26
   2.38%  [kernel]  [k] _raw_spin_lock
   2.09%  libtcmalloc.so.4.1.2  [.] operator new(unsigned long)
   1.79%  ceph-mon  [.] DispatchQueue::enqueue(Message*, int,
 unsigned long)
   1.62%  ceph-mon  [.] RefCountedObject::get()
   1.58%  libpthread-2.19.so[.] pthread_mutex_trylock
   1.32%  libtcmalloc.so.4.1.2  [.] operator delete(void*)
   1.24%  libc-2.19.so  [.] 0x00097fd0
   1.20%  ceph-mon  [.] ceph::buffer::ptr::release()
   1.18%  ceph-mon  [.] RefCountedObject::put()
   1.15%  libfreebl3.so [.] 0x000542a8
   1.05%  [kernel]  [k] update_cfs_shares
   1.00%  [kernel]  [k] tcp_sendmsg

 The cluster is mostly idle, and it's healthy. The store is 69MB big, and
 the MONs are consuming around 700MB of RAM.

 Any ideas on this situation? Is it safe to ignore?

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-mon cpu usage

2015-07-20 Thread Luis Periquito
Hi all,

I have a cluster with 28 nodes (all physical, 4Cores, 32GB Ram), each node
has 4 OSDs for a total of 112 OSDs. Each OSD has 106 PGs (counted including
replication). There are 3 MONs on this cluster.
I'm running on Ubuntu trusty with kernel 3.13.0-52-generic, with Hammer
(0.94.2).

This cluster was installed with Hammer (0.94.1) and has only been upgraded
to the latest available version.

On the three mons one is mostly idle, one is using ~170% CPU, and one is
using ~270% CPU. They will change as I restart the process (usually the
idle one is the one with the lowest uptime).

Running a perf top againt the ceph-mon PID on the non-idle boxes it wields
something like this:

  4.62%  libpthread-2.19.so[.] pthread_mutex_unlock
  3.95%  libpthread-2.19.so[.] pthread_mutex_lock
  3.91%  libsoftokn3.so[.] 0x0001db26
  2.38%  [kernel]  [k] _raw_spin_lock
  2.09%  libtcmalloc.so.4.1.2  [.] operator new(unsigned long)
  1.79%  ceph-mon  [.] DispatchQueue::enqueue(Message*, int,
unsigned long)
  1.62%  ceph-mon  [.] RefCountedObject::get()
  1.58%  libpthread-2.19.so[.] pthread_mutex_trylock
  1.32%  libtcmalloc.so.4.1.2  [.] operator delete(void*)
  1.24%  libc-2.19.so  [.] 0x00097fd0
  1.20%  ceph-mon  [.] ceph::buffer::ptr::release()
  1.18%  ceph-mon  [.] RefCountedObject::put()
  1.15%  libfreebl3.so [.] 0x000542a8
  1.05%  [kernel]  [k] update_cfs_shares
  1.00%  [kernel]  [k] tcp_sendmsg

The cluster is mostly idle, and it's healthy. The store is 69MB big, and
the MONs are consuming around 700MB of RAM.

Any ideas on this situation? Is it safe to ignore?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com