Re: [ceph-users] ceph-mon cpu usage
I saw this behavior when the servers are not in time sync. Check your ntp settings Dieter From: ceph-users ceph-users-boun...@lists.ceph.commailto:ceph-users-boun...@lists.ceph.com on behalf of Quentin Hartman qhart...@direwolfdigital.commailto:qhart...@direwolfdigital.com Date: Wednesday, July 29, 2015 at 5:47 PM To: Luis Periquito periqu...@gmail.commailto:periqu...@gmail.com Cc: Ceph Users ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com Subject: Re: [ceph-users] ceph-mon cpu usage I just had my ceph cluster exhibit this behavior (two of three mons eat all CPU, cluster becomes unusably slow) which is running 0.87.1 It seems to be tied to deep scrubbing, as the behavior almost immediately surfaces if that is turned on, but if it is off the behavior eventually seems to return to normal and stays that way while scrubbing is off. I have not yet found anything in the cluster to indicate a hardware problem. Any thoughts or further insights on this subject would be appreciated. QH On Sat, Jul 25, 2015 at 12:31 AM, Luis Periquito periqu...@gmail.commailto:periqu...@gmail.com wrote: I think I figured out! All 4 of the OSDs on one host (OSD 107-110) were sending massive amounts of auth requests to the monitors, seeming to overwhelm them. Weird bit is that I removed them (osd crush remove, auth del, osd rm), dd the box and all of the disks, reinstalled and guess what? They are still doing a lot of requests to the MONs... this will require some further investigations. As this is happening during my holidays, I just disabled them, and will investigate further when I get back. On Fri, Jul 24, 2015 at 11:11 PM, Kjetil Jørgensen kje...@medallia.commailto:kje...@medallia.com wrote: It sounds slightly similar to what I just experienced. I had one monitor out of three, which seemed to essentially run one core at full tilt continuously, and had it's virtual address space allocated at the point where top started calling it Tb. Requests hitting this monitor did not get very timely responses (although; I don't know if this were happening consistently or arbitrarily). I ended up re-building the monitor from the two healthy ones I had, which made the problem go away for me. After the fact inspection of the monitor I ripped out, clocked it in at 1.3Gb compared to the 250Mb of the other two, after rebuild they're all comparable in size. In my case; this started out for me on firefly, and persisted after upgrading to hammer. Which prompted the rebuild, suspecting that in my case it were related to something persistent for this monitor. I do not have that much more useful to contribute to this discussion, since I've more-or-less destroyed any evidence by re-building the monitor. Cheers, KJ On Fri, Jul 24, 2015 at 1:55 PM, Luis Periquito periqu...@gmail.commailto:periqu...@gmail.com wrote: The leveldb is smallish: around 70mb. I ran debug mon = 10 for a while, but couldn't find any interesting information. I would run out of space quite quickly though as the log partition only has 10g. On 24 Jul 2015 21:13, Mark Nelson mnel...@redhat.commailto:mnel...@redhat.com wrote: On 07/24/2015 02:31 PM, Luis Periquito wrote: Now it's official, I have a weird one! Restarted one of the ceph-mons with jemalloc and it didn't make any difference. It's still using a lot of cpu and still not freeing up memory... The issue is that the cluster almost stops responding to requests, and if I restart the primary mon (that had almost no memory usage nor cpu) the cluster goes back to its merry way responding to requests. Does anyone have any idea what may be going on? The worst bit is that I have several clusters just like this (well they are smaller), and as we do everything with puppet, they should be very similar... and all the other clusters are just working fine, without any issues whatsoever... We've seen cases where leveldb can't compact fast enough and memory balloons, but it's usually associated with extreme CPU usage as well. It would be showing up in perf though if that were the case... On 24 Jul 2015 10:11, Jan Schermer j...@schermer.czmailto:j...@schermer.cz mailto:j...@schermer.czmailto:j...@schermer.cz wrote: You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It should be possible to do something like LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd … The last time we tried it segfaulted after a few minutes, so YMMV and be careful. Jan On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.commailto:periqu...@gmail.com mailto:periqu...@gmail.commailto:periqu...@gmail.com wrote: Hi Greg, I've been looking at the tcmalloc issues, but did seem to affect osd's, and I do notice it in heavy read workloads (even after the patch and increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is affecting the mon process though. looking at perf top I'm getting most of the CPU usage
Re: [ceph-users] ceph-mon cpu usage
Thanks for the suggestion. NTP is fine in my case. Turns out it was a networking problem that wasn't triggering error counters on the NICs so it took a bit to track it down. QH On Thu, Jul 30, 2015 at 4:16 PM, Spillmann, Dieter dieter.spillm...@arris.com wrote: I saw this behavior when the servers are not in time sync. Check your ntp settings Dieter From: ceph-users ceph-users-boun...@lists.ceph.com on behalf of Quentin Hartman qhart...@direwolfdigital.com Date: Wednesday, July 29, 2015 at 5:47 PM To: Luis Periquito periqu...@gmail.com Cc: Ceph Users ceph-users@lists.ceph.com Subject: Re: [ceph-users] ceph-mon cpu usage I just had my ceph cluster exhibit this behavior (two of three mons eat all CPU, cluster becomes unusably slow) which is running 0.87.1 It seems to be tied to deep scrubbing, as the behavior almost immediately surfaces if that is turned on, but if it is off the behavior eventually seems to return to normal and stays that way while scrubbing is off. I have not yet found anything in the cluster to indicate a hardware problem. Any thoughts or further insights on this subject would be appreciated. QH On Sat, Jul 25, 2015 at 12:31 AM, Luis Periquito periqu...@gmail.com wrote: I think I figured out! All 4 of the OSDs on one host (OSD 107-110) were sending massive amounts of auth requests to the monitors, seeming to overwhelm them. Weird bit is that I removed them (osd crush remove, auth del, osd rm), dd the box and all of the disks, reinstalled and guess what? They are still doing a lot of requests to the MONs... this will require some further investigations. As this is happening during my holidays, I just disabled them, and will investigate further when I get back. On Fri, Jul 24, 2015 at 11:11 PM, Kjetil Jørgensen kje...@medallia.com wrote: It sounds slightly similar to what I just experienced. I had one monitor out of three, which seemed to essentially run one core at full tilt continuously, and had it's virtual address space allocated at the point where top started calling it Tb. Requests hitting this monitor did not get very timely responses (although; I don't know if this were happening consistently or arbitrarily). I ended up re-building the monitor from the two healthy ones I had, which made the problem go away for me. After the fact inspection of the monitor I ripped out, clocked it in at 1.3Gb compared to the 250Mb of the other two, after rebuild they're all comparable in size. In my case; this started out for me on firefly, and persisted after upgrading to hammer. Which prompted the rebuild, suspecting that in my case it were related to something persistent for this monitor. I do not have that much more useful to contribute to this discussion, since I've more-or-less destroyed any evidence by re-building the monitor. Cheers, KJ On Fri, Jul 24, 2015 at 1:55 PM, Luis Periquito periqu...@gmail.com wrote: The leveldb is smallish: around 70mb. I ran debug mon = 10 for a while, but couldn't find any interesting information. I would run out of space quite quickly though as the log partition only has 10g. On 24 Jul 2015 21:13, Mark Nelson mnel...@redhat.com wrote: On 07/24/2015 02:31 PM, Luis Periquito wrote: Now it's official, I have a weird one! Restarted one of the ceph-mons with jemalloc and it didn't make any difference. It's still using a lot of cpu and still not freeing up memory... The issue is that the cluster almost stops responding to requests, and if I restart the primary mon (that had almost no memory usage nor cpu) the cluster goes back to its merry way responding to requests. Does anyone have any idea what may be going on? The worst bit is that I have several clusters just like this (well they are smaller), and as we do everything with puppet, they should be very similar... and all the other clusters are just working fine, without any issues whatsoever... We've seen cases where leveldb can't compact fast enough and memory balloons, but it's usually associated with extreme CPU usage as well. It would be showing up in perf though if that were the case... On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz mailto:j...@schermer.cz wrote: You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It should be possible to do something like LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd … The last time we tried it segfaulted after a few minutes, so YMMV and be careful. Jan On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com mailto:periqu...@gmail.com wrote: Hi Greg, I've been looking at the tcmalloc issues, but did seem to affect osd's, and I do notice it in heavy read workloads (even after the patch and increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is affecting the mon process though. looking at perf top I'm getting most of the CPU usage
Re: [ceph-users] ceph-mon cpu usage
I just had my ceph cluster exhibit this behavior (two of three mons eat all CPU, cluster becomes unusably slow) which is running 0.87.1 It seems to be tied to deep scrubbing, as the behavior almost immediately surfaces if that is turned on, but if it is off the behavior eventually seems to return to normal and stays that way while scrubbing is off. I have not yet found anything in the cluster to indicate a hardware problem. Any thoughts or further insights on this subject would be appreciated. QH On Sat, Jul 25, 2015 at 12:31 AM, Luis Periquito periqu...@gmail.com wrote: I think I figured out! All 4 of the OSDs on one host (OSD 107-110) were sending massive amounts of auth requests to the monitors, seeming to overwhelm them. Weird bit is that I removed them (osd crush remove, auth del, osd rm), dd the box and all of the disks, reinstalled and guess what? They are still doing a lot of requests to the MONs... this will require some further investigations. As this is happening during my holidays, I just disabled them, and will investigate further when I get back. On Fri, Jul 24, 2015 at 11:11 PM, Kjetil Jørgensen kje...@medallia.com wrote: It sounds slightly similar to what I just experienced. I had one monitor out of three, which seemed to essentially run one core at full tilt continuously, and had it's virtual address space allocated at the point where top started calling it Tb. Requests hitting this monitor did not get very timely responses (although; I don't know if this were happening consistently or arbitrarily). I ended up re-building the monitor from the two healthy ones I had, which made the problem go away for me. After the fact inspection of the monitor I ripped out, clocked it in at 1.3Gb compared to the 250Mb of the other two, after rebuild they're all comparable in size. In my case; this started out for me on firefly, and persisted after upgrading to hammer. Which prompted the rebuild, suspecting that in my case it were related to something persistent for this monitor. I do not have that much more useful to contribute to this discussion, since I've more-or-less destroyed any evidence by re-building the monitor. Cheers, KJ On Fri, Jul 24, 2015 at 1:55 PM, Luis Periquito periqu...@gmail.com wrote: The leveldb is smallish: around 70mb. I ran debug mon = 10 for a while, but couldn't find any interesting information. I would run out of space quite quickly though as the log partition only has 10g. On 24 Jul 2015 21:13, Mark Nelson mnel...@redhat.com wrote: On 07/24/2015 02:31 PM, Luis Periquito wrote: Now it's official, I have a weird one! Restarted one of the ceph-mons with jemalloc and it didn't make any difference. It's still using a lot of cpu and still not freeing up memory... The issue is that the cluster almost stops responding to requests, and if I restart the primary mon (that had almost no memory usage nor cpu) the cluster goes back to its merry way responding to requests. Does anyone have any idea what may be going on? The worst bit is that I have several clusters just like this (well they are smaller), and as we do everything with puppet, they should be very similar... and all the other clusters are just working fine, without any issues whatsoever... We've seen cases where leveldb can't compact fast enough and memory balloons, but it's usually associated with extreme CPU usage as well. It would be showing up in perf though if that were the case... On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz mailto:j...@schermer.cz wrote: You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It should be possible to do something like LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd … The last time we tried it segfaulted after a few minutes, so YMMV and be careful. Jan On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com mailto:periqu...@gmail.com wrote: Hi Greg, I've been looking at the tcmalloc issues, but did seem to affect osd's, and I do notice it in heavy read workloads (even after the patch and increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is affecting the mon process though. looking at perf top I'm getting most of the CPU usage in mutex lock/unlock 5.02% libpthread-2.19.so http://libpthread-2.19.so/[.] pthread_mutex_unlock 3.82% libsoftokn3.so[.] 0x0001e7cb 3.46% libpthread-2.19.so http://libpthread-2.19.so/[.] pthread_mutex_lock I could try to use jemalloc, are you aware of any built binaries? Can I mix a cluster with different malloc binaries? On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com mailto:g...@gregs42.com wrote: On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com mailto:periqu...@gmail.com wrote: The ceph-mon is
Re: [ceph-users] ceph-mon cpu usage
I think I figured out! All 4 of the OSDs on one host (OSD 107-110) were sending massive amounts of auth requests to the monitors, seeming to overwhelm them. Weird bit is that I removed them (osd crush remove, auth del, osd rm), dd the box and all of the disks, reinstalled and guess what? They are still doing a lot of requests to the MONs... this will require some further investigations. As this is happening during my holidays, I just disabled them, and will investigate further when I get back. On Fri, Jul 24, 2015 at 11:11 PM, Kjetil Jørgensen kje...@medallia.com wrote: It sounds slightly similar to what I just experienced. I had one monitor out of three, which seemed to essentially run one core at full tilt continuously, and had it's virtual address space allocated at the point where top started calling it Tb. Requests hitting this monitor did not get very timely responses (although; I don't know if this were happening consistently or arbitrarily). I ended up re-building the monitor from the two healthy ones I had, which made the problem go away for me. After the fact inspection of the monitor I ripped out, clocked it in at 1.3Gb compared to the 250Mb of the other two, after rebuild they're all comparable in size. In my case; this started out for me on firefly, and persisted after upgrading to hammer. Which prompted the rebuild, suspecting that in my case it were related to something persistent for this monitor. I do not have that much more useful to contribute to this discussion, since I've more-or-less destroyed any evidence by re-building the monitor. Cheers, KJ On Fri, Jul 24, 2015 at 1:55 PM, Luis Periquito periqu...@gmail.com wrote: The leveldb is smallish: around 70mb. I ran debug mon = 10 for a while, but couldn't find any interesting information. I would run out of space quite quickly though as the log partition only has 10g. On 24 Jul 2015 21:13, Mark Nelson mnel...@redhat.com wrote: On 07/24/2015 02:31 PM, Luis Periquito wrote: Now it's official, I have a weird one! Restarted one of the ceph-mons with jemalloc and it didn't make any difference. It's still using a lot of cpu and still not freeing up memory... The issue is that the cluster almost stops responding to requests, and if I restart the primary mon (that had almost no memory usage nor cpu) the cluster goes back to its merry way responding to requests. Does anyone have any idea what may be going on? The worst bit is that I have several clusters just like this (well they are smaller), and as we do everything with puppet, they should be very similar... and all the other clusters are just working fine, without any issues whatsoever... We've seen cases where leveldb can't compact fast enough and memory balloons, but it's usually associated with extreme CPU usage as well. It would be showing up in perf though if that were the case... On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz mailto:j...@schermer.cz wrote: You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It should be possible to do something like LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd … The last time we tried it segfaulted after a few minutes, so YMMV and be careful. Jan On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com mailto:periqu...@gmail.com wrote: Hi Greg, I've been looking at the tcmalloc issues, but did seem to affect osd's, and I do notice it in heavy read workloads (even after the patch and increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is affecting the mon process though. looking at perf top I'm getting most of the CPU usage in mutex lock/unlock 5.02% libpthread-2.19.so http://libpthread-2.19.so/[.] pthread_mutex_unlock 3.82% libsoftokn3.so[.] 0x0001e7cb 3.46% libpthread-2.19.so http://libpthread-2.19.so/[.] pthread_mutex_lock I could try to use jemalloc, are you aware of any built binaries? Can I mix a cluster with different malloc binaries? On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com mailto:g...@gregs42.com wrote: On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com mailto:periqu...@gmail.com wrote: The ceph-mon is already taking a lot of memory, and I ran a heap stats MALLOC: 32391696 ( 30.9 MiB) Bytes in use by application MALLOC: + 27597135872 (26318.7 MiB) Bytes in page heap freelist MALLOC: + 16598552 ( 15.8 MiB) Bytes in central cache freelist MALLOC: + 14693536 ( 14.0 MiB) Bytes in transfer cache freelist MALLOC: + 17441592 ( 16.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in
Re: [ceph-users] ceph-mon cpu usage
You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It should be possible to do something like LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd … The last time we tried it segfaulted after a few minutes, so YMMV and be careful. Jan On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com wrote: Hi Greg, I've been looking at the tcmalloc issues, but did seem to affect osd's, and I do notice it in heavy read workloads (even after the patch and increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is affecting the mon process though. looking at perf top I'm getting most of the CPU usage in mutex lock/unlock 5.02% libpthread-2.19.so http://libpthread-2.19.so/[.] pthread_mutex_unlock 3.82% libsoftokn3.so[.] 0x0001e7cb 3.46% libpthread-2.19.so http://libpthread-2.19.so/[.] pthread_mutex_lock I could try to use jemalloc, are you aware of any built binaries? Can I mix a cluster with different malloc binaries? On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com mailto:g...@gregs42.com wrote: On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com mailto:periqu...@gmail.com wrote: The ceph-mon is already taking a lot of memory, and I ran a heap stats MALLOC: 32391696 ( 30.9 MiB) Bytes in use by application MALLOC: + 27597135872 (26318.7 MiB) Bytes in page heap freelist MALLOC: + 16598552 ( 15.8 MiB) Bytes in central cache freelist MALLOC: + 14693536 ( 14.0 MiB) Bytes in transfer cache freelist MALLOC: + 17441592 ( 16.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 27794649240 (26507.0 MiB) Actual memory used (physical + swap) MALLOC: + 26116096 ( 24.9 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5683 Spans in use MALLOC: 21 Thread heaps in use MALLOC: 8192 Tcmalloc page size after that I ran the heap release and it went back to normal. MALLOC: 22919616 ( 21.9 MiB) Bytes in use by application MALLOC: + 4792320 (4.6 MiB) Bytes in page heap freelist MALLOC: + 18743448 ( 17.9 MiB) Bytes in central cache freelist MALLOC: + 20645776 ( 19.7 MiB) Bytes in transfer cache freelist MALLOC: + 18456088 ( 17.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: =201945240 ( 192.6 MiB) Actual memory used (physical + swap) MALLOC: + 27618820096 tel:%2B%20%2027618820096 (26339.4 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5639 Spans in use MALLOC: 29 Thread heaps in use MALLOC: 8192 Tcmalloc page size So it just seems the monitor is not returning unused memory into the OS or reusing already allocated memory it deems as free... Yep. This is a bug (best we can tell) in some versions of tcmalloc combined with certain distribution stacks, although I don't think we've seen it reported on Trusty (nor on a tcmalloc distribution that new) before. Alternatively some folks are seeing tcmalloc use up lots of CPU in other scenarios involving memory return and it may manifest like this, but I'm not sure. You could look through the mailing list for information on it. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-mon cpu usage
On 07/24/2015 02:31 PM, Luis Periquito wrote: Now it's official, I have a weird one! Restarted one of the ceph-mons with jemalloc and it didn't make any difference. It's still using a lot of cpu and still not freeing up memory... The issue is that the cluster almost stops responding to requests, and if I restart the primary mon (that had almost no memory usage nor cpu) the cluster goes back to its merry way responding to requests. Does anyone have any idea what may be going on? The worst bit is that I have several clusters just like this (well they are smaller), and as we do everything with puppet, they should be very similar... and all the other clusters are just working fine, without any issues whatsoever... We've seen cases where leveldb can't compact fast enough and memory balloons, but it's usually associated with extreme CPU usage as well. It would be showing up in perf though if that were the case... On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz mailto:j...@schermer.cz wrote: You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It should be possible to do something like LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd … The last time we tried it segfaulted after a few minutes, so YMMV and be careful. Jan On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com mailto:periqu...@gmail.com wrote: Hi Greg, I've been looking at the tcmalloc issues, but did seem to affect osd's, and I do notice it in heavy read workloads (even after the patch and increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is affecting the mon process though. looking at perf top I'm getting most of the CPU usage in mutex lock/unlock 5.02% libpthread-2.19.so http://libpthread-2.19.so/[.] pthread_mutex_unlock 3.82% libsoftokn3.so[.] 0x0001e7cb 3.46% libpthread-2.19.so http://libpthread-2.19.so/[.] pthread_mutex_lock I could try to use jemalloc, are you aware of any built binaries? Can I mix a cluster with different malloc binaries? On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com mailto:g...@gregs42.com wrote: On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com mailto:periqu...@gmail.com wrote: The ceph-mon is already taking a lot of memory, and I ran a heap stats MALLOC: 32391696 ( 30.9 MiB) Bytes in use by application MALLOC: + 27597135872 (26318.7 MiB) Bytes in page heap freelist MALLOC: + 16598552 ( 15.8 MiB) Bytes in central cache freelist MALLOC: + 14693536 ( 14.0 MiB) Bytes in transfer cache freelist MALLOC: + 17441592 ( 16.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 27794649240 (26507.0 MiB) Actual memory used (physical + swap) MALLOC: + 26116096 ( 24.9 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5683 Spans in use MALLOC: 21 Thread heaps in use MALLOC: 8192 Tcmalloc page size after that I ran the heap release and it went back to normal. MALLOC: 22919616 ( 21.9 MiB) Bytes in use by application MALLOC: + 4792320 (4.6 MiB) Bytes in page heap freelist MALLOC: + 18743448 ( 17.9 MiB) Bytes in central cache freelist MALLOC: + 20645776 ( 19.7 MiB) Bytes in transfer cache freelist MALLOC: + 18456088 ( 17.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: =201945240 ( 192.6 MiB) Actual memory used (physical + swap) MALLOC: + 27618820096 tel:%2B%20%2027618820096 (26339.4 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5639 Spans in use MALLOC: 29 Thread heaps in use MALLOC: 8192 Tcmalloc page size So it just seems the monitor is not returning unused memory into the OS or reusing already allocated memory it deems as free... Yep. This is a bug (best we can tell) in some
Re: [ceph-users] ceph-mon cpu usage
The leveldb is smallish: around 70mb. I ran debug mon = 10 for a while, but couldn't find any interesting information. I would run out of space quite quickly though as the log partition only has 10g. On 24 Jul 2015 21:13, Mark Nelson mnel...@redhat.com wrote: On 07/24/2015 02:31 PM, Luis Periquito wrote: Now it's official, I have a weird one! Restarted one of the ceph-mons with jemalloc and it didn't make any difference. It's still using a lot of cpu and still not freeing up memory... The issue is that the cluster almost stops responding to requests, and if I restart the primary mon (that had almost no memory usage nor cpu) the cluster goes back to its merry way responding to requests. Does anyone have any idea what may be going on? The worst bit is that I have several clusters just like this (well they are smaller), and as we do everything with puppet, they should be very similar... and all the other clusters are just working fine, without any issues whatsoever... We've seen cases where leveldb can't compact fast enough and memory balloons, but it's usually associated with extreme CPU usage as well. It would be showing up in perf though if that were the case... On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz mailto:j...@schermer.cz wrote: You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It should be possible to do something like LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd … The last time we tried it segfaulted after a few minutes, so YMMV and be careful. Jan On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com mailto:periqu...@gmail.com wrote: Hi Greg, I've been looking at the tcmalloc issues, but did seem to affect osd's, and I do notice it in heavy read workloads (even after the patch and increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is affecting the mon process though. looking at perf top I'm getting most of the CPU usage in mutex lock/unlock 5.02% libpthread-2.19.so http://libpthread-2.19.so/[.] pthread_mutex_unlock 3.82% libsoftokn3.so[.] 0x0001e7cb 3.46% libpthread-2.19.so http://libpthread-2.19.so/[.] pthread_mutex_lock I could try to use jemalloc, are you aware of any built binaries? Can I mix a cluster with different malloc binaries? On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com mailto:g...@gregs42.com wrote: On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com mailto:periqu...@gmail.com wrote: The ceph-mon is already taking a lot of memory, and I ran a heap stats MALLOC: 32391696 ( 30.9 MiB) Bytes in use by application MALLOC: + 27597135872 (26318.7 MiB) Bytes in page heap freelist MALLOC: + 16598552 ( 15.8 MiB) Bytes in central cache freelist MALLOC: + 14693536 ( 14.0 MiB) Bytes in transfer cache freelist MALLOC: + 17441592 ( 16.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 27794649240 (26507.0 MiB) Actual memory used (physical + swap) MALLOC: + 26116096 ( 24.9 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5683 Spans in use MALLOC: 21 Thread heaps in use MALLOC: 8192 Tcmalloc page size after that I ran the heap release and it went back to normal. MALLOC: 22919616 ( 21.9 MiB) Bytes in use by application MALLOC: + 4792320 (4.6 MiB) Bytes in page heap freelist MALLOC: + 18743448 ( 17.9 MiB) Bytes in central cache freelist MALLOC: + 20645776 ( 19.7 MiB) Bytes in transfer cache freelist MALLOC: + 18456088 ( 17.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: =201945240 ( 192.6 MiB) Actual memory used (physical + swap) MALLOC: + 27618820096 tel:%2B%20%2027618820096 (26339.4 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5639 Spans in use MALLOC: 29
Re: [ceph-users] ceph-mon cpu usage
Now it's official, I have a weird one! Restarted one of the ceph-mons with jemalloc and it didn't make any difference. It's still using a lot of cpu and still not freeing up memory... The issue is that the cluster almost stops responding to requests, and if I restart the primary mon (that had almost no memory usage nor cpu) the cluster goes back to its merry way responding to requests. Does anyone have any idea what may be going on? The worst bit is that I have several clusters just like this (well they are smaller), and as we do everything with puppet, they should be very similar... and all the other clusters are just working fine, without any issues whatsoever... On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz wrote: You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It should be possible to do something like LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd … The last time we tried it segfaulted after a few minutes, so YMMV and be careful. Jan On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com wrote: Hi Greg, I've been looking at the tcmalloc issues, but did seem to affect osd's, and I do notice it in heavy read workloads (even after the patch and increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is affecting the mon process though. looking at perf top I'm getting most of the CPU usage in mutex lock/unlock 5.02% libpthread-2.19.so[.] pthread_mutex_unlock 3.82% libsoftokn3.so[.] 0x0001e7cb 3.46% libpthread-2.19.so[.] pthread_mutex_lock I could try to use jemalloc, are you aware of any built binaries? Can I mix a cluster with different malloc binaries? On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com wrote: On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com wrote: The ceph-mon is already taking a lot of memory, and I ran a heap stats MALLOC: 32391696 ( 30.9 MiB) Bytes in use by application MALLOC: + 27597135872 (26318.7 MiB) Bytes in page heap freelist MALLOC: + 16598552 ( 15.8 MiB) Bytes in central cache freelist MALLOC: + 14693536 ( 14.0 MiB) Bytes in transfer cache freelist MALLOC: + 17441592 ( 16.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 27794649240 (26507.0 MiB) Actual memory used (physical + swap) MALLOC: + 26116096 ( 24.9 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5683 Spans in use MALLOC: 21 Thread heaps in use MALLOC: 8192 Tcmalloc page size after that I ran the heap release and it went back to normal. MALLOC: 22919616 ( 21.9 MiB) Bytes in use by application MALLOC: + 4792320 (4.6 MiB) Bytes in page heap freelist MALLOC: + 18743448 ( 17.9 MiB) Bytes in central cache freelist MALLOC: + 20645776 ( 19.7 MiB) Bytes in transfer cache freelist MALLOC: + 18456088 ( 17.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: =201945240 ( 192.6 MiB) Actual memory used (physical + swap) MALLOC: + 27618820096 (26339.4 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5639 Spans in use MALLOC: 29 Thread heaps in use MALLOC: 8192 Tcmalloc page size So it just seems the monitor is not returning unused memory into the OS or reusing already allocated memory it deems as free... Yep. This is a bug (best we can tell) in some versions of tcmalloc combined with certain distribution stacks, although I don't think we've seen it reported on Trusty (nor on a tcmalloc distribution that new) before. Alternatively some folks are seeing tcmalloc use up lots of CPU in other scenarios involving memory return and it may manifest like this, but I'm not sure. You could look through the mailing list for information on it. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-mon cpu usage
It sounds slightly similar to what I just experienced. I had one monitor out of three, which seemed to essentially run one core at full tilt continuously, and had it's virtual address space allocated at the point where top started calling it Tb. Requests hitting this monitor did not get very timely responses (although; I don't know if this were happening consistently or arbitrarily). I ended up re-building the monitor from the two healthy ones I had, which made the problem go away for me. After the fact inspection of the monitor I ripped out, clocked it in at 1.3Gb compared to the 250Mb of the other two, after rebuild they're all comparable in size. In my case; this started out for me on firefly, and persisted after upgrading to hammer. Which prompted the rebuild, suspecting that in my case it were related to something persistent for this monitor. I do not have that much more useful to contribute to this discussion, since I've more-or-less destroyed any evidence by re-building the monitor. Cheers, KJ On Fri, Jul 24, 2015 at 1:55 PM, Luis Periquito periqu...@gmail.com wrote: The leveldb is smallish: around 70mb. I ran debug mon = 10 for a while, but couldn't find any interesting information. I would run out of space quite quickly though as the log partition only has 10g. On 24 Jul 2015 21:13, Mark Nelson mnel...@redhat.com wrote: On 07/24/2015 02:31 PM, Luis Periquito wrote: Now it's official, I have a weird one! Restarted one of the ceph-mons with jemalloc and it didn't make any difference. It's still using a lot of cpu and still not freeing up memory... The issue is that the cluster almost stops responding to requests, and if I restart the primary mon (that had almost no memory usage nor cpu) the cluster goes back to its merry way responding to requests. Does anyone have any idea what may be going on? The worst bit is that I have several clusters just like this (well they are smaller), and as we do everything with puppet, they should be very similar... and all the other clusters are just working fine, without any issues whatsoever... We've seen cases where leveldb can't compact fast enough and memory balloons, but it's usually associated with extreme CPU usage as well. It would be showing up in perf though if that were the case... On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz mailto:j...@schermer.cz wrote: You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It should be possible to do something like LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd … The last time we tried it segfaulted after a few minutes, so YMMV and be careful. Jan On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com mailto:periqu...@gmail.com wrote: Hi Greg, I've been looking at the tcmalloc issues, but did seem to affect osd's, and I do notice it in heavy read workloads (even after the patch and increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is affecting the mon process though. looking at perf top I'm getting most of the CPU usage in mutex lock/unlock 5.02% libpthread-2.19.so http://libpthread-2.19.so/[.] pthread_mutex_unlock 3.82% libsoftokn3.so[.] 0x0001e7cb 3.46% libpthread-2.19.so http://libpthread-2.19.so/[.] pthread_mutex_lock I could try to use jemalloc, are you aware of any built binaries? Can I mix a cluster with different malloc binaries? On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com mailto:g...@gregs42.com wrote: On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com mailto:periqu...@gmail.com wrote: The ceph-mon is already taking a lot of memory, and I ran a heap stats MALLOC: 32391696 ( 30.9 MiB) Bytes in use by application MALLOC: + 27597135872 (26318.7 MiB) Bytes in page heap freelist MALLOC: + 16598552 ( 15.8 MiB) Bytes in central cache freelist MALLOC: + 14693536 ( 14.0 MiB) Bytes in transfer cache freelist MALLOC: + 17441592 ( 16.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 27794649240 (26507.0 MiB) Actual memory used (physical + swap) MALLOC: + 26116096 ( 24.9 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5683 Spans in use MALLOC: 21 Thread heaps in use MALLOC: 8192 Tcmalloc page size
Re: [ceph-users] ceph-mon cpu usage
Hi Greg, I've been looking at the tcmalloc issues, but did seem to affect osd's, and I do notice it in heavy read workloads (even after the patch and increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is affecting the mon process though. looking at perf top I'm getting most of the CPU usage in mutex lock/unlock 5.02% libpthread-2.19.so[.] pthread_mutex_unlock 3.82% libsoftokn3.so[.] 0x0001e7cb 3.46% libpthread-2.19.so[.] pthread_mutex_lock I could try to use jemalloc, are you aware of any built binaries? Can I mix a cluster with different malloc binaries? On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com wrote: On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com wrote: The ceph-mon is already taking a lot of memory, and I ran a heap stats MALLOC: 32391696 ( 30.9 MiB) Bytes in use by application MALLOC: + 27597135872 (26318.7 MiB) Bytes in page heap freelist MALLOC: + 16598552 ( 15.8 MiB) Bytes in central cache freelist MALLOC: + 14693536 ( 14.0 MiB) Bytes in transfer cache freelist MALLOC: + 17441592 ( 16.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 27794649240 (26507.0 MiB) Actual memory used (physical + swap) MALLOC: + 26116096 ( 24.9 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5683 Spans in use MALLOC: 21 Thread heaps in use MALLOC: 8192 Tcmalloc page size after that I ran the heap release and it went back to normal. MALLOC: 22919616 ( 21.9 MiB) Bytes in use by application MALLOC: + 4792320 (4.6 MiB) Bytes in page heap freelist MALLOC: + 18743448 ( 17.9 MiB) Bytes in central cache freelist MALLOC: + 20645776 ( 19.7 MiB) Bytes in transfer cache freelist MALLOC: + 18456088 ( 17.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: =201945240 ( 192.6 MiB) Actual memory used (physical + swap) MALLOC: + 27618820096 (26339.4 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5639 Spans in use MALLOC: 29 Thread heaps in use MALLOC: 8192 Tcmalloc page size So it just seems the monitor is not returning unused memory into the OS or reusing already allocated memory it deems as free... Yep. This is a bug (best we can tell) in some versions of tcmalloc combined with certain distribution stacks, although I don't think we've seen it reported on Trusty (nor on a tcmalloc distribution that new) before. Alternatively some folks are seeing tcmalloc use up lots of CPU in other scenarios involving memory return and it may manifest like this, but I'm not sure. You could look through the mailing list for information on it. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-mon cpu usage
The ceph-mon is already taking a lot of memory, and I ran a heap stats MALLOC: 32391696 ( 30.9 MiB) Bytes in use by application MALLOC: + 27597135872 (26318.7 MiB) Bytes in page heap freelist MALLOC: + 16598552 ( 15.8 MiB) Bytes in central cache freelist MALLOC: + 14693536 ( 14.0 MiB) Bytes in transfer cache freelist MALLOC: + 17441592 ( 16.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 27794649240 (26507.0 MiB) Actual memory used (physical + swap) MALLOC: + 26116096 ( 24.9 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5683 Spans in use MALLOC: 21 Thread heaps in use MALLOC: 8192 Tcmalloc page size after that I ran the heap release and it went back to normal. MALLOC: 22919616 ( 21.9 MiB) Bytes in use by application MALLOC: + 4792320 (4.6 MiB) Bytes in page heap freelist MALLOC: + 18743448 ( 17.9 MiB) Bytes in central cache freelist MALLOC: + 20645776 ( 19.7 MiB) Bytes in transfer cache freelist MALLOC: + 18456088 ( 17.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: =201945240 ( 192.6 MiB) Actual memory used (physical + swap) MALLOC: + 27618820096 (26339.4 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5639 Spans in use MALLOC: 29 Thread heaps in use MALLOC: 8192 Tcmalloc page size So it just seems the monitor is not returning unused memory into the OS or reusing already allocated memory it deems as free... On Wed, Jul 22, 2015 at 4:29 PM, Luis Periquito periqu...@gmail.com wrote: This cluster is server RBD storage for openstack, and today all the I/O was just stopped. After looking in the boxes ceph-mon was using 17G ram - and this was on *all* the mons. Restarting the main one just made it work again (I restarted the other ones because they were using a lot of ram). This has happened twice now (first was last Monday). As this is considered a prod cluster there is no logging enabled, and I can't reproduce it - our test/dev clusters have been working fine, and have neither symptoms, but they were upgraded from firefly. What can we do to help debug the issue? Any ideas on how to identify the underlying issue? thanks, On Mon, Jul 20, 2015 at 1:59 PM, Luis Periquito periqu...@gmail.com wrote: Hi all, I have a cluster with 28 nodes (all physical, 4Cores, 32GB Ram), each node has 4 OSDs for a total of 112 OSDs. Each OSD has 106 PGs (counted including replication). There are 3 MONs on this cluster. I'm running on Ubuntu trusty with kernel 3.13.0-52-generic, with Hammer (0.94.2). This cluster was installed with Hammer (0.94.1) and has only been upgraded to the latest available version. On the three mons one is mostly idle, one is using ~170% CPU, and one is using ~270% CPU. They will change as I restart the process (usually the idle one is the one with the lowest uptime). Running a perf top againt the ceph-mon PID on the non-idle boxes it wields something like this: 4.62% libpthread-2.19.so[.] pthread_mutex_unlock 3.95% libpthread-2.19.so[.] pthread_mutex_lock 3.91% libsoftokn3.so[.] 0x0001db26 2.38% [kernel] [k] _raw_spin_lock 2.09% libtcmalloc.so.4.1.2 [.] operator new(unsigned long) 1.79% ceph-mon [.] DispatchQueue::enqueue(Message*, int, unsigned long) 1.62% ceph-mon [.] RefCountedObject::get() 1.58% libpthread-2.19.so[.] pthread_mutex_trylock 1.32% libtcmalloc.so.4.1.2 [.] operator delete(void*) 1.24% libc-2.19.so [.] 0x00097fd0 1.20% ceph-mon [.] ceph::buffer::ptr::release() 1.18% ceph-mon [.] RefCountedObject::put() 1.15% libfreebl3.so [.] 0x000542a8 1.05% [kernel] [k] update_cfs_shares 1.00% [kernel] [k] tcp_sendmsg The cluster is mostly idle, and it's healthy. The store is 69MB big, and the MONs are consuming around 700MB of RAM. Any ideas on this situation? Is it safe to ignore? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-mon cpu usage
On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com wrote: The ceph-mon is already taking a lot of memory, and I ran a heap stats MALLOC: 32391696 ( 30.9 MiB) Bytes in use by application MALLOC: + 27597135872 (26318.7 MiB) Bytes in page heap freelist MALLOC: + 16598552 ( 15.8 MiB) Bytes in central cache freelist MALLOC: + 14693536 ( 14.0 MiB) Bytes in transfer cache freelist MALLOC: + 17441592 ( 16.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: = 27794649240 (26507.0 MiB) Actual memory used (physical + swap) MALLOC: + 26116096 ( 24.9 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5683 Spans in use MALLOC: 21 Thread heaps in use MALLOC: 8192 Tcmalloc page size after that I ran the heap release and it went back to normal. MALLOC: 22919616 ( 21.9 MiB) Bytes in use by application MALLOC: + 4792320 (4.6 MiB) Bytes in page heap freelist MALLOC: + 18743448 ( 17.9 MiB) Bytes in central cache freelist MALLOC: + 20645776 ( 19.7 MiB) Bytes in transfer cache freelist MALLOC: + 18456088 ( 17.6 MiB) Bytes in thread cache freelists MALLOC: +116387992 ( 111.0 MiB) Bytes in malloc metadata MALLOC: MALLOC: =201945240 ( 192.6 MiB) Actual memory used (physical + swap) MALLOC: + 27618820096 (26339.4 MiB) Bytes released to OS (aka unmapped) MALLOC: MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used MALLOC: MALLOC: 5639 Spans in use MALLOC: 29 Thread heaps in use MALLOC: 8192 Tcmalloc page size So it just seems the monitor is not returning unused memory into the OS or reusing already allocated memory it deems as free... Yep. This is a bug (best we can tell) in some versions of tcmalloc combined with certain distribution stacks, although I don't think we've seen it reported on Trusty (nor on a tcmalloc distribution that new) before. Alternatively some folks are seeing tcmalloc use up lots of CPU in other scenarios involving memory return and it may manifest like this, but I'm not sure. You could look through the mailing list for information on it. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-mon cpu usage
This cluster is server RBD storage for openstack, and today all the I/O was just stopped. After looking in the boxes ceph-mon was using 17G ram - and this was on *all* the mons. Restarting the main one just made it work again (I restarted the other ones because they were using a lot of ram). This has happened twice now (first was last Monday). As this is considered a prod cluster there is no logging enabled, and I can't reproduce it - our test/dev clusters have been working fine, and have neither symptoms, but they were upgraded from firefly. What can we do to help debug the issue? Any ideas on how to identify the underlying issue? thanks, On Mon, Jul 20, 2015 at 1:59 PM, Luis Periquito periqu...@gmail.com wrote: Hi all, I have a cluster with 28 nodes (all physical, 4Cores, 32GB Ram), each node has 4 OSDs for a total of 112 OSDs. Each OSD has 106 PGs (counted including replication). There are 3 MONs on this cluster. I'm running on Ubuntu trusty with kernel 3.13.0-52-generic, with Hammer (0.94.2). This cluster was installed with Hammer (0.94.1) and has only been upgraded to the latest available version. On the three mons one is mostly idle, one is using ~170% CPU, and one is using ~270% CPU. They will change as I restart the process (usually the idle one is the one with the lowest uptime). Running a perf top againt the ceph-mon PID on the non-idle boxes it wields something like this: 4.62% libpthread-2.19.so[.] pthread_mutex_unlock 3.95% libpthread-2.19.so[.] pthread_mutex_lock 3.91% libsoftokn3.so[.] 0x0001db26 2.38% [kernel] [k] _raw_spin_lock 2.09% libtcmalloc.so.4.1.2 [.] operator new(unsigned long) 1.79% ceph-mon [.] DispatchQueue::enqueue(Message*, int, unsigned long) 1.62% ceph-mon [.] RefCountedObject::get() 1.58% libpthread-2.19.so[.] pthread_mutex_trylock 1.32% libtcmalloc.so.4.1.2 [.] operator delete(void*) 1.24% libc-2.19.so [.] 0x00097fd0 1.20% ceph-mon [.] ceph::buffer::ptr::release() 1.18% ceph-mon [.] RefCountedObject::put() 1.15% libfreebl3.so [.] 0x000542a8 1.05% [kernel] [k] update_cfs_shares 1.00% [kernel] [k] tcp_sendmsg The cluster is mostly idle, and it's healthy. The store is 69MB big, and the MONs are consuming around 700MB of RAM. Any ideas on this situation? Is it safe to ignore? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph-mon cpu usage
Hi all, I have a cluster with 28 nodes (all physical, 4Cores, 32GB Ram), each node has 4 OSDs for a total of 112 OSDs. Each OSD has 106 PGs (counted including replication). There are 3 MONs on this cluster. I'm running on Ubuntu trusty with kernel 3.13.0-52-generic, with Hammer (0.94.2). This cluster was installed with Hammer (0.94.1) and has only been upgraded to the latest available version. On the three mons one is mostly idle, one is using ~170% CPU, and one is using ~270% CPU. They will change as I restart the process (usually the idle one is the one with the lowest uptime). Running a perf top againt the ceph-mon PID on the non-idle boxes it wields something like this: 4.62% libpthread-2.19.so[.] pthread_mutex_unlock 3.95% libpthread-2.19.so[.] pthread_mutex_lock 3.91% libsoftokn3.so[.] 0x0001db26 2.38% [kernel] [k] _raw_spin_lock 2.09% libtcmalloc.so.4.1.2 [.] operator new(unsigned long) 1.79% ceph-mon [.] DispatchQueue::enqueue(Message*, int, unsigned long) 1.62% ceph-mon [.] RefCountedObject::get() 1.58% libpthread-2.19.so[.] pthread_mutex_trylock 1.32% libtcmalloc.so.4.1.2 [.] operator delete(void*) 1.24% libc-2.19.so [.] 0x00097fd0 1.20% ceph-mon [.] ceph::buffer::ptr::release() 1.18% ceph-mon [.] RefCountedObject::put() 1.15% libfreebl3.so [.] 0x000542a8 1.05% [kernel] [k] update_cfs_shares 1.00% [kernel] [k] tcp_sendmsg The cluster is mostly idle, and it's healthy. The store is 69MB big, and the MONs are consuming around 700MB of RAM. Any ideas on this situation? Is it safe to ignore? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com