Re: [ceph-users] Switching from tcmalloc

2015-06-26 Thread Alexandre DERUMIER
: [ceph-users] Switching from tcmalloc It would be really interesting if you could give jemalloc a try. Originally tcmalloc was used to get around some serious memory fragmentation issues in the OSD. You can read the original bug tracker entry from 5 years ago here: http://tracker.ceph.com/issues

Re: [ceph-users] Switching from tcmalloc

2015-06-25 Thread Dzianis Kahanovich
IMHO there must be tested in different glibc. Old glibc has optional experimental threaded extensions for malloc, default disabled (and have no options to enable even in Gentoo without hack, may be some distros was compiled so - don't know). But now this malloc features mostly ON by default, so

Re: [ceph-users] Switching from tcmalloc

2015-06-25 Thread Dzianis Kahanovich
...@lists.ceph.com] On Behalf Of Jan Schermer Sent: Wednesday, June 24, 2015 10:54 AM To: Ben Hines Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Switching from tcmalloc We did, but I don’t have the numbers. I have lots of graphs, though. We were mainly trying to solve the CPU usage, since

Re: [ceph-users] Switching from tcmalloc

2015-06-25 Thread Jan Schermer
Our first thought was jemalloc when we became aware of the issue, but that one requires support in code which is AFAIK not present in Dumpling. Am I right? We did try simply preloading jemalloc when starting OSD and that experient ended with SIGSEGV within minutes, we didn’t investigate it any

[ceph-users] Switching from tcmalloc

2015-06-24 Thread Jan Schermer
Can you guess when we did that? Still on dumpling, btw... http://www.zviratko.net/link/notcmalloc.png http://www.zviratko.net/link/notcmalloc.png Jan___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] Switching from tcmalloc

2015-06-24 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Did you see what the effect of just restarting the OSDs before using tcmalloc? I've noticed that there is usually a good drop for us just by restarting them. I don't think it is usually this drastic. - Robert LeBlanc GPG

Re: [ceph-users] Switching from tcmalloc

2015-06-24 Thread Jan Schermer
...@lists.ceph.com] On Behalf Of Jan Schermer Sent: Wednesday, June 24, 2015 10:54 AM To: Ben Hines Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Switching from tcmalloc We did, but I don’t have the numbers. I have lots of graphs, though. We were mainly trying to solve the CPU usage, since our

Re: [ceph-users] Switching from tcmalloc

2015-06-24 Thread Somnath Roy
-users-boun...@lists.ceph.com] On Behalf Of Jan Schermer Sent: Wednesday, June 24, 2015 10:54 AM To: Ben Hines Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Switching from tcmalloc We did, but I don’t have the numbers. I have lots of graphs, though. We were mainly trying to solve the CPU

Re: [ceph-users] Switching from tcmalloc

2015-06-24 Thread Ben Hines
Did you do before/after Ceph performance benchmarks? I dont care if my systems are using 80% cpu, if Ceph performance is better than when it's using 20% cpu. Can you share any scripts you have to automate these things? (NUMA pinning, migratepages) thanks, -Ben On Wed, Jun 24, 2015 at 10:25 AM,

Re: [ceph-users] Switching from tcmalloc

2015-06-24 Thread Jan Schermer
We did, but I don’t have the numbers. I have lots of graphs, though. We were mainly trying to solve the CPU usage, since our nodes are converged QEMU+CEPH OSDs, so this made a difference. We were also seeing the performance capped on CPUs when deleting snapshots of backfilling, all this should

Re: [ceph-users] Switching from tcmalloc

2015-06-24 Thread Jan Schermer
We already had the migratepages in place before we disabled tcmalloc. It didn’t do much. Disabling tcmalloc made immediate difference but there were still spikes and the latency wasn’t that great. (CPU usage was) Migrating memory helped a lot after that - it didn’t help (at least not the

Re: [ceph-users] Switching from tcmalloc

2015-06-24 Thread Jan Schermer
There were essentialy three things we had to do for such a drastic drop 1) recompile CEPH —without-tcmalloc 2) pin the OSDs to a set of a specific NUMA zone - we had this for a long time and it really helped 3) migrate the OSD memory to the correct CPU with migratepages - we will use cgroups

Re: [ceph-users] Switching from tcmalloc

2015-06-24 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 From what I understand, you probably got most of your reduction from co-locating your memory to the right NUMA nodes. tcmalloc/jemalloc should be much higher in performance because of how they hold memory in thread pools (less locking to allocate