I remember CERN had a test ceph cluster 30PB and the osd use more memery than usual ,and thay tune osdmap_epochs ,if it is the osdmap make it use more memery,ithink you may have a test use less osdmap_epochs to see if have some change
default mon_min_osdmap_epochs is 500 zphj1987 2016-11-08 22:08 GMT+08:00 Sage Weil <[email protected]>: > > ---------- Forwarded message ---------- > > From: Dong Wu <[email protected]> > > Date: 2016-10-27 18:50 GMT+08:00 > > Subject: Re: [ceph-users] Hammer OSD memory increase when add new machine > > To: huang jun <[email protected]> > > 抄送: ceph-users <[email protected]> > > > > > > 2016-10-27 17:50 GMT+08:00 huang jun <[email protected]>: > > > how do you add the new machine ? > > > does it first added to default ruleset and then you add the new rule > > > for this group? > > > do you have data pool use the default rule, does these pool contain > data? > > > > we dont use default ruleset, when we add new group machine, > > crush_location auto generate root and chassis, then we add a new rule > > for this group. > > > > > > > 2016-10-27 17:34 GMT+08:00 Dong Wu <[email protected]>: > > >> Hi all, > > >> > > >> We have a ceph cluster only use rbd. The cluster contains several > > >> group machines, each group contains several machines, then each > > >> machine has 12 SSDs, each ssd as an OSD (journal and data together). > > >> eg: > > >> group1: machine1~machine12 > > >> group2: machine13~machine24 > > >> ...... > > >> each group is separated with other group, which means each group has > > >> separated pools. > > >> > > >> we use Hammer(0.94.6) compiled with jemalloc(4.2). > > >> > > >> We have found that when we add a new group machine, the other group > > >> machine's memory increase 5% more or less (OSDs usage). > > >> > > >> each group's data is separated with others, so backfill only in group, > > >> not across. > > >> Why add a group of machine cause others memory increase? Is this > reasonable? > > It could be cached OSDmaps (they get slightly larger when you add OSDs) > but it's hard to say. It seems more likely that the pools and crush rules > aren't configured right and you're adding OSDs to the wrong group. > > If you look at the 'ceph daemon osd.NNN perf dump' output you can see, > among other things, how many PGs are on the OSD. Can you capture the > output before and after the change (and 5% memory footprint increase)? > > sage >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
