> On 12 May 2016, at 19:27, Vincenzo Pii <[email protected]> wrote: > > I have installed a new ceph cluster with ceph-ansible (using the same version > and playbook that had worked before, with some necessary changes to > variables). > > The only major difference is that now an osd (osd3) has a disk twice as big > as the others, thus a different weight (check the crushmap excerpt below). > > Ceph version is jewel (10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9)) and > the setup has a single monitor node (it will be three in production) and > three osds. > > Any help to find the issue will be highly appreciated! > > # ceph status > cluster f7f42c59-b8ec-4d68-bb09-41f7a10c6223 > health HEALTH_ERR > 448 pgs are stuck inactive for more than 300 seconds > 448 pgs stuck inactive > monmap e1: 1 mons at {sbb=10.2.48.205:6789/0} > election epoch 3, quorum 0 sbb > fsmap e8: 0/0/1 up > osdmap e10: 3 osds: 0 up, 0 in > flags sortbitwise > pgmap v11: 448 pgs, 4 pools, 0 bytes data, 0 objects > 0 kB used, 0 kB / 0 kB avail > 448 creating > > From the crushmap: > > host osd1 { > id -2 # do not change unnecessarily > # weight 1.811 > alg straw > hash 0 # rjenkins1 > item osd.0 weight 1.811 > } > host osd2 { > id -3 # do not change unnecessarily > # weight 1.811 > alg straw > hash 0 # rjenkins1 > item osd.1 weight 1.811 > } > host osd3 { > id -4 # do not change unnecessarily > # weight 3.630 > alg straw > hash 0 # rjenkins1 > item osd.2 weight 3.630 > } > > Vincenzo Pii | TERALYTICS > DevOps Engineer > Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland > phone: +41 (0) 79 191 11 08 > email: [email protected] > <mailto:[email protected]> > www.teralytics.net > <http://www.google.com/url?q=http%3A%2F%2Fwww.teralytics.net%2F&sa=D&sntz=1&usg=AFrqEzen-TJ7SdCY0QomEByjPQDNuEuP0A> > Company registration number: CH-020.3.037.709-7 | Trade register Canton Zurich > Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann de > Vries > > This e-mail message contains confidential information which is for the sole > attention and use of the intended recipient. Please notify us at once if you > think that it may not be intended for you and delete it immediately. > >
Problem found, I misconfigured the public_network and cluster_network variables for some of the hosts (I moved some configuration to host_vars). It was easy to spot once I checked the logs of those hosts.
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
