Re: [ceph-users] One host failure bring down the whole cluster
1) But Ceph says ...You can run a cluster with 1 monitor. (http://ceph.com/docs/master/rados/operations/add-or-rm-mons/), I assume it should work. And brain split is not my current concern 2) I've written object to Ceph, now I just want to get it back Anyway. I tried to reduce the mon number to 1. But after I remove it following the steps, it just cannot start up any more 1. [root~] service ceph -a stop mon.serverB 2. [root~] ceph mon remove serverB ## hang here forever 3. #Remove the monitor entry from ceph.conf. 4. Restart ceph service [root@serverA~]# systemctl status ceph.service -l ceph.service - LSB: Start Ceph distributed file system daemons at boot time Loaded: loaded (/etc/rc.d/init.d/ceph) Active: failed (Result: timeout) since Tue 2015-03-31 15:46:25 CST; 3min 15s ago Process: 2937 ExecStop=/etc/rc.d/init.d/ceph stop (code=exited, status=0/SUCCESS) Process: 3670 ExecStart=/etc/rc.d/init.d/ceph start (code=killed, signal=TERM) Mar 31 15:44:26 serverA ceph[3670]: === osd.6 === Mar 31 15:44:56 serverA ceph[3670]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.6 --keyring=/var/lib/ceph/osd/ceph-6/keyring osd crush create-or-move -- 6 3.64 host=serverA root=default' Mar 31 15:44:56 serverA ceph[3670]: === osd.7 === Mar 31 15:45:26 serverA ceph[3670]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.7 --keyring=/var/lib/ceph/osd/ceph-7/keyring osd crush create-or-move -- 7 3.64 host=serverA root=default' Mar 31 15:45:26 serverA ceph[3670]: === osd.8 === Mar 31 15:45:57 serverA ceph[3670]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.8 --keyring=/var/lib/ceph/osd/ceph-8/keyring osd crush create-or-move -- 8 3.64 host=serverA root=default' Mar 31 15:45:57 serverA ceph[3670]: === osd.9 === Mar 31 15:46:25 serverA systemd[1]: ceph.service operation timed out. Terminating. Mar 31 15:46:25 serverA systemd[1]: Failed to start LSB: Start Ceph distributed file system daemons at boot time. Mar 31 15:46:25 serverA systemd[1]: Unit ceph.service entered failed state. /var/log/ceph/ceph.log says: 2015-03-31 15:55:57.648800 mon.0 10.???.78:6789/0 1048 : cluster [INF] osd.21 10.???.78:6855/25598 failed (39 reports from 9 peers after 20.118062 = grace 20.00) 2015-03-31 15:55:57.931889 mon.0 10.???.78:6789/0 1055 : cluster [INF] osd.15 10..78:6825/23894 failed (39 reports from 9 peers after 20.401379 = grace 20.00) Obviously serverB is down, but it should not affect serverA from functioning? Right? From: Gregory Farnum [g...@gregs42.com] Sent: Tuesday, March 31, 2015 11:53 AM To: Lindsay Mathieson; Kai KH Huang Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] One host failure bring down the whole cluster On Mon, Mar 30, 2015 at 8:02 PM, Lindsay Mathieson lindsay.mathie...@gmail.com wrote: On Tue, 31 Mar 2015 02:42:27 AM Kai KH Huang wrote: Hi, all I have a two-node Ceph cluster, and both are monitor and osd. When they're both up, osd are all up and in, everything is fine... almost: Two things. 1 - You *really* need a min of three monitors. Ceph cannot form a quorum with just two monitors and you run a risk of split brain. You can form quorums with an even number of monitors, and Ceph does so — there's no risk of split brain. The problem with 2 monitors is that a quorum is always 2 — which is exactly what you're seeing right now. You can't run with only one monitor up (assuming you have a non-zero number of them). 2 - You also probably have a min size of two set (the default). This means that you need a minimum of two copies of each data object for writes to work. So with just two nodes, if one goes down you can't write to the other. Also this. So: - Install a extra monitor node - it doesn't have to be powerful, we just use a Intel Celeron NUC for that. - reduce your minimum size to 1 (One). Yep. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] One host failure bring down the whole cluster
On 3/31/15 11:27, Kai KH Huang wrote: 1) But Ceph says ...You can run a cluster with 1 monitor. (http://ceph.com/docs/master/rados/operations/add-or-rm-mons/), I assume it should work. And brain split is not my current concern Point is that you must have majority of monitors up. * In one monitor setup you need one monitor running, * In two monitor setup you need two monitors running,because if one goes down you do not have majority up, * In three monitor setup you need at least two monitors up, because if one goes down you still have majority up, * 4 - at least 3 * 5 - at least 3 * etc 2) I've written object to Ceph, now I just want to get it back Anyway. I tried to reduce the mon number to 1. But after I remove it following the steps, it just cannot start up any more 1. [root~] service ceph -a stop mon.serverB 2. [root~] ceph mon remove serverB ## hang here forever 3. #Remove the monitor entry from ceph.conf. 4. Restart ceph service It is grey area for me, but I think that you failed to remove that monitor because you didn't have a quorum for operation to succeed. I think you'll need to modify monmap manually and remove second monitor from it [root@serverA~]# systemctl status ceph.service -l ceph.service - LSB: Start Ceph distributed file system daemons at boot time Loaded: loaded (/etc/rc.d/init.d/ceph) Active: failed (Result: timeout) since Tue 2015-03-31 15:46:25 CST; 3min 15s ago Process: 2937 ExecStop=/etc/rc.d/init.d/ceph stop (code=exited, status=0/SUCCESS) Process: 3670 ExecStart=/etc/rc.d/init.d/ceph start (code=killed, signal=TERM) Mar 31 15:44:26 serverA ceph[3670]: === osd.6 === Mar 31 15:44:56 serverA ceph[3670]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.6 --keyring=/var/lib/ceph/osd/ceph-6/keyring osd crush create-or-move -- 6 3.64 host=serverA root=default' Mar 31 15:44:56 serverA ceph[3670]: === osd.7 === Mar 31 15:45:26 serverA ceph[3670]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.7 --keyring=/var/lib/ceph/osd/ceph-7/keyring osd crush create-or-move -- 7 3.64 host=serverA root=default' Mar 31 15:45:26 serverA ceph[3670]: === osd.8 === Mar 31 15:45:57 serverA ceph[3670]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.8 --keyring=/var/lib/ceph/osd/ceph-8/keyring osd crush create-or-move -- 8 3.64 host=serverA root=default' Mar 31 15:45:57 serverA ceph[3670]: === osd.9 === Mar 31 15:46:25 serverA systemd[1]: ceph.service operation timed out. Terminating. Mar 31 15:46:25 serverA systemd[1]: Failed to start LSB: Start Ceph distributed file system daemons at boot time. Mar 31 15:46:25 serverA systemd[1]: Unit ceph.service entered failed state. /var/log/ceph/ceph.log says: 2015-03-31 15:55:57.648800 mon.0 10.???.78:6789/0 1048 : cluster [INF] osd.21 10.???.78:6855/25598 failed (39 reports from 9 peers after 20.118062 = grace 20.00) 2015-03-31 15:55:57.931889 mon.0 10.???.78:6789/0 1055 : cluster [INF] osd.15 10..78:6825/23894 failed (39 reports from 9 peers after 20.401379 = grace 20.00) Obviously serverB is down, but it should not affect serverA from functioning? Right? From: Gregory Farnum [g...@gregs42.com] Sent: Tuesday, March 31, 2015 11:53 AM To: Lindsay Mathieson; Kai KH Huang Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] One host failure bring down the whole cluster On Mon, Mar 30, 2015 at 8:02 PM, Lindsay Mathieson lindsay.mathie...@gmail.com wrote: On Tue, 31 Mar 2015 02:42:27 AM Kai KH Huang wrote: Hi, all I have a two-node Ceph cluster, and both are monitor and osd. When they're both up, osd are all up and in, everything is fine... almost: Two things. 1 - You *really* need a min of three monitors. Ceph cannot form a quorum with just two monitors and you run a risk of split brain. You can form quorums with an even number of monitors, and Ceph does so — there's no risk of split brain. The problem with 2 monitors is that a quorum is always 2 — which is exactly what you're seeing right now. You can't run with only one monitor up (assuming you have a non-zero number of them). 2 - You also probably have a min size of two set (the default). This means that you need a minimum of two copies of each data object for writes to work. So with just two nodes, if one goes down you can't write to the other. Also this. So: - Install a extra monitor node - it doesn't have to be powerful, we just use a Intel Celeron NUC for that. - reduce your minimum size to 1 (One). Yep. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] One host failure bring down the whole cluster
Hi, all I have a two-node Ceph cluster, and both are monitor and osd. When they're both up, osd are all up and in, everything is fine... almost: [root~]# ceph -s health HEALTH_WARN 25 pgs degraded; 316 pgs incomplete; 85 pgs stale; 24 pgs stuck degraded; 316 pgs stuck inactive; 85 pgs stuck stale; 343 pgs stuck unclean; 24 pgs stuck undersized; 25 pgs undersized; recovery 11/153 objects degraded (7.190%) monmap e1: 2 mons at {server_b=10.???.78:6789/0,server_a=10.???.80:6789/0}, election epoch 14, quorum 0,1 server_b,server_a osdmap e116375: 22 osds: 22 up, 22 in pgmap v238656: 576 pgs, 2 pools, 224 MB data, 59 objects 56175 MB used, 63420 GB / 63475 GB avail 11/153 objects degraded (7.190%) 15 active+undersized+degraded 75 stale+active+clean 2 active+remapped 158 active+clean 10 stale+active+undersized+degraded 316 incomplete But if I bring down one server, the whole cluster seems not functioning any more: [root~]# ceph -s 2015-03-31 10:32:43.848125 7f57e4105700 0 -- :/1017540 10.???.78:6789/0 pipe(0x7f57e0027120 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f57e00273b0).fault This should not happen...Any thoughts? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] One host failure bring down the whole cluster
On Tue, 31 Mar 2015 02:42:27 AM Kai KH Huang wrote: Hi, all I have a two-node Ceph cluster, and both are monitor and osd. When they're both up, osd are all up and in, everything is fine... almost: Two things. 1 - You *really* need a min of three monitors. Ceph cannot form a quorum with just two monitors and you run a risk of split brain. 2 - You also probably have a min size of two set (the default). This means that you need a minimum of two copies of each data object for writes to work. So with just two nodes, if one goes down you can't write to the other. So: - Install a extra monitor node - it doesn't have to be powerful, we just use a Intel Celeron NUC for that. - reduce your minimum size to 1 (One). ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] One host failure bring down the whole cluster
On Tue, 31 Mar 2015 02:42:27 AM Kai KH Huang wrote: Hi, all I have a two-node Ceph cluster, and both are monitor and osd. When they're both up, osd are all up and in, everything is fine... almost: Two things. 1 - You *really* need a min of three monitors. Ceph cannot form a quorum with just two monitors and you run a risk of split brain. 2 - You also probably have a min size of two set (the default). This means that you need a minimum of two copies of each data object for writes to work. So with just two nodes, if one goes down you can't write to the other. So: - Install a extra monitor node - it doesn't have to be powerful, we just use a Intel Celeron NUC for that. - reduce your minimum size to 1 (One). ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] One host failure bring down the whole cluster
On Mon, Mar 30, 2015 at 8:02 PM, Lindsay Mathieson lindsay.mathie...@gmail.com wrote: On Tue, 31 Mar 2015 02:42:27 AM Kai KH Huang wrote: Hi, all I have a two-node Ceph cluster, and both are monitor and osd. When they're both up, osd are all up and in, everything is fine... almost: Two things. 1 - You *really* need a min of three monitors. Ceph cannot form a quorum with just two monitors and you run a risk of split brain. You can form quorums with an even number of monitors, and Ceph does so — there's no risk of split brain. The problem with 2 monitors is that a quorum is always 2 — which is exactly what you're seeing right now. You can't run with only one monitor up (assuming you have a non-zero number of them). 2 - You also probably have a min size of two set (the default). This means that you need a minimum of two copies of each data object for writes to work. So with just two nodes, if one goes down you can't write to the other. Also this. So: - Install a extra monitor node - it doesn't have to be powerful, we just use a Intel Celeron NUC for that. - reduce your minimum size to 1 (One). Yep. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com