> On 26/03/2015, at 23.36, Somnath Roy <[email protected]> wrote: > > Got most portion of it, thanks ! > But, still not able to get when second node is down why with single monitor > in the cluster client is not able to connect ? > 1 monitor can form a quorum and should be sufficient for a cluster to run. To have quorum you need more than 50% of monitors, which isn’t possible with one out of two, since 1 < (0.5*2 + 1) hence at least 3 monitors.
> > Thanks & Regards > Somnath > > -----Original Message----- > From: Gregory Farnum [mailto:[email protected]] > Sent: Thursday, March 26, 2015 3:29 PM > To: Somnath Roy > Cc: Lee Revell; [email protected] > Subject: Re: [ceph-users] All client writes block when 2 of 3 OSDs down > > On Thu, Mar 26, 2015 at 3:22 PM, Somnath Roy <[email protected]> wrote: >> Greg, >> Couple of dumb question may be. >> >> 1. If you see , the clients are connecting fine with two monitors in the >> cluster. 2 monitors can never form a quorum, but, 1 can, so, why with 1 >> monitor (which is I guess happening after making 2 nodes down) it is not >> able to connect ? > > A quorum is a strict majority of the total membership. 2 monitors can form a > quorum just fine if there are either 2 or 3 total membership. > (As long as those two agree on every action, it cannot be lost.) > > We don't *recommend* configuring systems with an even number of monitors, > because it increases the number of total possible failures without increasing > the number of failures that can be tolerated. (3 monitors requires 2 in > quorum, 4 does too. Same for 5 and 6, 7 and 8, etc etc.) > >> >> 2. Also, my understanding is while IO is going on *no* monitor interaction >> will be on that path, so, why the client io will be stopped because the >> monitor quorum is not there ? If the min_size =1 is properly set it should >> able to serve IO as long as 1 OSD (node) is up, isn't it ? > > Well, the remaining OSD won't be able to process IO because it's lost its > peers, and it can't reach any monitors to do updates or get new maps. > (Monitors which are not in quorum will not allow clients to > connect.) > The clients will eventually stop serving IO if they know they can't reach a > monitor, although I don't remember exactly how that's triggered. > > In this particular case, though, the client probably just tried to do an op > against the dead osd, realized it couldn't, and tried to fetch a map from the > monitors. When that failed it went into search mode, which is what the logs > are showing you. > -Greg > >> >> Thanks & Regards >> Somnath >> >> -----Original Message----- >> From: ceph-users [mailto:[email protected]] On Behalf >> Of Gregory Farnum >> Sent: Thursday, March 26, 2015 2:40 PM >> To: Lee Revell >> Cc: [email protected] >> Subject: Re: [ceph-users] All client writes block when 2 of 3 OSDs >> down >> >> On Thu, Mar 26, 2015 at 2:30 PM, Lee Revell <[email protected]> wrote: >>> On Thu, Mar 26, 2015 at 4:40 PM, Gregory Farnum <[email protected]> wrote: >>>> >>>> Has the OSD actually been detected as down yet? >>>> >>> >>> I believe it has, however I can't directly check because "ceph health" >>> starts to hang when I down the second node. >> >> Oh. You need to keep a quorum of your monitors running (just the monitor >> processes, not of everything in the system) or nothing at all is going to >> work. That's how we prevent split brain issues. >> >>> >>>> >>>> You'll also need to set that min size on your existing pools ("ceph >>>> osd pool <pool> set min_size 1" or similar) to change their >>>> behavior; the config option only takes effect for newly-created >>>> pools. (Thus the >>>> "default".) >>> >>> >>> I've done this, however the behavior is the same: >>> >>> $ for f in `ceph osd lspools | sed 's/[0-9]//g' | sed 's/,//g'`; do >>> ceph osd pool set $f min_size 1; done set pool 0 min_size to 1 set >>> pool 1 min_size to 1 set pool 2 min_size to 1 set pool 3 min_size to >>> 1 set pool 4 min_size to 1 set pool 5 min_size to 1 set pool 6 >>> min_size to 1 set pool 7 min_size to 1 >>> >>> $ ceph -w >>> cluster db460aa2-5129-4aaa-8b2e-43eac727124e >>> health HEALTH_WARN 1 mons down, quorum 0,1 ceph-node-1,ceph-node-2 >>> monmap e3: 3 mons at >>> {ceph-node-1=192.168.122.121:6789/0,ceph-node-2=192.168.122.131:6789/ >>> 0 ,ceph-node-3=192.168.122.141:6789/0}, >>> election epoch 194, quorum 0,1 ceph-node-1,ceph-node-2 >>> mdsmap e94: 1/1/1 up {0=ceph-node-1=up:active} >>> osdmap e362: 3 osds: 2 up, 2 in >>> pgmap v5913: 840 pgs, 8 pools, 7441 MB data, 994 objects >>> 25329 MB used, 12649 MB / 40059 MB avail >>> 840 active+clean >>> >>> 2015-03-26 17:23:56.009938 mon.0 [INF] pgmap v5913: 840 pgs: 840 >>> active+clean; 7441 MB data, 25329 MB used, 12649 MB / 40059 MB avail >>> 2015-03-26 17:25:51.042802 mon.0 [INF] pgmap v5914: 840 pgs: 840 >>> active+clean; 7441 MB data, 25329 MB used, 12649 MB / 40059 MB avail; >>> active+0 B/s >>> rd, 260 kB/s wr, 13 op/s >>> 2015-03-26 17:25:56.046491 mon.0 [INF] pgmap v5915: 840 pgs: 840 >>> active+clean; 7441 MB data, 25333 MB used, 12645 MB / 40059 MB avail; >>> active+0 B/s >>> rd, 943 kB/s wr, 38 op/s >>> 2015-03-26 17:26:01.058167 mon.0 [INF] pgmap v5916: 840 pgs: 840 >>> active+clean; 7441 MB data, 25335 MB used, 12643 MB / 40059 MB avail; >>> active+0 B/s >>> rd, 10699 kB/s wr, 621 op/s >>> >>> <this is where i kill the second OSD> >>> >>> 2015-03-26 17:26:26.778461 7f4ebeffd700 0 monclient: hunting for new >>> mon >>> 2015-03-26 17:26:30.701099 7f4ec45f5700 0 -- >>> 192.168.122.111:0/1007741 >> >>> 192.168.122.141:6789/0 pipe(0x7f4ec0023200 sd=3 :0 s=1 pgs=0 cs=0 l=1 >>> c=0x7f4ec0023490).fault >>> 2015-03-26 17:26:42.701154 7f4ec44f4700 0 -- >>> 192.168.122.111:0/1007741 >> >>> 192.168.122.131:6789/0 pipe(0x7f4ec00251b0 sd=3 :0 s=1 pgs=0 cs=0 l=1 >>> c=0x7f4ec0025440).fault >>> >>> And all writes block until I bring back an OSD. >>> >>> Lee >> _______________________________________________ >> ceph-users mailing list >> [email protected] >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> ________________________________ >> >> PLEASE NOTE: The information contained in this electronic mail message is >> intended only for the use of the designated recipient(s) named above. If the >> reader of this message is not the intended recipient, you are hereby >> notified that you have received this message in error and that any review, >> dissemination, distribution, or copying of this message is strictly >> prohibited. If you have received this communication in error, please notify >> the sender by telephone or e-mail (as shown above) immediately and destroy >> any and all copies of this message in your possession (whether hard copies >> or electronically stored copies). >> > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
